Monoids.
The reason why do we need Monoid?
is that you can run things parallely and aggregate the results of type A
if we have monoid for type A.
If you can prove something is a Monoid, you can parallelize trivially.
Steps:
1. Write your algorithm..
2. If you can prove your algorithm is monoid, then we can run list of values with type as our algorithm, parallely.
/**
* Nondeterministically sequence `fs`, collecting the results using a `Monoid`.
*/
def aggregate[A: Monoid](fs: Seq[F[A]]): F[A] =
map(gather(fs))(_.foldLeft(implicitly[Monoid[A]].zero)((a,b) => implicitly[Monoid[A]].append(a,b)))
Nondeterminism[Task].aggregate()
val li:List[Int] = List(1,2,3)
//Identity element.
// Associativity: given two values of type A, it has to return one value of type A.
li.foldLeft(0)( (x,y) => x+y)
// here's our definition of a monoid again
trait Monoid[A] {
// an identity element
def id: A
// an associative operation
def op(x: A, y: A): A
}
object Monoid {
//easy to define for strings
implicit val stringMonoid = new Monoid[String] {
def id = ""
def op(x: String, y: String) = x + y
}
// here's where things start to get more interesting, this says "I
// can give you a monoid for any Option[A] if you can give me a
// monoid for A"
implicit def optionMonoid[A](implicit am: Monoid[A]): Monoid[Option[A]] =
new Monoid[Option[A]] {
def id = None
def op(x: Option[A], y: Option[A]): Option[A] = (x,y) match {
case (x, None) => x
case (None, y) => y
case (Some(x),Some(y)) => Some(am.op(x,y)) // here we use the A monoid to add two As
}
}
// given an monoid for B, I can give you a monoid for functions
// returning B, by running the functions on the input and adding the
// results
implicit def functionMonoid[A,B](implicit bm: Monoid[B]): Monoid[A => B] = new Monoid[A => B] {
def id = A => bm.id
def op(x: A => B, y: A => B): A => B = { a =>
bm.op(x(a), y(a))
}
}
// we can use a monoid to collapse a bunch of values, here we take a
// list and function that takes us to a value for which we have a
// Monoid, and we can then collapse the list into a single value.
implicit def fold[A](la: List[A])(implicit am: Monoid[A]): A =
la.foldLeft(am.id)(am.op)
}
import Monoid._
import scalaz.Nondeterminism
import scalaz.concurrent.Task
val fizz: Int => Option[String] = x => if(x % 3 == 0) Some("fizz") else None
val buzz: Int => Option[String] = x => if(x % 5 == 0) Some("buzz") else None
val bazz: Int => Option[String] = x => if(x % 7 == 0) Some("bazz") else None
//We have to combine functions.
//We have combined list of functions into one function. How cool is that? :)
val funcs:List[(Int) => Option[String]] = List(fizz,buzz,bazz)
val fizzbuzzbazz:Int => Option[String] = fold(funcs)
// handle the Nones
val fbbOrInt: Int => String = { i =>
(fizzbuzzbazz(i) getOrElse i.toString) + ","
}
val strings:List[String] = List(3,5,7,105) map fbbOrInt
strings
fold(strings)
// What prompted the post was a question about having some collection of IDs, doing various lookups/transformations on those IDs (some of which can fail), merging the results into a single result, and returning that result. The question didn't specify this result type, but one thought that occurred to me is that it could be a chunked HTTP response, which fairly obviously forms a monoid, i.e. there's an empty chunk and an associative append operation. In other words, if I do N things, each of which returns a chunked HTTP response, it's easy to turn that into one chunked HTTP response.
// OK. So scalaz has a very powerful tool for "doing stuff," including I/O, exception catching, and concurrency, called Task. Really describing Task would take us too far afield (read Tim's great post!), but a Task[A] is a Task that, when run, returns an A.
// Remember when I said Task also catches exceptions? That means the Task can end up in either a success or failure state. It's a lot like the standard Future in that sense. But there's also a function on Task, .attempt, that transforms a Task[A] to a Task[Throwable \/ A]. \/ is like Either, but works in for-comprehensions.
// So if I have a function:
// def doStuff(id: Int): Task[Chunk] = ???
// and a bunch of IDs:
// val ids = List(42, 96, 2, 17, 5, 9, 23...)
// then obviously I can do:
// val tasks: List[Task[Chunk]] = ids.map(doStuff)
// Now, what I'd really like is either all the Chunks combined, or a list of the errors that occurred in the Tasks building the Chunks. The first question is: is there a convenient way to run all the Tasks, maybe even in parallel, and accumulate the results? Actually, there are several, but the one I'm interested in is Nondeterminism[Task].aggregate. As you can see, it takes a Seq[F[A]] and returns a F[A], but it can only do that if A is a Monoid.
// Chunk may be a Monoid (we're assuming it is), but what about accumulating errors? It turns out scalaz has another handy type, ValidationNel, which is like \/ but accumulates errors on the left. In fact, the Nel refers to a NonEmptyList, i.e. a ValidationNel is either a "right" with a value or a "left" with a list of errors that can't be empty (because what would a "left, but empty list of errors" mean)?
// Now, if I have a \/, I can turn it into a ValidationNel easily: .validation.toValidationNel. Great. But is this a Monoid? It turns out that it is, if and only if its right is a Monoid. So instead of
// val tasks: List[Task[Chunk]] = ids.map(doStuff)
// I say:
// val tasks: List[Task[ValidationNel[Throwable, Chunk]]] = ids.map(doStuff(_).attempt.map(_.validation.toValidationNel))
// In other words, I have a list of monads of monoids, which means I can say:
// val task: Task[ValidationNel[Throwable, Chunk]] = Nondeterminism[Task].aggregate(tasks)
// Now I have one Task that will return the Chunk built by merging all the Chunks from all the Tasks, or a NonEmptyList of all the Throwables that caused (any of) theTasks to fail.
// I hope this helps. It's a bit long because I wanted to describe a real-world use case.