Very recently, I've given a few talks on Asynchronous and Concurrent Programming in F#. In this talk, I gave a brief overview of the options you have when dealing with concurrency and asynchronous behavior. During these talks, I was asked a few times about where asynchronous computation expressions (workflows) fit and how it differs from doing things with the Task Parallel Library from the Parallel Extensions for .NET. There are some differences worth exploring and I'll post some code snippets to compare and contrast the two.
The Challenge
Today's challenge is to take a sample from the Task Parallel Library samples under AsyncDownload. The purpose of this example is to download HTML from a given website and return the HTML in a tuple with the original URL. We'll take two approaches, one using the Task Parallel Library (TPL) and the other using asynchronous computation expressions. We'll compare and contrast the approaches used to achieve the end result.
Using the Task Parallel Library
Let's take the original sample as written in C# and rewrite it in F#. Instead of returning an IEnumerable<T>, I'm going to rewrite this while using a sequence expression. The other change I'm going to make is using a tuple instead of a KeyValuePair<TKey, TValue> for my storage since it doesn't have to be so formal. This sample uses the System.Net.WebClient to download the strings asynchronously. This particular class uses events in order to subscribe to the DownloadStringCompleted event and then begin the download below. Let's take a look at what this code might look like:
seq { use results = new BlockingCollection<(string * string)>()
use pagesRemain = new CountdownEvent(1)
let _ = Task.Create(fun _ ->
urls |> List.iter(fun url ->
let wc = new WebClient()
wc.DownloadStringCompleted.Add(fun args ->
if args.Error = null then
results.Add(((args.UserState :?> string), args.Result))
if pagesRemain.Decrement() then
results.CompleteAdding()
)
pagesRemain.Increment()
wc.DownloadStringAsync(new Uri(url), url)
)
if pagesRemain.Decrement() then results.CompleteAdding()
)
for result in results.GetConsumingEnumerable() do yield result
}
let urls = ["http://microsoft.com"; "http://msn.com"]
let results = download urls
results |> Seq.iter(fun x -> printfn "%s : %s" (fst x) (snd x))
What this code accomplishes is the following:
- Wrap the entire operation in a sequence expression.
- Create a BlockingCollection of a tuple for storing our results
- Create a CountdownEvent to track whether we are done or not.
- Create a Task for the TPL
- Iterate through each URL given
- Create a WebClient and add a handler which checks whether it should add to the collection as well as complete adding
- Decrement the remaining and start the download async behavior
- Clean up and then iterate through each result tuple
Due to shared state issues, we have to worry about locking and such while adding to our collection. Sometimes shared state is nice for quick operations, but I quickly shy away from this approach should I need to scale to the nth degree. Instead, I'd advocate more of a shared-nothing approach through message passing. Each situation must be analyzed to see whether a shared state approach works or not. Functional languages such as F# tend to shy away from this, especially when worried about the "Assembly Language" level approach of dealing with locks, mutexes, semaphores and other goodies. But, overall, I'm liking the abstraction layer over creating tasks and I think it's getting better to a point where we don't have to think about the concurrency constructs as much as we do today.
Using Asynchronous Computation Expressions
Now, let's take a look at an approach using asynchronous computation expressions. This time, we'll use a monadic expression, much as we did above with the seqeuence one. We'll make the Async<'a> class the heart and soul of our operation. This allows us to represent a program fragment that will be executed at some point in the future. That fragment being of course the much dreaded word, Monad. Which, I agree with Simon Peyton Jones, that they should be called "Warm Fuzzy Things" instead of Monad. We'll get into what that word really means in the future, but in the mean time, let's dig into the code. Note that we had to add the GetResponseAsync method back to the WebRequest due to it being removed from the latest public bits of F#. As you can see, it's pretty trivial to extend any type that exposes the asynchronous behavior from the Begin/End pattern.
member x.GetResponseAsync() =
Async.BuildPrimitive(x.BeginGetResponse, x.EndGetResponse)
let download(url:string) =
async { let request = WebRequest.Create(url)
use! response = request.GetResponseAsync()
use stream = response.GetResponseStream()
use reader = new StreamReader(stream)
let! html = reader.ReadToEndAsync()
return (url, html)
}
let siteList = ["http://www.microsoft.com/";"http://msn.com/"]
let results =
Async.Run(Async.Parallel
[for site in siteList -> download site])
results |> Seq.iter(fun x -> printfn "%s : %s" (fst x) (snd x))
What this code is able to accomplish is the following:
- Create a WebRequest for the given URL.
- Asynchronously get the the response
- Get the stream and put it into a reader
- Asynchronously read the HTML on the page to the end
- Return a tuple of the URL and the HTML
- Run each URL from our site list in parallel to return us a list which I can iterate.
Seems pretty simple, doesn't it? Now if only concurrency were this easy. Oh wait, it just is... What we also get for free is resource lifetime management through the use keyword, binding with continuations, exception management and so on without much effort. I'll cover more of this in the future. I just wanted to whet the appetite for what is coming down the pike.
Wrapping it Up
This was just a quick primer on the differences between using the TPL tasks and asynchronous computation expressions. I'll dive deeper into each in the near future and how they tick. And possibly I can rewrite to combine the two and see how well they can compliment each other.
Source Click Here.
No comments:
Post a Comment
Post your comments here: