Friday 29 October 2010

C# 5.0: The Asynchronous Revolution

imageIn his usual quiet and unpretentious way, Anders Hejlsberg stepped up on stage at Microsoft PDC yesterday – and proceeded to launch a revolution. Two new keywords added to the C# and VB.Net languages was all it took, but the impact of those changes will be felt by programmers for years to come. And not just programmers: unlike other changes made to the .Net languages across the versions, these new additions will produce tangible benefits for users.

The Status Quo

Anders started his presentation by noting a current trend in applications [watch]: they are becoming increasingly connected. We’ve had client-server applications for decades, but we’re now seeing more and more applications that have to call out to a number of different services and co-ordinate their responses. This brings a number of problems. First there’s the latency - the application having to wait as the requests and responses pass back and forth over the network. Then, if we’re not careful, the application appears to hang, unable to repaint the user-interface because it’s too busy waiting for the server to respond. Meanwhile, on the other side of the network, all the server’s threads are tied up, waiting for responses to come back from other servers, so it stops responding to incoming requests – and scalability plummets.

Now we all know what the answer is: asynchronous programming. Rather than tying up our main thread waiting for a response to come back, launch the request on another thread and set up some kind of event handler so that you can be notified when the response comes back, and then finish the work.

We have the answer, but it’s not a pretty one, as anybody who has programmed a serious client-server application in Silverlight will tell you. In Silverlight there is no way of calling the server synchronously: Silverlight forces you to call asynchronously so that the browser hosting your application can remain responsive. In fact, there’s no way of doing something as simple as showing a dialog synchronously: if the way your application continues is dependant on the users’ response, you have to move the continuation logic into an event handler attached to the dialog so that it can be executed once the user clicks their choice of button.

Inside-out Thinking

Programming like this requires you to turn your thinking inside out, because you have to prepare your response (in an event-handler or whatever) before you even make the request. And if in your response you’re going to make another asynchronous call, you have to prepare your response for that even before your first response. Anders gave a small example of that [watch]. Here’s a small method that queries the NetFlix web service. Nice straight-forward code:

image

Make that asynchronous and see what happens:

image

See how it’s turned inside out: in line 4 we prepare our what-happens-next by attaching a lambda to the DownloadStringCompleted event, and the call to download the string is the very last thing we do. Then notice that we can no longer return a value from the method: it isn’t available when the method returns. Instead we have to get the user to pass in an Action that we can call back when the result is ready. So we’re forced into changing all the places where we call the QueryMovies method too: the effects of this change ripple out through the code. As Anders said, “it becomes lambdas all the way down”.

No wonder we only do this stuff when absolutely necessary – and our customers pay a price, getting applications that become unresponsive when busy, and servers that are not as efficient as they could be. All because, for mere mortals at least, asynchronous programming is just too hard.

Not any longer.

Asynchronous Programming for the rest of us

Up on stage, Anders introduced his two new keywords, async and await [watch]. Here’s how the asynchronous version of QueryMovies looks in C# 5.0:

QueryMoveisAsync

It’s identical to our nice straight-forward but sadly synchronous version, with the exception of four small changes (highlighed in orange). The first is the addition of the async keyword in front of the method definition. This tells the compiler that it’s going to have to work some magic on this method and make it return its result asynchronously. This produces the second change: the method now returns a Task<Movie[]> instead of just Movie[] – but notice how the return statement is unaltered – the compiler is taking care of turning the return value into a Task.

In case you’re wondering, Task is a class that was introduced in .Net 4.0 as part of the Task Parallel Library. It simply encapsulates a value that is going to become available at some point in the future (hence, why in some languages the equivalent concept is called a Future). Importantly, Task provides the ContinueWith that allows you to schedule an event handler to be called when that value is available.

The most significant change is the introduction of the await keyword in line 4, and the change to call DownloadStringTaskAsync. DownloadStringTaskAsync is just a version of DownloadString that, instead of returning string, returns Task<string>. Again, this Task<string> is something that will deliver a string sometime in the future. Adding the await keyword in front of the method call allows us to await that future moment, at which point the string will be popped out of its Task wrapper and the method will continue as normal.

But get this: it is not waiting by blocking the current thread. Under the covers the compiler takes everything in the method that comes after the await keyword, everything that depends on the value we’re waiting for, and packages it up into a lambda function – an event handler. This lambda function it attaches to the Task<string> returned by DownloadStringTaskAsync so that it can be run when the string becomes available.

There are very few restrictions on what the bodies of async methods can look like. All the usual control flow operations, if statements, foreach, while, etc. work. Best of all, exception handling works without needing any changes. The compiler takes care of all the twistedness of the asynchronicity, leaving you to think in straight lines.

All this means that, once C# 5.0 is here, there will be no excuses for applications that suffer white-outs while they do some work. And it means some cheap improvements to scalability on the server side. Anders showed an example of a server application that processes some RSS feeds over which he waved the async wand [watch]. The time needed to process the page dropped from about 1.25 seconds to 0.25 seconds – 5 times faster. Here’s a snippet of the code so you can see again how few changes are needed:

RssAsyncExample

This is big news. There are other languages that can already do this kind of thing, F# being one of them. But I don’t know of any other mainstream language that does it, or does it as neatly as this. Anders and team have created an incredibly powerful feature, but exposed it a way that everybody will be able to use.

You can play with this now. Anders announced a Visual Studio Async CTP which contains new versions of the C# and VB.Net compilers, and the necessary supporting APIs in the framework (including Async versions of lots of the existing APIs). It installs on top of Visual Studio 2010, and I gather that it enables Asynchronous methods in both .Net 4.0 and Silverlight.

Further Reading

Thursday 21 October 2010

Microsoft PDC 2010 Speculations

We’re now less than a week away from Microsoft PDC 2010. The session list has just been published so I think it’s time to indulge in some speculation as to this years announcements.

C# 5

For me the highlight of recent PDCs has been discovering what Anders and his sous-chefs have been cooking in the C# kitchen. Of all the changes likely to be announced, it is changes to the C# language that are likely to make the most day-to-day difference to my work. In 2005 it was LINQ, in 2008 it was dynamic programming. What will it be in 2010? Well, Anders has been given a slot, and we already have some clues as to what he might say. At his presentation on the future of C# and VB.Net at PDC 2009, Luca Bolognese trailed some of the likely features, including:

  • Compiler as a Service – the C# compiler completely rewritten in C#, and available to use programmatically, allowing all kinds of interesting meta-programming, including the ability to write refactorings.
  • Better support for immutability built into the language
  • Resumable methods.

It was this last one that interested me most. Luca showed an example involving asynchronous programming. Today we do this by calling a Begin* method, and passing in a callback that will be executed when the asynchronous invocation has finished (and we never, ever, forget to call the corresponding End* method!). This can get quite difficult to wrap your head around if you have a whole chain of methods that need to be invoked asynchronously – say in the callback of the second one you kick off another asynchronous method, passing in a callback that runs a third – and so on.

This style of programming actually has a name – Continuation Passing Style (or CPS) – and, from recent developments on Eric Lippert’s blog it seems they’re planning to add support for CPS to the C# language to make it easier to write this kind of code without blowing a fuse in your brain (in typical Eric Lippert style, he is neither confirming or denying what hypothetical features might or might not hypothetically be added to a hypothetical future version of C#, hypothetically speaking).

No doubt Anders will reveal all.

LINQ

Since it was released in 2007, LINQ has colonised .Net code bases at an incredible pace. Parallel LINQ was another great innovation in .Net 4. And Microsoft aren’t done yet. The Reactive Framework (aka Linq-to-Events) is coming on nicely. And this year Bart de Smet has a session tantalisingly entitled “Linq, Take Two: Realizing the LINQ to Everything Dream”. Linq-to-Everything sounds pretty ambitious, so I’m looking forward to hearing what he has to say.

Silverlight

It looks like we can expect some big announcements for Silverlight this year – check the session list for proof: there’s a note against Joe Stegman’s session on Silverlight saying that the abstract is embargoed until after the PDC keynote.

What would I like to see? Well, one of the biggest pain-points I’ve found in my Silverlight programming is that Continuation-Passing-Style programming is thrust upon you whenever you want to show a dialog, or make a call to the server: because Silverlight runs in a browser, you aren’t allowed to block the UI thread, so every potential blocking operation has to be done asynchronously. It would be really nice if C# 5 does simplify CPS, and if that also applied to Silverlight.

Azure

I’ve already made my PDC prediction for Windows Azure: I’m guessing that Microsoft will announce free, enthusiast-scale hosting for Web Applications in Azure. This is looking even more likely now that Amazon have announced a free entry-level version of their EC2 cloud hosting platform.

Thursday 28th October will reveal all. Stay tuned – once again my company has graciously given me time to follow along with the PDC sessions (and this year they’re being transmitted live, as well as being available for download within 24 hours). As on previous occasions, I’m hoping to find time to share with you all what I learn.

Wednesday 20 October 2010

Quick Fix: Silverlight Application showing as a blank screen

I just diagnosed and fixed a problem with my Silverlight application that I’ve hit often enough to make me think it’s worth sharing my approach.

The problem was a simple one. When I deployed my application, and browsed to a page that should have displayed a Silverlight control, all I saw was a blank page.

So I fired up the diagnoser of all problems Http, the excellent Fiddler, and reloaded my page. This is what I saw:

FiddlerMissingFile

In glaring red, Fiddler shows that my Silverlight control made a request to download one of its dependencies (which are all packaged in zip files and stored in the ClientBin folder), and was told it was not available – a HTTP 404 error.

So the fix was obvious: just add the missing file to my installer. Sorted!

Thursday 7 October 2010

Getting started with Protocol Buffers – and introducing Protocol Buffers Workbench

So I’ve fallen out of love with Xml, and decided that Google Protocol Buffers is a better serialization format for a key (and voluminous) data structure in our application (as has Twitter, incidentally). For various reasons, I decided to write my own Reader and Writer rather than go with the existing implementations. The first thing I needed to do then, was to understand how the Protocol Buffers encoding works.

Google have published some very good documentation, both for the message definition language, and the encoding, but nothing beats seeing it in action. For that there’s a Protocol Buffers compiler, protoc.exe which, amongst other things, will handle encoding and decoding messages (get it from the Protocol Buffers website - choose “Protocol Buffers compiler – Windows Binary”). The only thing is, protoc.exe is a command-line tool, and not very friendly if you’re just wanting to fiddle around. So I’ve summoned all my WPF powers, and created a graphical wrapper around it. Allow me to introduce Protocol Buffers Workbench:

image

In brief

Using the Workbench is easy:

  1. Install it using the ClickOnce installer.
  2. Put your message definition in the first column, and, from the drop-down pick the message that will be at the root of your document
  3. Enter your message in the second column, then hit the encode button (“>”). The encoded message is shown in the third column.

Or, if you have an encoded message, you can paste it into the third column and hit decode (“<”) to have the message translated into text format. The Paste button above the third column is particularly useful if you want to decode protocol buffers messages stored in binary columns in Sql Server. If you view such columns in Sql Management Studio they are displayed like “0x0A1B2C4D…”: the Paste button will recognise text in the clipboard in that format and paste it as binary data.

If you’re wondering about the “C#” toggle beneath the binary editor: that will encode the binary data as a C# byte array – particularly useful if you’re wanting to seed your unit tests with authentic protocol buffer bytes!

And that’s it. Unless you want to know more about defining message types, and writing messages in text message format – in which case, read on.

Message Types

The first thing to understand are the message types. Message types are a bit like an Xml Schema. But whereas you can understand Xml just fine without a Schema, having the message types to hand is essential when parsing a Protocol Buffers message. That's because, in a bid to squeeze the output as small as possible, all the wordy field identifiers and even field type information is stripped out. All that's left for each piece of data is a field number, and a few bits telling you how big it is.

Message types look like this:

enum Binding { 
   PAPERBACK = 1;
   HARDBACK = 2;
} 
 
message Author { 
   required string name = 1; 
   optional string location = 2; 
} 
 
message Book { 
   required string title = 1; 
   required Binding binding = 2; 
   repeated Author author = 3; 
   optional int32 number_of_pages = 4; 
   optional double weight = 5; 
}

So for each field in a message, you define if it must be present ("required"), it may be present ("optional"), or if it can appear any number of times, including never ("repeated"). You then give its type - and you have a range to choose from, including enums and your own custom message types. Finally, you assign it a field number: this is the critical part, as this is all that will distinguish between the different pieces of data in the resulting messages.

Messages in Text Message Format

It took me a bit of puzzling to work out how to write messages in text format. For some reason, though Google have defined a text format for Protocol Buffer Messages, they don't appear to have documented it any where.

Here's an example, using the message types defined above:

title: "The Holy Bible"
binding: HARDBACK
author: {
  name: "Moses"
  location: "Sinai Peninsula"
}
author: {
   name: "Paul"
   location: "Tarsus"
}
author: {
   name: "Luke"
}
number_of_pages: 1723
weight: 0.225

As you can see, it has some similarities to JSON, but not many. Field names aren’t enclosed in strings, for example, and they aren’t terminated by commas.