Functional Fun: May 2008

Wednesday 28 May 2008

Links I Like #2

Software Development

Lazy instantiation one-liner of instance fields with the coalesce operator - handy tip from Frank Quednau
The Technology Cliff: How Time Off From Programming Affects Your Chops - Rob Walling on getting back into real work after time off being a manager!
Top Five Reasons why Prince2 Sucks - Jurgen Appelo on what he thinks is the UK's worst export.

Non-Technical

Fortune-tellers get a visit from Trading Standards Officers - did they see it coming?
Plants have rights too! Have the Swiss gotten too attached to Edelweiss?

Utilities

ZoomIt - "free screen zoom and annotation tool for technical presentations" (thanks to my colleague Roman for this).
Cropper - my favourite screen capture utility - written in C#, of course.

Thursday 22 May 2008

Project Euler Problem 12

This post tackles Problem 12 in the Project Euler Series

If you've ever played pool then you'll already be acquainted with the subject of today's problem: if not, meet the triangle numbers. Imagine you have a triangle of pool balls arranged on a green baize table. Now take a big black marker pen and start counting from the tip of the triangle. At the end of every row of balls write down on the table the total number of balls so far. The sequence of numbers you've written down are the first few triangle numbers; the black eye and the permanent ban from the pool club are because you weren't paying attention: I said imagine! Now you've learned the hard way, you should be able to see how to extend the sequence to find further triangle numbers. As you have just discovered, the first few numbers are 1, 3, 6, 10, 15, 21, and so on.

Today's problem requires us to calculate the number of divisors of each triangle number, and then report the first one to have over 500. Let's get started.

The first job is to generate the sequence of triangle numbers in code. This is bread-and-butter to us now, but I'll throw in a bit of salami to keep things interesting.

If we wanted to generate any particular triangle number - say the nth- we could just use the Sum() function on the sequence 1.To(n) - or, more abstractly, use the Aggregate function to add each item in the sequence to the running total. But we need to generate the whole infinite sequence of triangle numbers - because we don't know before hand how many we'll need.

To help with this, I've created the LazilyAggregate extension method. This acts on a sequence, and returns a sequence of partial aggregates. You use it like this:

var triangleNumbers = 2.To(int.MaxValue)
                      .LazilyAggregate(
                        1,
                        (item, accumulatedValue) => accumulatedValue + item);

The first argument is the seed value - this will be the first item in returned sequence. The second argument is a lambda function which takes two parameters: the first being the next item in the sequence to aggregate, the second, the aggregated value so far. The value returned by the lambda is pushed to the output sequence, and also fed through to the next call of the lambda.

The next step is to enumerate all the divisors of each triangle number. Unlike previous problems, we don't want just the prime factors, but every single number that divides into our candidate.

We can speed up the generation of this divisor list by remembering that divisors always come in pairs: one factor will be less than Sqrt(n), the other will be bigger (unless of course n is a square number!). In the code below, the inner from expression is generating a sequence of pairs of divisors, my Concat() extension method is unpacking the pairs to create one long sequence of divisors which is then counted. The outer from clause will pick out only the triangle numbers with more than 500 divisors - and we take just the first from that sequence. A nd we're done.

The complete code is shown below, and can be downloaded from the Project Euler page on the MSDN Code Gallery.

[EulerProblem(12, Title="What is the value of the first triangle number to have over five hundred divisors?")]
public class Problem12
{
    public void Solve()
    {
        var triangleNumbers = 2.To(int.MaxValue)
                              .LazilyAggregate(
                                1,
                                (item, accumulatedValue) => accumulatedValue + item);

        var result = (
                     from n in triangleNumbers
                     where
                            (from factor in 1.To((int)Math.Sqrt(n))
                            where n.IsDivisibleBy(factor)
                            select new int[] { factor, n / factor } )
                            .Concat()
                            .Count() > 500
                     select n
                     )
                     .First();

        result.DisplayAndPause();
    }
}

public static class Extensions
{
    public static IEnumerable<TAccumulate> LazilyAggregate<T, TAccumulate>(this IEnumerable<T> sequence, TAccumulate seed, Func<T, TAccumulate, TAccumulate> aggregator)
    {
        var accumulatedValue = seed;
        yield return seed;

        foreach (var item in sequence)
        {
            accumulatedValue = aggregator(item, accumulatedValue);
            yield return accumulatedValue;
        }
    }

    public static bool IsDivisibleBy(this long number, long factor)
    {
        return number % factor == 0;
    }

    public static IEnumerable<int> To(this int first, int last)
    {
        if (first == last)
        {
            yield return first;
        }
        else if (first < last)
        {
            for (var l = first; l <= last; l++)
            {
                yield return l;
            }
        }
        else
        {
            for (var l = first; l >= last; l--)
            {
                yield return l;
            }
        }
    }

    public static void DisplayAndPause(this object result)
    {
        Console.WriteLine(result);
        Console.ReadLine();
    }

    public static IEnumerable<T> Concat<T>(this IEnumerable<IEnumerable<T>> sequences)
    {
        return sequences.SelectMany(sequence => sequence);
    }

    public static IEnumerable<T> Concat<T>(this IEnumerable<T[]> sequences)
    {
        return Concat(sequences.Cast<IEnumerable<T>>());
    }
}

Monday 19 May 2008

DebuggerNonUserCode: Suppressing ignorable exceptions in the debugger

Have you ever written code that made you want to wash your hands (or your keyboard!) the minute you'd finished typing? So dirty it was, that you surreptitiously looked around to see if anyone else had noticed the smell emanating from your instance of Visual Studio. We don't like it, but sometimes it's unavoidable. The clock is ticking; Google mentions no alternatives, so you put a peg on your nose, and plunge on.

It doesn't happen often in the clean, fresh-smelling environment of C#; usually it's when I'm writing VBA code that I feel like putting on the rubber gloves. One case that I do encounter in .Net more often than I would like though, is when Exceptions are thrown in non-exceptional cases, and I have to catch those exceptions to determine what to next.

Take Enum.Parse as an example. It converts a string value to an enum value. More likely than not, the string value will have been entered by the user, so there's a good chance that it's not valid. If it can't be parsed, then an ArgumentException is thrown. So if you want to trap for invalid values, you're forced to do this:

enum MyEnum
{
    Value1,
    Value2,
}

bool TryParse(string value, out MyEnum convertedValue)
{
    try
    {
        convertedValue = (MyEnum)Enum.Parse(typeof(MyEnum), value);
        return true;
    }
    catch (ArgumentException)
    {
        convertedValue = 0;
        return false;
    }
}

Apart from looking nasty, there's another problem that you only notice when you start using the debugger. If you've set the debugger to break whenever an exception is thrown - as I usually do to make sure there are no problems sneaking through overly generic catch handlers - it will break in this method whenever there's invalid input, even though you have the catch statement there and it wouldn't really crash your app. You could configure the debugger to ignore ArgumentException - but then you'll likely miss cases that you really do want to know about.

As of .Net 2.0, there's an easy solution to this problem: DebuggerNonUserCode. This is an attribute that you put against a method to tell the debugger "Nothing to do with me guv'. Ain't my code!". The gullible debugger will believe you, and won't break in that method: using the attribute makes the debugger skip the method altogether, even when you're stepping through code; exceptions that occur, and are then caught within the method won't break into the debugger. It will treat it as if it were a call to a Framework assembly, and should an exception go unhandled, it will be reported one level up the call stack, in the code that called the method. All this is part of the Just-My-Code feature, which Mike Stall gives the low-down on here.

Obviously this is powerful stuff. Make sure you've tested your code well before you apply the attribute - or you won't be able to debug where it's going wrong (actually there is a way to do it - see the note below). For this reason, I'd suggest using it on the smallest method you can. Also make sure that you are catching very specific exceptions - notice that I'm using ArgumentException in this case, rather than a catch-all Exception.

I've found this especially useful when programming Excel. In the Excel object model, collections give you no API for discovering if particular members exist. The only way to find out is to make the call, and see if you get an exception. Here's an example of an extension method that checks whether a Worksheet exists, decorated with the DebuggerNonuserCode attribute:

using System.Diagnostics;
using XL = Microsoft.Office.Interop.Excel;

public static class WorkbookExtensions 
{
    [DebuggerNonUserCode]
    public static bool TryGetWorksheet(this XL.Workbook wb, string worksheetName, out XL.Worksheet retrievedWorksheet)
    {
        bool exists = false;
        retrievedWorksheet = null;

        try
        {
            retrievedWorksheet = GetWorksheet(wb, worksheetName);
            exists = retrievedWorksheet != null;
        }
        catch(COMException)
        {
            exists = false;
        }

        return exists;
    }

    [DebuggerNonUserCode]
    public static XL.Worksheet GetWorksheet(this XL.Workbook wb, string worksheetName)
    {
        return wb.Worksheets.get_Item(worksheetName) as XL.Worksheet;
    }
}

Notes:

If ever you need to see everything that's going on in your code, including in the DebuggerNonUserCode methods you can turn off the "Just My Code" feature in Visual Studio by going to Tools->Options, then expanding the Debugging node, and selecting General:
Jacob Carpenter has written up a Generic version of the TryParse pattern for enums; Michael Goldobin has actually created a TryParse that avoids the exception altogether - but don't read about that yet, because then you won't need my trick!.
The TryParse pattern is already implemented for parsing other primitive types; int, double, decimal, etc. all have their TryParse methods: go and vote for Enum to get it too. As Peter Richie points out in that Feedback issue I linked to, this forced use of exception handling that Enum.Parse makes you do violates the Framework Design Guidelines. Shame on you, Microsoft ;-).

Thursday 15 May 2008

(Mis)Using IDisposable: beyond resource clean-up

Anybody who has ever done any work with files, databases and so on, in C#, has had it drummed into them that they must be good citizens and always tidy up after themselves. Usually that means using using. But did you know that the using statement has uses that go far beyond resource cleanup?

As you know, using works in tandem with the IDisposable interface. Whenever you write something like this:

using (StreamReader reader = new StreamReader("MyFile.txt"))
{
    // do some stuff with the file
}

the compiler translates it into this:

StreamReader reader = new StreamReader("MyFile.txt");
try
{
    // do some stuff with the file
}
finally
{
    if (reader != null)
    {
        ((IDisposable)reader).Dispose();
    }
}

The effect of this is to ensure that, however dodgy the stuff you're doing with reader, whatever exceptions are thrown, its Dispose method will always be called.

Traditionally, this pattern has been used to ensure that files are closed, database connections are severed, GDI Brushes are washed up, etc. as soon as code has finished with them. But it's not limited to that. Think laterally. What does the using statement actually do? It guarantees that, at the end of the using block the Dispose method will be called on the variable that you are are using. But the Dispose method doesn't need to dispose of things. It can do anything you like!

Suppose for example that you need to change the state of something temporarily, whilst you do a job, but want to make sure that it gets changed back when you've finished. Then the using statement is here to help!

An example from VSTO Programming

For a practical example, consider the Excel object model. If you want to get good performance whilst you change lots of cells about, you have to tell Excel not to bother redoing calculations, or updating the screen for the duration. But when you've finished mucking about with its worksheets, you certainly do want Excel to repaint and recalculate.

Now suppose that you have several methods, each modifying cells, and each wanting to do it efficiently. Each of them could turn off recalculation and screen updating before they start, and back on when they've finished. But then, if those methods start calling each other you'll end up with a disco, as the screen flashes each time updating is turned off then on again. To get round this, each method should capture the current state, then restore that when it has finished. But think how mucky the code in each method would look if you started doing that in-line. Now consider this:

public void Update1()
{
    using (excel.DisableUpdating())
    {
        // change lots of cells, etc.
        Update2();
    }
}
public void Update2()
{
    using (excel.DisableUpdating())
    {
        // change other cells
    }
}

The DisableUpdating method takes care of capturing the current state, turning off repainting and recalculation, and then returns an IDisposable that will restore the original state. So whatever the nesting of method calls, the right thing will always happen.

Here's the extension method that makes this possible:

using Xl = Microsoft.Office.Interop.Excel;
public static class ApplicationExtensions
{
    public static IDisposable DisableUpdating(this Xl.Application application)
    {
        // preserve the current settings
        var originalScreenUpdating = application.ScreenUpdating;
        var originalCalculationMode = application.Calculation;

        // disable the updates
        application.ScreenUpdating = false;
        application.Calculation = Xl.XlCalculation.xlCalculationManual;

        return new DelegateInvokingDisposer(
            () =>
            {
                // when disposed, restore the original settings
                application.ScreenUpdating = originalScreenUpdating;
                application.Calculation = originalCalculationMode;
            }
            );
    }
}

You can see that this makes good use of closures to capture the existing state from Excel so that it is available for use by the Disposer lambda function.

The DelegateInvokingDisposer class looks like this:

public struct DelegateInvokingDisposer: IDisposable
{
    private readonly Action _disposeMethod;

    public DelegateInvokingDisposer(Action disposeMethod)
    {
        if (disposeMethod == null)
        {
            throw new ArgumentNullException("disposeMethod");
        }

        _disposeMethod = disposeMethod;
    }

    public void Dispose()
    {
        _disposeMethod();
    }
}

You can find other examples of this idea around the blogosphere. Ian Griffiths, for example, has used this technique to improve on the C# lock statement, creating a version that will timeout if unable to acquire a lock. You'll find another example if you look at the WPF framework, and the ItemCollection class, which supplies items to ListBoxes and the like. In WinForms, if you had a block of code that updated the Items property, you would top and tail it with calls to ListBox.BeginUpdate and ListBox.EndUpdate so that the ListBox didn't get redrawn for every additional item. ItemCollection does things differently. It has the DeferRefresh method which returns an IDisposable object. You use it like this:

using (listBox.Items.DeferRefresh())
{
    listBox.Items.Add("MyItem");
    // add lots more items
}

I'm sure, once you get used to the idea, you'll be able to find new uses of using for yourself.

Tuesday 13 May 2008

How-to: Create a Server Certificate for a WCF Service

One of the things I took a while to get used to when I started working with WCF were the certificates that are needed to secure Services.

Microsoft have configured WCF to be secure by default. This means that developers need to be much more imaginative in finding ways to open up gaping security holes in applications: there are several incantations you need to perform if you want to send passwords unencrypted over the wire, for example.

As a consequence however, if you do need to exchange security credentials, and are not in a situation where you can use Windows Credentials, you might find yourself needing to specify a Certificate that can be used to encrypt messages exchanged by client and server. You know you need one if you get errors like

System.InvalidOperationException: The service certificate is not provided. Specify a service certificate in ServiceCredentials.

The books I consulted (Learning WCF and Programming WCF Services) talked a lot about creating Certificates that could be used in a test environment (using makecert and the like), but almost nothing about creating certificates for a production environment. Using Test Certificates even during development is also less than ideal: because they are not properly signed by the relevant authorities you have to configure WCF not to do the full security checks on them; apparently there are also some performance problems.

After digging around a bit, I figured out that the X.509 certificates that WCF says it needs are the same thing as the SSL certificates you can purchase from Thawte, Verisign, and co - call me stupid for taking so long, but I haven't seen this actually spelled out. This led me to discover an easier way of generating test certificates. Comodo offer a very nice service letting you create free test SSL certificates that are valid for 90 days. As far as I can tell, they don't stop you recreating the certificates at the end of that time either. Because they don't validate domain names for these test certificates, you can use this service to create certificates for machines on your intranet. Their prices for full certificates seem very reasonable, particularly if you need them only for intranet use.

Getting and setting up a certificate

In the hope that I'll save somebody else some time, these are the steps I follow to set up my WCF service with a certificate:

Generate a Certificate Signing Request (CSR). The easiest way to do this is with IIS. Here's how if you use IIS 5.1 or 6.0 (see the bottom of this post for instructions using IIS7). Don't do this on a server that's serving up a site - it will change the IIS Certificate for the site. You can however create the certificate on one computer and export it to another.
1. In IIS, right click on one of your websites (there'll only be the Default Web Site if you're doing this in Windows XP), then select Properties. Switch to the Directory Security tab.
2. Click the Server Certificate button.
3. In the IIS Certificate Wizard, select "Create a new Certificate".
4. (If you see three options (Renew, Remove, Replace) that means you already have an SSL certificate configured. If you're sure it's safe to do so, you can select Remove to get back to the position of generating a new one. )
5. Fill out the screens that follow. The most important one to get right is "Common name". Here you should put the same DNS name that you intend client computers to use when connecting to the Server. If, for example, you only intend to connect from the same machine, then it's fine to use "localhost". Use the server's name if you intend clients to connect over an intranet. For servers being accessed from elsewhere, include the full domain name, e.g. mymachine.mydomain.com.
6. In the last but one step of the wizard, make sure you note down where IIS puts the Certificate Request File.
Get the certificate signed. This is where you'll need to use a Certification Authority (such as Thawte or Verisign, or one you set up yourself). As I mentioned, I've found that Comodo is the easiest to use for creating Test Certificates.
1. On the Comodo website look for the "Get It Free Now" link; or one of the other options if you're ready to splash out and buy a full certificate. Follow the link through to the Certificate request page.
2. At this point you'll need to open up the Certificate Request File that IIS created, and copy-and-paste its contents into the appropriate box on the web page.
3. Complete the rest of the request at the site, then wait for an email to come through with your certificate zipped up inside it.
Import the certificate into IIS. Now you'll need to hop back to IIS.
1. Get to the IIS Certificate Wizard as we did in step one.
2. This time, it should give you the option to "Process the pending request"
3. Follow through, and select the file that you got back from Comodo, or other CA.
4. Complete the Wizard: the certificate is now available for use in IIS if you need it.
Make sure the Certificate is available for WCF. To do this we'll need to use the Certificate Manager Snap-in.
1. Click the Start button, type mmc, then press enter. This will display an empty Management Console.
2. Click File->Add/Remove Snap-in.
3. In the Add/Remove dialog, click Add.
4. Select "Certificates" from the list and click Add. When prompted in the following dialog, select to manage certificates for the Computer account on the Local Computer.
5. Close and OK back to the main Console Window. The Certificates node should now be there.
6. Check that a Certificate with your machine's domain name appears in the Personal\Certificates folder.
  If you have created a certificate for localhost, you can use this Console to export the Certificate, and import it on another machine, to save having to create multiple test certificates. The same applies if you have created the certificate on a machine different to the Server that will use it. Right-click the certificate, then select All Tasks->Export to export from this machine; to Import on another machine, bring up this same console, then right click the Personal/Certificates folder and select All Tasks->Import.
Now you're ready to configure WCF. These steps assume you're using the config file to configure your service.
1. Open the Services config file. Look for the behaviour element that's used by your service. Under it, add a <serviceCredentials> node, then, inside that a <serviceCertificate> node. It should look something like this:
```
<configuration>
 ...
  <system.serviceModel>
    <behaviors>
      ...
      <serviceBehaviors>
        <behavior name="[YourServiceBehaviourName]">
          <serviceCredentials>
            <serviceCertificate findValue="localhost" storeLocation="LocalMachine"
              storeName="My" x509FindType="FindBySubjectName" />
    </serviceCredentials>
    ...
```
  Update the findValue attribute to include the domain name of your machine - whatever you put for the Common Name in the Certificate Signing Request. For completeness, I should mention that the value in the storeName attribute corresponds with the top level folder under the Certificates node in the Certificates Snap-in (thus the "Personal" folder becomes the "My" store). If for some reason IIS puts your certificate in a different folder, then you might need to change the value in storeName.
That's it. Your newly secured service is good to go.

Generating Certificate Requests in IIS7

[Update 10/11/08]

Kevan Brewer kindly sent me instructions for generating certificate requests in IIS7. These replace steps 1 and 3 above.

Generate a Certificate Request.
1. Click Start -> Run and type inetmgr.
2. Click on the root node in IIS manager and double-click Server Certificates from the centre panel.
3. In the Actions panel on the right, click on Create Certificate Request...
4. Fill in the form paying particular attention to the Common Name field on the first page. On the last page, click the browse button, browse to the directory of your choice and enter a file name e.g. testcertificate. (I found that if I just typed a name into the box, it either created the file somewhere random or didn’t create it at all.)
Get the Certificate Signed. This is the same as step 2 above.
Import the Certificate into IIS.
1. Back in IIS Manager, click on Complete Certificate Request... on the Actions panel.
2. Browse to the directory to where the zipped certificate files were extracted, change the file type to *.* since, by default, it only shows files with a .cer extension. Select the file which most closely matches the name you entered for the Common Name field in step 1.4.
3. Enter a suitable name for the friendly name (e.g. test certificate) and click OK.

Monday 12 May 2008

Project Euler Problem 11

What's your strategy for solving a word search? What - you haven't got one? OK, I'll be honest: I usually start out by just staring at the grid, hoping that words will magically stand out. Sometimes they do. But more often than not, I'm left with a list of words that refuse to be stared out of their hiding places. What then?

I ask, because Problem 11 is a mathematicians equivalent of a word search: given a 20 x 20 grid of numbers, find the line of four numbers that has the maximum product. Since they have yet to implement the "stare" keyword in C#, we need to find an methodical way for the computer to search the grid.

When I'm trying to unearth a particular stubborn word in a grid, I'll usually start from the bottom right (to force myself to look more carefully), scanning to find its first letter. Once I've got that, I'll look at all the letters encircling it; if any of them are promising, I just need to follow the line that the two letters begin to see if I've found the word. To search the number grid we can do something similar. The computer doesn't need to start at the bottom right: it will look carefully anyway. But we'll still get it to work through every square of the grid, looking at every line of four numbers that starts from that square: every possibility needs to be checked to be sure we find the maximum.

But we don't need to repeat ourselves. If I've already checked out the line that goes down from a square to the fourth row directly beneath it, I don't then need to check the line going up from that fourth row. In other words we only need to check four directions from each square: a line horizontally to the left, a line vertically down, a line diagonally down to the left, and a line diagonally down to the right. That will cover all the possibilities. We'll split the algorithm into two stages: generating a list of all the possible quadruplets in the grid, then checking the quadruplets to find the one with the maximum product.

Before we get ahead of ourselves, we need to convert the grid we're given into a form we can search. I've chosen to use a List of Lists of Integers, since this is the easiest to construct in a LINQ query. The outer list represents rows, the inner list columns of numbers.

The query to convert the input string to a List of Lists looks like this:

const string NumberString = @"
            08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
            49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
            81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
            52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
            22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
            24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
            32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
            67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
            24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
            21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
            78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
            16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
            86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
            19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
            04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
            88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
            04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
            20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
            20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
            01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48
            ";

// create a list of lists of ints representing the cells
var digits = NumberString.
                // split string into rows
                Split(new[]{Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries)
                .Select(line =>
                    // split row into columns
                    line.Split(' ')
                    // ignore any whitespace
                    .Where(token => token.IsInteger())
                    // convert string to number
                    .Select(token => int.Parse(token))
                    .ToList()
                    )
                .ToList();

Now we can start collecting the quadruplets. In each case we use two from clauses to iterate through all the possible starting positions in the grid. From each position we use an inner Select clause to select the appropriate line of four numbers out from that position.

var horizontals = 
  // start from the left, and stop four cells before the right edge
  from x in 0.To(16)
  // include all rows from top to bottom
  from y in 0.To(19)
  select (from offset in 0.To(3)
        select digits[x + offset][y]);

var verticals =
  from x in 0.To(19)
  from y in 0.To(16)
  select (from offset in 0.To(3)
        select digits[x][y + offset]);

var forwardDiagonals =
  from x in 0.To(16)
  from y in 0.To(16)
  select (from offset in 0.To(3)
        select digits[x + offset][y + offset]);

var reverseDiagonals =
  from x in 3.To(19)
  from y in 0.To(16)
  select (from offset in 0.To(3)
        select digits[x - offset][y + offset]);

Each of the queries above generates an individual sequence of quadruplets, one query for each direction - the order that we process the different directions doesn't matter since we have to cover them all anyway. We need to be able to process all these sequences together. The Enumerable class does have a Concat method that will join two sequences, returning one longer sequence, but we need to join 4. For that, I've create the Concat() extension method.

It does the same job as Enumerable.Concat(), but works on a sequence of sequences (which is why I've stuck the individual sequences together in an array in the code below: arrays of type T implement IEnumerable<T>). Then it's just a case of calculating the Product of each quadruplet, and finding the maximum of all products:

new[]
{
    horizontals,
    verticals,
    forwardDiagonals,
    reverseDiagonals
}
    .Concat()
    .Select(quadruplet => quadruplet.Product())
    .Max()
    .DisplayAndPause();

Here's the complete listing. The full code can be found on the MSDN Code Gallery site.

[EulerProblem(11, Title = "What is the greatest product of four numbers on the same straight line in the 20 by 20 grid?")]
public class Problem11
{
    public void Solve()
    {
        const string NumberString =
                    @"
                    08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
                    49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
                    81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
                    52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
                    22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
                    24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
                    32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
                    67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
                    24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
                    21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
                    78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
                    16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
                    86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
                    19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
                    04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
                    88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
                    04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
                    20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
                    20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
                    01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48
                    ";

        // create a list of lists of ints representing the cells
        var digits = NumberString.
            // split string into rows
            Split(new[] {Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries)
            .Select(
            line =>
            // split row into columns
            line.Split(' ')
                // ignore any whitespace
                .Where(token => token.IsInteger())
                // convert string to number
                .Select(token => int.Parse(token))
                .ToList()
            )
            .ToList();

        var horizontals = 
            // start from the left, and stop four cells before the right edge
            from x in 0.To(16)
            // include all rows from top to bottom
            from y in 0.To(19)
            select (from offset in 0.To(3)
                    select digits[x + offset][y]);

        var verticals =
            from x in 0.To(19)
            from y in 0.To(16)
            select (from offset in 0.To(3)
                    select digits[x][y + offset]);

        var forwardDiagonals =
            from x in 0.To(16)
            from y in 0.To(16)
            select (from offset in 0.To(3)
                    select digits[x + offset][y + offset]);

        var reverseDiagonals =
            from x in 3.To(19)
            from y in 0.To(16)
            select (from offset in 0.To(3)
                    select digits[x - offset][y + offset]);

        new[]
        {
            horizontals,
            verticals,
            forwardDiagonals,
            reverseDiagonals
        }
            .Concat()
            .Select(quadruplet => quadruplet.Product())
            .Max()
            .DisplayAndPause();
    }
}

public static class Extensions
{
    public static void DisplayAndPause(this object result)
    {
        Console.WriteLine(result);
        Console.ReadLine();
    }

    public static bool IsInteger(this string value)
    {
       int number;
       return int.TryParse(value, out number);
    }

    public static IEnumerable<T> Concat<T>(this IEnumerable<T[]> sequences)
    {
       return sequences.SelectMany(sequence => sequence);
    }

    public static int Product(this IEnumerable factors)
    {
        int product = 1;

        foreach (var factor in factors)
        {
            product *= factor;
        }

        return product;
    }
}

Thursday 8 May 2008

LINQ-to-Console

In the last episode we made a heady brew of Attributes, Reflection and LINQ to discover all the Problem-solving classes in my Project Euler Console App. The aim was to be able present the user with a list of the problems we'd solved, so they could choose one to run.

I showed the beginning of the process - discovering the classes - and the end - running the solution chosen by the user; I thought there would be nothing interesting about the middle part - displaying the list, and getting the user's choice, but I was wrong. C#3.0 makes even Console programming interesting again!

In the first place we've got the task of taking the meta-data about each problem, and making, not a meal out of it, but a menu. If you remember, we have a list of EulerProblemSolverMetadata objects, each containing the problem number, its title, and the information need by the Reflection APIs to invoke the Solve method in the class. What we need is a nicely formatted string that we can display in the Console.

For that we can use the Enumerable.Aggregate method. This, as you know, works its way through a sequence, combining each item with earlier items in whatever way you specify. We'll use it to add a line for each item to a StringBuilder:

var problemsSummary = solversMetaData.Aggregate(
                       new StringBuilder(),
                       (sb, solverData) =>
                         sb.AppendFormat("{0}: {1}\n", solverData.Number, solverData.Title)
                      );
Console.WriteLine(problemsSummary);

Output is one thing: getting input from a user is another. If users aren't being deliberately evil and trying to crash a program, then they'll often do something stupid; even the best of us sometimes have "senior moments". All of which means that a good proportion of many code bases is taken up with sanitising user input.

I started off with something pretty standard, as follows:

Console.WriteLine("Which problem would you like to solve?");
EulerProblemSolverMetadata chosenSolver = null;

while (chosenSolver == null)
{
    string response = Console.ReadLine();
    int problemNumber = 0;
    bool validResponse = int.TryParse(response, out problemNumber);

    if (!validResponse)
    {
        Console.WriteLine("Please type in a number");
        continue;
    }

    // check whether there is a Solver for that problem number
    chosenSolver = solversMetaData.FirstOrDefault(solver => solver.Number == problemNumber);

    if (chosenSolver != null)
    {
        break;
    }
    else
    {
        Console.WriteLine("That problem number wasn't recognised");
        continue;
    }
}

return chosenSolver;

That while loop made me feel very uncomfortable. It needs to be there to give the user more than one chance at giving a sensible answer: but it would ruin my Functional Programming credentials if I let that stand as my final answer. So I pondered the problem a little while longer (no pun intended!), squinting at that loop to see if I could discern anything that could be made to look LINQish.

Eventually I realised that what I had was essentially a pipeline: the code repeatedly gets user input, subjects it to a set of checks and successive conversions, until finally a valid answer emerges at the other end. If at any stage the pipeline rejects the item, it starts over again. That's almost the cannonical form of a LINQ query: data source, filters, conversions. Once that thought had clicked in my mind, I wrote down what I wanted my code to look like:

var chosenSolver =
       Console.In.ReadLines()
          .Validate(input => input.IsInteger(), "Please type in a number")
          .Select(input => int.Parse(input))
          .Validate(problemNumber =>
                   solversMetaData.Any(metaData => metaData.Number == problemNumber),
                   "That problem number wasn't recognised"
                   )
          .Select(problemNumber =>
                  solversMetaData.First(metaData => metaData.Number == problemNumber)
                  )
          .First();

return chosenSolver;

To make this work, we need just two new extension methods: ReadLines is an extension method on TextReader returning an Iterator to generate a sequence containing succesive pieces of user input:

public static IEnumerable<string> ReadLines(this TextReader reader)
{
  while (true)
  {
     yield return reader.ReadLine();
  }
}

The Validate method is identical to Where in that it filters a sequence using a lambda expression; the only difference is that when Validate rejects an item, it writes out the message that we pass to it:

public static IEnumerable<T> Validate<T>(this IEnumerable<T> sequence, Func<T,bool> predicate, string invalidItemMessage)
{
  foreach (var item in sequence)
  {
    if (predicate(item))
    {
        yield return item;
    }
    else
    {
        Console.WriteLine(invalidItemMessage);
    }
  }
}

With these extension methods in place the LINQ query works identically to the code in the while loop above. Much more beautiful, though, in my opinion, and no loops!

The code can be found in the Program.cs class in the download from MSDN Code Gallery.

Monday 5 May 2008

LINQ to Reflection

Today we've reached a milestone. 10 Project Euler problems solved, using Functional C# 3.0. To celebrate, I've put the code for all the solutions up on Microsoft's Code Gallery.

You can find them at http://code.msdn.Microsoft.com/projecteuler. I intend to keep this up to date, as I add new solutions.

Before going public, I wanted to spruce up my code a little. It's all held in a Console App. As I was adding new solutions to my project, I'd been creating them as classes, each with their own public static void Main() method, and changing the Startup object in Visual Studio to point to the one I wanted to run. Not very elegant.

What I wanted was a menu that would list all the problems numbers and their titles, and allow the user to pick the one to solve. Being lazy (a great virtue in a programmer!), I didn't want to have to update this menu each time I added a new solution to the project. Let the program find out for itself what solutions exist, I thought.

But how to do that? Enter, Attributes. Attributes are a way of decorating code with additional information. You'll have used an attribute when ever you've marked a class as being Serializable: they are the things that live inside square brackets just before the thing you're decorating. They are often used in combination with Reflection. This is a powerful area of the .Net Base Class libraries which allow you to inspect your code whilst it is running, create classes on the fly, call methods by name, and all kinds of exciting (and just a little scary) things like that.

I decided to create an Attribute to show my program that a particular class is a Project Euler problem solver. The attribute will hold the problem number, and its title.

You define it like this:

public class EulerProblemAttribute : Attribute
    {
        public EulerProblemAttribute(int problemNumber)
        {
            ProblemNumber = problemNumber;
        }

        public int ProblemNumber
        {
            get;
            set;
        }

        public string Title
        {
            get;
            set;
        }
    }

Simple, isn't it. It's like any other class, the key point being that it inherits from Attribute. This is how you use it to decorate a class:

[EulerProblem(7, Title="Find the 10001st prime.")]
    public class Problem7
    {
        public void Solve()
        {
  ...

The next step is to write some code that will find all the classes in the project that are marked with this attribute. This provides a nice opportunity to demonstrate a very practical use of LINQ: you could call it LINQ-to-Reflection.

What I've done is to create a class called EulerProblemSolverMetadata which will hold the information about each solver. The Reflection APIs provide classes that represent the various aspects of code in assemblies. For instance Type holds information about types - classes, structs, etc. MethodInfo holds information about a Method - and allows that method to be invoked dynamically on an object of the appropriate type. So in my EulerProblemSolverMetadata I've put a property called Class which holds the type of the solver, and a property called Method which holds a MethodInfo object representing the "Solve" method which I've decided each solver must have.

Here's the LINQ query that will populate my metadata class for me:

var problemSolvers =
     from type in Assembly.GetExecutingAssembly().GetTypes()
     // filter out types that are not decorated with the EulerProblemAttribute
     where Attribute.IsDefined(type, typeof(EulerProblemAttribute))
     // ensure that the type has a parameterless public Solve method
     let method = type.GetMethod("Solve", new Type[] { })
     where method != null
     // get hold of the attribute, so that we can get the metadata from it
     let attribute = Attribute.GetCustomAttribute(type, typeof(EulerProblemAttribute)) as EulerProblemAttribute
     // sort the problems into ascending order
     orderby attribute.ProblemNumber ascending
     select new EulerProblemSolverMetadata
            {
                Class = type,
                Number = attribute.ProblemNumber,
                Title = attribute.Title,
                Method = method
            };

Now that we've discovered what Solvers are available, we can present the list to the user and ask them to choose one. Once they've made a choice we need to use the metadata to run the appropriate piece of code. To do that we use the Activator class to create an instance of the appropriate Solver type. Then we invoke it's Solve method:

object instance = Activator.CreateInstance(chosenSolver.Class);
chosenSolver.Method.Invoke(instance, new object[] { });

Easy, when you know how.

You can see the full code in the project on the MSDN code gallery.

Next time: LINQ-to-Console!

Project Euler Problem 10

If you remember, we solved Project Euler Problem 10 along with Problem 7. You can find that solution here.

This post is just to keep the sequence tidy!

Thursday 1 May 2008

Project Euler Problem 9

I've never seen a popularity contest for Mathematical theorems. If one was ever held, I'd bet on Pythagoras's Theorem taking the number one spot - because his is the only theorem that most people have ever heard of. I'd further bet (if I was a betting man) that a good number of them would even be able to quote it - having had it caned into them at school:

"The square on the hypotenuse is equal to the sum of the squares on the other two sides".

Or as my old maths teacher used to say,

"The squaw on the hippopotamus is equal to the sum of the squaws on the other two hides."

(He had a really terrible joke, with this as its punchline; if you're feeling desperate enough, you can read it, and a few others, here)

And so we come to today's problem, Problem 9, which deals with Pythagorean triplets. These are just sets of three integers (a,b,c) with a ^ 2 + b ^2 = c^2; basically, numbers that satisfy Pythagoras's theorem. The problem is to find the unique set of numbers where a + b + c = 1000, and a < b < c.

We'll use a brute force method to solve this. We just need to note that since a must be less than b, and a + b + c = 1000, neither a nor b can exceed 500.

Here's the solution, using C# query syntax:

class Problem9
    {
        public static void Main()
        {
            (
             from a in 1.To(500)
             from b in a.To(500)
             let c = 1000 - a - b
             where a * a + b * b == c * c
             select new { a, b, c, product = a * b * c }
            )
            .Single()
            .DisplayAndPause();
        }
    }

(You can find the code for the two extension methods , To, and DisplayAndPause in the last post)

We're using the two from clauses to generate all possible pairs of numbers (a,b) where a is between 1 and 500, and b is between a and 500. The let clause calculates the value of c, storing it as an intermediate result; The where clause makes sure that a,b and c are truly pythagorean.

In the select clause I'm generating a new anonymous type using the new {...} expression. Anonymous types are just a shorthand way of grouping together a set of properties, without going to the trouble of defining a class to hold them. In many cases you don't even need to give names to those properties: C# will infer names from the variables or properties that you assign into them. So the first 3 properties in this anonymous type will have names "a", "b", and "c"; for the fourth property, I have to stop being lazy and think of a name ("product"), because it gets its value from an expression ("a * b * c"), and C# isn't imaginative enough to pick a name for that.

If you want to use anonymous types yourself, you should note two things. Firstly, the properties on the type are read-only: you can't change their values once they've been initialised. Secondly, you can only easily use anonymous types within the scope of a method: you can't return them from methods for the simple reason that you don't know the name of their type - they're anonymous, duh! Actually, I lie. You can return them from methods, but only if you return them as object - but then you'd need to use Reflection to get hold of their properties.

Because there's a unique answer to this problem, the query expression between the brackets will generate a sequence containing just one element. The Single extension method will return this for us - and will even throw an exception if it finds that there's more than one, which is a useful check that we've got the query expression correct.

As a nice touch, anonymous types have their ToString() method overridden, so that when you display them, as I do in the DisplayAndPause extension method, you see the values of all their properties, like so:

{ a=???, b=???, c=???, product=????}

(except you'd see the answers, rather than the question marks I've put there to stop you cheating!).

Straight forward today, wasn't it?

Pages