Tuesday, 15 July 2008

Project Euler Problem 15: City Grids and Pascal's Triangle

Problem 15 in the Project Euler series asks us to count the number of routes through a 20x20 grid starting from the top left and travelling to bottom right, always moving either rightwards or downwards, never backtracking. I can give you the solution in one line of code; explaining why it is the solution will take me many more lines of prose. The solution is:

return PascalsTriangle.GetEntry(2*20, 20);

(OK, you caught me: PascalsTriangle isn't a C# primitive type, so I guess I'll have to give some code for that as well. But all in good time.)

So what does Pascal's triangle have to do with City Grids? This is one of the things that I love about mathematics: seemingly unrelated problems often have subtle and beautiful connections. The connections are completely hidden on the surface, but once you uncover them, link at a time, you wonder that you never spotted them immediately.

How should we go about counting the number of routes through the grid? Here, try with this smaller grid - take my copy and scribble on it all you like:

A 6x6 city grid

My first insight was to start counting, not at the start of the grid, but at the end. Rather than try to work out all the possible routes from the starting point, let's count how many routes there are from points near the end, then work backwards.

Let me show you.

How many routes are there from the point F6 (marked "End") to the same point, F6? Stupid question, obvious answer: one - just follow the complicated directions: "Stand Still". From F5 to F6 there's one route: "Go South one block". From E6 to F6, one route: "Go East one block". But if I'm stood at E5 and want to get to End I have two choices, I can either go "East one block, then south one block" or "South one block, then East one block".

In fact, from anywhere along column F there's only one route to the end ("Go South x blocks"), because we're not allowed to turn back west; and from anywhere along row 6 there's only one route ("Go East x blocks") because we're not allowed to turn up north. On the diagram I've drawn what we've got so far - blue shows the number of routes from each point:

CityGridv3

Now lets consider how to find the number of routes from other spots on the grid. Take E4, for example. If we go east to F4 then we already know that there's only one way we can go on from there. If, however, we go south one block to E5 then we know that there are two routes we can take from there. So this makes a total of three possible routes from E4. By the same argument, we can see that there are 3 possible routes from D5. Then we can apply the same reasoning to cell D4: three possible routes if we go east to E4 plus three possible routes if we go south to D5 make for a total of 6 ways to get from D4 to the end. Another diagram to summarise:

CityGridv4

You see the way this is going now, don't you? To work out the total number of routes from each intersection we add up the total number of routes from the intersections one block to the east, and one block to the south.

Now lets spin the grid about its centre to put the point marked End at the top, drop the gridlines, and write out the numbers by themselves. Like this:

PascalsTriangle

What have we got? Pascal's Triangle - a fascinating mathematical object d'art that has been rediscovered many times throughout history in many different contexts. (Wikipedia will be delighted to tell you more about it- there's more to it than meets the eye). You construct it by writing 1s down the diagonal edges of the triangle, then forming each cell by summing the two neighbours in the row above.

I've highlighted the cells in red, because they are the ones that particularly concern us. Imagine spinning the triangle back so that it aligns with the grid. Then you see that "2" is the number of routes through a 1x1 grid, and "6" is the number of routes through a 2x2 grid.

Let's label the cells of the grid. Call the apex of the triangle row 0, and call the first "1" on each row column 0. Thus the number of routes through a 1x1 grid appears in row 2, column 1; the number of routes through a 2x2 grid appears at row 4 column 2. From there we can take a flying un-mathematical leap to a conclusion and say that the number of routes through an n x n grid will appear in row 2n, column n of Pascal's Triangle. That explains my solution: obvious now?

But how to calculate the number in that cell? My first thought was to write code that builds the triangle row by row. But there's a quicker way. Using Proof by Wikipedia, we can show that P(r,c) = ((r + 1 - c) / c) * P(r, c - 1), where P(r, c) is the value of Pascal's triangle at row r, column c. In other words, to calculate the value of a cell at column, c, in a particular row, r, we multiply the value of the cell to the left by the fraction (r + 1 - c) / c.

A sequence defined recursively like that makes me think of the Unfold method for generating it. Remember that after you have given it a seed value, Unfold generates a sequence for you by applying the lambda you give it to the previous term in the sequence. So our mathematical definition becomes:

public class PascalsTriangle
{
    public static long GetEntry(int row, int column)
    {
        // the L suffix on "Entry = 1L" is to force Entry to have a long type
        return Functional.Unfold(new {Entry = 1L, Column = 1},
                                 previous =>
                                 new
                                     {
                                         Entry = (previous.Entry*(row + 1 - previous.Column))/previous.Column,
                                         Column = previous.Column + 1
                                     })
            .SkipWhile(item => item.Column <= column)
            .First()
            .Entry;
    }
}

Note the use of the anonymous type to pass an additional piece of information through to the next iteration: Entry holds the cell value, and Column holds an incrementing count of the number of columns that we've calculated.

For reference, here's equivalent code that uses looping rather than functional concepts:

public static long GetEntry(int row, int column)
{
    long current = 1;

    for (int i = 1; i <= column; i++ )
    {
        current = (current * (row + 1 - i)) / i;
    }

    return current;
}

The complete code is shown below, and can also be downloaded from my Project Euler on the MSDN Code Gallery.

[EulerProblem(15, Title = "Starting in the top left corner in a 20 by 20 grid, how many routes are there to the bottom right corner?")]
public class Problem15
{
    public long Solve()
    {
        return PascalsTriangle.GetEntry(2*20, 20);
    }
}

public class PascalsTriangle
{

    public static long GetEntry(int row, int column)
    {
        // the L suffix on "Entry = 1L" is to force Entry to have a long type
        return Functional.Unfold(new { Entry = 1L, Column = 1 },
                                 previous =>
                                 new
                                     {
                                         Entry = (previous.Entry * (row + 1 - previous.Column)) / previous.Column,
                                         Column = previous.Column + 1
                                     })
            .SkipWhile(item => item.Column <= column)
            .First()
            .Entry;
    }
}

public static class Functional
{
    public static IEnumerable<T> Unfold<T>(this T seed, Func<T, T> generator)
    {
        // include seed in the sequence
        yield return seed;

        T current = seed;

        // now continue the sequence
        while (true)
        {
            current = generator(current);
            yield return current;
        }
    }
}

15 comments:

Tristan Reid said...

Still loving your blog.
I did this one slightly differently: in the post preceding this one you introduced memoization. Holding this new hammer, it was hard for me to not see a nail, so I created a Tuple class so that I could make multi-parameter memoize methods. Then the solution becomes:
Func < int, int, long > getPascalTriangleEntry = null;
getPascalTriangleEntry = (int row, int col) = >
{
if (row < = 1 || col == 0 || col == row)
{
return 1;
}
return getPascalTriangleEntry(row - 1, col) + getPascalTriangleEntry(row - 1, col - 1);
};
getPascalTriangleEntry = getPascalTriangleEntry.Memoize();

Cheers,

-t.

Anonymous said...

what if it is a 3x4? so its not n x n its m x n?

Unknown said...

Anonymous,
That's left as an excercise to the reader :-). Seriously though, I have no idea - never thought about it. What are your thoughts?

Anonymous said...

There are 40 steps to any given path (we have to go 20 steps down and 20 steps right). Since the definition of backtracking makes any given move down indistinguishable from any other move down, and any given move right indistinguishable from any other move right, the answer is 40!/(20!20!)..

Art of Designing algm lies in looking at the less obvious facts..my soln is way faster than yours mate

Unknown said...

for n x m grid the solution is direct extension of your method:
paths(m,n) = binomial(n+m, m), which becomes the same as the solution above when m = n. Neat huh?

Btw, I didn't know the solution till I saw your grid labeling illustration. That made me say "ahah!" So thank you for that,
max

Anonymous said...

Like Tristan I also wanted to use my new-found hammer.. but used it to tackle the problem as written:

Func findRouteCount = null;
findRouteCount =
(x, y) =>
{
if (x == 0 && y == 0) return 1;
long routes = 0;
if (x != 0) routes = checked(routes + findRouteCount(x - 1, y));
if (y != 0) routes = checked(routes + findRouteCount(x, y - 1));
return routes;
};

findRouteCount = findRouteCount.Memoize();
return findRouteCount(20, 20);

Anonymous said...

I found it easy to just create pascals triangle and then count every line with odd amount of entries and take the middle entry at line number 20. You have to create the triangle with an integer array on each line because with a string, it can be confusing with whitespace, and numbers larger than 10 are more than one char long.

Otherworld said...
This comment has been removed by the author.
Otherworld said...

For some reason this was straightforward.
I only looked at 2x2 grid with 6 routes and I was like "Wait, the combination formula might work, since 4 nCr 2 is 6." (4 is twice a two, the dimension of a grid)
Then I tried 6 nCr 3 and got 20. After that my only problem was the overflow.
But yeah, math helps :)

Otherworld said...

For some reason this was straightforward.
I only looked at 2x2 grid with 6 routes and I was like "Wait, the combination formula might work, since 4 nCr 2 is 6." (4 is twice a two, the dimension of a grid)
Then I tried 6 nCr 3 and got 20. After that my only problem was the overflow.
But yeah, math helps :)

maxsu said...

for n x m grid the solution is direct extension of your method:
paths(m,n) = binomial(n+m, m), which becomes the same as the solution above when m = n. Neat huh?

Btw, I didn't know the solution till I saw your grid labeling illustration. That made me say "ahah!" So thank you for that,
max

Khaldac said...

(m+n)!/m!n! is the actual formula. 2n!/n!n! is merely a special case where m=n.

Traviskendall7 said...

An overnight express company must include ten cities on its route. How many different routes are possible, assuming that it matters in which order the cities are included in the routing?

agamemnus said...

Great maths solution, but your implementation is very long. Simply iteratively generates each new base, while rotating around two arrays. (one as new base, one as old base)

Freebasic code. 22 lines. Can be squished together to 15 or less.

dim as integer levels = 20
dim as integer curSize = 1
dim as ulongint c1, c2
dim as integer j
dim as ulongint ptr arrayA = callocate (1, sizeof(ulongint))
dim as ulongint ptr arrayB = 0
arrayA[0] = 1

for i as integer = 0 to levels * 2 - 1
 curSize += 1
 arrayB = reallocate (arrayB, curSize * sizeof(ulongint))
 for j = 0 to curSize - 1
  if j = 0           then c1 = 0 else c1 = arrayA[j-1]
  if j = curSize - 1 then c2 = 0 else c2 = arrayA[j]
  arrayB[j] = c1 + c2
 next j
 swap arrayA, arrayB
next i

print arrayA[(curSize+1)/2-1]
sleep
system

Michell-1999 said...

ok,i have this question my science teacher asked us today about a 6x6 grid and how many pathes from top left to bottom right,i didn't understand a thing you guys were talking about,but I also guess im failing this question,any help,my answer is due tomorrow,I hope one of you can help me!

Post a Comment