Wednesday, 9 April 2008

Bouncing on your tail

Anybody using Recursion to write functions, as we did in the last episode, should be aware of the major pitfall: Stack Overflow.

Do you know why Stack Overflow errors happen? Let me tell you a story.

Inside your computer, as Sean Hargreaves has informed us, are lots of little elves who scribble away at fantastic speeds, doing all the calculations that we programmers throw at them. Each elf has his own little pile of paper on which he jots the results of his sums. As you would expect, the elves need to be very methodical, and they follow a particular system.

Imagine that Charles, Chief Processing elf, is working his way through a function. He scribbles busily on his piece of paper, humming to himself as he notes down the result of each calculation (that's him you can hear if you stick your head next to the case). Then his instructions tell him that he's got to call another function. So he reaches out, grabs a fresh piece of paper, writes the name of the function at the top (and underlines it) together with the values of the parameters, and then plugs away at it. When he gets to the end, and has found his answer, he screws the paper up, and throws it over his shoulder (ever wondered why your case gets clogged up with so much dust?). He writes the answer on the piece of paper now at the top of the pile, right where he left off before the function call.

So you see, for each nested function call, his pile of paper gets one sheet taller. If we give poor Charles a recursive function to work on, the pile will grow and grow and grow, and he'll never be able to chuck any pieces away. As the pile grows, he has to stand on his chair to reach the top. Then he has to fetch a stepladder. The pile keeps on growing. Now he's standing on tip-toes, and then, as he reaches out to get a fresh sheet for his next function call, Catastrophe! He wobbles, the ladder falls, the pile topples. Stack Overflow!

(You can make all this tie up with what Wikipedia says if you know that Charles calls his pile of paper a "Call Stack", and each piece on it a "Stack Frame").

Some Compilers (like the F# compiler for instance) are kind to poor Charles, and change his instructions slightly to help him avoid accidents. It can do this when the final instruction in a function is to call another function and immediately return its result. This is called a Tail Call. If the last instruction of a recursive function is the recursive call, then we call that Tail Recursion. You'll notice that the function we created last time for factoring primes is tail recursive.

So what can the compiler do? Well, because the last instruction Charles has to carry out is to call the next function and then return its result, he knows that he doesn't need to keep his workings-out for the current function - he isn't going to need them again. So the compiler can tell Charles that, rather than starting a fresh sheet of paper, he can scrub his current one with his eraser and reuse it for the next function. To put it more boringly, the function is changed so that rather than making a call, it does a goto back to the top of the function, and thus avoids using a new Stack Frame.

Unfortunately, the C# compiler has no compassion for poor elves, and does nothing to help them avoid Stack Overflow (though in certain circumstances the .Net Just-In-Time compiler does help out a bit ). Sam to the rescue!

Not so long ago, I read about a technique called Trampolining. I thought at first that it might have been named that way by lazy geeks who wanted to make themselves sound athletic. I now know the real reason: trampolining makes functions bounce! You start with a tail-recursive function (that's important: it won't work for ordinary recursive functions). Then you change it so that rather then actually calling itself or returning a value it returns a message. One of two kinds of message in fact. The one kind of message says, "I want to bounce: please call me again with these parameters"; the other kind says, "I've had enough bouncing: here's the final answer". This new function obviously won't work by itself: it needs to be wrapped in a trampoline. The trampoline keeps on calling the function, each time using the the new set of parameters, until the function finally gives an answer.

This is what it looks like, applied to the prime factorisation algorithm from last time:

public long FindGreatestPrimeFactor(long factor, long n)
{

  var function = Trampoline.MakeTrampoline((long factorGreaterThan, long number) =>
  {
      long upperBound = (long)Math.Ceiling(Math.Sqrt(number));


       // find next factor of number
      long nextFactor = Range(factorGreaterThan + 1, upperBound)
                              .SkipWhile(x => number % x > 0).FirstOrDefault();

      // if no other factor was found, then the number must be prime
      if (nextFactor == 0)
      {
          return Trampoline.ReturnResult<long,long,long>(number);
      }
      else
      {
          // find the multiplicity of the factor
          long multiplicity = Enumerable.Range(1, Int32.MaxValue)
                                  .TakeWhile(x => number % (long)Math.Pow(nextFactor, x) == 0)
                                  .Last();

          long quotient = number / (long)Math.Pow(nextFactor, multiplicity);

          if (quotient == 1)
          {
              return Trampoline.ReturnResult<long,long,long>(nextFactor);
          }
          else
          {
              return Trampoline.Recurse<long,long,long>(nextFactor, quotient);
          }
      }
  });

  return function(factor, n);
}

And here's the Trampoline class:

public static class Trampoline
{
  public static Func<T1,T2,TResult> MakeTrampoline<T1,T2,TResult>(Func<T1,T2,Bounce<T1,T2, TResult>> function)
  {
      Func<T1,T2,TResult> trampolined = (T1 arg1, T2 arg2) =>
      {
          T1 currentArg1 = arg1;
          T2 currentArg2 = arg2;

          while (true)
          {
              Bounce<T1,T2, TResult> result = function(currentArg1, currentArg2);

              if (result.HasResult)
              {
                  return result.Result;
              }
              else
              {
                  currentArg1 = result.Param1;
                  currentArg2 = result.Param2;
              }
          }
      };

      return trampolined;
  }


  public static Bounce<T1,T2, TResult> Recurse<T1,T2, TResult>(T1 arg1, T2 arg2)
  {
      return new Bounce<T1,T2, TResult>arg1, arg2);
  }

  public static Bounce<T1,T2, TResult> ReturnResult<T1,T2, TResult>(TResult result)
  {
      return new Bounce<T1,T2, TResult>result);
  }

}


public struct Bounce<T1,T2, TResult>
{
  public T1 Param1 {get; private set;}
  public T2 Param2 { get; private set; }

  public TResult Result { get; private set; }
  public bool HasResult { get; private set; }
  public bool Recurse { get; private set; }

  public Bounce(T1 param1, T2 param2) : this()
  {
      Param1 = param1;
      Param2 = param2;
      HasResult = false;

      Recurse = true;

  }

  public Bounce(TResult result) : this()
  {
      Result = result;
      HasResult = true;

      Recurse = false;

  }
}

MakeTrampoline is an example of a higher-order function: it's a function factory - a function that creates another function. It takes our modified recursive function - the one that returns messages - and creates from it the recursive version that actually returns the result.

The type of the parameter to MakeTrampoline, indicates that we're expecting a delegate, a pointer to a function of two arguments, returning an instance of the Bounce struct. Bounce is what I'm using to convey the messages from the wanna-be recursive function. It either holds the result of the recursion, or the two parameters that should be used for the next recursive call. You can see that I've created two factory methods on Trampoline, Recurse, and ReturnResult, to make the syntax for returning messages from the would-be recursive function a bit more obvious.

Inside the MakeTrampoline method, I've defined a closure. This is an anonymous function that captures some of the variables in the function that surrounds it. In this case, the closure captures the value of the function parameter. This means that the anonymous function - our trampoline- can then call function in a loop, each time doing as it requests: either simulating recursion by calling function again with the new set of parameters, or, ultimately, returning a value and jumping out of the loop. Because it is a function factory, MakeTrampoline returns a delegate pointing to this inner function, rather than actually calling it itself. Thus its up to calling functions, like FindGreatestPrimeFactor to call the newly minted function to get the result.

So what do you think? I've tried different variations on this theme to see if I can improve the usage syntax, but I haven't been able to come up with anything significantly better - if anybody has any ideas I'd be glad to hear them. In the source, you can find another overload of the MakeTrampoline method which takes a 1-parameter function. As you can see, it's straightforward to extend to Functions with any number of parameters, but will unfortunately involve code-duplication. I've tried to think of a way round that too, but the only methods I've come up with involve sacrificing some efficiency.

Now this Trampoline isn't a panacea. It won't save you from badly written recursive functions that never return: in that case you'll just exchange a StackOverflow exception for a hung application as the Trampoline goes into an infinite loop. And I don't recommend it for simple cases where you know that you've only got a few levels of recursion. Standard recursion will always perform better. But it might be useful in cases with very deep recursion, or in frameworks with less memory allocated to the Stack (I think the .Net Compact Framework is one such case). And this technique has suggested to me another, Lazy Trampolining, which I'll tell you about some other time.

The source for this article can be found here.

11 comments:

Anonymous said...

Just a small notice: you have two links pointing to Problem 4, when you are obviously modifying the method you used on Problem 3.

It confused me a bit in the begining. I was wondering what is so recursive about Problem 4 :)

Unknown said...

Dan,
Thanks for letting me know about this: it should be fixed now.

Sam

Anonymous said...

There's also a second wrong link in the begining, at
"Anybody using Recursion to write functions, as we did in the last episode, should be aware of the major pitfall: Stack Overflow".

Sorry, I tend to be a bit too perfectionist.

I should also add that I liked this article. Great explaining of the stack overflow issue.

Unknown said...

Dan,
I shall have to hire you as my proof-reader :-)

Keep the feedback coming - it's great to know that people are enjoying my articles and finding them useful. Please consider giving star ratings to any posts you particularly like (or dislike!).

Tristan Reid said...

Hey, just another comment to tell you that there are other people out there who enjoy your articles and find them useful. I'm in that intersection of people who like math and want to learn more functional programming. I'm working along on the Project Euler problems, and trying to only use your posts as a hint. It's really fun! Thanks again.

Unknown said...

Tristan,
Thanks. It's a real encouragement when I get comments like that. I shall have to get a move on with the next Project Euler post!

Sam

Anonymous said...

Hi Sam,

EXCELLENT article, immediately put you on my iGoogle rss.

I work with C# code generated with a DSL, so users can easily create stackoverflows. (Because they don;t know about all that.)

This trampolining approach seems like a possible solution to some of those problems.

Thanks, GJ

Unknown said...

Glad you liked it GJ!

Saliya Ekanayake said...

Nice post!

Andrey said...

Awesome. One of the clearest explanations I've even seen!

Samuel Jack said...

Thanks, Andrey!

Post a Comment