Friday 8 August 2008

Project Euler Problem 17: Converting numbers to words

Zimbabwe Prices: Picture Credit, SokwaneleFancy earning a Lotto prize of 1.2 quadrillion dollars? Then you should move to Zimbabwe! Don't be in too much of a hurry, though. By the time you'd brought it back home, your Zimbabwe-dollar haul would have shrunk to less than 4,000 US Dollars.

A couple of weeks back the BBC published an article talking about the problem of big numbers in Zimbabwe. Due to daily expenses measured in the trillions of dollars, with hyperinflation of 2,200,000% pushing that figure ever higher, the citizens of Zimbabwe have been googling to find the names for the next big numbers after trillions. Coincidentally, I was doing the same thing the day before I read the article as I went about solving Project Euler Problem 17.

The problem wants to know how many letters are needed to write out the numbers 1 to 1000 in words. Clearly this calls for an algorithm to convert numbers to words; since I always like to give a little extra to my loyal readers, I wanted to create an algorithm to handle numbers bigger than 1000. I ended up with one that can convert anything up to long.MaxValue: in words, nine quintillion, two hundred and twenty-three quadrillion, three hundred and seventy-two trillion, thirty-six billion, eight hundred and fifty-four million, seven hundred and seventy-five thousand, eight hundred and seven! Plenty powerful enough to write out Bill Gate's paychecks, and sufficient to keep the Zimbabweans going for the next few months!

So how does the algorithm work? A basic ingredient is a dictionary mapping the basic numbers to words. This covers the numbers 1 to 20, then every multiple of ten up to 100, then each larger number that has a special name. The key ingredient is a recursive function (functional programming had to come into it somewhere) that starts with the largest part of a number and repeatedly breaks it down into its components.

In Pseudo-code:

  1. Let n be the number to process
  2. For numbers smaller than 0, combine "Negative" with the result of calling the algorithm on the absolute value of n
  3. For n between 0 and 20, look up the result in the word dictionary
  4. For n between 20 and 100, calculate the number of 10s and number of units; look up the name for the number of tens; if there are any units, look up the appropriate name and combine the two results with a hyphen
  5. For n between 100 and 1000, calculate the number of 100s and the remainder; look up the name for the number of hundreds; if there is any remainder, recurse to get its wordy representation, then combine it with result for the hundreds using "and" to join the two parts
  6. For n bigger than 1000: decide which is the biggest named number (million, billion, etc.) that divides into n. Calculate the number of that base unit and the remainder. Recurse to convert the number of base units into words, and recurse to convert the remainder into words. Combine the two parts using ","

In C#:

public static class ConvertToWordsExtension
{
 private static Dictionary<long, string> _wordDictionary;
 private const int OneThousand = 1000;
 private const long OneMillion = 1000000;
 private const long OneBillion = 1000000000;
 private const long OneTrillion = 1000000000000;
 private const long OneQuadrillion = 1000000000000000;
 private const long OneQuintillion = 1000000000000000000;

 /// <summary>
 /// Converts a number to its English representation in words.
 /// </summary>
 /// <remarks>Uses the Short Scale for large numbers. See http://en.wikipedia.org/wiki/Long_and_short_scales for more details</remarks>
 public static string ConvertToWords(this long number)
 {
     EnsureWordDictionaryInitialised();

     if (number == long.MinValue)
     {
        throw new ArgumentOutOfRangeException();
     }

     return ConvertToWordsCore(number);
 }

 /// <summary>
 /// Converts a number to its English representation in words
 /// </summary>
 public static string ConvertToWords(this int number)
 {
     return ConvertToWords((long)number);
 }

 private static Dictionary<long, string> CreateWordDictionary()
 {
     return new Dictionary<long, string>
                {
                    {0, "zero"},
                    {1, "one"},
                    {2, "two"},
                    {3, "three"},
                    {4, "four"},
                    {5, "five"},
                    {6, "six"},
                    {7, "seven"},
                    {8, "eight"},
                    {9, "nine"},
                    {10, "ten"},
                    {11, "eleven"},
                    {12, "twelve"},
                    {13, "thirteen"},
                    {14, "fourteen"},
                    {15, "fifteen"},
                    {16, "sixteen"},
                    {17, "seventeen"},
                    {18, "eighteen"},
                    {19, "nineteen"},
                    {20, "twenty"},
                    {30, "thirty"},
                    {40, "forty"},
                    {50, "fifty"},
                    {60, "sixty"},
                    {70, "seventy"},
                    {80, "eighty"},
                    {90, "ninety"},
                    {100, "hundred"},
                    {OneThousand, "thousand"},
                    {OneMillion, "million"},
                    {OneBillion, "billion"},
                    {OneTrillion, "trillion"},
                    {OneQuadrillion, "quadrillion"},
                    {OneQuintillion, "quintillion"}
                };
 }

 private static void EnsureWordDictionaryInitialised()
 {
     // ensure thread-safety when caching our word dictionary
     // note: this doesn't prevent two copies of the word dictionary
     // being initialised - but that doesn't matter; only one would be
     // cached, the other garbage collected.
     if (_wordDictionary == null)
     {
         var dictionary = CreateWordDictionary();
         Interlocked.CompareExchange(ref _wordDictionary, dictionary, null);
     }
 }

 private static string ConvertToWordsCore(long number)
 {
     if (number < 0)
     {
         return "negative " + ConvertToWordsCore(Math.Abs(number));
     }

     // if between 1 and 19, convert to word
     if (0 <= number && number < 20)
     {
         return _wordDictionary[number];
     }

     // if between 20 and 99, convert tens to word then recurse for units
     if (20 <= number && number < 100)
     {
         return ProcessTens(number, _wordDictionary);
     }

     // if between 100 and 999, convert hundreds to word then recurse for tens
     if (100 <= number && number < OneThousand)
     {
         return ProcessHundreds(number, _wordDictionary);
     }

     if (OneThousand <= number && number < OneMillion)
     {
         return ProcessLargeNumber(number, OneThousand, _wordDictionary);
     }

     if (OneMillion <= number && number < OneBillion)
     {
         return ProcessLargeNumber(number, OneMillion, _wordDictionary);
     }

     if (OneBillion <= number && number < OneTrillion)
     {
         return ProcessLargeNumber(number, OneBillion, _wordDictionary);
     }

     if (OneTrillion <= number && number < OneQuadrillion)
     {
         return ProcessLargeNumber(number, OneTrillion, _wordDictionary);
     }

     if (OneQuadrillion <= number && number < OneQuintillion)
     {
         return ProcessLargeNumber(number, OneQuadrillion, _wordDictionary);
     }
     else
     {
         return ProcessLargeNumber(number, OneQuintillion, _wordDictionary);
     }
 }

 private static string ProcessLargeNumber(long number, long baseUnit, Dictionary<long, string> wordDictionary)
 {
     // split the number into number of baseUnits (thousands, millions, etc.)
     // and the remainder
     var numberOfBaseUnits = number / baseUnit;
     var remainder = number % baseUnit;
     // apply ConvertToWordsCore to represent the number of baseUnits as words
     string conversion = ConvertToWordsCore(numberOfBaseUnits) + " " + wordDictionary[baseUnit];
     // recurse for any remainder
                 conversion += remainder <= 0 ? ""
                            : (remainder < 100 ? " and " : ", ") + ConvertToWordsCore(remainder);
     return conversion;
 }

 private static string ProcessHundreds(long number, Dictionary<long, string> wordDictionary)
 {
     var hundreds = number / 100;
     var remainder = number % 100;
     string conversion = wordDictionary[hundreds] + " " + wordDictionary[100];
     conversion += remainder > 0 ? " and " + ConvertToWordsCore(remainder) : "";
     return conversion;
 }

 private static string ProcessTens(long number, Dictionary<long, string> wordDictionary)
 {
     Debug.Assert(0 <= number && number < 100);

     // split the number into the number of tens and the number of units,
     // so that words for both can be looked up independantly
     var tens = (number / 10) * 10;
     var units = number % 10;
     string conversion = wordDictionary[tens];
     conversion += units > 0 ? "-" + wordDictionary[units] : "";
     return conversion;
 }
}

Having coded the algorithm like that, I was casting around to find a way of making the ConvertToWordsCore method more compact. That was when I came across Igor Ostrovsky's post on the ternary conditional operation ("?") and was reminded of a neat trick. I re-wrote the method like this:

private static string ConvertToWordsCore(long number)
{
 return
     number < 0 ? "negative " + ConvertToWordsCore(Math.Abs(number))
         : 0 <= number && number < 20 ? _wordDictionary[number]
         : 20 <= number && number < 100 ? ProcessTens(number, _wordDictionary)
         : 100 <= number && number < OneThousand ? ProcessHundreds(number, _wordDictionary)
         : OneThousand <= number && number < OneMillion ? ProcessLargeNumber(number, OneThousand, _wordDictionary)
         : OneMillion <= number && number < OneBillion ? ProcessLargeNumber(number, OneMillion, _wordDictionary)
         : OneBillion <= number && number < OneTrillion ? ProcessLargeNumber(number, OneBillion, _wordDictionary)
         : OneTrillion <= number && number < OneQuadrillion ? ProcessLargeNumber(number, OneTrillion, _wordDictionary)
         : OneQuadrillion <= number && number < OneQuintillion ? ProcessLargeNumber(number, OneQuadrillion, _wordDictionary)
         : ProcessLargeNumber(number, OneQuintillion, _wordDictionary); // long.Max value is just over nine quintillion
}

That's much more compact, and to my eyes, a lot more elegant. The only downside I can see is that debugging might be more difficult. Which form do you prefer?

With this algorithm in place, solving Project Euler 17 is trivial:

[EulerProblem(17, Title="How many letters would be needed to write all the numbers in words from 1 to 1000?")]
public class Problem17
{
 public long Solve()
 {
     return
         1.To(1000)
             .Select(
                 number => number
                             .ConvertToWords()
                             .ToCharArray()
                             .Count(character => char.IsLetter(character))
             )
             .Sum();
 }
}

As always, full code (including some unit tests for the ConvertToWords method!) is available on my Project Euler code gallery page. You can also try out the algorithm in my handy online utility to convert numbers to words.

Update 27/5/09:Fixed bug where 'and' was missed out in numbers like 1006

10 comments:

Anonymous said...

Your code doesn't seem to convert thousands(or other large numbers) with a reminder under 100 but still over 0 to output the correct result. Isn't fx. 1042 supposed to output "[one] thousand and forty-two", not "[one] thousand, forty-two"?

Not that it matters in the case of project euler, but still something you should fix up.

Anyway; Awesome series on project euler you got here!

Unknown said...

Anonymous,
I wondered how long it would be before someone spotted this ;-). It should do as you say, I just haven't got round to fixing the code - solving new problems is more exciting than fixing bugs in solutions to old ones!

Thanks for reminding me anyway: now it's public, I might be shamed into fixing it!

Sam

Unknown said...

The "and" bug should be fixed now.

scias23 said...

hey, what's the formula for converting numbers up to 9999? i will code it on java.

Anonymous said...

This problem is quite boring compared to other Project Euler problems.

Anonymous said...

When you spell out numbers, you only use and where a decimal is. So in your example above 1042 would be one thousand forty-two, not one thousand and forty-two, and not one thousand, forty-two.
The and would be used for a number like 1042.2 - one thousand forty-two and two tenths.

Anonymous said...

sorry, I just found out that you crazy British may do it differently than we Americans do... my bad.

Jonathan Wood said...

Cool. I just did something similar. You can see my version at http://www.blackbeltcoder.com/Articles/strings/converting-numbers-to-words

Anonymous said...

wouldn't it be quicker to use a dictionary like this:
{0, 4},
{1, 3},
{2, 3},
{3, 5},
{4, 4},
{5, 4},
with the number, then the length of the word.
That's what I did.

balthazzarvagle said...

Because we don’t work for the on line casino, we can’t tell you when MyBookie is planning their next replace. As a designated VIP member at MyBookie, you’ll take pleasure in exclusive access to month-to-month on line casino cashback, never-before-seen bonuses, additional cost options, and much sooner payouts. If you’re using the MyBookie promo code MYB100, you’ll need to make an initial deposit of $50 or extra. You also needs to|must also} conscious of|concentrate on|pay attention to} MyBookie’s 10x rollover requirement. You have 30 full days to beat the clock, but we’d suggest an eye fixed on|maintaining a tally of|keeping track of} 먹튀사이트 먹튀프렌즈3 your free play steadiness.

Post a Comment