Friday, 1 May 2009

Practical LINQ #3: Compacting ranges of numbers

Here’s a problem that has popped up several times in one guise or another during the six years of my career:

Given a list of numbers, present it to the user in its most compact form.

So if you have 1, 3, 4, 5, 9, 11, 12, 13 it should be presented as 1, 3 – 5, 9, 11 – 13. It’s what your brain does automatically when you fill out the “Selected Pages” box in the Print dialog of Microsoft Word.

I’m sure you can imagine for yourself the kind of for-loop I crafted the last time I solved this problem. This time I wanted to make life more interesting.

The first thing is to make sure the numbers are marshalled in ascending order, with no duplicates in the ranks:

var preparedList = numbers.Distinct().OrderBy(x => x);

For the next step, I imagined into being a little Scanner class, with a nice fluent interface. I wrote down the following code, which was how I wanted the Scanner to work; then, with Resharper pointing out my omissions in glaring red text, I implemented the concepts to make it work:

public IEnumerable<Range> Compact(IEnumerable<int> numbers)
{
    var preparedList = numbers.Distinct().OrderBy(x => x);

    var scanner = new Scanner<int, Range>();

    scanner.ConfigurePattern()
        .StartMatchingWhenAny()
        .ContinueMatchingWhile((nextItem, matchedItems) => nextItem == matchedItems.Last() + 1)
        .Output(details => new Range(
                                    details.MatchedItems.First(),
                                    details.MatchedItems.Last()));

    return scanner.Scan(preparedList);
}

The idea is for the Scanner to walk a sequence of objects (ints in this case), producing from it a sequence of tokens (here, Range objects – simple structs with First and Last properties). The Scanner produces its output by matching patterns of contiguous input elements. The patterns are configured using a fluent interface:

  • the condition that must be met in order to start matching this pattern (any integer is accepted at the start of this simple pattern)
  • which items are considered part of the pattern: here an integer is accepted if it is only one bigger than the previous matched integer
  • how to generate a token from the matched part of the sequence

The internal workings of the scanner aren’t terribly interesting. See the download link below if you’re desperate for a look.

Now don’t laugh: String.Join is my discovery of the week. It concatenates an array of strings, putting a separator of your choice between them; leaving you free to fry bigger fish than how to put a comma after every item except the last.

So my final answer looks like this:

var numbers = new[] {1, 3, 4, 5, 9, 11, 12, 13};

var compacter = new RangeCompacter();
var ranges = compacter.Compact(numbers);

// create a pretty string, each range or number separated by a comma
var display = string.Join(
    ", ", 
    ranges
    .Select(range => range.ToString())
    .ToArray());

Assert.AreEqual("1, 3 - 5, 9, 11 - 13", display);

I’ve put all the code, including my Scanner class, on Code Gallery; there’s even a sprinkling of unit tests! If you find any other uses for it, do let me know.

3 comments:

Jamie said...

Ah, nice stuff

I remember, a long time ago, trying to make my own string.Split(), and not having it work very well, until my dad asked - "why not use string.Split?" *sigh*

Commenting still seems to not work on FF3, no matter what is tried, or maybe its just me.

Sævar said...

As always, a quality post.

Sam said...

Thanks guys. I'm sorry it's been a bit quiet round here lately!

Post a Comment