Wednesday 4 March 2009

Practical LINQ #2: Reporting duplicate names

So my Widget editor can helpfully suggest default names, but it also lets users change the names of widgets. Ohh! Danger ahead: what if they give two (or more) widgets the same name? We can’t have that! The poor dears might confuse themselves. So we better make sure we warn them; that means we have to write code to detect duplicate names. And that gives me another opportunity to show some elegant LINQy C# for solving this real-life problem.

My requirements are simple: I want to generate an error for all widgets with a shared name, except for the first one with that name – no point in overburdening a user with error messages. I also want to ignore Widgets without a name – I’ve got another rule that deals with them.

The code is straightforward: group the widgets by name, and ignore any groups containing just one widget. Then for each group generate error messages for all items except the first. Translating that into proper English:

private static IEnumerable<TMessage> DetectDuplicates<T,TMessage>(IEnumerable<T> instances, Func<T, string> nameSelector, Func<T, TMessage> messageGenerator)
{
   var groupedDuplicates = instances
       .GroupBy(nameSelector)
       .Where(group => group.Count() > 1 && !string.IsNullOrEmpty(group.Key));

   var errors = groupedDuplicates
       .SelectMany(group => group.
                                Skip(1)
                                .Select(messageGenerator));

   return errors;
}

All the angle brackets in the method signature can mean only one thing: I’ve made the method generic, so it doesn’t just work with my Widgets; it will work with anything that has a property vaguely resembling a name. Along with your list of instances, you pass in a function that can extract a name from each of your objects, and a function that will generate the appropriate message for each duplicate. In fact it doesn’t just have to a message: you could return a rich error object if you wanted to be really sophisticated.

Here’s an example1:

static void Main()
{
   var widgets = new[]
                     {
                         new Widget {Name = "Super"},
                         new Widget {Name = "Competent"},
                         new Widget {Name = "Competent"}
                     };

   var errors = DetectDuplicates(
       widgets,
       w => w.Name,
       w => "There's already a widget called " + w.Name + ". Be more original!");

   foreach (var error in errors)
   {
       Console.WriteLine(error);
   }

   Console.ReadLine();
}

Footnotes

  1. The name of one of the widgets here was inspired by a brand of AEG oven that I spotted the other day: they named them “Competence”. I thought brand names were supposed to be aspirational!

4 comments:

Anonymous said...

5 stars.
I love looking at code written by other people.

I even executed this one, as I first had the impression group.Skip(1) was skipping the first group, when it was actually skipping only the first element of each group, of course.

Sævar said...
This comment has been removed by the author.
Jamie said...

It's quite strange, the proxy server I was going through was designating your feedburner rss link as porn. Though now that I am using the one on your site, updates come as normal.

This code - its been something that I've done before, except without the easiness of Linq. It will come in useful in the future, thanks for it! (I wonder, is it directly stealable?)

Unknown said...

Jamie,
The code is definitely stealable. All the code on this site is published under the Creative Commons license - as you can see on the right.

Post a Comment