Regular expressions are a great, albeit sometimes overused, tool for parsing and extracting text data. Sometimes I find myself looking at a list of data that I want to execute a regular expression against. I could do a for loop or encode a Regex match in a lambda but both interrupt the flow and require a fair bit of boilerplate.
I instead decided to write several simple extensions to make it easier on myself. Note that these import the [generic]System.Text.RegularExpressions[/generic] namespace.
[csharp]
///
///
/// An enumerable set of strings.
/// A regex pattern.
public static IEnumerable
{
return Matches(items, new Regex(pattern));
}
///
///
/// An enumerable set of strings.
/// A regex pattern.
/// The regex options to use.
public static IEnumerable
{
return Matches(items, new Regex(pattern, options));
}
///
///
/// An enumerable set of strings.
/// Regex to run.
public static IEnumerable
{
foreach (var item in items)
{
var m = regex.Match(item);
if (m.Success)
yield return m;
}
}
[/csharp]
Here’s a simple usage example that parses a list of numbers encoded as text with different units postfixes. If it finds anything in milimeters, it will convert it to inches.
[csharp]
var strings = new[] { “3.453\””, “4.2343 inches”, “2.34 mm”, “19.3 in”, “13 milimeters”, “hello world” };
var pattern = @”^[ ](?
var inches = strings.Matches(pattern)
.Select(m => Convert.ToDecimal(m.Groups[“number”].Value) * (m.Groups[“mm”].Success ? 0.0393701m : 1m));
foreach (var item in inches)
{
Console.WriteLine(“{0:0.000#}\””, item);
}
[/csharp]
This produces the following output:
[raw linenumbers=”false”]
3.453″
4.2343″
0.0921″
19.300″
0.5118″
[/raw]
You could also extend this for other purposes such as excluding certain lines from a text report, flattening match collections across multiple matches, etc. Feel free to share your own creations in the comments.
Pretty damn awesome…looking like a superstar at work this morning…Thanks for the post