Regular Expressions in C# – Part 5 – Groups

In regular expression we can use groups to parse our subject with sub strings in our pattern. These groups are also found in our Match object, so we can retrieve the matches for each group. Groups are expressed by using parentheses '()'. Groups are counted from left to right starting with the whole pattern.

Access to group match values

We test some real life examples of regular expression patterns with groups. But before we do that we need a little helper to print out the actual group matches. We do this by printing the Groups collection from the Match object.

using System.Diagnostics;
using System.Text.RegularExpressions;
 
namespace RegularExpressions.Tests.Helpers
{
    public class DebugWriter
    {
        public static void WriteGroups(Match match)
        {
            var index = 0;
 
            foreach (var group in match.Groups)
            {
                Debug.WriteLine("Group {0}: {1}", index, group);
                index++;
            }
        }
    }
}

Match a Postal Code

In the Netherlands we use a postal code format of four digits and two uppercase alphabetic characters (1234 AB). Sometimes there is a space between the numeric and alphabetic characters… sometimes not, but both are valid postal codes. See the test below for a regular expression that matches both occurrences and returns the numeric and alphabetic part as separate groups.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using RegularExpressions.Tests.Helpers;
 
namespace RegularExpressions.Tests.Part05
{
    [TestClass]
    public class Groups
    {
        [TestMethod]
        public void Match_PostcalCode_With_Space_Character()
        {
            const string pattern = @"^([0-9]{4}) ?([A-Z]{2})";
            const string subject = "4841 AB";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteGroups(match);
            }
 
            Assert.AreEqual(1, matches.Count);
 
            // Debug Trace:
            // Group 0: 4841 AB
            // Group 1: 4841
            // Group 2: AB
        }
 
        [TestMethod]
        public void Match_PostcalCode_Without_Space_Character()
        {
            const string pattern = @"^([0-9]{4}) ?([A-Z]{2})";
            const string subject = "4841AB";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteGroups(match);
            }
 
            Assert.AreEqual(1, matches.Count);
 
            // Debug Trace:
            // Group 0: 4841AB
            // Group 1: 4841
            // Group 2: AB
        }
    }
}

The first group is the numeric character sequence of four ([0-9]{4}). The second group is a pair of uppercase alphabetic characters ([A-Z]{2}). Between these group we have a space character with a question mark to express that there can be a space between these groups ' ?'. Three groups are returned; the whole pattern, the first and the second group. Both tests return the same result.

Regular Expressions in C# – Part 4 – Wild Character

We use the dot character '.' to match any character in a regular expression pattern. It is called a wild character. This includes spaces, but not the newline character. If we want to match only word boundaries we use the /b anchor. If the character between these boundaries must be alpha-numeric (or underscore) [a-zA-z0-9_] we can use the shorthand \w instead of the dot. Here are a few examples.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using RegularExpressions.Tests.Helpers;
 
namespace RegularExpressions.Tests.Part04
{
    [TestClass]
    public class WildCharacters
    {
        [TestMethod]
        public void Each_Character_Produces_Match_Except_NewLine()
        {
            const string pattern = ".";
            const string subject = "boy\ngirl";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(7, matches.Count);
 
            // Debug Trace:
            // 0: 1: b
            // 1: 1: o
            // 2: 1: y
            // 4: 1: g
            // 5: 1: i
            // 6: 1: r
            // 7: 1: l
        }
 
        [TestMethod]
        public void Matches_Each_Boundery_Of_Three_Characters()
        {
            const string pattern = @"\b.{3}\b";
            const string subject = "man bear pig xx";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(3, matches.Count);
 
            // Debug Trace:
            // 0: 3: man
            // 9: 3: pig
            // 12: 3:  xx <- Note: space-x-x is also a match
        }
 
        [TestMethod]
        public void Matches_Each_Word_Of_Three_Characters()
        {
            const string pattern = @"\b\w{3}\b";
            const string subject = "man bear pig xx";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // Debug Trace:
            // 0: 3: man
            // 9: 3: pig
        }
 
        [TestMethod]
        public void Matches_Each_Word_Of_Any_Length_Of_Characters_Starting_With_P()
        {
            const string pattern = @"p\w+";
            const string subject = "man bear pig\n pothole";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // Debug Trace:
            // 9: 3: pig
            // 14: 7: pothole
        }
    }
}

Regular Expressions in C# – Part 3 – Anchors

In regular expressions we can use the circumflex character (^) to express the beginning of a line or string. If we want to express the end of a line or a string, we can use the dollar sign ($). The m-modifier expresses a multiline string (?m). The DebugWriter’s WriteMatch helper method from the previous example prints the matches found in the tests below.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using RegularExpressions.Tests.Helpers;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class Anchors
    {
        [TestMethod]
        public void Boy_Should_Be_Found_At_The_Start_Of_The_First_Line()
        {
            const string pattern = "^boy";
            const string subject = "boygirlboy\nboy";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(1, matches.Count);
 
            // boy is found only at the first line
            // and not after the newline character
 
            // Debug Trace:
            // 0: 3: boy
        }
 
        [TestMethod]
        public void Boy_Should_Be_Found_At_The_Start_Of_Multiple_Lines()
        {
            const string pattern = "(?m)^boy";
            const string subject = "boygirlboy\nboy";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // Since this is a multiline string, boy is found
            // on both lines at the beginning
 
            // Debug Trace:
            // 0: 3: boy
            // 11: 3: boy
        }
 
        [TestMethod]
        public void Boy_And_Girl_Should_Be_Found_At_The_Start_Of_Multiple_Lines()
        {
            const string pattern = "(?m)^boy|^girl";
            const string subject = "boygirlboy\ngirl\nboy";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(3, matches.Count);
 
            // Since this is a multiline string, both boy and girl
            // are found on all lines at the beginning
 
            // Debug Trace:
            // 0: 3: boy
            // 11: 4: girl
            // 16: 3: boy
        }
 
        [TestMethod]
        public void Only_Boy_Should_Be_Found_At_The_Start_Of_Multiple_Lines()
        {
            const string pattern = "(?m:^boy)|^girl";
            const string subject = "boygirlboy\ngirl\nboy";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // the m modifier restricted to 'boy' only
            // girl is not found after the first line
 
            // Debug Trace:
            // 0: 3: boy
            // 16: 3: boy
        }
 
        [TestMethod]
        public void Boy_Should_Be_Found_At_The_End_Of_The_Subject()
        {
            const string pattern = "boy$";
            const string subject = "boygirlboy\nboy\n";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(1, matches.Count);
 
            // boy is found at the end of the string. This may
            // be followed by a single newline character
 
            // Debug Trace:
            // 11: 3: boy
        }
 
        [TestMethod]
        public void Boy_Should_Be_Found_At_The_End_Of_Multiple_Lines()
        {
            const string pattern = "(?m)boy$";
            const string subject = "boygirlboy\nboy";
            var regEx = new Regex(pattern);
            MatchCollection matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // boy is now foud on all lines, since this
            // is a multiline string
 
            // Debug Trace:
            // 7: 3: boy
            // 11: 3: boy
        }
    }
}

Regular Expressions in C# – Part 2 – Matches and NextMatch

In the previous example on Regular Expressions we only matched the first occurrence of a pattern in a subject string. By using the NextMatch method on the Match object we can iterate through all subsequent matches in the subject. As the name implies, calling NextMatch returns the next Match object from the subject.

Printing match results

Let’s write a little helper to print out the match results. It’s a simple method printing the start position, length and contents of the match to the debug output. We can use this in our tests.

using System.Diagnostics;
using System.Text.RegularExpressions;
 
namespace RegularExpressions.Tests.Helpers
{
    public class DebugWriter
    {
        internal static void WriteMatch(Match match, string subject)
        {
            Debug.WriteLine("{0}: {1}: {2}",
                            match.Index,
                            match.Length,
                            subject.Substring(match.Index, match.Length));
        }
    }
}

Using NextMatch

The use of the NextMatch method is pretty straightforward. Be aware that this can result in unexpected behavior if we are not careful with repetition operators like ? and *. For these operators always return a successful match, even if a pattern is not found. They simple return an empty length match.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using RegularExpressions.Tests.Helpers;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class NextMatch
    {
        [TestMethod]
        public void Pattern_Should_Be_Found_Five_Times_With_Star()
        {
            const string pattern = "a*";
            const string subject = "aaaabcaa";
            var regEx = new Regex(pattern);
            int counter = 0;
 
            Match match = regEx.Match(subject);
 
            while (match.Success)
            {
                counter++;
                DebugWriter.WriteMatch(match, subject);
                match = match.NextMatch();
            }
 
            Assert.AreEqual(5, counter);
 
            // Debug Trace:
            // 0: 4: aaaa
            // 4: 0: 
            // 5: 0: 
            // 6: 2: aa
            // 8: 0: 
        }
 
        [TestMethod]
        public void Pattern_Should_Be_Found_Nine_Times_With_Questionmark()
        {
            const string pattern = "a?";
            const string subject = "aaaabcaa";
            var regEx = new Regex(pattern);
            int counter = 0;
 
            Match match = regEx.Match(subject);
 
            while (match.Success)
            {
                counter++;
                DebugWriter.WriteMatch(match, subject);
                match = match.NextMatch();
            }
 
            Assert.AreEqual(9, counter);
 
            // Debug Trace:
            // 0: 1: a
            // 1: 1: a
            // 2: 1: a
            // 3: 1: a
            // 4: 0: 
            // 5: 0: 
            // 6: 1: a
            // 7: 1: a
            // 8: 0:
        }
 
        [TestMethod]
        public void Pattern_Should_Be_Found_Two_Times_With_Plus()
        {
            const string pattern = "a+";
            const string subject = "aaaabcaa";
            var regEx = new Regex(pattern);
            int counter = 0;
 
            Match match = regEx.Match(subject);
 
            while (match.Success)
            {
                counter++;
                DebugWriter.WriteMatch(match, subject);
                match = match.NextMatch();
            }
 
            Assert.AreEqual(2, counter);
 
            // Debug Trace:
            // 0: 4: aaaa
            // 6: 2: aa
        }
    }
}

Using Matches

A more convenient way of iteration through all matches is by using the Matches method on the Regular Expressions class. It returns a MatchCollection we can query with LINQ for instance.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using RegularExpressions.Tests.Helpers;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class Matches
    {
        [TestMethod]
        public void Pattern_Should_Be_Found_Two_Times_With_Plus()
        {
            const string pattern = "a+";
            const string subject = "aaaabcaa";
            var regEx = new Regex(pattern);
            var matches = regEx.Matches(subject);
 
            foreach (Match match in matches)
            {
                DebugWriter.WriteMatch(match, subject);
            }
 
            Assert.AreEqual(2, matches.Count);
 
            // Debug Trace:
            // 0: 4: aaaa
            // 6: 2: aa
        }
    }
}

Regular Expressions in C# – Part 1 – Basics

Since I will be writing a lot of validation code in the coming weeks, I decided to dive a little deeper into creating and using Regular Expressions. These little cryptic patterns are very useful when it comes to validating business rules on data fields. It just takes a little practice to create useful regular expression patterns. First the basics…

From MSDN – Regular expressions provide a powerful, flexible, and efficient method for processing text. The extensive pattern-matching notation of regular expressions enables you to quickly parse large amounts of text to find specific character patterns; to validate text to ensure that it matches a predefined pattern (such as an e-mail address); to extract, edit, replace, or delete text substrings; and to add the extracted strings to a collection in order to generate a report.

Using the Regex class in .NET

In the System.Text.RegularExpressions namespace we find the Regex class. Just new it up with some regular expression pattern string. Once instantiated you can’t change this pattern. Call the Match method and pass it whatever subject you want to match the pattern with. It returns a Match object. The Success property of this object tells us if we have a match.

var regEx = new Regex(pattern);
var match = regEx.Match(subject);
 
if (match.Success)
{
    var result = "We have a match";
}

Concatenation

Matching on a sequence of characters is fairly simple. Finding a concatenation (like ‘cat’) in a subject string is done like in the test code below. We just check to see the starting index of the match and the length of the match:

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class Concatenation
    {
        [TestMethod]
        public void Cat_Should_Be_Found()
        {
            const string pattern = "cat";
            const string subject = "dogcat";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 3 && match.Length == 3);
        }
 
        [TestMethod]
        public void Cat_Should_Be_Found_First_Occurence()
        {
            const string pattern = "cat";
            const string subject = "catdogcat";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 3);
        }
 
        [TestMethod]
        public void Bird_Should_Not_Be_Found()
        {
            const string pattern = "bird";
            const string subject = "dogcat";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsFalse(match.Success);
        }
    }
}

Alternation

So what if we want to find a cat or a dog in a subject? We use the ‘|’ alternation sign. Going from left to right, the first occurrence it finds wins the match.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class Alternation
    {
        [TestMethod]
        public void Dog_Should_Be_Found()
        {
            const string pattern = "cat|dog";
            const string subject = "dogcat";
            var regEx = new Regex(pattern);
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 3);
        }
 
        [TestMethod]
        public void Dogcat_Should_Be_Found()
        {
            const string pattern = "dogcat|cat";
            const string subject = "dogcat";
            var regEx = new Regex(pattern);
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 6);
        }
 
        [TestMethod]
        public void Dog_Should_Be_Found_First()
        {
            const string pattern = "cat|dog|dogcat";
            const string subject = "dogcat";
            var regEx = new Regex(pattern);
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 3);
        }
 
        [TestMethod]
        public void Catcat_Should_Be_Found()
        {
            const string pattern = "catcatcat|catcat|cat";
            const string subject = "dogcatcatdogdog";
            var regEx = new Regex(pattern);
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 3 && match.Length == 6);
        }
    }
}

Repetition

If we want to find the occurrence of some sequence repeated a number of times, we can use the ‘*’ or ‘+’ operators. The ‘*’ operator is always successful, even if it doesn’t find a match. In that case it simply returns a zero length match. The ‘+’ is a more picky repetition finder and will fail if it doesn’t match. We can search for a range by using the {0,2} notation. This will find the repetition zero, ones or twice (larges first). The ‘?’ is a shorthand for {0,1}, which is zero or single occurrence.

using System.Text.RegularExpressions;
using Microsoft.VisualStudio.TestTools.UnitTesting;
 
namespace RegularExpressions.Tests
{
    [TestClass]
    public class Repetition
    {
        [TestMethod]
        public void Character_A_Should_Be_Found_Once()
        {
            const string pattern = "a*";
            const string subject = "abc";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 1);
        }
 
        [TestMethod]
        public void Character_A_Should_Be_Found_Repeated_Four_Times()
        {
            const string pattern = "a*";
            const string subject = "aaaabcaa";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 4);
        }
 
        [TestMethod]
        public void Character_A_Should_Be_Found_Length_Zero()
        {
            const string pattern = "a*";
            const string subject = "bcdefgh";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 0);
        }
 
        [TestMethod]
        public void Aa_Should_Be_Found_Repeated_Twice()
        {
            const string pattern = "(aa)*";
            const string subject = "aaaaa";
            var regEx = new Regex(pattern);
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 4);
        }
 
        [TestMethod]
        public void Dog_Should_Be_Found_Length_Zero()
        {
            // Repetion matching with * is always succesfull
            // Simply returns 0 as length (empty match)
 
            const string pattern = "(cat)*";
            const string subject = "dogcatcat";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 0);
        }
 
        [TestMethod]
        public void Catcat_Should_Be_Found_Range_Zero_To_Two()
        {
            const string pattern = "(cat){0,2}";
            const string subject = "catcatcatcatdog";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 0 && match.Length == 6);
        }
 
        [TestMethod]
        public void Cat_Should_Be_Found_Exactly_Ones()
        {
            const string pattern = "(cat){1}";
            const string subject = "catcatcatcatdog";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Success);
        }
 
        [TestMethod]
        public void Cat_Should_Be_Found_Once()
        {
            // ? is shortcut for zero or once
            const string pattern = "(cat)?";
            const string subject = "catcatcatcatdog";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Success);
        }
 
        [TestMethod]
        public void Catcat_Should_Be_Found_Twice_With_Plus()
        {
            // The + operator finds the exact match
            // and doesn't return a zero length match
 
            const string pattern = "(catcat)+";
            const string subject = "dogcatcatcatcatdog";
            var regEx = new Regex(pattern);
 
            Match match = regEx.Match(subject);
 
            Assert.IsTrue(match.Index == 3 && match.Length == 12);
        }
    }
}

Pretty basic stuff huh? Next time we dive in deeper…

Binomial Probability Example in C#

A statistical binomial experiment (or Bernoulli trial) is a fixed number of independent trials with a binary outcome. That is a success or failure outcome with a constant probability for each trial. E.g. we flip a fair or loaded coin ten times and it can land on either heads or tails. If we want to calculate the probability that we throw heads exactly three times out of these ten flips, we are asking for the binomial probability of this occurrence.

The formula

Here’s the formula: it’s the calculation of the probability that we have an outcome of x successes having n trials with a success probability of p. The value q is the probability of failure which is the same as 1-p.

Binomial Probability Formula

Calculate factorials

We need to calculate the factorial a few times in this formula (like n!), so let’s tackle this first. The factorial of for instance the number 4 is equal to 4 * 3 * 2 * 1. So the math is pretty straight forward as you can see in this recursive extension method:

public static int Factorial(this int x)
{
    return x <= 1 ? 1 : x * Factorial(x - 1);
}

Calculate Binomial Probability

The extension method below takes three arguments. First we want to know the number of trials in this experiment (e.g. number of coin flips). Second we need to know the probability of a successful outcome of each trial (e.g. heads). Finally we take the value of the exact number of successful outcomes we want to calculate the probability of (e.g. three times heads in this trial). Now all what’s left is to calculate the probability by applying the formula.

using System;
 
namespace Statistics
{
    public static class Distribution
    {
        public static double BinomialProbability(
            int trials, double successProbability, int successes)
        {
            int possibilities = trials.Factorial() /
                                ((trials - successes).Factorial() * successes.Factorial());
 
            double result = possibilities *
                            Math.Pow(successProbability, successes) *
                            Math.Pow(1 - successProbability, (trials - successes));
 
            return result;
        }
 
        public static int Factorial(this int x)
        {
            return x <= 1 ? 1 : x * Factorial(x - 1);
        }
    }
}

We take two steps to calculate the binomial probability in this method. First we calculate all possible sequences with the number of successful outcomes. For instance if we flip a fair coin five times there are ten possible outcome sequences with three times heads.

Binomial Probability Formula Ex1

We then take this value and multiply it by the probabilities of successes and failures to the power of their occurrence in the sequence. In this example above this would be

Binomial Probability Formula Ex3

So, the probability that we flip a fair coin and getting exactly three times heads in the results is about 20%.

Happy path testing

using System.Diagnostics;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Statistics;
 
namespace StatisticsTests
{
    [TestClass]
    public class DistributionTests
    {
        [TestMethod]
        public void BinomialProbabilityFair()
        {
            const int trials = 10;
            const double successProbability = 0.5;
            const int successes = 5;
 
            double result = Distribution.BinomialProbability(
                trials, successProbability, successes);
 
            Assert.AreEqual(expected: 0.24609375, actual: result);
            Debug.WriteLine("Binomial Probability fair coin is {0}", result);
        }
 
        [TestMethod]
        public void BinomialProbabilityLoaded()
        {
            const int trials = 10;
            const double successProbability = 0.8;
            const int successes = 5;
 
            double result = Distribution.BinomialProbability(
                trials, successProbability, successes);
 
            Assert.AreEqual(expected: 0.026424115199999976, actual: result);
            Debug.WriteLine("Binomial Probability loaded coin is {0}", result);
        }
 
        [TestMethod]
        public void Factorial_One()
        {
            int result = 1.Factorial();
 
            Assert.AreEqual(expected: 1, actual: result);
            Debug.WriteLine("Factorial of 1 is {0}", result);
        }
 
        [TestMethod]
        public void Factorial_Five()
        {
            int result = 5.Factorial();
 
            Assert.AreEqual(expected: 120, actual: result);
            Debug.WriteLine("Factorial of 5 is {0}", result);
        }
 
        [TestMethod]
        public void Factorial_Zero()
        {
            int result = 0.Factorial();
 
            Assert.AreEqual(expected: 1, actual: result);
            Debug.WriteLine("Factorial of 0 is {0}", result);
        }
 
        [TestMethod]
        public void Factorial_MinusOne()
        {
            int result = -1.Factorial();
 
            Assert.AreEqual(expected: -1, actual: result);
            Debug.WriteLine("Factorial of -1 is {0}", result);
        }
    }
}

Binomial Probability Tests

Simple Claims based Identity in .NET 4.5

For years we used the Identity and Principal with Role based security constructs build into the .NET Framework like WindowsIdentity and GenericIdentity. Today’s systems are getting more and more loosely coupled, with often separate authorization mechanisms. This requires a much more flexible way of expressing identity. That is why Microsoft decided to add claims based identity to the mix as an integral part of .NET since version 4.5.

What is a claim

We can look at an identity claim as being a statement about something or someone, made by something or someone else. I could say that you are a developer for instance. That’s me claiming that you’re a dev, which is some kind of role claim. Google could make a claim about your email address. An internal HR system claims your name and age. These claims could all be a part of your identity. It is all about trusting the claim issuer.

ClaimsIdentity and ClaimsPrincipal

Based on the IIdentity and IPrincipal interfaces, Microsoft added the ClaimsIdentity and ClaimsPrincipal classes to work with our new and existing code. We can find them in the System.Security.Claims namespace. Let’s take a look at how to create and use claims in an example project.

using System.Collections.Generic;
using System.Linq;
using System.Security.Claims;
using System.Security.Principal;
using System.Threading;
using Microsoft.VisualStudio.TestTools.UnitTesting;
 
namespace ClaimsBasedIdentity
{
    [TestClass]
    public class ClaimTests
    {
        [TestInitialize]
        public void SetupClaims()
        {
            var claims = new List<Claim>
                {
                    new Claim("UserName", "lvbokhorst"),
                    new Claim(ClaimTypes.Name, "Leon van Bokhorst"),
                    new Claim(ClaimTypes.Email, "leonvanbokhorst@ilovehadoop.org"),
                    new Claim(ClaimTypes.Role, "nerds"),
                    new Claim("http://remondo.claims/room", "mancave")
                };
 
            var claimsIdentity = new ClaimsIdentity(
                claims, "Basic", "UserName", ClaimTypes.Role);
 
            Thread.CurrentPrincipal = new ClaimsPrincipal(claimsIdentity);
        }

In the SetupClaims of this test class we first create a list of claims. Some are predefined ClaimTypes, like Name, Email and Role. The Room and UserName claims are custom types.

With this list of claims we build a ClaimsIdentity. We pass it the claims list and specify the AuthenticationType. Furthermore we specify the type to use as the Name property of the identity and the type we use to express roles. This is for backward compatibility with .NET 4.0 and below code as shown below. From this ClaimsIdentity we create a ClaimsPrincipal object and pass it to the Thread.CurrentPrincipal as we always did.

No legacy code broken

Since we specified in the ClaimsIdentity which ClaimType should be used for the identity name and principal roles, all of our existing code should still work. The test below proves it… we can still use the Identity property and the IsInRole method of the principal.


        [TestMethod]
        public void Using_Claims_In_Existing_Code()
        {
            const string notExpectedName = "Leon van Bokhorst";
            const string expectedName = "lvbokhorst";
            const string expectedRole = "nerds";
 
            IPrincipal principal = Thread.CurrentPrincipal;
 
            Assert.AreNotEqual(notExpectedName, principal.Identity.Name);
            Assert.AreEqual(expectedName, principal.Identity.Name);
            Assert.IsTrue(principal.IsInRole(expectedRole));
        }

Identity flexibility

The test below shows the much more flexible way of dealing with identity by using claims. We now have a much broader range of types, name-value-pairs and lists to express these identity claims. We can use predefined query methods, like FindAll and FindFirst. We can query our Claims list with LINQ or figure out if there is a claim at all with HasClaim.

        [TestMethod]
        public void Using_Claims()
        {
            const string expectedName = "lvbokhorst";
 
            var principal = ClaimsPrincipal.Current;
 
            Claim userName = principal.FindFirst("UserName");
            Claim role = principal.Claims.First(c => c.Type == ClaimTypes.Role
                                                     && c.Value.StartsWith("nerd"));
            bool hasEmailClaim = principal.HasClaim(c => c.Type == ClaimTypes.Email);
 
            Assert.AreEqual(expectedName, userName.Value);
            Assert.IsTrue(principal.IsInRole(role.Value));
            Assert.IsTrue(hasEmailClaim);
        }
    }
}

A very nice implementation of Claims based Identity.

Create a Secure Password Hash with BCrypt

Right of the bat: there’s no such thing as a secure password hash. But we can make an attacker’s life harder if we try. For years we used (and sometimes still use) algorithms like MD5, SHA-1, SHA-256, SHA-512 to store password hashes. We even added some salt to prevent easy cracking. It isn’t quite enough…

Nowadays it can be child’s play to “restore” such a hashed password by browsing terabytes of rainbow tables. Brute force attacks are also much easier with the low costs of computing power. What we need is a clever and processor intensive hashing algorithm that keeps attackers busy for ages. This is where BCrypt comes to play.

BCrypt is an adaptive cryptographic hash with an adjustable work factor. The higher this value, the longer the hash computation gets. When machines are getting faster over time, you can boost security by cranking up the work factor. This work factor gets stored in the actual 60 character hash:

$2a$12$2G56I4ikWfpnP7qW8U6K6OFMyJ.daxIgfiPrysKXoM2OH2aiOx6ri

Here we have, delimited by $-signs, 2a as the algorithm version, 12 as the work factor and the rest of the string contains the actual salt and cipher.

Using BCrypt.NET

That’s easy. First browse the Nuget library for BCrypt.NET and add it to you project. Determine the proper work factor for your needs, call HashPassword and you’re done. I took a work factor of 12. In this test it takes about half a second to compute the hash.

using System.Diagnostics;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using BCrypter = BCrypt.Net.BCrypt;
 
namespace HashingTest
{
    [TestClass]
    public class HashingTests
    {
        [TestMethod]
        public void Hash_Using_BCrypt()
        {
            const string password = "@aRdIgMe!sjE*";
            const int workFactor = 12;
 
            string hashed = BCrypter.HashPassword(
                password, BCrypter.GenerateSalt(workFactor));
            
            Assert.IsTrue(BCrypter.Verify(password, hashed));
            Debug.WriteLine(hashed);
        }
    }
}

BCrypt Cryptographic Hash Algorithm Test

Security is important. I’m not an expert so don’t take my word for it. :evil:
More information on how to safely store a password?