In statistics, both Variance and Standard Deviation calculated values say something about the spread of data given a sequence of numbers. The spread tells us how far away the datapoints can be from the mean average. This is what deviation means: the distance from the normal. Both are calculated quite easily.

- We first have to calculate the
**mean average**as shown in the previous example. - Then we compute the
**variance**, which is defined as the squared differences from the mean average. We do this by subtracting the mean of each number in the dataset and square the result (why square?). The sum of the results divided by the count of all numbers in the dataset is the variance. - From there it’s one step to the
**standard deviation**. We simply by taking the square root of the variance. This gives us a nice spread estimation of the dataset.

Now let’s see how we do this in code. Again we implement these calculations as a extension method for a IEnumerable of double. We call the Mean extension method from the previous example to get our mean average for the calculation of the Variance. We square the difference between the number and mean by calling Math.Pow (to the power 2.0).

using System; using System.Collections.Generic; using System.Linq; namespace Statistics { public static class Spread { public static double Variance(this IEnumerable<double> list) { List<double> numbers = list.ToList(); double mean = numbers.Mean(); double result = numbers.Sum(number => Math.Pow(number - mean, 2.0)); return result / numbers.Count; } public static double StandardDeviation(this IEnumerable<double> list) { return Math.Sqrt(list.Variance()); } } }

We test both extension methods on a dataset with a small and larger spread relative to each other.

using System.Diagnostics; using Microsoft.VisualStudio.TestTools.UnitTesting; using Statistics; namespace StatisticsTests { [TestClass] public class SpreadTests { private readonly double[] _testDataLargeSpread = new[] {10.9, 7.3, 1.2, 75.3, 2.3, 4.4, 1.9, 2.1, 53.0, 2.8}; private readonly double[] _testDataSmallSpread = new[] {1.9, 1.3, 1.2, 0.3, 2.3, 0.4, 1.9, 2.1, 2.0, 1.8}; [TestMethod] public void VarianceOnLargeSpreadListShouldReturnValid() { double result = _testDataLargeSpread.Variance(); Assert.AreEqual(expected: 609.45959999999991, actual: result); Debug.WriteLine("Variance is {0}", result); } [TestMethod] public void VarianceOnSmallSpreadListShouldReturnValid() { double result = _testDataSmallSpread.Variance(); Assert.AreEqual(expected: 0.44360000000000011, actual: result); Debug.WriteLine("Variance is {0}", result); } [TestMethod] public void StandardDeviationOnLargeSpreadListShouldReturnValid() { double result = _testDataLargeSpread.StandardDeviation(); Assert.AreEqual(expected: 24.687235568204066, actual: result); Debug.WriteLine("Standard Deviation is {0}", result); } [TestMethod] public void StandardDeviationOnSmallSpreadListShouldReturnValid() { double result = _testDataSmallSpread.StandardDeviation(); Assert.AreEqual(expected: 0.66603303221386856, actual: result); Debug.WriteLine("Standard Deviation is {0}", result); } } }

Should one convert the double types to double? to account for null values?