Lazy<T> monad instance in C#

One of the most surprising and useful things I’ve learned about .NET this year is that it has quite good support for monads as of .NET 3.5 (it’s just missing higher-order polymorphism). What’s more, for those allergic to monads, you don’t need to understand anything in that previous sentence to follow this post. :)

In this post we’ll implement the couple of functions necessary to be able to compose instances of Lazy<T> together in interesting ways using LINQ and LINQ comprehensions.

Motivation

As mentioned last post, I’ve been messing around with Lazy<T> as a way of performing one-time, long-running retrieval of values in an asynchronous way. Some of these values depend upon each other, so I’d like a way to build a new Lazy<T> from some existing lazy values, then hand them off to a background thread to populate. We’ll see an example of this at the end of the post.

Lazy LINQ extension methods

To play nicely with LINQ we need to implement a couple of extension methods for Lazy<T>: Select and SelectMany. Types for which these functions are implemented are monads1, which means we can combine them in interesting ways. In Haskell, Select is known as fmap, SelectMany is known as >>= (pronounced "bind").

Mapping inside with Select

The Select implementation for Lazy<T> works similarly to how it does for IEnumerable<T>. For IEnumerable<T>, it takes a Func<T, TResult> and maps it over each value inside, and returns a new IEnumerable<TResult>. For Lazy<T>, it applies the selector function to the single value inside, returning a new Lazy<TResult>.

public static class LazyExtensions {
    public static Lazy<TResult> Select<T, TResult>(this Lazy<T> value, Func<T, TResult> selector) {
        return new Lazy<TResult>(() => selector(value.Value));
    }
}

Mapping and flattening with SelectMany

Now we come to SelectMany. Rather than taking a Func<T, TResult> like Select, it works on Func<T, Lazy<TResult>> instead.

public static Lazy<TResult> SelectMany<T, TResult>(this Lazy<T> lazy, Func<T, Lazy<TResult>> selector) {
    return new Lazy<TResult>(() => selector(lazy.Value).Value);
}

The output of selector(lazy.Value) is a Lazy<TResult>. We don’t want to return a Lazy<Lazy<TResult>>, so we need to flatten this out. We can do this by returning selector(lazy.Value).Value, to get the inner Lazy<TResult> out.2

To work well with the from .. select style of LINQ query we also want to implement another overload of SelectMany that has an additional result combining function. We can also re-write our previous SelectMany in terms of this new overload. Here’s the finished static class:

public static class LazyExtensions {
    public static Lazy<TResult> Select<T, TResult>(this Lazy<T> lazy, Func<T, TResult> selector) {
        return new Lazy<TResult>(() => selector(lazy.Value));
    }

    public static Lazy<TResult> SelectMany<T, TResult>(this Lazy<T> lazy, Func<T, Lazy<TResult>> selector) {
        return SelectMany(lazy, selector, (a, b) => b);
    }

    public static Lazy<TResult> SelectMany<T, TA, TResult>(this Lazy<T> lazy, Func<T, Lazy<TA>> selector, Func<T, TA, TResult> resultSelector) {
        return new Lazy<TResult>(() =>
                                 {
                                     var first = lazy.Value;
                                     var second = selector(first).Value;
                                     return resultSelector(first, second);
                                 });
    }
}

The last function is our new SelectMany overload, which has the additional step of combining the results of both the first lazy value, and the second lazy value returned from the selector function.

It’s alive! It’s alive!

We can now test out our abomination. Here’s how Select works:

[Test]
public void Map() {
    var lazyInt = new Lazy<int>(() => 42);
    var newLazy = lazyInt.Select(x => (x * 2).ToString());
    Assert.AreEqual("84", newLazy.Value);
}

And here’s how SelectMany works using LINQ comprehension syntax (we can also call the SelectMany extension method directly if we like):

[Test]
public void SelectMany() {
    var first = new Lazy<int>(() => 42);
    var second = new Lazy<string>(() => "nyan");

    var third = from x in first
                from str in second
                select str + " cat " + x;

    Assert.AreEqual("nyan cat 42", third.Value);
}

The from .. select syntax gives us a nice way to express how we want to combine the values inside the Lazy<T> instances. We call the value in the first instance x, and the string in the second instance str, then create a new lazy that combines str + " cat " + x, all without ever really accessing the values until the .Value property is called.

To illustrate that combining Lazy<T> instances does not trigger evaluation, let’s write a test that increments a counter for each instance whenever evaluation occurs.

[Test]
public void ShouldDeferExecutionUntilValueCalled()
{
    var counters = new int[3];

    var first = new Lazy<int>(() => { counters[0]++; return 10; });
    var second = new Lazy<string>(() => { counters[1]++; return "nyan"; });
    var third = from x in first
                from str in second
                let temp = counters[2]++
                select str + " cat " + x;

    //No values have been evaluated yet:
    Assert.That(counters, Is.EquivalentTo(new[] { 0, 0, 0 }));

    //Force values to be evaluated. 
    //third.Value will force first and second to evaluate too.
    var eval0 = third.Value;
    var eval1 = third.Value;

    //Each value only evaluated once
    Assert.That(counters, Is.EquivalentTo(new[] { 1, 1, 1 }));
}

This is just like the deferred execution we get using LINQ for IEnumerable<T>, where nothing is evaluated until we traverse the enumerable by using foreach or a method like ToList().

Back to our motivating example

So back to our motivating example. We wanted to take several long-running operations, and calculate them once on a background thread. We’ll specify all these operations using Lazy<T> instances, compose them together using our extension methods, and do the initialisation of the value using an async task (this is thread-safe by default for Lazy<T>).

The Lazy<T> handles caching the calculation value, so subsequent accesses to any of the lazy instances involved in the calculation will now return the pre-calculated value immediately.

[Test]
public void AsyncInitialisation() {
    var key0Lookup = new Lazy<string>(FranticSearchForFirstKey);
    var key1Lookup = new Lazy<string>(SearchingEverywhereForSecondKey);
    var veryTrickyCalculation =
        from key0 in key0Lookup
        from key1 in key1Lookup
        select LongRunningCalculation(key0, key1);

    Task.Factory.StartNew(() => veryTrickyCalculation.Value);

    //Blocks until async initialisation finished:
    Assert.AreEqual(30, veryTrickyCalculation.Value);
    //Returns immediately:
    Assert.AreEqual(30, veryTrickyCalculation.Value);
}

private string FranticSearchForFirstKey() {
    Console.WriteLine("frantic search for first key");
    Thread.Sleep(1000);
    return "10";
}

private string SearchingEverywhereForSecondKey() {
    Console.WriteLine("where's the second key?");
    Thread.Sleep(700);
    return "20";
}

private int LongRunningCalculation(string key0, string key1) {
    Console.WriteLine("stand back, i'm going to try science!");
    Thread.Sleep(2000);
    return int.Parse(key0) + int.Parse(key1);
}

This test is just to show it working; for more realistic use we could try wrapping the task in an IObservable<T> and calling subscribers back with the result. The first result will take some time, but subsequent calls will return the value back to the subscriber straight away.

Conclusion

So we’ve now seen how to combine Lazy<T> instances in interesting ways using LINQ, Select, and SelectMany. This let us control evaluation of several lazy instances by composing them and pushing initialisation into an async task.

More fundamentally, we’ve now got a bit of an idea about monads. A monad for a type is formed by a Select and SelectMany defined for the type, which enables the use of common composition patterns like from .. select. Between this, standard LINQ enumerables, and using F#’s Option type in C# we can start to get an idea of how useful this pattern is.


  1. A monad also needs a function which takes a value of T and returns a new instance of the monad with that value. This is called return in Haskell. We can use the Lazy<T> constructor for this. There are also a few simple rules for how these functions should act and combine to make a valid monad instance, but we won’t go into them here.

  2. This is a key part of how the monad interface works, so you’ll see this pattern crop up all the time when implementing SelectMany.

Comments