Side-effect free programming in C#

How can we do inherently side-effecty things like IO when using functional programming, where each call has to be side-effect free?

In this post we’ll look at side-effects and how we could eliminate them from our code (we’ll use C#, but we can apply the idea to many languages). The aim is not to get something enormously practical that we’ll use every day, but instead to explore some ideas and hopefully work out how it is possible to get any useful work done using functional programming where side-effects are forbidden.

Side-effects

A function’s output depends purely on its input. For example, + is a function because its output will be based entirely on the numbers given as input. If we call it with the same input, we’ll always get the same output.

What about the inc() method of a mutable counter? It is not a function, because if we call inc() once we get a counter set to 1, but if we call inc() again we get a counter set to 2.

In other words inc() has a side-effect; it does something other than produce an output from its input. This has some downsides, so we’d like to avoid it where possible. But how can we avoid side-effects for something like Console.WriteLine(), where we mutate the state of the console with each call?

Haskell does not let us express side-effects at all¹ but still manages to do IO, so it must be possible. Can we get the same effect (hah!) in C#?

Simple IO

Let’s have a look at a simple C# program that reads some input and writes to the console:

public static void SideEffecty() {
    Console.WriteLine("Hi, I'm C#. What's your name?");
    var name = Console.ReadLine();
    Console.WriteLine("Nice to meet you " + name);
}

static void Main(string[] args) {
    SideEffecty();
}

The SideEffecty method has side-effects all over the place. It doesn’t have any output (it is void), so we can hardly say that its output depends purely on its input, and even if it did have output, it doesn’t take any inputs anyway!

How can we make this side-effect free? By cheating: we simply don’t execute the offending bit of code.

public static Action NotSoSideEffecty() {
    return () => {
        Console.WriteLine("Hi, I'm C#. What's your name?");
        var name = Console.ReadLine();
        Console.WriteLine("Nice to meet you " + name);
    };
}

static void Main(string[] args) {
    var program1 = NotSoSideEffecty();
    var program2 = NotSoSideEffecty();
    var program3 = NotSoSideEffecty();

    //Uncomment to run the effect:
    //program3();
}

We can now execute NotSoSideEffecty() without any side-effects at all. We can consider NotSoSideEffecty() to be a function that takes no input (also known as Unit or () in languages like F#, Scala and Haskell), and always returns a program that will prompt for some input and write some output.² We can call NotSoSideEffecty() as often as we like; as far as we’re concerned it will always return the same output for its (empty) input.

If we defer all side-effects by wrapping them up in Action (for things like writing to console) and Func<T> (for things like reading input), we can assemble side-effect free programs that do absolutely nothing until we execute the composed program in our Main method.³

Slightly inconvenient

I think we could actually get fairly far with this approach, but it is not exactly convenient. We have to be very careful about invoking any delegates lest they actually represent a side-effect, unless we remember to wrap that invocation within an Action too.

What we need is another way of packaging up these IO actions.

An `IO` type

Rather than using delegate types, let’s create an IO<T> type to represent a program that will produce some value of type T. An example is ReadLine(), which when run will produce a string it reads from the console. We’ll use a placeholder Unit type for programs that return void, like WriteLine(string s) which produces a program that prints some output but does not return any value.

public sealed class Unit { } /* Meh, close enough */

public sealed class IO<T> {
    private readonly Func<T> ioAction;
    public IO(Func<T> ioAction) { this.ioAction = ioAction; }
}

public static class IO {
    public static IO<Unit> WriteLine(string s) {
        return new IO<Unit>(() => {
            Console.WriteLine(s);
            return new Unit();
        });
    }

    public static IO<string> ReadLine() {
        return new IO<string>(Console.ReadLine);
    }
}

Just as before, we are packaging up our side-effects rather than running them immediately. We’re not going to be accidentally executing these any time soon though – we haven’t provided any way to run them at all!

We’ve also got a static IO class with a few built-in IO programs; WriteLine() and ReadLine(). But how do we put these together to build a program that works the same as NotSoSideEffecty()?

Combining `IO` programs

The first thing NotSoSideEffecty() does is prompt for input, then read a line of input. We’d like to do this by combining our IO.WriteLine(string s) and IO.ReadLine() programs. We’ll call this combining function Then⁴:

public sealed class IO<T> {
    /* ... snipped ... */

    public IO<TNext> Then<TNext>(IO<TNext> next) {
        // Create a new IO action that runs first action, then runs next action.
        return new IO<TNext>(() => {
            ioAction();
            return next.ioAction();
        });
    }
}

static void Main(string[] args) {
    var program = IO.WriteLine("Hi, I'm C#. What's your name?")
                    .Then(IO.ReadLine());
    /* TODO: run program somehow */
}

The next thing we need to do is construct a new program that takes the value that will be produced from ReadLine() and writes it to console. If we had access to the string from ReadLine() we could just write a function that takes string and returns an IO<Unit>, but we can’t get that string without running our programming and performing a side-effect. We’ll have to carefully put this together in the IO<T> class:

public sealed class IO<T> {
    /* ... snipped ... */

    public IO<TNext> Then<TNext>(Func<T, IO<TNext>> useValue) {
        // Create new IO action that gets the value from first action, 
        // uses the value to get the next action, then runs that.
        return new IO<TNext>(() => {
           var value = ioAction();
           var nextIO = useValue(value);
           return nextIO.ioAction();
        });
    }

Now assuming we sneak in a way of running an IO action, we can replicate our NotSoSideEffecty() program:

/* Shhhh, secret side-effecty method in IO<T> class...
    internal T UnsafePerformIO() { return ioAction(); }
*/

static void Main(string[] args) {
    var program =
        IO.WriteLine("Hi, I'm C#. What's your name?")
          .Then(
              IO.ReadLine()
                .Then(name => IO.WriteLine("Nice to meet you " + name))
          );

    program.UnsafePerformIO();
}

LINQing it up

We can make our side-effect-free code easier to follow by using LINQ. To play nicely with LINQ we need to provide a SelectMany function with the right signature. This will be very similar to our second Then function, but it will have to take an additional selector that combines the values from the first and second IO programs:

// In IO<T> class:

public IO<TB> SelectMany<TA, TB>(Func<T, IO<TA>> useValue,
                                 Func<T, TA, TB> selector) {
    return new IO<TB>(() => {
       var value = ioAction();
       var nextIO = useValue(value);
       var nextValue = nextIO.ioAction();
       return selector(value, nextValue);
    });
}

And now we can use LINQ comprehension syntax to clean up a little:

static void Main(string[] args) {
    var program =
        from a in IO.WriteLine("Hi, I'm C#. What's your name?")
        from name in IO.ReadLine()
        from b in IO.WriteLine("Nice to meet you " + name)
        select new Unit(); // our program doesn't return a value,
                           // so we'll return Unit.

    program.UnsafePerformIO();
}

Aside: The a and b variables are required to play nicely with the LINQ syntax. They will be set to the result of running the IO.WriteLine() programs, which will just be Unit.

A quick Haskell IO diversion

Although I’m probably over-simplifying things, this is how I tend to think about Haskell IO. Haskell has a type IO a which wraps up side-effects, and we have several pure functions available to help us combine them. Haskell’s main function returns an IO () (IO unit) so the entire program is completely side-effect free. The Haskell execution engine then takes care of invoking that program.⁵

To see the similarities, here’s the equivalent program in Haskell:

main = do
    putStrLn "Hi, I'm Haskell. What's your name?"
    name <- getLine
    putStrLn ("Nice to meet you " ++ name)

Reading and writing to Console ain’t that exciting

We’re not limited to ReadLine and WriteLine – we can wrap any side-effect up in an IO action. We could return IO actions that iterate over files, generate random numbers, make network calls etc.

While we may not want to do our C# IO like this, it could be a useful technique for combining things like async actions or database operations using pure functions, before running them from impure, side-effecting parts of our code.

Conclusion

One way to eliminate side-effects is to not execute side-effecting code. This sounds more than a little like cheating, but it seems just crazy enough to work! :) Instead of running side-effects, we can return values which represent programs with side-effects.

We can use functions like Then() and SelectMany() (and others we didn’t look at in this post) to combine these values into larger programs, all without causing a side-effect.

For this post we used this approach to perform very basic IO, but the same idea works for system, file system, network, and database calls.

There’s a whole lot I need to learn before I can start using these concepts⁶, but I think I’m starting to get an idea of how functional programming can be used to construct programs that do useful work, without us having to resort to side-effects.

If you’ve tried pushing side-effect free functions fairly far into your code in languages that generally promote side-effects everywhere, I’d love to hear how it’s worked out for you.

With the exception of a few cheats↩
Another way of thinking about this is as NotSoSideEffecty() returning an effect as a value, rather than a side-effect.↩
We could also write a shim that takes a DLL with an Action Main() function and executes that, so our entire program is pure.↩
Then is a valid function: its output depends purely on its input.↩
Haskell’s IO is actually implemented as a state transformer (ST), which is more like Func<World, Tuple<World,T>> than the Func<T> we used, but I think these end up being fairly comparable in the case of IO. See Lazy Functional State Threads (1994) by John Launchbury and Simon Peyton Jones for more on this topic.↩
Specifically, I need to learn more about iteratee-style structures to effectively use resources with pure FP, and probably some monad transformer stuff.↩