Reasoning and mutability

One thing I often hear about functional programming is that its requirement for immutability makes programs easier to reason about. To me this seems intuitively true – it’s got to be easier to work out what a program does without having to keep track of changing state while evaluating programs, right?

I wanted to challenge my assumptions about this. Could I convince myself that one is unambiguously easier to reason about? And if so, what is it about the other that makes it more difficult to reason about?

To do this I tried tracing through some examples of mutable and immutable data structures. I tried to use similar, “object” styles for both, so that the characteristic difference between them was the mutability of their internal data (rather than getting thrown off by differences between functional and OO styles).

I’d love to get your thoughts for and against these ideas. I am especially likely to be the victim of confirmation bias on this one, so I’m counting on you to keep me honest. Leave comments or send me email please! :)

Counter example

Let’s start by looking at a simple example of a counter that has an inc() method to increment the counter, and a value() method to return its value. One counter will be a mutable data structure, where calling inc() will modify a field within the counter. The other will be immutable – calling inc() will return a new immutable counter containing the incremented value.

var counter = function() {
    var i = 0;
    return (function() {
        return {
            value: function() { return i; },
            inc: function() { i=i+1; return this; }

var counterI = function() {
    var ctor = function(i) {
        return {
            value: function() { return i; },
            inc: function() { return new ctor(i+1); }
    return ctor(0);

Equational reasoning

In FP names are given to specific, immutable expressions. The idea is whenever we see a name, we can replace it with the expression it refers to without affecting the program’s behaviour (this is known as referential transparency). We can use this fact to treat programs as mathematical equations – if we see an x in an expression, we can replace it with whatever expression x equals.

var f = function (c) {
    return -;

Let’s reason through how f works when given an immutable counterI, by substituting in the meanings of each expression in the function.

// let c{i} = an immutable counter with a value i

    = c{i}.inc().value() - c{i}.inc().value()
    = c{i+1}.value() - c{i+1}.value()           // by
    = (i+1) - (i+1)                             // by counterI.value()
    = 0                                         // by (+), (-)

Here we’ve taken a fairly mechanical approach to evaluating this expression, but we can also apply mathematical/logical principles:

let x =
in f(c)
    = x - x                 // by defn of x
    = 0                     // by (-)

And sure enough:

> f(new counterI().inc().inc())

So by substituting in values for names and vice versa, we can solve our equation to work out the program result, just as we would for any algebraic equation.

Reasoning about evaluation and state

We can’t use the same equality of terms with mutable data, as the meaning of an expression can vary based on when it is called. One expression does not necessarily equal another expression.

> f(new counter().inc().inc())

While mutability means we’ve lost referential transparency and equational reasoning, we can still reason about the code by keeping track of the state as we evaluate each expression. We’ll treat this as a pair of values – the evaluation on the left, and the state on the right.

(f(c), c: i)
    = ( -, c: i)
    = (c.value() -, c: i+1)         // by
    = (i+1 -, c: i+1)               // by counter.value()
    = (i+1 - c.value(), c: i+2)                     // by
    = (i+1 - (i+1), c: i+2)                         // by counter.value()
    = (-1, c: i+2)                                  // by (+), (-)

So we can still reason about our code, we just have to keep track of the evaluation and the state separately. Not a big deal, and something we are very familiar with (and are probably quite adept at).

The story so far

We’ve found a few distinctions between reasoning about mutable and immutable code.

We managed to trace through our immutable example in 2 or 3 steps, while our mutable example took 5. This was because we replaced both occurrences of at once in the immutable case, which we could do because the expressions are equal and referentially transparent. In the mutable case we had to do each step separately, as does not necessarily mean the same thing as another So mutability is making reasoning more difficult for us by forcing us to evaluate more steps to understand our program.

Immutability also lets us apply algebraic laws and the properties of our code to gain an understanding about its result. We can use properties like x - x = 0 for subtraction to help us understand the behaviour of our program, without ever needing to evaluate the expression. So we can understand f without knowing anything about the counter instance other than its immutability.

In other words, immutability is enabling us to reason about our program independently from its execution. We can think logically about the code and only step through the parts we need. Mutability on the other hand inextricably links the meaning of the program with its method of execution, forcing us to go through the same execution steps at the computer will in order to understand it.

Order in the court

To further illustrate this coupling between program meaning and execution, let’s try re-evaluating our mutable example in different orders.

// Left-hand side first (from above):
(f(c), c: i)
    = (-1, c: i+2)

// Right-hand side first:
(f(c), c: i)
    = ( -, c: i)
    = ( - c.value(), c: i+1)        // RHS inc()
    = ( - (i+1), c: i+1)            // by value()
    = (c.value() - (i+1), c: i+2)                  // LHS inc()
    = (i+2 - (i+1), c: i+2)                        // by counter.value()
    = (1, c: i+2)                                  // by (+), (-)

// inc() calls, then value() calls:
(f(c), c: i)
    = ( -, c: i)
    = (c.value() - c.value(), c: i+2)              // both inc() calls
    = ((i+2) - (i+2), c: i+1)                      // by value()
    = (0, c: i+2)                                  // by (+), (-)

Here we have three different results from three equally valid evaluations of the program. It is insufficient to understand what each piece of the program does, we also need to know exactly what way our target language and platform will execute it. The immutable version we can reason about independently of evaluation order.

We tend to work around this source of ordering confusion in the presence of mutability using imperative-style code. This makes ordering obvious (from top to bottom), and makes it easy for us to understand our program by stepping through it an instruction at a time:

function f(c) {                 // let c have initial state i;                    // c: i+1;                    // c: i+2
    var lhs = c.value();        // lhs: i+2
    var rhs = c.value();        // rhs: i+2
    return lhs - rhs;           // lhs - rhs = i+2-i-2 = 0

While this does make our program easy to understand, it still means we have to evaluate every line, and still doesn’t make it apparent when we can use properties like x-x=0 to help us understand the code.


What does this program return if a and b are mutable, and both have an initial counter state of 0?

function g(a, b) {                  // let a: x, b: y;                        // a: x+1;                        // a: x+2;                        // b: y+1
    return a.value() - b.value();   // x+2-y-1 = x-y+1

We increment a twice and b once, then return the difference between the counters, so we should end up with 1. Unless, of course, we get 0:

var a = new counter();
var b = a;
> 0

Here a and b are aliased to the same instance1. If a and b were different instances we’d get 1. The problem is we can not correctly reason about what this code will do without knowing what a and b refer to.

In the immutable case, our output will always follow the formula we arrived at regardless of aliasing, as calls to inc() return new instances rather than mutating existing ones.

function g2(a, b) {
    return -; // = x-y+1

var a = new counterI();
var b = a;
> 1

var c = new counterI();
var d = new counterI();
> 1

Mutability in a larger context

So far we have been dealing with code in a very limited scope. Within that scope we’ve seen that immutability lets us reason about a program independently of its execution, so we can understand it without having evaluate each term and can use algebraic substitution to simplify how we think about it.

In contrast mutability forces us to evaluate each step of a program to understand it, tracking of state in parallel with execution, requires call ordering to be specified explicitly, and also is subject to non-obvious behaviour due to potential aliasing of references.

Extrapolating to larger contexts, mutability means we now potentially need to evaluate every step of a program, keeping careful track of each piece of data’s state for the duration of its scope, as well as tracking state updates via aliased references, in order understand what any single piece of the program is doing.

Due to a lack of side-effects, immutable code seems like it should always be understandable in isolation; its output depends purely on its input.

Based on this, it seems a clear-cut case of immutability leading to code that is easier to reason about, as promised by FP proponents.

Mutability strikes back

Perhaps mutability has other advantages that compensate for making it harder to reason about pieces of code in isolation.

One advantage is familiarity. Most of us learn programming in an imperative manner with mutable data, and this matches our intuition of how the von Neumann machine works2. I don’t think it is fair to understate this point – programming with pure functions and immutable data requires different approaches than those many of us have been applying for the entirety of our careers. My current feeling is that this knowledge is well worth pursuing for the benefits it brings.

There is also a question of time and space performance3, or more specifically, reasoning about this performance. If our understanding of a program with immutable data is independent of its execution, then we need to reason about certain aspects of its performance separately again.4

Another thing to consider is that programming by analogy may be easier to understand than logical reasoning. If we consider a counter as an analogy to an object we interact with in the physical world, then telling it to increase its value seems like a mutable operation, with time and order of execution being an implicit part of the context we are in. If all our mutable objects behave in a way that keeps their state valid, and we control the order and scope of their mutations, then perhaps we can build systems that react reasonably as a whole. Although this does seem to imply we’d end up spending most of our time sorting out threading, scope and context, rather than expressing the solution to the problem unambiguously with immutable data and pure functions.

There are techniques to help us with keeping these mutations in check and for reasoning more logically about mutable code, such as design by contract (DbC), which involves specifying pre- and post-conditions and invariants for mutable objects. If we can tell none of these conditions are violated by a program then we may be able to get some confidence that pieces work in isolation and that a system will work correctly as the sum of its parts.

As a side note, there may be other ways of getting benefits from message-passing analogies, while still maintaining the advantages of immutability and code we can reason about. Perhaps an approach like Erlang’s, where we have components implemented in functional style communicating via message passing, gives us the best of both worlds.


When I first started experimenting with this topic I honestly imagined I would find more grey area between reasoning about mutable and immutable programs. Despite the simplistic example (and completely invalid sample size of the experiment ;)), I think we’ve identified some specific properties of immutable programs that make them easier to reason about than their mutable counterparts.

We can think about immutable programs independently from their execution, meaning we can understand them by evaluating less expressions, and in the context of various algebraic properties using equational reasoning. Mutable programs require us to track state at the same time as evaluating each step of the program, and are affected by ordering and aliasing ambiguities.

To leave myself a small paling of fence to sit on, it is worth noting that the last twenty-odd years of software development has been primarily taught in the context of mutable objects, and we have learned to reason about program execution and time and space use in this context.

Whether this gives rise to more intuitive behaviour in large systems compared to being able to apply sound logic to understand each part of a program is something I just don’t know. My experiences programming using mutable objects have been less than stellar, which has led me to look much more closely at functional programming, but I just don’t yet have enough experience with the latter in large projects to adequately judge the two.

Based on the properties outlined in this post, I definitely feel it is worth getting this experience. To me there is more than enough evidence to suggest a switch to immutable data is worth the potential risk and learning curve for the benefits of more understandable programs. If it turns out I’m wrong then I’m going to be very interested in identifying the characteristics of mutable data I have missed that do in fact make it easier to reason about.

  1. Thanks to Tony Sloane for pointing out some problems caused by aliasing to me. Please attribute mistakes in my interpretation to me because, unlike me, Tony knows what he’s talking about. :)

  2. See “Can Programming Be Liberated from von Neumann Style?” by John Backus for an excellent discussion of this topic. PDF link

  3. See Preview of immutable collections released on NuGet

  4. A counter-argument to this that the prevalence of multi-core machines (even mobile phones have 4 as of 2012) makes reasoning about evaluation difficult even with mutable, imperative code. Linear reasoning is no longer reliable, especially if we want to take advantage of multiple threads.