Probably the hardest thing I ever did explained as simply as possible.

DrBearhands

A while ago, I came across a situation where I needed to know things about some hidden state at compile time. Traditionally, this is done using Hoare logic. In Hoare logic, every statement has a precondition and a postcondition. That is, what must be true before the statement and what will be true after the statement. For the statement “set x to 7”, the precondition is that x exists, and the postcondition is that is x = 7. We can chain multiple statements and get a final postcondition using the rules of Hoare logic.

Hoare logic is limited, which is one reason I strongly prefer (total) functional programming over imperative programming. Nevertheless, even functional programmers must deal with mutable state at some point. Generally we do this with a function of type

s -> s

where s is the type of the state. That is, a function that takes as state as input, and produces a new state as output. A state-changing function may also return some value, so the actual type of the function is

s -> (a, s)

where a is the type of the return value. In functional programming we use tuples to “return” multiple values. When we define how several state-changing functions can be chained together, they are called monads. You don’t need to understand what that means exactly. “Monad” is just a mathematical name for a certain pattern.

It seemed what I wanted was Hoare logic over state monads. I came across a paper about the Hoare state monad. With the knowledge I had, this was one of the most difficult things I have ever implemented. I had already given up on Hoare state monads at least twice. This time I finally managed it after a full week. I feel that not sharing this journey would be a waste. Especially since they will probably turn out to be too complicated for the purpose I had in mind.

The code in this article is what I actually used to deconstruct the problem for myself. Plus a few extras for things that are only obvious to me.

Not every language can express Hoare state monad, we will need a sufficiently powerful type system. I was using Idris 2, so this was not a problem. Don’t worry, I will go through any unusual concepts before using them.

First, let’s make our own state monad. I’m going to use a hardcoded state, usually this would be variable.

data MyState = Foo | Bar

For those unfamiliar with sum-types: the above declares that a value of type MyState can be either Foo or Bar.

We go on to create a state monad for MyState, you might be unfamiliar with the notation so I will explain it afterward.

data MyStateMonad : a -> Type where
  MkMyStateMonad : (MyState -> (a, MyState)) -> MyStateMonad a

Note that a single colon indicates a type declaration in Idris. E.g. x : Int means “x is an integer”.

The first line, data MyStateMonad : a -> Type where declares that the MyStateMonad needs some a as input to make a Type. This is similar to arrays. In statically typed languages, we don’t have just “an array”, we have e.g. “an array of integers”. We must specify what’s “inside” it. Similarly we can have a state monad (think of it as a statement) that returns an integer MyStateMonad Int, or a string MyStateMonad String, or something else.

Below the data declaration are all the type’s constructors, only MkMyStateMonad in this case. The constructor has certain arguments, (MyState -> (a, MyState)) in this case. Recall state-changing functions are s->(a,s), with s=MyState in this case. The constructor also has a return type, MyStateMonad a, which must match the type given in the data declaration header. The a is variable in this context, but every a must match, so the return type of the function defines what the type of MyStateMonad a is. E.g.: for a function f : MyState -> (Int, MyState), we know MkMyStateMonad f : MyStateMonad Int.

But what about input? Idris is a curried language, meaning we can have the function:

someFunction : Int -> Int
someFunction = (+) 1

In other words, a function of type MyState -> (a, MyState) can be a closure of as many arguments as you want.

State monads are useful as is. They let us use do-notation rather than less readable pattern matching:

do
  x <- actionA
  actionB
  actionC x

vs.

\s1 =>
  let
    (x, s2) = actionA s1
    s3 = actionB s2
  in
    actionC x s3

this is not what we set out to do though, so let’s talk about proofs in Idris.

You may have heard that types are propositions and values are proofs of propositions. The value 7 is proof that Ints exist. 7 inhabits Int. But we can do something a bit more interesting. Remember how MyStateMonad needed some a? We can do something similar with some other type over a, say P, so that P a only holds for some a. We do this by restricting what constructors are available.

A good example is the Equal type:

data Equal : a -> b -> Type where
  Refl : Equal x x

Here, we can use the Refl constructor to prove Equal x x, but there is no way to get an Equal x y value. Equal x y is a valid type, but there is no value of that type unless x = y. Therefore, if you can pass a value with type Equal x y to a function, you know within that function that x = y. There are more complex cases, of course, such as restricting inputs to members of a list.

In the above example, Equal is a predicate, Equal x y is a proposition, and Refl is a proof of Equal x x.

We can use proofs in a function without adding runtime overhead:

acceptsOnlyFoo : (s1 : MyState)
              -> {auto 0 prf : Equal Foo s1}
              -> (Int, MyState)
acceptsOnlyFoo inState = (7, Bar)

Let’s dissect the function type line by line:

(s1 : MyState): this is similar to MyState, but we have given the argument a name (s1). In dependently typed programming, types can depend on values, so it’s useful to have value names in the type.
{auto 0 prf : Equal Foo s1}: the curly brackets make this argument implicit. We don’t have to specify it when calling the function, provided the compiler can figure it out itself. The 0 means this argument will not exist at runtime. auto denotes that the compiler will try to find a value for the type itself from the context. prf is again the variable name and Equal Foo s1 is the type. In other words, we ask the compiler to find us some value of type Equal Foo s1, with information known at compile time only. The compiler can only succeed in this by finding Refl if it knows for sure the first argument is Foo.
(Int, MyState) the output type as you are already familiar with.

This means we can call acceptsOnlyFoo as:

acceptsOnlyFoo Foo

but not

\x => acceptsOnlyFoo x

unless we already know Equal Foo x at compile time some other way.

You might be able to see where this is going now. If we have one function that implicitly requires Equal Foo x, and another that “returns” an Equal Foo x, those could be safely chained together.

While acceptsOnlyFoo specifically requires a proof Equal Foo, we do not want to limit ourselves to just proving equality. We need a more generic type for the precondition. A precondition could be any MyState -> Type. A state monad with a precondition is therefore:

data MyPreconditionMonad : (pre : MyState -> Type) -> a -> Type where
  MkMyPreconditionMonad :
    ( ( s1 : MyState) ->
      { auto 0 prf_pre : pre s1 } ->
      (a, MyState)
    ) ->
    MyPreconditionMonad pre a

Notice { auto 0 prf_pre : pre s1 }. The proof that must be found must be of type pre s1. The initial state s1 is passed to it as argument at compile-time. Even though we don’t really know what it will be. In our previous example pre was Equal Foo.

An example of how to construct the state monad with precondition:

testPrecondition1 : MyPreconditionMonad (Equal Foo) Int
testPrecondition1 = MkMyPreconditionMonad acceptsOnlyFoo

But what if we want to chain state monads that don’t have a precondition? We create a Top proposition, which can always be satisfied:

data Top : a -> Type where
  MkTop : Top a

Whatever a we have, MkTop will be proof that Top a.

The name Top comes from formal logic, where it is indicated with the symbol ⊤. It shouldn’t actually take any arguments, but this servers our purpose well.

And this is how you might use it:

testPrecondition2 : MyPreconditionMonad Top Int
testPrecondition2 = MkMyPreconditionMonad f
  where
    f : (s1 : MyState) -> {auto 0 prf : Top s1} -> (Int, MyState)
    f s1 = (7, s1)

So that about covers preconditions. Postconditions are a little bit harder. As far as I know, Idris has no implicit results. Instead, we must return something like a tuple with an implicit argument. To do this we create a special type PostS'. I’ve added the apostrophe because the actual PostS will be a little different.

data PostS' : (s : Type) -> (s -> Type) -> Type where
  MkPostS' : (val : s)
          -> {auto 0 prf : prop val}
          -> PostS' s prop

PostS' is parametrized by a Type s and a predicate about s. Note that we pass val to prop, not s, since s is the type, and we want the proposition to hold about the value. For a concrete example, we want to be able to say PostS' Int (Equal 7), not PostS' Int (Equal Int).

Now let’s use PostS' in a function that increases a counter by 1. The type of the counter will be Nat, a built-in type for natural numbers:

testPostS' : (s1 : Nat) -> PostS' Nat (Equal (s1 + 1))
testPostS' x = MkPostS' (x+1)

As you can see, we don’t need to pass a proof, Idris has figured out Equal (s1+1) (x+1) automatically, by unifying s1 and x. More or less. I think. Honestly I’m not that knowledgeable about how Idris searches for proofs.

But the real PostS needs to be a bit more complex. We don’t just want the postcondition to be about the output state, it also needs to consider the returned value and input state. For instance we might want to declare that s2 = s1 + x.

Luckily, if you understand the previous code, the next step is not particularly complicated:

data PostS : (a : Type) -> (s : Type) -> (a -> s -> Type) -> Type where
  MkPostS  : (val : a)
          -> (state : s)
          -> {auto 0 prf : prop val state}
          -> PostS a s prop

A MkPostS consists of a value, a state, and a proof that a proposition about said state and value holds (at compile time). You may have noticed that prop takes no input state. That is because PostS does not know about input state. If we want a predicate about input state, that input state must already be passed to it outside of PostS. For instance, if we want to express a resulting state is equal to the input state plus the returned values:

testPostS : (s1 : Nat) -> PostS Nat Nat (\n => Equal (s1 + n))
testPostS s1 = MkPostS 7 (s1+7)

With that out of the way, let’s make a Hoare monad for a specific state. We will keep using Nat now instead of MyState, since there’s not much to prove about a MyState:

data MyHoareMonad : (pre : Nat -> Type) -> (a : Type) -> (post : Nat -> a -> Nat -> Type) -> Type where
  MkMyHoareMonad :
    ( ( s1 : Nat) ->
      { auto 0 _ : pre s1 } ->
      PostS a Nat (post s1)
    ) ->
    MyHoareMonad pre a post

And we can use it as follows:

testMyHoareMonad : MyHoareMonad (Equal 0) Nat (\s1 => Equal)
testMyHoareMonad = MkMyHoareMonad f
  where
    f : (s1 : Nat) -> {auto 0 prf : Equal 0 s1} -> PostS Nat Nat (Equal)
    f inState {prf} = MkPostS{prf=Refl} (inState+1) (inState+1) --{prf=Refl} is needed here because of a compiler bug.

Unfortunately there currently appears to be a bug in Idris’ auto strategy. It means that, for now, we won’t be able to rely on auto as much. It’s not a huge deal provided it will get fixed in the future, but a tad annoying.

But it’s time for the real thing, make a Hoare state monad without hardcoded state type:

data HoareStateMonad : (s : Type) -> (pre : s -> Type) -> (a : Type) -> (post : s -> a -> s -> Type) -> Type where
  MkHSMonad :
    ( ( s1 : s ) ->
      { auto 0 prf : pre s1 } ->
      PostS a s (post s1)
    ) ->
    HoareStateMonad s pre a post

it’s not much different, we just replaced Nat with a variable type s.

Unfortunately, the hardest part is still to come. We want to use a Hoare state monad as a monad, or sort-of a monad. Technically we won’t be making a monad. A monad requires the rule m m a -> m a, but since our “m” has been decorated with pre- and postconditions, every “m” in that formula would be different. Doesn’t really matter for us though, it’s close enough.

We need to define a pure and a >>= (bind) function.

A pure function lifts a value to a monadic value, like making a list with a single element, doing a no-op, or, in the case of a regular state monad, leave the state unchanged and return the lifted value:

pure : a -> MyStateMonad a
pure val = MkMyStateMonad ((,) val)

So what should the pre- and postconditions be for pure? Since we’re not using the state in any way, the precondition can just be Top. Anything goes.

We could also use Top as the postcondition, but that would lose some useful information. We know the input state is equal to the output state (Equal s1 s2), and the output value is equal to the lifted value. We can use a tuple as a conjunction to indicate to predicates must both hold. After all, we must be able to provide both values of a tuple to construct it.

With that, pure becomes:

pure : (a1 : a) -> HoareStateMonad s Top a (\s1, a2, s2 => (Equal s1 s2, Equal a1 a2))
pure a1 = MkHSMonad $ \s1 => MkPostS a1 s1

And now for the tricky bit, the bind operator.

For regular state monads, the bind operator chains the state-changing functions together:

(>>=) : MyStateMonad a -> (a -> MyStateMonad b) -> MyStateMonad b
(>>=) (MkMyStateMonad f1) act2 = MkMyStateMonad $ \s1 =>
  let
    (val1, s2) = f1 s1
    (MkMyStateMonad f2) act2 val1
  in
     f2 s2

So what we’re really doing, is first executing action A, the executing action B, potentially using A’s output. I will be using a notation from program algebra for clarity: A ; B means “first do A, then do B”.

But, again, what should the pre- and postconditions be? Let’s first look at the part of bind’s type signature we do know:

(>>=) :  HoareStateMonad s p1 a q1
      -> ((x : a) -> HoareStateMonad s (p2 x) b (q2 x))
      -> HoareStateMonad s ?p3 b ?q3

Here, s is the state type, p1 is the precondition of action A, a is the return type of action A, q1 is the postcondition of action A, and x is the input to action B. p2 and q2 are a bit different than p1 and q1. They are applied not just over input state, return value and output state, but also over the input value x. Why? Consider the following function:

trySetState : (x : Nat) -> (s1 : Nat) -> (Bool, Nat)
trySetState x s1 =
  if (someCondition)
    then (True, x)
    else (False, s1)

The function sets the state based on some condition. This is a common pattern. A reasonable postcondition for trySetState would express that if the return type is True, then Equal x s2 is proven, and if it is False, then Equal s1 s2 is proven. Declaring this proposition requires using x as input.

As the example shows, it is useful to also consider inputs in the pre- and postconidtion for the second argument of bind.

Back to the bind operation. What is p3, the precondition of the “first A then B”? Since A executes first, p3 must contain p1. Furthermore, to satisfy p2, p3 must ensure that p2 holds after A has executed, so the postcondition q1 of A must imply the precondition p2 of B. Remember also that the type of p3 is s -> Type.

We define a function bind_pre that constructs a new precondition from p1, q1 and p2:

bind_pre
  :  {s : Type}
  -> {a : Type}
  -> (p1 : s -> Type)
  -> (q1 : s -> a -> s -> Type)
  -> (p2 : a -> s -> Type)
  -> s -> Type
bind_pre p1 q1 p2 s1 =
  ( p1 s1
  , {x : a} -> {s2 : s} -> q1 s1 x s2 -> p2 x s2
  )

This isn’t easy, so let’s go through it line by line:

bind_pre: the function name
: {s : Type}: the type declaration for an implicit argument. You can read this as “for any s that is a type”
-> {a : Type}: Read as “for any a that is a type”
-> (p1 : s -> Type): first argument to the function (not counting implicits). A predicate over s. The precondition of the first action.
-> (q1 : s -> a -> s -> Type): Postcondition of the first action
-> (p2 : a -> s -> Type): precondition of the second action. Based on the output of the first action.
-> s -> Type: The return type. Remember p3 must be s -> Type. Idris is curried to x -> s -> Type and x -> (s -> Type) are exactly the same thing.
bind_pre p1 q1 p2 s1 =: begin of function body. We give the arguments new names here, that is because the names in the type declaration are only valid within that type declaration. I used matching names for clarity.
( p1 s1: The result is a tuple, with as first value the regular precondition of the first action.
, {x : a} -> {s2 : s} -> q1 s1 x s2 -> p2 x s2: for any x of type a, and any s2 of type s, if we can prove q1 s1 x s2, then we must also be able to prove p2 x s2.

Well, that was hard enough, but the postcondition is worse. q1 and q2 are qualified on intermediate results. These intermediate values won’t appear in the resulting monad’s type declaration. We only know that they exist and that the postconditions hold for them. We’re going to need a function for existential quantification:

exists : {p : _} -> (p -> Type) -> Type
exists f = {b : Type} -> ({a : _} -> f a -> b) -> b

What we did here, is reverse universal quantification to get existential qualification. {a : Type} -> a -> Type is a predicate that works for all as that are Types. Top is such a predicate. Conversely, we want to state that some a exists that satisfies a certain predicate. There is a duality here: if all you know is that something exists, you can only reason about it using predicates that work for all values. In programming, that means an existentially quantified ouput is mathematically the same as a universally quantified continuation. In other words: exists x. x is the same as forall y . (forall x . x -> y) -> y. The latter can be read as “you can prove any statement of type y so long as you can prove it using any statement of type x”. Since we know nothing about y, only that it can be proven using any x, this tells us that some x must exist. The other way works too: if we know some x exists, we know that a y can be proven if it can be proven for any x.

Now finally we can get to the postcondition of A; B.

Remember that q1 and q2 are qualified over intermediate values, of which we only know they exist in the type signature. We therefore have:

bind_post
  :  {s : Type}
  -> {a : Type}
  -> {b : Type}
  -> (q1 : s -> a -> s -> Type)
  -> (q2 : a -> s -> b -> s -> Type)
  -> (s -> b -> s -> Type)
bind_post q1 q2 s1 y s3
  =  exists (\x => exists (\s2 => (q1 s1 x s2, q2 x s2 y s3)))

Which leaves the final declaration of bind, I’ll explain it line-by-line later:

(>>=)  : {a : Type}
      -> {b : Type}
      -> {s : Type}
      -> {p1 : s -> Type}
      -> {q1 : s -> a -> s -> Type}
      -> {p2 : a -> s -> Type}
      -> {q2 : a -> s -> b -> s -> Type}
      -> HoareStateMonad s p1 a q1
      -> ((x : a) -> HoareStateMonad s (p2 x) b (q2 x))
      -> HoareStateMonad s (bind_pre p1 q1 p2) b (bind_post q1 q2)
(>>=) (MkHSMonad f1) act2 = MkHSMonad f
  where
    f :  ( s1 : s )
      -> { auto 0 pre3 : bind_pre p1 q1 p2 s1 }
      -> PostS b s (bind_post q1 q2 s1)
    f s1 {pre3} =
      let
        MkPostS v1 s2 {prf=post1} = f1 s1
        MkHSMonad f2 = act2 v1
        MkPostS v2 s3 {prf=post2} = f2{prf= (snd pre3) post1} s2
      in MkPostS v2 s3 {prf=(\e1 => (e1 $ \e2 => e2 (post1, post2)))}

{a : Type}: implicit argument. Read as “for all as that are types”
{b : Type}: for all bs that are types
{s : Type}: for all ss that are types
{p1 : s -> Type}: for all predicates p1 about an s (this is the type of our first precondition)
{q1 : s -> a -> s -> Type} for all predicates q1 about an s, a and another s (the type of our first postcondition)
{p2 : a -> s -> Type}: for all predicates p2 over a and s (second precondition, extra qualified over the input)
{q2 : a -> s -> b -> s -> Type}: second postcondition
HoareStateMonad s p1 a q1: the first argument, a HoareStateMonad about state s, with precondition p1, return value a, and postcondition q1.
((x : a) -> HoareStateMonad s (p2 x) b (q2 x)): the second argument, a function that takes an a (with value x), and yields a HoareStateMonad about state s, with precondition p2 x, returning a b, and with postcondition q2 x
HoareStateMonad s (bind_pre p1 q1 p2) b (bind_post q1 q2): results in a HoareStateMonad about state s, with precondition bind_pre p1 q1 p2, returning a value b, and with postcondition bind_post q1 q2

And for the function body:

(>>=) (MkHSMonad f1) act2 = MkHSMonad f: unwrapping and wrapping the state-changing functions in monad constructors. f is the new state-changing function we create as the sequence of our inputs.
f : ( s1 : s ): the first argument to the new function is the input state
-> { auto 0 pre3 : bind_pre p1 q1 p2 s1 }: we also require a proof that the combined preconditions hold
-> PostS b s (bind_post q1 q2 s1): and produce a value with suitable postconditions
f s1 {pre3} =: we note the precondition proof pre3 explicitely as we will need to use it manually in the function body.
MkPostS v1 s2 {prf=post1} = f1 s1: “Run” the first action with the input state. This will result in a return value v1, a new state s2 and an proof of the postcondition post1. Note that post1 still has multiplicity 0 so it will not exist at runtime.
MkHSMonad f2 = act2 v1: pass the return value of the first action, this will result in a new state-changing function f2, wrapped in a HoareStateMonad.
MkPostS v2 s3 {prf=post2} = f2{prf= (snd pre3) post1} s2: we “run” f2 with state s2, but must also prove its precondition. The precondition pre3 is ( p1 s1 , {x : a} -> {s2 : s} -> q1 s1 x s2 -> p2 x s2 ) The second argument is of interest here, and we have q1 s1 v1 s2 from post1, and so (snd pre3) post1 is proof of p2 v1 s2.
MkPostS v2 s3 {prf=(\e1 => (e1 $ \e2 => e2 (post1, post2)))}: we create a return value with postcondition, but must provide proof of our new postcondition. Recall existential types are continuations, and so we create a continuation as proof.

Next steps

I still have to use HoareStateMonads in practice. I planned to use them to reduce what mistakes can be made when writing a WebGL program. Those can be very tricky to debug. I fear HoareStateMonads require passing around proofs far too much to be useful in most cases. The extra programming work might not weigh up against the reduction of bugs. But perhaps Idris will be able to find many proofs by itself. Or perhaps I will need a different data type entirely.