Interpreters with
Non-Determinism Using a
Free Monad
NOTE: There are better ways to implement non-determinism than what this blog post summarizes; e.g., by using scoped effects or latent effects. One day I may get around to summarizing that better approach here. If you’d like to help this process along, email me.
Non-determinism can be a challenging language feature to give a definitional interpreter for. In this post we illustrate an approach to overcoming this challenge: a free monad.
This blog post is a literate Agda file. The contents of this post should be accessible to readers familiar with Haskell; and it should be possible to translate the definitions in this post to non-dependently typed languages, such as Haskell or Scala, by using slightly less precise encodings of free monads.
The contents of this post are based on joint work with Arjen Rouvoet.
Typos, comments, or questions? Let me know!
Why Free Monads and This Blog Post?
The purpose of this post is to illustrate the free monads approach to modeling non-determinism as an effect, in a way that does not require a deep understanding of the continuation monad, and in a way that lets us write a definitional interpreter and define its effects separately.
There are many other ways we could define and implement interpreters with non-determinism.1 Why use a free monad? As we will illustrate, free monads allow us to
- define a single definitional interpreter with multiple possible interpretations of effects.
- have fine-grained control over the interpreter state, as is necessary to define, e.g., non-determinism.
The combination of these two properties is attractive for many purposes, and this combination is hard to come by if we use general monads, unless we resort to the continuation monad (sometimes called “the mother of all monads”). On the other hand, free monads are less powerful than regular monads,2 and can be as involved to work with as general monads.
Contents of this Post
Command Trees (a Free Monad)
A free monad defines trees of chained sequences of effectful commands. We illustrate what we mean by this below, but first we define a data type Free
, which defines a free monad that describes the structure of command trees:3
data Free (C : Set) (R : C → Set) (A : Set) : Set where ‵return : A → Free C R A _‵>>=_ : (c : C) → (R c → Free C R A) → Free C R A
Free
is indexed by a set of commands (C : Set
), a set of command-dependent responses (R : C → Set
), and a set of values that the command tree will ultimately produce (A : Set
).
Free
has two data type constructors:
‵return
is a trivial command which simply returns a value of typeA
directly._‵>>=_
defines an infix operator which is used to chain together a command (c : C)
and a continuation which takes a command-response as input and returns the remaining set of commands in the tree as output (R c → Free C R A
).
The inhabitants of the Free
data type are trees in the sense that continuations branch over all possible responses resulting from interpreting a command. In the continuation of _‵>>=_
in Free
, R c
represents the response type of a command c
.
An Example: Coin Flip Command Trees
Say our set of commands consists of a single command for flipping a coin:
data FCmd : Set where !flip : FCmd
And say we expect a Bool
ean as the result of a flip
command. The function below implements this precisely:
FResp : FCmd → Set FResp ‵flip = Bool
Now the following command tree represents a program that returns one of two strings ("heads"
or "tails"
) depending on the result of flipping a coin:4
flipping-example : Free FCmd FResp String flipping-example = !flip ‵>>= λ where true → ‵return "heads" false → ‵return "tails"
The command tree above does not actually perform the flip
action: it is an explicit representation of a program that flips a coin, and branches on the result of performing the flip.
In order to actually perform the flip, we should define an interpretation function for translating command trees of type Free FCmd FResp _
into actual effectful computation. We omit the interpretation function for coin flips here. Below we illustrate how to define such a command tree interpretation function. First, we introduce some convenience functions that we shall make use of in order to use a free monad to write definitional interpreters for languages with effects.
Free
is a Monad
We define return
and _>>=_
(pronounced “bind”) functions for Free
.
We put these definitions in a module because we are going to be defining different return
and _>>=_
functions later in this post:
module FreeMonad where
return
is just an alias for constructing a trivial command tree:5
return : ∀ {C R A} → A → Free C R A return = ‵return
_>>=_
takes a command tree and a continuation as input, and splices the two together:
_>>=_ : ∀ {C R A B} → Free C R A → (A → Free C R B) → Free C R B ‵return x >>= k = k x (c ‵>>= f) >>= k = c ‵>>= (λ r → f r >>= k)
A Language with Non-Deterministic Choice
Our goal is to write interpreters for languages with non-determinism. For the purpose of studying how we can reach this goal, we shall focus on a rather tiny and mostly-uninteresting language that nonetheless holds many of the same challenges for dealing with non-determinism as more interesting languages. The language is a calculator language with numbers (‵num
), addition expressions (_‵+_
), and operations for recalling (‵recall
) and setting (‵set
) the value of a single memory cell. That is, the language runtime tracks the state of a single memory cell which can hold a single number in it, and which can be recalled and mutated during run time.
data Expr : Set where ‵num : ℕ → Expr _‵+_ : Expr → Expr → Expr ‵recall : Expr ‵set : ℕ → Expr
The order of evaluation for addition expressions is undefined for this language. Consider the following expression:
nondet-expr-example : Expr nondet-expr-example = ‵recall ‵+ (‵set 1)
If the initial value of the memory cell is 0
, then nondet-expr-example
should return either 1
or 2
, depending on whether the ‵recall
expression is interpreted before or after (‵set 1)
, and assuming that the result of evaluating a ‵set
expressions is the number that ‵set
is applied to (i.e., 1
in the program above).
An Interpreter with Committed‑Choice Non‑Determinism
We shall define an interpreter with non-deterministic evaluation order between the left and right sub-expression branches of addition expressions. We define the interpreter in three steps:
- We define a set of commands and responses that our interpreter shall make use of to define effectful computations.
- We define an evaluation function which translates an
Expr
ession into a command tree. This step will give us an explicit representation of which effectful actions anExpr
ession gives rise to. - We define the meaning of command trees.
Commands and Responses (Step 1)
Our interpreter will make use of three different commands as defined by the Cmd
data type:6
mutual data Cmd : Set where !set : ℕ → Cmd !recall : Cmd !fork : Free Cmd Resp ℕ → Free Cmd Resp ℕ → Cmd Resp : Cmd → Set Resp (!set n) = ℕ Resp !recall = ℕ Resp (!fork f₁ f₂) = ℕ × ℕ
Here:
!set n
sets the memory cell ton
, and returnsn
as the result of command evaluation.!recall
recalls the current value of the memory cell.!fork f₁ f₂
evaluates two command treesf₁
andf₂
. The order in which the two command trees are evaluated (left-to-right or right-to-left) is decided by flipping a coin.
The Resp
data type tells us what the return type is for each of these commands.
Our interpreter shall make use of the following “liftings” of each command to a command tree:
recall : Free Cmd Resp ℕ recall = !recall ‵>>= ‵return set : ℕ → Free Cmd Resp ℕ set n = !set n ‵>>= ‵return fork : Free Cmd Resp ℕ → Free Cmd Resp ℕ → Free Cmd Resp (ℕ × ℕ) fork f₁ f₂ = !fork f₁ f₂ ‵>>= ‵return
Evaluation (Step 2)
Now let us define an eval
uation function that takes an expression as input, and returns as output a command tree that represents a program with non-deterministic choice (using fork
) and a stateful memory cell (set
and recall
).
The eval
function below makes use of the _>>=_
and return
of Free
, using Agda’s support for do
notation.
module _ where open FreeMonad eval : Expr → Free Cmd Resp ℕ eval (‵num n) = return n eval (e₁ ‵+ e₂) = do (n₁ , n₂) ← fork (eval e₁) (eval e₂) return (n₁ + n₂) eval ‵recall = do recall eval (‵set n) = set n
Here we have put eval
in an anonymous module (named _
) in order to locally open the FreeMonad
module which defines _>>=_
.
An Example
We can now evaluate the nondet-expr-example
program from earlier by turning the object language program into a command tree that tells us which effects the program gives rise to, and in which order:
nondet-free-example : Free Cmd Resp ℕ nondet-free-example = eval (‵recall ‵+ (‵set 1))
Evaluation yields the command tree shown below:78
nondet-example : nondet-free-example ≡ !fork (!recall ‵>>= ‵return) (!set 1 ‵>>= ‵return) ‵>>= (λ{ (n₁ , n₂) → ‵return (n₁ + n₂) }) nondet-example = refl
This program represents that there is a fork in our interpreter: the interpreter may choose to evaluate the recall
branch first; or it may choose to evaluate the ‵set 1
branch first. The result of either order of evaluation is going to be a pair of two natural numbers which are added together in the continuation expression (λ{ (n₁ , n₂) → ‵return (n₁ + n₂) })
.
Command Tree Meaning (Step 3)
We define the meaning of command trees, by translating trees into actual computations. Our notion of computation will have to encode some notion of non-determinism. There are multiple ways we could choose to encode non-determinism, such as accumulating the set of all possible outcomes that a program can have, or by “flipping a coin” to decide which branch to follow at fork points. We opt for the latter, but the former would have been equally fine.
module CommandTreeMeaning where
We translate command trees into computations that consume streams of coin flips and can mutate a stateful memory cell. We represent coin flips as a stream of booleans (e.g., true
for heads
; false
for tails
). The CM
type defines computations:
CM : Set → Set CM A = Stream Bool ∞ → ℕ → A × Stream Bool ∞ × ℕ
A computation of type CM A
expects as input a(n infinite) stream of coin flips (Stream Bool ∞
9), and a memory cell value (ℕ
), and returns as output a value of type A
, as well as the remaining stream of coin flips and the (possibly) updated memory cell value.
CM
is a monad, and we define return
and _>>=_
functions over CM
, to be used by the translation function from command trees to CM
that we define shortly.10
return : ∀ {A} → A → CM A return a cs n = a , cs , n _>>=_ : ∀ {A B} → CM A → (A → CM B) → CM B (f >>= k) cs n = case (f cs n) of λ where (a , cs′ , n′) → k a cs′ n′ _>>_ : ∀ {A B} → CM A → CM B → CM B f >> k = f >>= const k
Our translation function will also make use of the following effectful operations over CM
:
do-set n
mutates the state of the memory cell ton
.do-recall
recalls the state of the memory cell.do-flip
flips a coin and returns the (Bool
ean) result.
do-set : ℕ → CM ℕ do-set n cs _ = n , cs , n do-recall : CM ℕ do-recall cs n = n , cs , n do-flip : CM Bool do-flip cs n = head cs , tail cs , n
We define a handle
function which translates a command into a computation, and a ⟦_⟧
function which translates command trees into computations by handle
ing each command in the tree, thereby turning it into a computation:
mutual handle : (c : Cmd) → CM (Resp c) handle (!set n) = do-set n handle !recall = do-recall handle (!fork f₁ f₂) = do b ← do-flip if b then (do n₁ ← ⟦ f₁ ⟧ n₂ ← ⟦ f₂ ⟧ return (n₁ , n₂)) else (do n₂ ← ⟦ f₂ ⟧ n₁ ← ⟦ f₁ ⟧ return (n₁ , n₂))
The handle
function makes use of the ⟦_⟧
function (below) in the !fork f₁ f₂
case, because the definition of !fork
requires us to turn each of f₁
and f₂
into computations. The !fork
case does so by first flipping a coin to decide which order in which to evaluate f₁
and f₂
. The ⟦_⟧
function uses handle
to turn commands into computations, and chains command tree branch computations by using the _>>=_
of CM
:
⟦_⟧ : Free Cmd Resp ℕ → CM ℕ ⟦ ‵return n ⟧ = return n ⟦ c ‵>>= k ⟧ = handle c >>= λ r → ⟦ k r ⟧
Given eval
and ⟦_⟧
, we can define an interp
function for interpreting expressions to numbers:
interp : Expr → Stream Bool ∞ → ℕ interp e cs = proj₁ (⟦ eval e ⟧ cs 0)
Running It
The semantics lets us interpret the nondet-expr-example
program from above using different orders of evaluation to yield different results:
true-stream = repeat true -- infinite stream of `true`s false-stream = repeat false -- infinite stream of `false`s interp-nondet-expr-example=1 : 1 ≡ interp nondet-expr-example true-stream interp-nondet-expr-example=1 = refl interp-nondet-expr-example=2 : 2 ≡ interp nondet-expr-example false-stream interp-nondet-expr-example=2 = refl
Intermezzo
We have defined a semantics for committed-choice non-determinism for a simple calculator language. We have done so by first translating calculator language expressions into command trees, and subsequently giving meaning to these command trees. Command trees serve as explicit representations of the effect semantics of object language programs. This representation lets us control the order in which semantic actions happen — which is exactly what we need to define an interleaving order of evaluation. In the rest of this post, we reuse evaluator eval
from above, but give new meaning to the !fork
command, which interleaves the order of evaluation of expressions at fork
points.
An Interpreter with Interleaving Non‑Determinism
Consider the following program:
interleave-example : Expr interleave-example = ((‵set 1) ‵+ ‵recall) ‵+ (‵set 2)
Using our interp
reter from above, there are two possible outcomes from evaluation:
3 = 0 + 1 + 2
(corresponding to this sequence of memory effects:‵recall
;(‵set 1)
;(‵set 2)
4 = 1 + 1 + 2
(corresponding to this sequence of memory effects:(‵set 1)
;‵recall
;(‵set 2)
The interp
reter above is non-deterministic, but after deciding which branch to evaluate first, the interp
reter is committed to fully evaluating the first branch before evaluating the second branch. That means that the following outcome is not valid with the interp
reter above:
5 = 1 + 2 + 2
(corresponding to this sequence of memory effects:(‵set 1)
;(‵set 2)
;recall
)
With an interleaving semantics, all of i.-iii. are valid outcomes, because we allow interp
reters to do computation in any branch.
We will write an interpreter with interleaving order of evaluation, which can yield either of the three results above. We follow the same recipe as before. That is, we first define a set of commands and responses, then an eval
uation function that translates source language expressions into command trees, and finally we define the meaning of these command trees separately.
module InterleavingEval where open FreeMonad
Commands, Responses, and Evaluation
The commands, responses, and evaluation function is exactly the same as above. Only the command tree meaning changes. Free monads allow us to define a single definitional interpreter with multiple possible interpretations of effects.
Command Tree Meaning
We define the meaning of commands and command trees, by translating command trees into actual computations.
module InterleavingMonad where open InterleavingEval
The notion of computation we shall use keeps track of a pool of Thread
s, where each Thread
represents an interleavingly executing computation.
running
threads, and blocked threads that are waiting to join
the results of two other threads in the current thread pool:
data Thread (A : Set) : Set where running : Free Cmd Resp A → Thread A join : ℕ → ℕ → (A × A → Free Cmd Resp ℕ) → Thread A
The join
constructor is parameterized by two natural numbers, say n₁
and n₁
, representing two thread identifiers in the current thread pool. The meaning of join
should be to fetch the final results (of type A
) produced by the threads at position n₁
and n₂
, and pass these two results to the continuation of the join
expression (A × A → Free Cmd Resp ℕ
).
IM
type:
IM : Set → Set IM A = List (Thread ℕ) → Stream ℕ ∞ → ℕ → Maybe (A × List (Thread ℕ) × Stream ℕ ∞ × ℕ)
A computation takes as input:
- A list of threads (a thread pool).
- A stream of natural numbers, representing an interleaving order. For example, a stream whose first three numbers are
1
,2
, and0
would first do a single computation step of the thread at position1
in the current thread pool; then a single computation step of the thread at position2
; and then a single computation step of the thread at position0
. - A natural number representing a stateful memory cell (of type
ℕ
).
Unless computations fail (the Maybe
in the return type of IM
indicates possible-failure), a computation returns some value of type A
, an updated thread pool (List (Thread ℕ)
), an updated stream of interleaving choices, and an updated memory cell (ℕ
).
The reason for possible-failure in IM
has to do with the fact that thread pool lookups may fail.11 For example, if the thread pool only has two threads in it, and we try to do a computation step of a thread at position 3
, this will fail.
IM
is a monad:12
return : ∀ {A} → A → IM A return a ts cs n = just (a , ts , cs , n) _>>=_ : ∀ {A B} → IM A → (A → IM B) → IM B (f >>= k) ts cs n = case (f ts cs n) of λ where (just (a , ts′ , cs′ , n′)) → k a ts′ cs′ n′ nothing → nothing _>>_ : ∀ {A B} → IM A → IM B → IM B f >> k = f >>= const k
We define the following effectful operations over IM
:
fail
aborts a computation.get-thread n
tries to get the thread at positionn
in the current thread pool.set-thread n θ
updates the thread at positionn
in the current thread pool to beθ
(whereθ
is of typeThread
).new-thread θ
adds a new thread θ to the current thread pool, and returns a pointer to the newly created thread.do-set
anddo-recall
are as before.do-choice
consumes a thread pool pointer from the stream of natural numbers representing the interleaving order for the currently executing program.
fail : ∀ {A} → IM A fail _ _ _ = nothing get-thread : ℕ → IM (Thread ℕ) get-thread t ts cs n = case (ts ‼ t) of λ where nothing → nothing (just θ) → just (θ , ts , cs , n) set-thread : ℕ → Thread ℕ → IM ⊤ set-thread t θ ts cs n = case (t [ ts ]≔ θ) of λ where nothing → nothing (just ts′) → just (tt , ts′ , cs , n) new-thread : Thread ℕ → IM ℕ new-thread θ ts cs n = let t = length ts in just (t , ts ++ θ ∷ [] , cs , n) do-set : ℕ → IM ⊤ do-set n ts cs _ = just (tt , ts , cs , n) do-recall : IM ℕ do-recall ts cs n = just (n , ts , cs , n) do-choice : IM ℕ do-choice ts cs n = just (head cs , ts , tail cs , n)
We define a step f
function for doing a single step of interpretation for a command tree f
, and yield a thread representing the result of doing the computation step. step
is a function over IM
, so it may alter the underlying thread state; e.g., by allocating new threads in the case of interpreting a !fork
:
step : Free Cmd Resp ℕ → IM (Thread ℕ) step (‵return x) = return (running (‵return x)) step (!set n ‵>>= k) = do do-set n return (running (k n)) step (!recall ‵>>= k) = do n ← do-recall return (running (k n)) step (!fork f₁ f₂ ‵>>= k) = do t₁ ← new-thread (running f₁) t₂ ← new-thread (running f₂) return (join t₁ t₂ k)
step-thread
does a single computation step for a thread. Running threads are interpreted by using the definition of step
from above, whereas join
threads are interpreted by getting the thread states at each of the thread positions that join
indicates that it has forked. Doing a steo of a join
thread only succeeds if both of the forked threads have actually yielded a result.
step-thread : Thread ℕ → IM (Thread ℕ) step-thread (running f) = step f step-thread (join t₁ t₂ k) = do (running (‵return n₁)) ← get-thread t₁ where _ → fail (running (‵return n₂)) ← get-thread t₂ where _ → fail return (running (k (n₁ , n₂)))
step-pool
uses the interleaving order to decide which thread to do a computation step of. After doing a step in a thread, the thread state of the computation IM
is updated to reflect the thread update.
step-pool : IM ⊤ step-pool = do c ← do-choice θ ← get-thread c θ′ ← step-thread θ set-thread c θ′
We can run
threads by continuing to iterate the state of the thread pool by calling step-pool
, until the main thread (given by the thread at position 0
) has yielded a result, or until evaluation fails for some reason:
run : ℕ → IM ℕ run (suc k) = do θ ← get-thread 0 case θ of λ where (running (‵return n)) → return n _ → do step-pool run k run _ = fail
run
is parameterized by a natural number indicating how many steps we should do before giving up and failing.13
Given the functions defined so far, we can define the following meaning function which turns a command tree into a computation that we can run n
steps of, where steps can happen in interleaved order of evaluation:
⟦_⟧ : Free Cmd Resp ℕ → ℕ → IM ℕ ⟦ f ⟧ k ts = run k (running f ∷ ts)
We can now also define an interp
function that uses eval
and ⟦_⟧
to interpret an expression into a number (unless the computation crashes):
interp : Expr → ℕ → Stream ℕ ∞ → Maybe ℕ interp e k cs = case (⟦ eval e ⟧ k [] cs 0) of λ where nothing → nothing (just (n , _)) → just n
Running It
The following interleaving orders (interleaving1
-3
) correspond gives us the three different outputs (i.-iii.) summarized earlier when we discussed the interleave-example
program:
interleaving1 : Stream ℕ ∞ interleaving1 = (0 ∷ 1 ∷ 4 ∷ 3 ∷ 1 ∷ 2 ∷ []) ++ˢ (repeat 0) interleaving2 : Stream ℕ ∞ interleaving2 = (0 ∷ 1 ∷ 3 ∷ 4 ∷ 1 ∷ 2 ∷ []) ++ˢ (repeat 0) interleaving3 : Stream ℕ ∞ interleaving3 = (0 ∷ 1 ∷ 3 ∷ 2 ∷ 4 ∷ 1 ∷ []) ++ˢ (repeat 0) interp-interleave-example=3 : just 3 ≡ interp interleave-example 100 interleaving1 interp-interleave-example=3 = refl interp-interleave-example=4 : just 4 ≡ interp interleave-example 100 interleaving2 interp-interleave-example=4 = refl interp-interleave-example=5 : just 5 ≡ interp interleave-example 100 interleaving3 interp-interleave-example=5 = refl
Conclusion
We have illustrated how to use a free monad for modeling non-determinism as an effect, in a way that does not require a deep understanding of the continuation monad, and in a way that lets us write a definitional interpreter and define its effects, completely separately from the interpreter itself.
For example, we could have used a small-step interpreter, or an interpreter in continuation-passing style.↩
To be more specific, free monads can be used to implement most monads except for the continuation monad.↩
The command tree nomenclature, and (more importantly) the core idea of this post, is due to Hancock and Setzer. This representation of command trees is borrowed from Baanen and Swierstra’s recent work on predicate transformer semantics.↩
The
λ where true → … ; false → …
is Agda notation for a function that pattern matches on the input.↩The syntax
∀ {A B} → …
means that Agda will automatically try to infer which typesA
andB
the function implicitly takes as input.↩The
mutual
keyword tells Agda that the definitions may be mutually recursive. That is, the definition ofCmd
can refer toResp
, andResp
can refer toCmd
.↩nondet-example
is an Agda proof thateval (nondet-expr-example)
is equal to the command tree shown in the type of the program.↩λ{ (n₁ , n₂) → … }
is Agda notation for a pattern matching lambda, similarly toλ where (n₁ , n₂) → …
.↩The
∞
index inStream Bool ∞
is a size annotation, required by theStream
type in the Agda Standard Library.↩For the purpose of this blog post, we are content with claiming that
CM
is a monad, and shall not bother with proving the monad laws for the notion of_>>=_
andreturn
ofCM
.↩Since we are working in Agda, we might avoid this use of
Maybe
by using more precise (dependent) types. We are more concerned with explaining to the programmer how effects are actually intended to be interpreted here than we are with explaining this to Agda’s type checker. (Even if the two tend to be connected in practice.)↩Once again we are content with claiming that
IM
is a monad, and shall not bother with proving the monad laws for the notion of_>>=_
andreturn
ofIM
.↩This parameterization by a natural number is a folklore technique known by many names, such as fueled, petrol-driven, gas-driven, or step-indexed interpretation. We use this technique because Agda insists that function should be obviously terminating or obviously non-terminating. It is not obvious (to Agda) that our notion of thread pool will always be terminating. We could have helped Agda realize that it is by using sized types, but that is beyond the scope of this post.↩