Dependable Types 3: Reductio Sine Absurdum
19 Feb 2018Welcome back! This time, we’re going to write that evaluate
function we
mentioned in the last article… Strictly,
this function will be better named reduce
, as it will perform an operation
called beta reduction (or β-reduction, if you’re so inclined). What
this means is that, any time we see an Abstraction
on the left-hand side
of an Application
, we can simplify by taking the abstraction’s body, and
replacing any mention of its parameter with the argument.
For example:
(λx.x)y
We can reduce this down to y
simply by substituting y
for all
occurrences of x
inside the x
abstraction. Woo!
As a more complicated example:
(λx.(λy.(λz.z)y)x)a
Here, we have a few nested functions, so we handle them one at a time.
First, we substitute a
in place of x
:
(λy.(λz.z)y)a
Now, we’ve eliminated the abstraction for x
(and so the context no
longer has an x
), but we see that a
is now applied to the y
abstraction.
Let’s do the same thing for y
as we did for x
:
(λz.z)a
This is exactly the same as the body of the y
abstraction, but all the
mentions of y
have been replaced by a
, and so y
has been totally
eliminated! Finally, we substitute a
in place of z
:
a
Boom. This, reader mine, is what we’re going to do today. Before we go any further, think about why this might be a bit trickier with De Bruijn indices. Buckle up, friends: we’re gonna need to write some proofs.
Before we get to code, let’s think about the type signature. If we get
everything right, the context of the reduced expression won’t change, and
nor will the type. As our Expression
type is indexed by both of these
things, this helps us to refine the set of possible implementations, and
hence produce fewer bugs!
reduce
: Expression ctx ptype
-> Expression ctx ptype
As usual, to make this code fit on a mobile screen, a lot of the variables have been abbreviated. If you’d prefer to read a more verbose form, the code for this post is a good place to start!
Apart from an Application
in an Abstraction
, there’s not anything we can do
to reduce our expression. So, with that one exception, all this is going to
look pretty uninteresting, and just like any other recursive function you might
imagine for this type:
reduce
: Expression ctx ptype
-> Expression ctx ptype
reduce (Abstraction param body)
= Abstraction param (reduce body)
reduce (Variable ref)
= Variable ref
reduce (Application fn arg)
with (reduce fn, reduce arg)
| (Abstraction param body, arg')
= substitute Here body arg'
| (fn', arg')
= Application fn' arg'
We can see that, when we encounter the magical situation we’re looking for, we
do some kind of substitution of arg'
in place of Here
inside the
body
. There’s a catch, though: we’re using Here
and There
as De Bruijn
indices.
Take the following:
(λx.λy.(λz.x)x)
When we use this format, we can identify all the references to x
simply
because they’re written as x
. However, let’s look at the same expression
when written with De Bruijn indices:
λλ(λ3)2
The variable three lambdas ago and the variable two lambdas ago are actually the same variable! De Bruijn indices change depending on the level of abstraction-nesting, which means that we’ll also have to keep track of that!
Once again, let’s think about the type for substitute
before we write any
code. The body
is in a different context to the arg'
: it has an extra
variable that isn’t present in the arg'
. However, once we finish
substitution, we will have eliminated that variable entirely, so it is also
missing from the result!
substitute
: (ref : Elem atype ctx)
-> Expression
ctx
ptype
-> Expression
(dropElem ctx ref)
atype
-> Expression
(dropElem ctx ref)
ptype
Let’s start with the simplest case: Application
. In this instance, we recurse
down through both sides, and we’re done. We’ll find that Application
is
very much the simple case in this process, as it is the only one of the
three constructors that doesn’t touch the context.
substitute ref (Application fn x) arg
= Application (substitute ref fn arg)
(substitute ref x arg)
Now, let’s have a little look at the case of Abstraction
. We probably want to
increment the index we’re looking for, so we’ll stick a There
on it:
substitute ref (Abstraction param body) arg
= Abstraction
param
(substitute (There ref) body arg)
… Not quite.
Type mismatch between
Expression
(dropElem ctx ref)
atype
(Type of arg)
and
Expression
(dropElem (param :: ctx) (There ref))
atype
(Expected type)
This type error rightly tells us that arg
is the problem here, and
specifically that it is in the wrong context: inside the Abstraction
,
there is an extra parameter in our context to worry about, so we need to
update the arg
context accordingly. Luckily, we know that the arg
works in
the ctx
context, so param
doesn’t get a mention inside. All we need to do
is increment the references.
… Ish. If a variable is introduced within that body
, we should not
bump the reference - we only care about those that come from outside the
body
, which we call the free variables of the body
expression.
Symmetrically, those that come from inside the body
will be referred to
as bound variables.
expandContext
: (bound : Context)
-> Expr (bound ++ free) t
-> Expr (bound ++ a :: free) t
This type describes what we want: expand the context by one more bound
variable. Consequently, we’ll need to increment all free
references, but
none of the bound
references. Sorry if these last few paragraphs take a few
readthroughs…
expandContext ctx (Application fn x)
= Application (expandContext ctx fn)
(expandContext ctx x)
expandContext ctx (Abstraction param body)
= Abstraction
param
(expandContext (param :: ctx) body)
expandContext ctx (Variable ref)
= Variable (expandElemContext ctx ref)
As before, Application
is nice and easy. Abstraction
is a bit ugly, but as
expected: we add this new param
to the bound list, and recurse. The fun
happens in the Variable
case: we want to update the ref
when it’s a
reference to a free variable. Let’s have a look at that function:
expandElemContext
: (bound : Context)
-> Elem t (bound ++ free)
-> Elem t (bound ++ a :: free)
expandElemContext [] ref
= There ref
expandElemContext (_ :: _) Here
= Here
expandElemContext (_ :: xs) (There later)
= There (expandElemContext xs later)
When we have an empty context, all references must be free. If we have
a Here
reference to a non-empty context, we know it must be bound! In
any other case, we can just drop the most recent bound variable, and a layer of
the reference, and recurse! This function isn’t too scary, really: we
ignore a reference within the context, and increment a reference
beyond. Don’t panic!
Now we have all that out the way, we can rewrite our substitute
statement:
substitute ref (Abstraction param body) arg
= Abstraction
param
( substitute
(There ref)
body
(expandContext [] arg)
)
Magnifique.
The only case we haven’t considered for substitute
yet is Variable
. I’ve
deliberately left this one until last because, well, it’s quite scary. To
people unfamiliar with Idris, this is going to look… well, unfamiliar.
Nevertheless, let’s crack on!
When we encounter a Variable
within an expression, one of two things may
be true:
-
The
Variable
’s reference is not the one we’re looking to eliminate, so we just update it if necessary. -
The
Variable
’s reference is the one we’re looking for, and we want to replace thisVariable
with the givenarg
.
Now, if the variable is the one we’re looking for, we’re going to have to
prove a couple things to Idris. First of all, atype
and ptype
are going
to have to be equivalent. If we can’t prove this, Idris won’t let us unify
atype
and ptype
, as they could be different. Secondly, we have to prove
that the context references are indeed the same! The order isn’t especially
important, but both are required to satisfy the compiler.
Enough stalling for time; let’s see the beast:
substitute {atype} {ptype} ref (Variable ref') arg
with (decEq ptype atype)
substitute {atype = ptype} ref (Variable ref') arg
| (Yes Refl)
with (decEq ref' ref)
substitute ref' (Variable ref') arg
| (Yes Refl)
| (Yes Refl)
= arg
substitute ref (Variable ref') arg
| (Yes Refl)
| (No contra)
= Variable (independentRefs ref' ref contra)
substitute ref (Variable ref') arg
| (No contra)
= Variable (independentValues ref' ref contra)
Warned you, right? There are a couple things going on here. Firstly, we’re
pulling terms out of the type. This is another bit of “dependently-typed
magic”: we can reflect things down from the type-level, such as atype
and
ptype
(the ProgramType
indices for the expression and the substitution),
and then use them within our function! Types are terms are types are
terms. It really is magical.
Now we have access to those values, we can use decEq
(decidable equality)
to compare them. The result of doing so is either Yes prf
or No contra
,
where prf
and contra
are proofs of one way or the other. We’ll see more
of these in a moment.
I’m omitting the accompanying
DecEq ProgramType
implementation because it’s quite mechanical, and doesn’t contain anything we’re not about to see. However, if you’re enjoying this, I’d encourage you to go take a look!
If we successfully prove both, the result is simply arg
: we know that
we’re safe to make the substitution, so we replace the Variable
with the
arg
we’ve been carrying. Job done! If that isn’t, the case, however, we
have a little more work to do. Specifically, we need to update the
Variable
index to account for the now-eliminated parameter. To avoid a
Maybe
, let’s have a go at proving that our reference can be adjusted
safely. You’ll notice that the above code had two proofs: independentRefs
and
independentValues
are the holes that we’re going to have to fill.
Don’t think there was any magic in the selection of proofs. Most of this code was written with a lot of help from Idris’ editor plugin. Most of the time with proofs, the obvious proof search will do most (if not all) the work for you!
Firstly, let’s look at the type of independentRefs
:
independentRefs
: (l : Elem x xs)
-> (r : Elem x xs)
-> Not (l = r)
-> Elem x (dropElem xs r)
Here, we have to prove that, if the l
and r
values are not equal, x
will exist in the list without the element to which r
refers. In other,
simpler words: if l
and r
are both references to a value of some type, and
they’re not the same, that type must be in there at least twice, and we can
remove one and provide a reference to another!
independentRefs Here Here prf
= absurd (prf Refl)
We first deal with the case in which the two references are the same.
However, we already have a proof that this can’t be the case, so this is…
absurd
! The proof is Refl
(we can think of this as “obvious” for now),
given the proof that Not (l = r)
.
I won’t go into too much detail, but
absurd
is a function fromVoid
to anything. In other words, there’s no way this can happen!
independentRefs Here (There later) prf
= Here
If l
points to the first element, and r
to any other, then we know
that dropping r
will make no difference - the element will be in first
place!
independentRefs (There later) Here prf
= later
Similarly, if r
points to the first, and l
to something later, the
element is simply one place closer to the start than before.
independentRefs (There this) (There that) prf
= There (independentRefs this that (prf . cong))
Here’s our recursive step. If both references are later on, we recurse.
Notice the cong
function here:
cong : a = b -> f a = f b
“If a
and b
are the same, applying f
to both will give the same
answer”. The f
here is There
: if we can remove a There
from each and
still prove that they’re the same, we’re good! Now, what about that
independentValues
function? Well, it’s almost identical to the above. The
only reason it has to exist is that, this time, we want to prove that, if the
referenced values are different, removing one won’t remove the other:
independentValues
: (l : Elem x xs)
-> (r : Elem y xs)
-> Not (x = y)
-> Elem x (dropElem xs r)
This one is even simpler, as we don’t have to carry the cong
proof: we only
care about the proof at the point that it fails. Voila:
independentValues Here Here prf
= absurd (prf Refl)
independentValues (There later) Here prf
= later
independentValues Here (There later) prf
= Here
independentValues (There x) (There later) prf
= There (independentValues x later prf)
Now, we have everything we need. We’ve made it, friends: the above is almost everything from this article’s code. Still, while we’re here, why don’t we try one last party trick?
Earlier, we had this expression:
(λx.(λy.(λz.z)y)x)a
If we convert this to our format, we get something rather more ugly:
Test : Expression (x :: context) x
Test {x} -- x is just a type!
= Application
( Abstraction
x
( Application
( Abstraction
x
( Application
( Abstraction
x
( Variable Here
)
)
( Variable Here
)
)
)
( Variable Here
)
)
)
( Variable Here
)
Here, we’re saying that Test
is an expression in any context with an
x
at the head, and it has a return type of x
. Given that we already
know the result of reducing this expression is just Variable Here
, why
don’t we write a test?
Dependent types to the rescue once more. Thanks to the magic of Idris, where terms are types and types are terms, we can write our tests at the type-level, and have them checked whenever we compile! On top of that, the compiler will recognise them as dead code after type-checking, so they won’t appear in the output binary! Here’s how we’d write the type of our test:
allClear
: reduce (Test {context = []} {x = PInt})
= Variable Here
We’re giving the compiler a couple hints here: to avoid polymorphism confusion,
we specialise our context
and x
values to make it easier for the
type-checker. In actual fact, this is required for the next step. What this
type says is that reducing our expression is equivalent to the reduced
form we expect. What do we write for our body, though?
allClear
= Refl
The most beautiful line in this whole post. The type-checker will do the
reduction, arrive at the result of Variable Here
, and then the proof of this
test becomes obvious. Now, whenever we change our code, this reduction
will be carried out, and the result will be checked. If the result does not
match the intended output, compilation will fail.
Failing tests are now type errors. How awesome is that?
Ok, this has been the longest post I’ve written by quite a way, but there was a lot to cover! I hope this has given you some sort of idea about how we write proofs for the type system, and how they allow us to expand contexts, update variable references, and all sorts of things we haven’t covered here!
Next time, we’ll be looking at some serious dependently-typed magic. We’ll
be turning our Expression
values into actual Idris functions that we can
call with arguments and receive results, all while remaining totally
type-safe. Until then, feel free to contact me on
Twitter with any suggestions, questions,
criticisms, and so on. I’d love to hear from you, and see what you’ve been
concocting!
Until next time, take care ♥
As always, a huge thanks to Liam for teaching me everything I know about Idris. Thanks also to Gabe for proofreading this article (among many)!