I studied a proof today that I had been putting off for some time now; that a simply typed lambda calculus is normalizing — every well-typed expression eventually reduces to a value. The proof is quite clever and that the exact wording of the proof in *Types and Programming Languages* is a bit terse so I think it will make for a good, juicy blog post.

In case you haven’t already, you need to first read up (a little bit) on untyped and typed lambda calculus. Try reading to the point where you can comfortably state why $$ lambda x cdot x x $$ can not be well-typed. I’ll wait right here, I promise. 🙂

I’ll use the symbol $$ [x mapsto s] t $$ to mean “replace instances of $$ x $$ by $$ s $$ in the term $$ t $$”. We’ll talk about a typed lambda calculus which has exactly one type $$ A $$ and only one value of that type, $$ a $$. We’ll evaluate in normal order.

Firstly, note (informally) that all types will look like $$ ((A to A) to (A to A)) to (A to A) $$. It is not hard to see (or prove) that the set of types is isomorphic to a set of binary trees with every internal node a $$ to $$ and every leaf an $$ A $$. This is important since we’ll do structural induction on the binary tree corresponding to a type.

We now define a set of what are usually called reducibility candidates. They are a predicate on terms of the lambda calculus with the following conditions

- $$ R_{A}(t) $$ if $$ t $$ halts.
- $$ R_{T_{1} to T_{2}}(t) $$ $$ iff $$ $$ t $$ halts and $$ R_{T_{1}}(s) Rightarrow R_{T_{2}}(t s) $$

Since $$ R_T $$ is a predicate, we can safely say things like $$ x $$ is *in* $$ R_A $$ etc.

We aim to prove the following

- Every term in a set $$ R_{T} $$ (for any type $$ T $$) is normalizable. Note that this follows from the definition of the reducibility candidates
- Every term of type $$ T $$ is in $$ R_{T} $$.

These two, combined, will allow us to draw the conclusion that every well-typed term in our simply typed lambda calculus halts.

**Lemma 0: Every term in a reducibility candidate halts**

This follows directly from the definition of a reducibility candidate.

**Lemma 1: Reduction does not change the reducibility status of a term**

Formally, we try to prove $$ R_T(t) wedge (t to t’) iff R_T(t’) $$. For this, we use the structural induction on binary trees isomorphic to each type we talked about above. There are two cases.

__T is the atom type__

This is the *base case* for our induction and also pretty obvious. From $$ t to t’ $$ we conclude $$ t $$ halts iff $$ t’ $$ halts. Since the only condition to be in $$ R_A $$ is halting, $$ R_A(t) iff R_A(t’) $$.

__T is a function type__

Let $$ T equiv T_1 to T_2 $$. In this case, each $$ t’ $$ has to satisfy the additional condition $$ forall s in R_{T_1} cdot R_{T_2}(t’;s) $$. We begin by taking an arbitrary $$ s $$ in $$ R_{T_1} $$.

Notice how $$ t ; s to t’ ; s $$ [1]? Also notice how $$ T_2 $$ is a substructure of $$ T equiv (T_1 to T_2) $$? We can invoke structural induction here and say we already know $$ R_{T_2}(t ; s) Rightarrow R_{T_2}(t’ ; s) $$ (*statement 0*). Also, since we’re given $$ R_T(t) $$, we know for all $$ s $$ in $$ R_{T_1} $$, $$ t ; s $$ is in $$ R_{T_2} $$ (property of $$ R_T $$) (*statement 1*). We can conclude from *statement 0* and *statement 1* that for any arbitrary $$ s $$ in $$ R_{T_1} $$, we have $$ R_{T_2} (t’ ; s) $$. This is the extra condition we needed to show $$ t’ $$ is in $$ R_T $$.

**Lemma 2: Every well typed term of type T is in $$ R_T $$ **

This is proved by proving a stronger assertion: if $$ x_1:T_1, …, x_n:T_n vdash t ; : ; T $$ and $$ emptyset vdash v_i : T_i $$ for all $$ i $$ then $$ R_T([x_1 mapsto v_1] … [x_n mapsto v_n]t) $$. This is a stronger assertion because this also applies to terms with free variables while, at the *top level*, we’ll need to consider only closed terms. The assertion will be invoked on open terms only via induction.

To prove this assertion, we’ll invoke induction on the *size* of a term. So, informally, $$ t_1 ; t_2 $$ is greater than either $$ t_1 $$ or $$ t_2 $$ and $$ lambda a cdot x $$ is greater than $$ x $$. We make a case by case analysis on the various forms $$ t $$ can be in

__A variable__

This is obvious, since $$ t = x_i $$ for some $$ i $$.

__An application__

This too is easy. Since $$ t equiv t_1 ; t_2 $$, we invoke induction on $$ t_1 $$ and $$ t_2 $$ separately and then use $$ ([a mapsto b]c) ([a mapsto b]d) equiv ([a mapsto b] (c ; d)) $$ to get our result.

__An abstraction__

We know, from its form that $$ t $$ is a value. The other thing left to prove is that, keeping types compatible, $$ forall s in R_S cdot R_M(t ; s) $$. $$ t $$ is of the form $$ lambda x : S cdot b $$. Let $$ s in R_S $$. We know, from lemma 0 that $$ s $$ will reduce to a value $$ v $$. From lemma 1, we get $$ R_S(v) $$. The condition we’re trying to prove is $$ ([x_1 mapsto v_1]…[x_n mapsto v_n]t)s in R_P $$. But this beta-reduces to $$ ([x_1 mapsto v_1]…[x_n mapsto v_n](t ; s)) $$ $$ to $$ $$ ([x_1 mapsto v_1]…[x_n mapsto v_n](t ; v)) $$ $$ to $$ $$ ([x_1 mapsto v_1]…[x_n mapsto v_n]([x mapsto v]b) $$. $$ b $$ is a *smaller* term satisfying the requirements for this proposition, and we can say, via induction, $$ R_P([x_1 mapsto v_1]…[x_n mapsto v_n]([x mapsto v]b) $$. From lemma 1 we get $$ R_P(([x_1 mapsto v_1]…[x_n mapsto v_n]t)s) $$, which is what we wanted to prove.

—

- In normal order. I haven’t tried proving this in other evaluation strategies but I think I can go around by proving that the evaluation strategy does not matter in deciding the behaviour of a program in this lambda calculus.