Lambda Calculus for the Imperative Programmer I

This post is first in a series of articles on lambda calculus. You can see the whole series by looking up the tag lcip.

The Untyped Lambda Calculus

At the very core, the untyped lambda calculus (referred to as λ from this point on) is a model of computation. In other words, it formalizes the notion of computability and, since it is Turing Complete, it is possible to implement any algorithm using just the primitives given to us by λ (this last part is known as the Church Turing Thesis).

The definition of λ gives us two things: it teaches us how to construct "lambda terms" and it teaches us how to manipulate such terms we create. The surprise is not in the rules themselves — they are all very simple and intuitive; but in how they come together in creating a powerful model of computation.

λ Terms

A lambda term can either be a variable, an application or an abstraction. λ terms are inductively defined; any term resulting from the zero or more applications of the following rules will result in a lambda term:

  • (Variable) A textual symbol or token (like x) is a λ term. A variable is usually printed as itself.
  • (Application) Two λ terms can be combined into a new λ term by invoking this rule. The textual representation of an application is usually the textual representation of the first λ term followed by the textual representation of the second.
  • (Abstraction) A λ term can be combined with a variable to give a new λ term. An abstraction is generally textually represented as λ x. T; where x is the representation of the variable and T is the representation of the term.

In C:

enum {
  TERM_VARIABLE,
  TERM_ABSTRACTION,
  TERM_APPLICATION
};

typedef struct term {
  int type;
  union {
    struct {
      /* variables  */
      const char *var_name;
    };
    struct {
      /* applications  */
      struct term *left, *right;
    };
    struct {
      /* abstractions  */
      const char *arg_name;
      struct term *body;
    };
  };
} term_t;

void term_print(FILE *fptr, term_t *term) {
  switch (term->type) {
    case TERM_VARIABLE:
      fprintf(fptr, "%s", term->var_name);
      break;

    case TERM_APPLICATION:
      fprintf(fptr, "( ");
      term_print(fptr, term->left);
      fprintf(fptr, " ");
      term_print(fptr, term->right);
      fprintf(fptr, " )");
      break;

    case TERM_ABSTRACTION:
      fprintf(fptr, "( \ %s. ", term->var_name);
      term_print(fptr, term->body);
      fprintf(fptr, " )");
      break;
  }
}

For an abstraction λ x. T, x is said to be bound inside T. Variables (or symbols) that aren’t bound this way are said to be free. In the expression λ x. (λ y. (((λ t. x) y) z) y), z and the last y are free variables (the parenthesis’ exist to denote the order in which the terms are combined). That z is free is obvious, since the expression doesn’t have a sub-term of the form λ z. T, z can’t really be bound anywhere. The last y too is free since none of its parent abstractions (there is only one, the outermost one) abstract over y. The second last y, however, is bound since its parent abstracts over y.

A direct (but rather slow) way to list the free variables in an expression in C:

static void list_free_recursive(stack_t *stack, term_t *term) {
  switch (term->type) {
    case TERM_VARIABLE: {
      if (!stack_contains(stack, term->var_name))
        printf("%sn", term->var_name);
      break;
    }

    case TERM_APPLICATION: {
      list_free_recursive(stack, term->left);
      list_free_recursive(stack, term->right);
      break;
    }

    case TERM_ABSTRACTION: {
      stack_push(stack, term->arg_name);
      list_free_recursive(stack, term->body);
      stack_pop(stack);
      break;
    }
  }
}

void term_list_free_variables(term_t *term) {
  stack_t stack;
  stack_init(&stack);
  list_free_recursive(&stack, term);
}

The exact symbol used to represent bound variables don’t have any semantic meaning, in the same way x does not have any semantic meaning in int foo(int x) { return x; } (if you think about C compilation in terms of source files and compilation units, it isn’t hard to convince yourself that only external identifiers have any real meaning). In other words, you can always rename bound variables as long as you are consistent about it — λ x. (λ y. x) is exactly the same as λ t. (λ y. t). This property, called Alpha Equivalence, means that testing two terms for equality isn’t a simple recursive comparison — λ x. x is equal to λ y. y semantically but not textually.

Sometimes De Bruijn indices indices are used to indicate bound variables. Instead of textual variables, you use an integer to denote how far out the abstraction introducing a variable is. In λ x. (λ y. y x), for example, the y in (λ y. y x) is introduced by the first λ when moving outwards, while the x is introduced by the second. This makes its De Bruijn form λ. (λ. 1 2). Note that since the "variables" now directly reference the λ they’re interested in, the λ terms themselves don’t have to have explicitly name a symbol. De Bruijn indices ensure that equal terms also look equal textually.

λ terms with no free variables are called closed terms or combinators and terms with at least one free variable are called open terms. We need to be a little careful when dealing with open terms since we have to ensure we don’t end up accidentally binding variables. For instance, if T is closed, we can always use it in an abstraction like λ x. (λ y. (y (T x))). However, if T is open because it has x as a free variable, we may end up doing something unintentional (why?).

β Reductions

If anything, this is the property that should make it intuitively clear how λ models functional programming languages. It gives us the semantics of one step or transition in the evaluation or reduction of a given λ term.

A β reduction reduces (λ x. T) A to [x := A] T, where [a := B] C means "replace all open a in C by B. Consider (λ x. (λ x. x) x) y as an example — mechanically applying the rule tell us that this term can be reduced to [x := y] (λ x. x) x. There is only one open x in the term (λ x. x) x, which when replaced by y gives us (λ x. x) y.

A naive implementation of [var := new_term] term in C:

term_t *term_replace_open(term_t *term, const char *var,
                          term_t *new_term) {
  switch (term->type) {
    case TERM_VARIABLE:
      if (strcmp(term->var_name, var) == 0) return new_term;
      return term;

    case TERM_APPLICATION: {
      term_t *left =
        term_replace_open(term->left, var, new_term);
      term_t *right =
        term_replace_open(term->right, var, new_term);
      return term_make_application(left, right);
    }

    case TERM_ABSTRACTION:
      if (strcmp(term->arg_name, var) == 0) return term;
      return term_make_abstraction(
          term->arg_name,
          term_replace_open(term->body, var, new_term));
  }
}

A sequence of β reductions is the λ equivalent of "executing" a program.

To make sure you’ve understood how β reductions work, try to (mentally or with pen and paper) intuitively guess what the following expressions reduce to after a β reduction:

  • (λ x. (λ y. (x y))) (λ x. x)
  • (λ x. (x x)) (λ x. (x x))
  • (λ x. (λ x. (x x))) (λ x. x)

What can you infer from the second example?

A good way to develop intuition and ease in manipulating λ terms is to have a look at the various ways to encode "real world values" like integers and boolean in λ. One way to do this is using Church encoding. However, before that you need to take a look at the conventions generally when representing λ terms (copied from this place):

  • Outermost parentheses are dropped: M N instead of (M N)
  • Applications are assumed to be left associative: M N P may be written instead of ((M N) P)
  • The body of an abstraction extends as far right as possible: λx. M N means λx. (M N) and not (λx. M) N.
  • A sequence of abstractions is contracted: λx. λy. λz. N is abbreviated as λxyz. N

Evaluation Strategies

Given a term to reduce, a redex, the β reduction rule tells us how to get a reduct (the result of applying the rule) but nothing about how to choose the redex to be reduced. For example, the expression (λ xy. x y) ((λ x. x) (λ x. x)) (from now on we’ll follow the notation specified above) has two reduction sites. We can see the expression either as (λ xy. x y) T (with ((λ x. x) (λ x. x)) as T) and try to reduce it to [x := T] (λ y. x y); or as (λ xy. x y) ((λ x. x) S) (with λ x. x as S) and try to reduce it to (λ xy. x y) ([x := S] x).

For a more convincing case, consider (λ xy. y) ((λ x. x x) (λ x. x x)) (λ x. x); can you find an policy for choosing redexes that will put the evaluator in an infinite loop? Can you find a policy that doesn’t?

The process of choose the next redex in a λ term is dictated by the various reduction strategies a particular implementation can adopt.

Next posts

In the next posts we’ll discuss an implementation for λ, and then move onto typed variants.

Advertisements
This entry was posted in Computers and tagged , , . Bookmark the permalink.

One Response to Lambda Calculus for the Imperative Programmer I

  1. Pingback: Visto nel Web – 62 « Ok, panico

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s