lambda calculus
1. syntax
\[ E := x \mid \lambda x. E \mid E E \] On the RHS: variable, abstraction, application
1.1. application
\(E_1 E_2\) where \(E_1\) is a function and \(E_2\) is an argument
1.2. abstraction
\(\lambda x. E\) where \(x\) is a bound variable and \(E\) is an expression
2. notation
\[ \lambda s (\lambda z. s z) \] is written \(\lambda sz.sz\)
We assume that function application is left-associative.
3. free and bound variables
- \(\lambda\) -calculus is lexically scoped: variables in expression are bound by the tightest containing function
- A free variable in an expression is not bound by any containing function
- A combinator is a \(\lambda\) expression without free variables
4. \(\beta\) -substitution
- the reduction rule for the \(\lambda\) -calculus
- How do we apply functions to arguments?
- This is the way that \(\lambda\) -calculus programs get run
\[ (\lambda x.E) E_a \rightarrow E[E_a/x] \]
- "substitute \(E_a\) for all occurences of \(x\) in \(E\)"
- variable
- \(y[E_a / x] = E_a\) if \(x \equiv y\)
- \(y[E_a / x] = y\) otherwise
- application
- \((E_1 E_2)[E_a / x] = (E_1[E_a / x] E_2[E_a / x])\)
- abstraction
- \((\lambda y. E_1)[E_a / x] = \lambda y. E_1\) if \(x\equiv y\) – "if we hit a bound variable, stop substituting deeper into that term"
- \((\lambda y. E_1)[E_a / x] = \lambda y. E_1(E_a / x)\) otherwise – "if \(x\) is not bound in the body, we keep substituting" CAVEAT BELOW
For this last bullet, now we have a problem! What if \(E_a\) contains a variable that is also called \(y\)? Now that \(y\) in the \(E_a\) is going to be bound. This is called the problem of variable capture
5. \(\alpha\) -renaming
- we should rename all the variables \(y\) in \(E_1\) to be something else so there is no conflict
- why is this justified? Doesn't change the behavior of the function under application, which is really what we're concerned with
- why rename in \(E_1\) and not in \(E_a\)? \(E_a\) could be part of a larger context – e.g. bound by an even larger scope – so we don't want to mess with it. But in \(E_1\), everything that is a \(y\) is bound, so we can change it however we want
6. \(\eta\) -rule
\((\lambda x. E x) \rightarrow E\) if \(x\not\in FV(E)\)
- "the only thing that matters about lambda terms is the behavior on the outside"
- \(E\) behaves the same when being applied as \((\lambda x. E x)\)
7. how is computation done?
- so far, with everything discussed, we are treating the \(\lambda\) -calculus as a completely syntactic system
- we haven't said what the meaning of programs is
- normal form – an expression that cannot be reduced further
- this is how we run the \(\lambda\) -calculus: reduce the expression until it can't be reduced further
- see program semantics
8. church numerals
8.1. Zero
\[ \lambda s z. z \] Zero is defined simply as the function that takes a function \(s\) and an argument \(z\) and returns \(z\)
We choose the abbreviation \(\mathsf{0}\) to stand in for \(\lambda s z . z\): \[ \mathsf{0} \equiv \lambda s z.z \]
At any point, we may replace \(\mathsf{0}\) with the function.
8.2. Natural numbers
We define the rest of the natural numbers in a similar way:
\[\begin{align*} \mathsf{1} &\equiv \lambda sz. s(z)\\ \mathsf{2} &\equiv \lambda sz. s(s(z))\\ \mathsf{3} &\equiv \lambda sz. s(s(s(z)))\\ &\vdots \end{align*}\]
8.3. Succ
We now define a function such that, given a representation for the number \(n\), it returns the representation for the number \(n+1\).
\[ \mathsf{S} \equiv \lambda w y x. (y ( (w y) x ) ) \equiv \lambda w y x. y (wyx) \]
Question: are we sure that this type checks? That is, in the above, how do we know that \(w\) is able to take 2 arguments? What if it only takes 1 argument? Does something break. Conjecture: I think it's ok provided we assume that all the terms involved are combinators. Then, each term in the body was introduced through binding. And there is no case in which this binding will go through if the arguments themselves are not combinators (check).
How should we read this? Let's see it in action for a given number \(\mathsf{N}\). Then,
\[\begin{align*} \mathsf{S N} &= \lambda y x . (y ( (\mathsf{N} y ) x ) )\\ &= \lambda s z . s (\mathsf{N} s z) \\ &= \lambda s z . s ( \underbrace{s ( s ( \cdots }_{n} (z) \cdots )) )\\ &= \lambda s z . \underbrace{s ( s ( s ( \cdots }_{n+1} (z) \cdots )) )\\ &\equiv \mathsf{N} + \mathsf{1} \end{align*}\] The second line follows from the fact that we can rename the bound variables without changing the function.
We say that \(\mathsf{S N}\) and \(\mathsf{N + 1}\) are \(\beta\) -equivalent up to an \(\alpha\) -equivalence. That is, one can be transformed into the other using \(\beta\) -reductions, if we ignore functions that are the same thing but just use different variable names (equivalent up to \(\alpha\) -conversion).
8.3.1. some other intuitions
We want our successor to take a number and then two other arguments: \(y\) and \(x\). Our successor function is curried, so given a number, it returns a new function that take \(y\) and \(x\) as arguments. We want our successor function to return a Church numeral. What is a Church numeral for \(n+1\)? It's something that takes the first argument and applies it to the second argument \(n+1\) times. So, given \(y\) and \(x\), let's write our function body. Well, we have been given the Church numeral for \(n\), so applying it to \(y\) and \(x\) yields an \(n\) -fold application of \(y\) to \(x\). Then, we are almost there! Let our function return the application of \(y\) to this \(n\) -fold application. Then, we have an \(n+1\) -fold application of \(y\) to \(x\). And this is the Church numeral for \(n+1\).
8.4. addition
To add two Church numerals \(\mathsf{M}\) and \(\mathsf{N}\) using: \[ \mathsf{MSN} \]
8.4.1. intuitions
We can see this is true, because \(\mathsf{MS} = \lambda z . \underbrace{\mathsf{S}(\mathsf{S}(\cdots \mathsf{S}(}_{m} z )\cdots))\) is a function that takes an argument \(z\) and applies the successor function to it \(m\) times. What will that \(z\) be? It is another Church numeral \(\mathsf{N}\). And the \(m\) -th successor of \(\mathsf{M}\) is \(\mathsf{N}\).
9. Logical operators
Define True and False as: \[ \mathsf{T} = \lambda x y. x \] and \[ \mathsf{F} = \lambda x y. y \]
These are the functions that take two arguments and return, respectively, return the first and second argument.
9.1. AND
Then, we define: \[ \wedge \equiv \lambda p q. p q \mathsf{F} \]
9.1.1. intuitions
What does it take for this function to return true? If \(p\) is true, then, it will return the \(q\), in which case \(q\) had better be true. If \(p\) is false, it will return \(\mathsf{F}\).
9.2. OR
\[ \vee \equiv \lambda p q. p \mathsf{T} q \]
9.2.1. intuitions
What will it take for this function to return true? If \(p\) is true, it will return \(\mathsf{T}\). If it false, then it will return \(q\), in which case \(q\) had better be true.
10. recursion
See the note on Y Combinators.
11. computability
Let \(\mathbf{N}\) be the Church numerals. We say that a function \(F: \mathbb{N} \rightarrow \mathbb{N}\) is computable if and only if there is a lambda function \(f\) such that \(F(x) = y\) if and only if \(f(\mathbf{x}) =_{\beta} \mathbf{y}\). Where \(\mathbf{x},\mathbf{y} \in \mathbf{N}\) and \(=_{\beta}\) denotes \(\beta\) -equivalence.
Question: what if there is no finite set of \(\beta\) -reductions that does what we want? Then the function is un-computable I think.
It turns out that simply typed lambda calculus is strongly normalizing, but doesn't allow for the Y-combinator.