(λblog. blog blog) (λblog. blog blog)

diverging since 2008

Some words on lazy evaluation and sharing

2009-01-03 23:58

Prequel: lazy evaluation

The book The Implementation of Functional Programming Languages discusses at length the graph reduction implementation of a non-strict functional language (see esp. chapters 10-12). This technique is what is usually called lazy evaluation.

The point about lazy evaluation is that each function argument is evaluated at most once (when it is first needed). This also applies to names bound with lets as lets are (in the book) transformed into function applications.

The picture is a bit more complicated for GHC, but in essence: lazy evaluation guarantees that a monomorphic (non-polymorphic) expression with a name is evaluated only once.

The beef: fix

Now I'll continue where I left off in a previous posting about fix. We are dissecting the standard definition of fix:

> fix f = let x = f x in x

I previously explored one alternative definition, and now I'll use another as a springboard. So: why is fix not defined as

> myfix f = f (myfix f)

which is obviously equivalent to the previous one? The oft-heard answer is that the former definition has "better sharing characteristics".

Let's have a look at an infinite list of ones.

> ones = 1:ones :: [Integer]
> ones_myfix = myfix (1:) :: [Integer]
> ones_fix = fix (1:) :: [Integer] 

The typesigs are here to keep us monomorphic.

First up, ones. The name ones gets bound to a cons-cell (constructed by (:)), that has 1 as its head and ones as its tail. Only one cons-cell is ever allocated. Now let's have a look at what ones_myfix does.

ones_myfix ==> myfix (1:) ==> 1:myfix (1:) ==> 1:1:myfix (1:) ==> ...

This equivalent to ones_myfix = 1:ones_myfix, but only up to naming, and naming is crucial when dealing with lazy evaluation. The call myfix (1:) is not memoised so it is re-evaluated when we progress through the list, and each call allocates a new cons cell. The solution is to add an intermediate name, which is exactly what (the real) fix does:

ones_fix ==> fix (1:) ==> let x = 1:x in x

Which creates one cons cell exactly like ones, except the cell is called x. This is why the standard definition of fix has those "better sharing characteristics".

Afterword

The Implementation of Functional Programming Languages takes an approach where fix (AKA the Y combinator) is taken as a built-in and recursive lets are implemented with it. In GHC, recursive lets are elementary and thus fix can be defined without any special tricks.

The book actually mentions that there are two distinct ways for implementing the built-in Y: either cyclically (by sharing) or non-cyclically (re-evaluating at each step). This is exactly the same design decision we encounter with haskell's fix.

Updated 01:40

Tags: haskell.