(define (foo x) ...)are converted to defines with the essential syntax
(define foo (lambda (x) ...))Non-top-level defines are converted into equivalent letrec-s.
(define (foo x y)
(let ((x y)
(z x))
(let* ((x (+ z x)))
x)))
is converted to
(define foo
(lambda (x y)
(let* ((x__1 y)
(z x)
(x__2 (+ z x__1)))
x__2)))
cond, case, or,
and into equivalent terms using a small set of primitives. New
variables may be introduced in this phase.
In case a procedure like or or and occurs in the place
where its value is treated as a boolean (eg. first argument of
if), it is converted into an analogous boolean-returning
procedure, which will finally be represented by an analogous C
procedure (eg. || or &&).
Associative procedures are converted into structures of corresponding
nonassociative procedures. List is converted to a structure of cons-s.
Map and for-each with more than two arguments are converted into an
equivalent do-cycle. map-s and for-each-s with two arguments are
treated as if they were defined in the compiled file -- the definitions
map1 and for-each1 are automatically included, if needed.
There is an option in `hobbit.scm' to make all map-s and
for-each-s be converted into equivalent do-loops, avoiding the use
of map1 and/or for-each1 altogether.
Here Hobbit produces code for creating real closures for all the lambda-terms which are not marked as being liftable by the previous liftability analysis.
Global variables (eg x-glob) are translated as pointers (locations) to SCM objects and used via a fetch: *x_glob (or a fetch macro GLOBAL(x-glob) which translates to *x_glob).
While producing closures hobbit tries to minimize the indirection levels necessary. Generally a local variable x may have to be translated to an element of a vector of local variables built in the procedure creating x. If x occurs in a non-liftable closure, the whole vector of local variables is passed to a closure.
Such a translation using a local vector will only take place if either x is set! inside a non-liftable lambda-term or x is a name of a recursively defined non-liftable function, and the definition of x is irregular. The definition of x is irregular if x is given the non-liftable recursive value t by extra computation, eg as
(set! x (let ((u 1)) (lambda (y) (display u) (x (+ u 1)))))
and not as a simple lambdaterm:
(set! x (lambda (y) (display x) (x (+ y 1))))
In all the other cases a local scheme variable x is translated directly to a local C variable x having the type SCM (a 32-bit integer). If such an x occurs in a non-liftable closure, then only its value is passed to a closure via the closure-vector. In case the directly-translated variable x is passed to a liftable lambda-term where it is set!, then x is passed indirectly by using its address &x. In the lifted lambda-term it is then accessed via *.
If all the variables x1, ..., xn created in a procedure can be translated directly as C variables, then the procedure does not create a special vector for (a subset of) local variables.
An application (foo ...) is generally translated to C by an internal apply of the SCM interpreter: apply(GLOBAL(foo), ...). Using an internal apply is much slower than using direct a C function call, since:
However, in case foo is either a name of a non-redefined primitive or a name of a non-redefined liftable procedure, the application is translated to C directly without the extra layer of calling apply: foo(...).
Sometimes lambda-lifting generates the case that some variable x is accessed not directly, but by *x. See the next section.
Undefined procedures are assumed to be defined via interpreter and are called using an internal apply.
When this pass starts, all the real (nonliftable) closures have been translated to closure-creating code. The remaining lambda-terms are all liftable.
Lambda-lifting is performed. That is, all procedures defined inside some other procedure (eg. in letrec) and unnamed lambda-terms are made top-level procedure definitions. Any N variables not bound in such procedures which were bound in the surrounding procedure are given as extra N first parameters of the procedure, and whenever the procedure is called, the values of these variables are given as extra N first arguments.
For example:
(define foo
(lambda (x y)
(letrec ((bar (lambda (u) (+ u x))))
(bar y) )))
is converted to
(define foo
(lambda (x y)
(foo-fn1 x y) ))
(define foo-fn1
(lambda (x u)
(+ u x) ))
The case of mutually recursive definitions in letrec needs special treatment -- all free variables in mutually recursive funs have, in general, to be passed to each of those funs. For example, in
(define (foo x y z i)
(letrec ((f1 (lambda (u) (if x (+ (f2 u) 1))))
(f2 (lambda (v) (if (zero? v) 1 (f1 z)))) )
(f2 i) ))
the procedure f1 contains a free variable x and the procedure f2 contains a free variable z. Lambda-lifted f1 and f2 must each get both of these variables:
(define (foo x y z i) (foo-fn2 x z i) ) (define foo-fn1 (lambda (x z u) (if x (+ (foo-fn2 x z u) 1))) ) (define foo-fn2 (lambda (x z v) (if (zero? v) 1 (foo-fn1 x z z))) )
Recall that hobbit has already done dependency analysis and has split the original letrec into smaller chunks according to this analysis: see pass 1.
Whenever the value of some free variable is modified by set! in the procedure, this variable is passed by reference instead. This is not directly possible in scheme, but it is possible in C.
(define foo
(lambda (x y z)
(letrec ((bar (lambda (u) (set! z (+ u x z)))))
(bar y)
z)))
is converted to incorrect scheme:
(define foo
(lambda (x y z)
(foo-fn1 x (**c-adr** z) y)
z))
(define foo-fn1
(lambda (x (**c-adr** z) u)
(set! (**c-fetch** z) (+ u x (**c-fetch** z))) ))
The last two will finally be compiled into correct C as:
SCM foo(x, y, z)
SCM x, y, z;
{
foo_fn1(x, &z, y);
return z;
}
SCM foo_fn1(x, z, u)
SCM x, u;
SCM *z;
{
return (*z = (u + x) + *z);
}
As the scheme do-construction is compiled into C for, but for cannot occur in all places in C (it is a statement), then if the do in a scheme procedure occurs in a place which will not be a statement in C, the whole do-term is lifted out into a new top-level procedure analogously to lambda-lifting. Any statement-lifted parts of some procedure foo are called foo_auxn, where n is a number.
The special C-ish procedure **return** is pushed into a scheme term as far as possible to extend the scope of statements in the resulting C program. For example,
(define foo
(lambda (x y)
(if x (+ 1 y) (+ 2 y)) ))
is converted to
(define foo
(lambda (x y)
(if x (**return** (+ 1 y)) (**return** (+ 2 y))) ))
Immediate tailrecursion (foo calling foo tailrecursively) is recognized and converted into an assignment of new values to args and a jump to the beginning of the procedure body.
All procedures taking a list argument are converted into ordinary non-list taking procedures and they are called with the list-making calls inserted. For example,
(define foo
(lambda (x . y)
(cons x (reverse y)) ))
is converted to
(define foo
(lambda (x y)
(cons x (reverse y)) ))
and any call to foo will make a list for a variable y. For example,
(foo 1 2 3)
is converted to
(foo 1 (cons 2 (cons 3 '()))).
All higher-order procedure calls where an argument-term contains unbound variables will generate a new instance (provided it has not been created already) of this higher-order procedure, carrying the right amount of free variables inside to right places.
For example, if there is a following definition:
(define (member-if fn lst)
(if (fn (car lst))
lst
(member-if fn (cdr lst)) ))
and a call
(member-if (lambda (x) (eq? x y)) lst),
a new instance of member-if is created (if an analogous one has not been created before):
(define (member-if_inst1 tmp fn lst)
(if (fn tmp (car lst))
lst
(member-if_inst1 tmp fn (cdr lst)) ))
and the call is converted to
(member-if_inst1 y foo lst)
and a top-level define
(define (foo y x) (eq? x y))
In addition, if the higher-order procedure is to be exported, an additional instance is created, which uses apply to call all argument-procedures, assuming they are defined via interpreter. The exportable higher-order procedure will have a name fun_exporthof, where fun is the name of the original procedure.
All C<->Scheme conversions for immediate objects like numbers, booleans and characters are introduced. Internal apply is used for undefined procedures. Some optimizations are performed to decrease the amount of C<->Scheme object conversions.
All vector, pair and string constants are replaced by new variables. These variables are instantiated to the right values by init_foo*.
Procedures foo which are to be exported (made accesible to the interpreter), and which have an arity different from one of the following five templates: x, (), (x), (x y), (x y z), are made accessible via an additional procedure foo_wrapper taking a single list argument.
More or less straightforward.
The type conversion between C objects and immediate Scheme objects of the type boolean, char and num is performed by macros. The scheme object '() is represented by the macro object EOL.
Experiment yourself by defining:
(define *build-intermediate-files* #t)
instead of the default:
(define *build-intermediate-files* #f).