Forum: CompilerIssue: COMPILE-FILE-SYMBOL-HANDLING
References: CLtL p. 182
Issue IN-PACKAGE-FUNCTIONALITY (passed)
Issue CONSTANT-COMPILABLE-TYPES (passed)
Issue DEFPACKAGE (passed)
Category: CHANGE/CLARIFICATION
Edit History: V1, 01 Feb 1989, Sandra Loosemore
V2, 12 Mar 1989, Sandra Loosemore (discussion, error terms)
V3, 18 Apr 1989, Sandra Loosemore (new proposal)
V4, 15 Jun 1989, Sandra Loosemore (minor wording changes)
V5, 04 Jul 1989, Sandra Loosemore (incorporate amendments)
Status: Passed, as amended, June 89
Recommendation to drafting committee: in item 1b, clarify
that the "first" top-level form may appear as the first
form inside a top-level PROGN, as the result of macro
expansion, etc.
Problem Description:
It is not clear how COMPILE-FILE is supposed to specify to LOAD how
symbols in the compiled file should be interned. In particular, what
happens if the value of *PACKAGE* is different at load-time than it
was at compile-time, or if any of the packages referenced in the file
are defined differently?
There are two models currently being used to implement this behavior:
(1) Symbols appearing in the output file produced by COMPILE-FILE
are qualified with package prefixes in the same way that PRINT
would qualify them. Symbols that are accessible in *PACKAGE*
at compile-time are looked up in *PACKAGE* at load-time. (This
is the "current-package" model.)
(2) Symbols appearing in the output file produced by COMPILE-FILE
are always qualified with the name of their home package,
regardless of the value of *PACKAGE*. (This is the
"home-package" model.)
Proposal COMPILE-FILE-SYMBOL-HANDLING:NEW-REQUIRE-CONSISTENCY:
In order to guarantee that compiled files can be loaded correctly,
users must ensure that the packages referenced in the file are defined
consistently at compile and load time. Conforming Common Lisp programs
must satisfy the following requirements:
(1) The value of *PACKAGE* when a top-level form in the file is processed
by COMPILE-FILE must be the same as the value of *PACKAGE* when the
code corresponding to that top-level form in the compiled file is
executed by the loader. In particular:
(a) Any top-level form in a file which alters the value of *PACKAGE*
must change it to a package of the same name at both compile and
load time.
(b) If the first nonatomic top-level form in the file is not a call to
IN-PACKAGE, then the value of *PACKAGE* at the time LOAD is
called must be a package with the same name as the package that
was the value of *PACKAGE* at the time COMPILE-FILE was called.
(2) For all symbols appearing lexically within a top-level form that
were accessible in the package that was the value of *PACKAGE*
during processing of that top-level form at compile time, but
whose home package was another package, at load time there must
be a symbol with the same name that is accessible in both the
load-time *PACKAGE* and in the package with the same name as the
compile-time home package.
(3) For all symbols in the compiled file that were external symbols in
their home package at compile time, there must be a symbol with the
same name that is an external symbol in the package with the same name
at load time.
If any of these conditions do not hold, the package in which LOAD looks
for the affected symbols is unspecified. Implementations are permitted
to signal an error or otherwise define this behavior.
A symbol S appearing in the source code is similar as a constant to
a symbol S' in the compiled code if:
o Their print names are similar as constants
And, either
o S is accessible in *PACKAGE* at compile time, and S' is accessible in
*PACKAGE* at load time
Or
o S' is accessible in the package that is similar as a constant to the
home package of symbol S.
The "similar as constants" relationship for interned symbols has nothing
to do with *READTABLE* or how the function READ would parse the
characters in the print name of the symbol.
Rationale:
This proposal is merely an explicit statement of the status quo,
namely that users cannot depend on any particular behavior if the
package environment at load time is inconsistent with what existed
at compile time.
This proposal supports both the current-package and home-package
models as implementation techniques.
Current Practice:
PSL/PCLS implements the home-package model, as does A-Lisp. Utah
Common Lisp implements the current-package model, but the chief
compiler hacker says he thinks that the home-package model
actually makes more sense, and agrees that any program that behaves
differently under the two proposals is broken.
The TI Explorer currently implements the home-package model, after
trying it both ways.
KCL also implements the home-package model.
Lucid Lisp appears to implement the current-package model.
Symbolics Genera implements the current-package model. Symbolics
Cloe probably does also.
Coral also implements the current-package model.
Cost to implementors:
Proposal NEW-REQUIRE-CONSISTENCY is intended to be compatible with either
of the two models, but it may not be entirely compatible with the
details of current implementations.
In particular, some implementations that use the current-package
model appear to restrict IN-PACKAGE to being the first top-level
form in the file and dump all symbols referenced in the file after
the entire file has been processed (so that the value of *PACKAGE*
used to determine whether to qualify symbols in the output file is
the same for all symbols in the file). Under this proposal, these
implementations would have to note when the value of *PACKAGE*
changes during processing of a top-level form.
Cost to users:
Any user program that would break under proposal NEW-REQUIRE-CONSISTENCY
is probably already nonportable, since this proposal is intended to
leave the behavior unspecified where it would differ under the
two implementation models.
For a discussion of how the two models treat nonportable or erroneous
programs, see the "Analysis" section below.
Benefits:
COMPILE-FILE's treatment of symbols is made explicit in the standard.
Analysis:
The two implementation models differ in the following situations.
Proposal NEW-REQUIRE-CONSISTENCY, in effect, says that valid programs do
not cause any of these situations to occur, and the behavior in such
cases is unspecified (allowing both models to be used as valid
implementation techniques).
(1) The situation where the file does not contain a IN-PACKAGE
and where the compile-time value of *PACKAGE* is a package with a
different name than the load-time value of *PACKAGE*.
The current-package model would intern the names of symbols that
were accessible in *PACKAGE* at compile time in *PACKAGE* at load time.
The home-package model would intern the names of symbols that
were accessible in *PACKAGE* at compile time in the package with
the same name as their compile-time home package.
In general, programs must be compiled in the "right" package, so
that the compiler can find and apply the correct macro expansions,
type definitions, and so on; see issue COMPILE-ENVIRONMENT-CONSISTENCY.
As a result of macroexpansion or other transformations applied by
the compiler, the compiled file may contain symbol references that
were not present in the source file. The current-package model may
cause problems because these references may be resolved to be
symbols other than the ones that were intended. The home-package
model is more likely to find the correct symbols at load time.
(2) The situation where there is a symbol accessible in the
compile-time value of *PACKAGE* but with another home package, and
where at load time there is not a symbol with the same name that
is accessible in both packages. This situation might occur, for
example, if at compile time there is a symbol that is external in
its home package and that package is used by *PACKAGE*, but where
there is no such external symbol in that package at load time, or
the load-time *PACKAGE* does not use the other package.
The current-package model would find or create a symbol accessible
in *PACKAGE*.
The home-package model would find or create a symbol accessible in
a package with the same name as the symbol's compile-time home
package.
Some people feel that the behavior of the current-package model is
more intuitive in this situation, and that it is more forgiving of
differences between the compile-time and load-time package
structures. Others feel that the behavior of the home-package
model is more intuitive, and that if there have been significant
changes to the package structures, it is probably an indication
that the file needs to be recompiled anyway, since the compiler
might have picked up macro definitions and the like from the
wrong package.
(3) The situation where a symbol is external in its home package
and where there is no such external symbol in that package at load
time.
The current-package model would quietly find or create the symbol
in *PACKAGE* if the symbol were accessible in *PACKAGE* at compile
time. Otherwise, it will signal an error.
The home-package model would always just quietly find or create the
symbol as internal in its home package.
Not complaining when a symbol that is supposed to be external
isn't can be seen as a violation of modularity. However, it seems
like this argument should apply equally to symbols whose home
package is *PACKAGE* as symbols whose home package is somewhere
else.
Discussion:
There has been some further and lengthy discussion on the question of
whether this proposal overspecifies the behavior of COMPILE-FILE and
LOAD. At least one person would like the standard to say nothing on
this issue beyond a statement of the goal that loading a compiled file
should exhibit the same behavior as loading its source file. We have
also considered another approach to the problem that would place more
stringent requirements on conforming programs and fewer requirements
on implementations. Neither of these alternatives seems to have much
support, though.