.SH
Expression Optimization
.PP
Each expression tree, as it is read in, is subjected to
a fairly comprehensive
analysis.
This is performed
by the
.II optim
routine and a number of subroutines;
the major things done are
.IP 1.
Modifications and simplifications
of the tree so its value may be computed more efficiently
and conveniently by the code generator.
.RT
.IP 2.
Marking each interior node with an estimate of the number of
registers required to evaluate it.
This register count is needed to guide the code generation algorithm.
.PP
One thing that is definitely not done is
discovery or exploitation of common subexpressions, nor is this done anywhere in the
compiler.
.PP
The basic organization is simple: a depth-first scan of the tree.
.II Optim
does nothing for leaf nodes (except for automatics; see below),
and calls
.II unoptim
to handle unary operators.
For binary operators,
it calls itself to process the operands,
then treats each operator separately.
One important case is
commutative and associative operators, which are handled
by
.II acommute.
.PP
Here is a brief catalog of the transformations carried out by
by
.II optim
itself.
It is not intended to be complete.
Some of the transformations are machine-dependent,
although they may well be useful on machines other than the
PDP-11.
.IP 1.
As indicated in the discussion of
.II unoptim
below, the optimizer can create a node type corresponding
to the location addressed by a register plus a constant offset.
Since this is precisely the implementation of automatic variables
and arguments, where the register is fixed by convention,
such variables are changed to the new form to simplify
later processing.
.RT
.IP 2.
Associative and commutative operators are processed by the
special routine
.II acommute.
.RT
.IP 3.
After processing by
.II acommute,
the bitwise & operator is turned into a new
.II andn
operator; `a & b' becomes
`a
.II andn
~b'.
This is done because the PDP-11 provides
no
.II and
operator, but only
.II andn.
A similar transformation takes place for
`=&'.
.RT
.IP 4.
Relationals are turned around so the
more complicated expression is on the left.
(So that `2 > f(x)' becomes `f(x) < 2').
This improves code generation since
the algorithm prefers to have the right operand
require fewer registers than the left.
.RT
.IP 5.
An expression minus a constant is turned into
the expression plus the negative constant,
and the
.II acommute
routine is called
to take advantage of the properties of addition.
.RT
.IP 6.
Operators with constant operands are evaluated.
.RT
.IP 7.
Right shifts (unless by 1)
are turned into left shifts with a negated right operand,
since the PDP-11 lacks a general right-shift operator.
.RT
.IP 8.
A number of special cases are simplified, such as division or
multiplication by 1,
and shifts by 0.
.LP
The
.II unoptim
routine performs the same sort of processing for unary operators.
.IP 1.
`*&x' and `&*x' are simplified to `x'.
.RT
.IP 2.
If
.II r
is a register and
.II c
is a constant or the address of a static or external
variable,
the expressions `*(r+c)'
and `*r' are turned into a special kind of name node
which expresses
the name itself and the offset.
This simplifies subsequent processing
because such constructions can appear as
the the address of a PDP-11 instruction.
.RT
.IP 3.
When the unary `&' operator is applied to
a name node of the special kind just discussed,
it is reworked to make the addition
explicit again;
this is done because the PDP-11 has no `load address' instruction.
.RT
.IP 4.
Constructions
like
`*r++' and
`*\-\-r'
where
.II r
is a register are discovered and marked
as being implementable using the PDP-11
auto-increment and -decrement modes.
.RT
.IP 5.
If `!' is applied to a relational,
the `!' is discarded
and the sense of the relational is reversed.
.RT
.IP 6.
Special cases involving reflexive
use of negation and complementation are discovered.
.RT
.IP 7.
Operations applying to constants are evaluated.
.PP
The
.II acommute
routine, called for associative and commutative operators,
discovers clusters of the same operator at the top levels
of the current tree, and arranges them in a list:
for `a+((b+c)+(d+f))'
the list would be`a,b,c,d,e,f'.
After each subtree is optimized, the list is sorted in
decreasing difficulty of computation;
as mentioned above,
the code generation algorithm works best when left operands
are the difficult ones.
The `degree of difficulty'
computed is actually finer than
the mere number of registers required;
a constant is considered simpler
than the address of a static or external, which is simpler
than reference to a variable.
This makes it easy to fold all the constants
together,
and also to merge together the sum of a constant and the address of
a static
or external (since in such nodes there is space for
an `offset' value).
There are also special cases, like multiplication by 1 and addition of 0.
.II
A special routine is invoked to handle sums of products.
.II Distrib
is based on the fact that it is better
to compute `c1*c2*x + c1*y' as `c1*(c2*x + y)'
and makes the divisibility tests required to assure the
correctness of the transformation.
This transformation is rarely
possible with code directly written by the user,
but it invariably occurs as a result of the
implementation of multi-dimensional arrays.
.PP
Finally,
.II acommute
reconstructs a tree from the list
of expressions which result.