VT Internals Documentation


Description
-----------

This file documents some of the methods used by VT to implement its
features.


Organization
------------

The following topics are covered: memory management, primitive
functions, the interpreter, the compiler, and VTC efficiency notes.


Memory management
-----------------

The majority of VT's automatic memory management is done with
reference counting.  Leaks are possible through self-referential array
structures, so there is also a garbage collector.  VT code that
manipulates arrays, regexps, strings or programs must maintain the
reference counts for these structures so that they are not deleted
prematurely or leaked.

Data structures that are deleted at a predetermined time are
"backlisted"-- that is, they store a list of VTC data frames that
refer to them.  User nodes (remotes, windows, key bindings, files) and
some arrays are backlisted.  Pointers to user nodes and backlisted
arrays in the VT program should not be left lying around unless they
are the primary pointers used to find those objects or are encased in
Dframe structures, because the backlisting scheme is designed to
handle Dframes used by VTC code.  When backlisted objects are deleted,
the frames that refer to them are reset to NULL values.  Backlists are
stored in dynamically allocated linked chains.

Arrays are reference-counted if they are alloc() arrays.  The gvars
array is never deleted, so it is not reference-counted.  Parameter and
local variable arrays are deleted at a predetermined time (when the
function exits) and are backlisted.

Strings are reference-counted at two levels.  The Rstr structure
contains a dynamically-allocated string and a reference count, and is
used to keep track of the physical representations of VTC strings.  An
Istr structure contains a pointer to an Rstr and a reference count;
each Istr represents a copy of the string from the point of view of
the interpreter.  Although several Istrs can point to a single Rstr,
the interpreter and primitive functions must make sure that changes to
one Istr do not affect the other Istrs pointing to the same Rstr.
This is accomplished by passing the Istr to isolate() before making
modifications to it.  isolate() makes a separate copy of the string (a
new Rstr) if the current Rstr has more than one Istr pointing to it.
When the reference count for an Rstr drops to zero (no copies of the
string are referenced by VTC pointers), the string is freed; when the
reference count for an Istr drops to zero (that particular copy of a
string is no longer referenced by any VTC pointers) the Istr structure
is freed.

Most of the memory management in the VT program is handled through the
ref_frame() and deref_frame() functions and their companion functions
ref_frames() and deref_frames().  These examine the types of their
arguments, which are data frames or arrays of data frames, and do the
appropriate referencing or dereferencing.

One sometimes annoying constraint of the memory management system is
that the programmer must be careful to increment the reference count
of data frames before decrementing the counts of any other data frames
that might point to the same objects.  As an example, when a primitive
function returns, the interpreter must do a ref_frame() on the return
value before calling deref_frames() on the arguments that it must
remove to make room to place the return value on the data stack.  If
it called deref_frames() first, the object that the return value
pointed to might be deleted.

Another constraint is that, because backlists store the locations of
data frames that point to them, the memory manager must be informed
when data frames move.  After the interpreter calls ref_frame() on the
return value of a primitive function and deletes the arguments that
were passed to the primitive, it adds the return value to the top of
the stack.  Since it has just moved the value from one Dframe
structure to another, it must call move_frame_refs() to update
backlists that contained the location of the return value.
Fortunately, move_frame_refs() takes essentially zero time unless it
is dealing with a backlisted object, and even then in the case of the
return value from primitive functions, the reference will always be at
the front of the backlist chain since it has just been added.

Some organized memory management is done in the compiler.  All
identifiers read by the tokenizer are indexed in a resizable table and
freed as a group when the parser is finished.  Other than that, the
parser allocates and frees memory on an as-needed basis.

In general, VT's memory management makes intensive use of dynamic 
allocation, and in particular repeated allocation and freeing of 
small data structures.  For this reason, VT uses a tray malloc.  
Tray malloc simply allocates structures of 16 bytes or less from 
trays.  This saves lots of time, and probably some space as well.


Primitive functions
-------------------

Primitive functions and operator functions communicate with the
interpreter through Dframes.  They generally do not need to worry
about reference counts unless they are modifying data structures that
hold other Dframes.

A primitive function's definition begins with:

void pr_name(rf, argc, argv)
        Dframe *rf, *argv;
        int argc;

In prmt.h this is abbreviated "PDECL(pr_name)".  rf is the return
frame, and is initialized to type F_NULL before the primitive or
operator function is called.  So a primitive need do nothing to return
a value of NULL.  argc and argv are the arguments-- if the primitive
is specified as accepting a certain number of arguments (the argc
element of the primitive structure is positive) in the table in
prmtab.c, argc is guaranteed to be that number.

Primitives can access up to four arguments easily through the Dp1,
Dp2, Dp3 and Dp4 macros, which stand for argv[0], argv[1], argv[2] and
argv[3] respectively.  In addition, Dp is an abbreviation for Dp1 for
primitives which take only one argument.  Tp and Tp1..Tp4 are
abbreviations for the types of Dp1..Dp4, Int and Int1..Int4 for the
integer values for integer data frames, and Un and Un1..Un4 for the
user node values.

The first step of most primitive functions is to type-check its
arguments.  This is not done systematically because a significant
number of primitives accept a variety of different argument types.
The macro Terror("name") displays an error message in the current
window and returns a message to the interpreter to abort.

Primitive functions need not deal with adding rf to backlists or
incrementing reference counts of objects it is returning, because the
interpreter automatically calls ref_frame() on rf after the primitive
function returns.  However, if they are assigning a data frame to a
data structure such as an array element or a window/remote object,
they must call ref_frame() on the data frame they assigned to.


Interpreter
-----------

The VTC interpreter is a fairly simple stack machine that operates on
a data stack (dstack) and a call stack (cstack).  The interpreter is
reentrant, so that the data stack and call stack may have data and
call frames from several different programs at the same time.  A call
frame that is the starting frame of a program is called a "detached"
frame, and a postcondition of the interp() function is that the
detached frame closest to the top of the call stack will have been
removed, along with all the call frames above it, when interp() exits.
The call frames in the call stack contain enough information to
determine which data frames are associated with which programs on the
call stack.

The programs the interpreter reads are a sequence of one-word
instructions, usually an instruction type followed by some
information.  For instance, a VTC function call is stored as three
instructions: a type instruction I_FCALL, an integer containing the
argument count, and a pointer to the Func structure containing the
function.

Although the interpreter is reentrant, each call to the interpreter is
treated as a separate program running.  This is important for the
primitives read(), getch(), sleep(), and abort(), which suspend or
abort the currently-running program.  For instance, parse() often
results in a call to interp(), and if a read() occurs at some point
while this program is executing, only the data and call stack entries
for this new program (i.e. down to the topmost detached call frame)
will be suspended, and the function that called parse() will continue.
If interp() is called without a new detached call frame being placed
on the call stack, VT will shortly crash.

According to this design, the callv() primitive does not call
interp() but rather modifies the top of the call and data stacks and
then returns with a message to the interpreter not to execute its
normal cleanup sequence.  If it did call interp() instead, callv()
would act like detach(), which is intended to create an independent
thread.

The call frames on the call stack contain pointers and indices into
the data stack, which can be visualized like so:

            .
            .
        |   .   |
        |-------| <-- dpos
        |       |
        |-------|   (Frames used by program algorithm)
        |       |
        |-------| <-- lvars
        |       |
        |-------|   (Local variable values)
        |       |
        |-------| <-- lvars->vals
        |       |
        |-------|   (Parameter variable values)
        |       |
        |-------| <-- avars->vals (indexed by dstart)
        |   .   |
            .
            .

The dstart element of the call frame is the index of the frame pointed
to be avars->vals, or the start of the data frames used by the call
frame.  The cpush() call initializes a new call frame.  It assumes
that the parameters have already been pushed on the stack.  It extends
the stack by prog->lvarc entries to make room for the local variable
values.  The "progress" argument to the cpush() call is non-zero only
for programs generated by the prog_pcall() call in prmt4.c.  These
pseudo-programs have no local or argument variables, and their only
instruction is to call a primitive.  Thus, they act like a program
with no local or parameter variables that has already pushed several
values onto the stack and is ready to pop them off with a primitive
call.  All other functions begin with dpos indexing the data frame
immediately after the local variables.  If cpush() is called with a
non-zero progress argument, the program should have no local variables
or parameter variables.

When a call frame exits normally, the program has left one data frame
on the stack on top of its argument and local variables.
cpop_normal() removes the variables and moves the top frame down to
the first frame of the argument space.  The net effect of interpreting
a call frame is thus to replace its parameter variables with a return
value.  This is the same effect as calling a primitive.

If the program aborts preternaturally, the interpreter simply cleans
up pointers to the arrays in the call frames and makes a single call
to deref_frames() to remove all the data frames from the dstart
element of the most recent detached data frame up to the top of the
stack.


Parser
------

The parser has to overcome several difficult aspects of the VT design:
first, it must produce linear code rather than parse trees; second, it
must be able to determine what types of expressions can be used as
lvalues (most C-ish extension language compilers do not have pointers,
which greatly simplifies this problem); third, at the tokenizing level
it must be able to accept its input in multiple chunks spaced out in
time, and determine before it starts compiling when a parser directive
is finished.

The primary difficulty in producing linear code is dealing with jumps.
The parser maintains a jump table and two stacks, one for conditional
flow control and one for loops, which are more complicated.  When the
parser comes to a point in the code where loops are required, it
allocates space in the jump table and pushes the current position in
the jump table onto one of the stacks.  The macro interface handles
these operations fairly transparently.  For conditionals, we call
'Jmp(t);', where t is the type of jump (e.g. I_JMPF), and then we call
'Dest;' when we arrive at the destination of the jump, or 'Destnext;'
if it's more convenient to pop the conditional stack one jump
instruction ahead of time.  For loops, we call Incloop(n), where n is
the number of jump destinations required for the loop.  Then we call
Ldest(k) to set destination #k, and Ljmp(k, t) to jump to destination
#k (t is, again, something like I_JMPF).

Because we produce linear code, it is most expedient to handle lvalues
in the syntax.  This results in a second shift-reduce conflict (beyond
the usual if-then-else conflict) resulting from the rule "postlval:
'(' lval ')'", since according to the grammar, the lval could be
reduced to an expr and then to an atom with the parenthesees.  Another
unpleasant side-effect of handling this in the grammar is that yacc's
explicit precedence rules lose a lot of their effectiveness, since
they do not operate once a group of tokens has been reduced to an
lval.  This results in a more complicated expression syntax, but the
linear design of the code precludes handling lvalue-checking in the
semantics.

Since we do not receive our text input all at once, we keep a token
buffer.  The parsing routine, rather than passing tokens directly to
yyparse(), is called directly by the parse() primitive or from the
input window in VTC mode, and places the resulting tokens in the
buffer.  At the end of each line the parser makes an educate guess at
whether a parser directive is complete based on the balance of the '{'
and '}' tokens and whether the "func" keyword is at the beginning of
the token buffer.  The yylex() function simply reads out of the token
buffer until it is empty.

The reduction rules writes the information for some instructions in a
temporary form that is different from the final form of instructions.
They write jump instructions as integer offsets instead of the Instr *
pointers used in the final form, because they do not know while it is
compiling where the instructions will be stored, they store all
references to identifiers as strings in order to avoid adding to the
function and variable dictionaries in case of error.  The function
scan() converts these intermediate forms into the final forms.

The compiler makes a simple optimization for a[0] so that it 
produces code for *a instead of *(a+0).  This results in a third 
shift-reduce conflict because it is done in the grammar.


Efficiency notes for VTC code
-----------------------------

String lengths are stored in memory along with strings, so strlen() is
not required to manually count string elements.  Also, separate copies
of strings are stored in the same physical location in memory as often
as possible.  For instance, the strdup() primitive does not
immediately copy its string argument into another location in memory;
it waits until one of the copies of the strings are modified.  Thus,
avoiding strlen() or strdup() calls will not save a significant amount
of time or space.

If we pass strdup() a pointer to the middle of a string, it will still
make a copy of the whole string if the string is later modified.  If
we are going to store a copy of a substring, this could result in a
considerable space overhead.  So it is better to use something like:
	substring = strcpy("", s, n);
where s is the pointer into the middle of the string and n is the
length of the substring.