Design notes Foundation

2/7/86

Problem:  References to mutable objects get screwed up when realloc
makes them bigger through copying.

Solution: Refer to mutable objects through one known place which will
get the new pointer.  Or else signal relocation up through the tree.
Hmmmm.  That may work even when offsets change.  But it may take a
while for relocations to get propagated...

----

2/10/86

It occurs to me that arrays should be mutable, AND they should be
typed.

In this way we can deal with mutations in the following way:

When something in an array moves, the owner of the array is signaled.
The owner is responsible for taking action on the change depending on
the type of array it is.

I think that perhaps a RAW should actually go into an array of raw.

Types of arrays:

Raw characters
Pointers to nodes
Pointers to procedures

(so far)

----
2/12/86

I REALLY don't like the way Regions are shaping up.  When one little
Array gets a new first element, Every Region that references every
element in the Array must be changed.  It is easier to walk up the
tree and find all the parents in such a case.  Provided Raw text can
only have one parent Node!  

Here is an idea for enforcing the hierarchy:  Require that a
particular TYPE be defined to have numerical significance.
A document is 10, A chapter is 19, a Section is 18, a Paragraph is 17,
and so on.  When nesting is attempted, compare the tens digit (or
whatever) for equality so that nesting can occur at all, and then test
to see if the units digit (or whatever) is smaller than the parent.
If not, the nesting is invalid.

----

2/13/86

I noticed the confusion I got into in the design document talking
about raw data and the Raw abstract data type.  I think it is best to
describe raw character data as char * and get rid of the typedef to
Raw.

Graphics, bitmaps and text have fundamentally different ways of
being viewed, being changed, and being utilized.  Therfore, although
there will be a conceptual similarity between the three owing to their
being raw data, the system should be designed to make the DIFFERENCES
between the three most clear.  This means that the old idea of generic
procedures manipulating Raw data of all types is not valid.  In fact,
just the opposite happens.

I am re-tinking raws and nodes.  I am trying to simplify the nesting
and the operations of nodes.  How about:

Region of text
  Array of blocks of raw text
  symbolic name for region
  Generic type of region
  Specific type of region
  Other Regions that care about this Region
  Operations  Each operation can run on one or n characters at a time.
    insert
    delete
    copyout (to another block of raw or another node or another reigion?)
    transform
      like a pipe operation?
      replace text in this region, or copyout
    Position cursor (for insert, etc) offset in region
    find (match a string and position cursor there)

The insert option might want an optional transformation,
The delete operation might want an optional transformation
Copyout might want an optional transformation.
There might want to be a replace operation.

These four ideas are jsut different ways of organizing the basic
operations given.

----

2/13/86

I noticed the confusion I got into in the design document talking
about raw data and the Raw abstract data type.  I think it is best to
describe raw character data as char * and get rid of the typedef to
Raw.

Graphics, bitmaps and text have fundamentally different ways of
being viewed, being changed, and being utilized.  Therfore, although
there will be a conceptual similarity between the three owing to their
being raw data, the system should be designed to make the DIFFERENCES
between the three most clear.  This means that the old idea of generic
procedures manipulating Raw data of all types is not valid.  In fact,
just the opposite happens.

I am begining to think that having a float in the middle of a block of
text is a bad idea.  It makes all the accessing code more complex.
Special routines need to be called to read across the hole, and the
information on where the hole is isn't kept in the array.  Putting the
float at the end would make things simpler:  anything that knows how
to scan a C string would work unmodified.  That is a considerable
savings in programmer effort and understanding to use the system.
Putting the float at the end of the array of characters can be made
arbitrarily efficient.  If, whenever we insert new text in a new place
we open a new array, and then consolidate arrays when the system is
not doing anything, we will probably end up winning.  This
consolidation and new application of float requires more thought...

There is no good reason to use the zeroeth element of the Array as a
size parameter.  It is a bad idea for the following reason:  Some
routine responsible for filling arrays needs to know both how many
elements there are and how filled the array is.  Knowing one without
the other is almost useless.  The only time it helps is when iterating
over all elements of the array, it tells when to stop.  Either both
must be maintained in a user visible way, or routines for iterating
and accessing must be provided.  What should the effect be of
accessing beyond the end of an array?  What cost are we willing to
bear in preventing the accessing beyond the end?  In Clu, we would
just write routines to do it.  In C, we would just let people access
beyond the end of the routine.  I think the solution may be to use
macros which expand to in-line code.  It affords protection, and it is
simple to use, and it is fast.

More on operations on low level data:

Funny
  Copyout - needs destination and optional transformation.
    (source, destination, offset)  (insert bacwards)
  Transform - in insert, in-line, to other area???
    (source, destination, offset, transform)

	onto itself? 
		do we check s/d (or have a flag for overwrite)
		or allocate (or let the proc do it)
		or assume (if overwrite is needed, the proc will do it)
	is order important? forw/back through characters?
		most things go forward, but...
	

Specify:
  Source
  Destination
  Transformation if any
  Blocksize -   number of chars in a unit
  block count
    ALL
    First
    Last
    One

Cases:
  Insert
  Overwrite
  Source and Destination same
  Source and Destination different

--------

Final decision:

Primops:
  Find  (source) -> offset
	(direction)? or range?
  Delete (destination, offset) 
  Insert (source, destination, offset)
  Size
  Create
  Destroy

The primops are hooked into the storage allocator for each mutable
data type.  The user is free to mung the data as she sees fit.  A
special debugging mode will be added to the storage allocator which
will write a magic constant into ALL the "unused" words of a block,
and then check to see if they are touched.  This is the only range
checking you will get!

Low level data:  Array, char *, graph *, bitmap *.

----

2/14/86

In doing the design on the generic primop procedures for the dynamic
data I decided that Insert was too complicated to use, and not in
keeping with the C philosophy.  Instead it should have a widen command
which puts in zeros into a hole it opens in a block:

Widen	Enlarge a data block shifting data down if necessary.
	Takes pointer to block pointer, offset to begin opening new
	space, and count of space to add.  Returns -1 if offset is
	past the end of the block, or if the widen fails to get more
	space. 

----

2/19/86

Additional debugging mode for allocator:  Check freed pointer against
array of allocated pointers.  Tell of free of non-allocated block.

----

2/20/86

With the basic text storage management out of the way, it is time to
think of I/O.  What will we want to do with I/O?  We shall want a
central control over File descriptors at least for input.  We shall
want to read text into buffers.  We shall want to write text from
buffers.  We shall want to re-format text into buffers.

Can a Region function as a buffer? hmmmmmmmmm...

----

2/27/86

When designing re-display, additional things had to be added to
regions to create blocks of copied text derrived from a region.  This
addtion seems to work ok for redisplay, but right now seems a bit too
ad-hoc.  It may turn out to help for outline and annotation modes and
for diff.  If ever fancy diff is added...