From: aybee@Athena.MIT.EDU
Message-Id: <9302132248.AA10158@auditorium>
To: Judson Harward <jud@ceci.mit.edu>
Cc: aybee@Athena.MIT.EDU
Subject: Re: AM2 memo 
In-Reply-To: Your message of Wed, 10 Feb 93 17:57:18 -0500.
             <9302102253.AA09266@Athena.MIT.EDU> 
Date: Sat, 13 Feb 93 17:48:00 EST


-------------------------------------
I)	Brief minutes of the networking meeting with Tomi

	We discussed publishing the interface of objects.  I asked why
publish it.  Tomi said so that you could check for errors in arguments
at the receiver end.  I said that I understood why that was good in
principle, and why when you rely on compiler techniques like rpcgen +
cc its straightforward.  In the AM2 messaging scheme, we are proposing
checking the arguments of messages at the receiver end within a
process.  What happens if we do the same over the network?

An argument mismatch is a fatal error, no?  It would be if you were
using RPC.  Does leaving the check to the receiver increase the chance
of corruption?  No, because it will still be caught before the
receiver does anything, and the sender can't have acted on the
nonexistent return yet.  Upshot was Tomi will think about just doing
the checking on the receiver side and using a version of the argument
list object for marshalling.

> Q: what return object? are you talking about allowing round-trip
network msgs again?
> I think an argument mismatch should only kill the sender, not the
receiver ('cept if in same process...both die); that way we don't kill
of servers when clients are stupid.

	He was also worried about the sender knowing whether a message is
synchronous or asynchronous.  With the new changes to the ADL, it is now
specified in the ADL.  It will be a syntax error to use an asynchronous
message as part of an rvalue, but I wonder if it should matter otherwise.
Thoughts?

> I agree with you that the ADL will just have to keep an asynch from
being assigned from.  I think the real question is what happens when
you assign from a ONE-WAY msg? (with no return value).  You should
die.  We need to be sure that when a message is sent a flag is set
saying whether a return value is expected or not. That way if one is
expected, and the target wasn' t gonna provide one, he can say DIE!!
This may be simple for local msgs (if RetObj in frame is null, not
expecting a return, else are expecting return), but may need to be
more explicit if we allow remote round-trip msgs, where the return
might be a msg sent on the socket that wouldnt be sent unless
otherwise needed & where the RetObj is only visible to the local post
office not the remote handler. (I dont know how the return msg will
work, im kinda guessing here).

---------------------------------------------
II)	We also discussed error handling, and I made a first pass on an
error module.  I suggest that we have a gereal error object that can deal
with three kinds of errors:

	Fatal) Source of the error detects error and supplies error message.
Error object just logs message and exits.  Example: sysntax error.

	Critical but fixable) Source of error detects it, calls an error
method with string, error method puts up a blocking popup with an OK and Die
buttons.  On OK, return to source for source initiated retry.  On Die, ....
Example: videodisc won't respond (is it on?).

	Warning) Source selects default action and calls error method with
string.  Handler logs the message and returns.  Example: can't find font.

	This extends to network messages fairly straightforwardly.

	Fatal) Sender must die so receiver sends the die message to sender.

	Critical but fixable) Well, in a network context that probably means
you have sent me a message I can't perform because of something wrong on my
system.  Popup goes up on receiver, not sender, but I'm open to suggestions.

	Warning) Log message on both.

	This suggests that the error handlers shoul;d shake hands as soon as
two AM2 processes contact each other, and that they probably want their own
socket pair.

> Sounds ok, but I'm not sure that syntax errors are fatal.  They
certainly aren't fatal in an editor situation; no one would use an
editor for ADL which puked on every syntax error.  Because we want the
error module to react differently for syntax errors on startup parsing
and for syntax errors during editing, I have been considering a more
context sensitive error module that could change behavior thru time.
I was also considering three types of errors: FATAL errors, errors,
and warnings.  These are very similar to your divisions.  My error
module/object/set of functions (prolly obj) is more sophisticated than
your however.

Error object:  it understands the distinctions between error types
explained below and responds appropriately.  it also maintains lists
of error handlers for each type of error or warning.  when an error or
warning are registered, a handler is searched for.  if one exists, it
will be invoked (should prolly allow either c++ handlers or adl
handlers, each with own callback signature).  The handler should be
provided with the error type (bad volume name, can't open file, media
device not responding , etc) and some info specific to the error (vol.
name, filename, device name, etc).  The handler can take whatever
steps its likes, trying to correct the error -- pop-up, queury-user,
ping device, etc).  When the handler returns it should tell whether or
not it thinks it handled the error.  Here we have some choices about
convention: what kinda result? True/False? the number of times to
retry? a real # showing prob. that it's fixed?  Does the error module
believe it?  Return right away if it says ok? (prolly).  Then the
object who signaled the error to begin with can retry, choosing
whether or not to signal an error or a FATAL error if it fails again.
In all cases the error object should log the error.  And of course, if
there are no handlers registered for errors, they become fatal.  If
none for warnings, the are just dumped to cerr (in addition to normal
warning loggin).

Here's where my model gets more complicated than yours.  I think that
users (C++ or ADL) should be able to register new handlers at runtime.
This would allow editors to register more friendly handlers for syntax
errors and runtime messaging errors.  Another sophistication (which
you may really decide we dont need, but which sounds useful) would be
if the error module maintained stacks of error handlers for each
error, only calling the top of each stack at anytime.  This way, a
section of code can decide how to deal with its errors, and register a
handler without having to worry about what happens when it unregisters
the handler.  It will just go back to its previous handler.

1) FATAL errors cause death with no reprieve.  when a module signals
this it will not expect to have control returned to it.

2) errors can be fixed.  the module signalling this is promising to
retry or at least work properly if control is returned to it (as
described above).

3)  This is for things we dont want to die on, but would like to
notify users about.  Real coerced to Integer. etc...

i hadnt really thought about the network case, but i think as much as
possible the sender should die, not receiver.  and i guess error
modules should prolly shakehands as you suggested.  One thing i think
my model allows for that yours doesn't is that on an error, the
response is flexible, not just a pop-up.  And a server might be smart
enough to put the pop-up on the clients screen instead/in addition to
on the server.

-------------------------------------------------------
III)	Issues I have been brooding on.

A)	Scope

	The problem is that according to our current definition, if a method is
a scope, then the class members are not visible in the method without an extern
declaration.  I now distinguish two kinds of scopes transparent and opaque.
The identifiers of the closest enclosing scope are visible in a transparent
scope but not in an opaque one.  Class definitions form opaque scopes, and
methods and intance initialization blocks transparent ones.

*) sounds good

B)	Minor changes in method return declaration.  I suggest that the
prototype look like

	on selector : [Type1 arg1, [... , Typen argn]] [ returns Rtype ]

rather than

	on selector : [Type1 arg1, [... , Typen argn]] [ return Rtype rval]

It creates another keyword, but I don't see any reason to hgave a dummy variable
for the return value in the prototype.

*) i dont currently have a dummy variable for it in the yacc code
=====>    on Doit: Integer i, Real r return Boolean
is a valid header (maybe i didnt do it the way the book says, but
thats what i thought it should be)  I even had returns for a while but
then thought about it from an English point of view "on Doit ...
return Boolean" sounds better than "returns" i thought.  I think i
would like a separater such as another colon between the args and the
"return" part, tho.

C)	Overloading methods in ADL classes and redefining members.  I am
thinking of outright forbidding it.  Yes, it is a real restriction on the
object model, but it is one that would save us a lot of heartache, and would
very seldom be missed.
*) what happens when you inherit from two classes which have a method
or a member with the same name?
*) yes it would be a real time saver, but i dont think we can do it
without thinking for reasons pointed to by my previous question.

D)	The problem of coordinating constraint processing when there are
multiple inputs to the right hand side.  I propose a third message queue along
with the UI and ADL queue for constraint Set_value messages.  It is
checked after the ADL queue but before the UI queue, and it collapses multiple
messages to the same object so only one update is sent.  Details remain to be
worked out, but it looks as if it may do the job, and ain't too ugly.  Let me
know what you think.

*) it is kinda ugly, and im afraid that it isnt sufficient. consider
the follwing diagram where words represent kinda-logical gates (ie
constraints in adl) (propagates from left to right).

a --------------\
		  \
		    xor --- c 
		  /
b ---- not -----/

lets say that a=false and b=false .... so c=true let's say that a and
b both change.  the XOR gate will get on the third queue as will the
NOT gate.  when the adl queue clears, the XOR will get evaluated and
c=false (true XOR true). then the NOT will get eval'd and its output
will change to false, causing the XOR to get back on the thrid queue
and eventually changing to true again.  As you probably know, in
digital design, this is called a glitch -- the output of our circuit
switched to false incorrectly even though in the steady state it was
gonna be true again.  In digital design they realize that glitches are
unaviodable, and settle for making sure that primitive gates are
glitch free.  Im not really sure where im headed with this except to
show you that we cant fix all glitches.  They clock things in digital
design to put places at which glitches are stopped.  I think that was
the purpose behind the third q you were suggesting.  That q also
raises the question of how you are gonna collapse the messages.  If
the constraint builds up some kinda memory (that's not possible in our
constraints is it?) how would you collapse the msgs and keep the
memory right?  Im starting to think that "clocking" the output devices
(ie windows, etc) might be the best (if ugliest) option; then we would
be clocking in a single, meaningful place.  (of course, that's ugly
too, cuz then the xf's need to cache possible state changes and need
to register and wait for a clock signal....arg!)

	Hope you're feeling better.  Keep your feet up anyway.
*) thanks..im doing much better, but prolly wont leave the house until
tuesday.

*) i hope this ranting and raving was useful.....

ab