Received: from SOUTH-STATION-ANNEX.MIT.EDU by po9.MIT.EDU (5.61/4.7) id AA00812; Wed, 31 Jan 96 13:16:48 EST
Received: from slip-bal.lcs.mit.edu by MIT.EDU with SMTP
	id AA14395; Wed, 31 Jan 96 13:16:13 EST
Received: (from bal@localhost) by slip-bal.lcs.mit.edu (8.6.12/8.6.12) id NAA00181; Wed, 31 Jan 1996 13:15:24 -0500
Date: Wed, 31 Jan 1996 13:15:24 -0500
From: "Brian A. LaMacchia" <bal@slip-bal.lcs.mit.edu>
Message-Id: <199601311815.NAA00181@slip-bal.lcs.mit.edu>
To: marc@MIT.EDU
Subject: thesis coments
Reply-To: bal@zurich.ai.mit.edu
X-Address: MIT AI Lab, NE43-431, 545 Technology Square, Cambridge, MA 02139
X-Phone: (617) 253-0290

Marc--

OK, I've read all the way through your paper.  Let me give you the
general comments first:

1) Abstract.  Most of the people who pick up a copy of your paper are
only going to read the abstract, so you've got to make this eally strong
and put your points across.

Here's what you had originally:

   Existing key servers for PGP are based on the PGP program as a key
   management mechanism.  PGP is not designed for this kind of heavy-duty
   use, and the servers' performance is suffering.  This paper describes
   the specification and design for a new public key server.  This key
   server uses a set of hash tables, indexing the PGP keys in the
   database by key ID, user ID, and database add time.  It implements
   add, get, index, and last operations via interactive HTTP queries and
   batch mail queries.  The daemon which serves these requests is
   persistent, which means it can use caching to improve performance
   further.

An attempted rewrite:

Existing public key servers for Pretty Good Privacy(TM) (PGP) utilize
PGP itself for key management.  PGP's key management routines were not
designed to handle large keyrings, and this dependence upon PGP for key
management is severely degrading key server performance.  In this paper
we describe the specification and design for a new public key server
that provides the same functionality of current servers but with much
higher performance.  Our key server is a ``drop-in'' replacement for the
current server, providing both e-mail and HTTP interfaces.  Preliminary
testing of this server under real-world conditions has shown at least an
order of magnitude performance increasewhen compared to the old
keyserver.

2) You need to use present tense throughout the thesis.  In a lot of
places you say things like, "the server will be able to handle..." etc.
Say "the server handles..." because it really does.  Remember this isn't
the proposal, it's a statement of what actually works.

3) I think a lot of your paragraphs are too short.  You're not writing a
news story, make sure you have plenty of "meat" on each paragraph.  In a
number of places you can safely combine a couple of short paragraphs
into longer ones.

4) You should reference the Appendicies at least once in the main body
of the paper.  If you never reference them why should I even look at
them?  A throwaway sentence somewhere, say at the end of a paragraph or
subsection, like "See Appendix <foo> for details on..."  is fine.

5) You're missing two very important sections in the paper, both of
which come at the end.  I was kind of surprised as I was reading
through the technical details to have hit the end of the paper.  You
need a "Performance" section and a "Conclusions" section.  In
Performance talk about how the code works in practice.  Put some of your
statistics in there, and compare the two servers.  Comment on how you
think the server will scale; where are the bottlenecks in *your* server?
What are the upper limits?

In "Conclusions" (or you can call it "Future Work", that's OK too) gie
the reader some idea of where the code should go next.  What would you
implement next if you had a chance?  In particular, how would you
support substring searches (or even just prefix searches)?  If you look
at the requests the new server had gotten in /usr/adm/syslog it looks
like a significant number of people try substring matches.  Now they
probably don't need to, but talk about how supporting that option
conflicts with fast db stuff.  You can also talk about how disk space
scales with respect to number of keys in the database.  How fast to the
indices grow?  When the keyring hits 40000 how big with the databases
be? 

6) Don't "sell yourself short".  I guess what I'm trying to say is that
you shouldn't downplay what you've done, since you hit all your target
goals and the server flies.  Talk about the tricky parts, or what
constraints on the implementation were dictated by the interface.
You've got to really highlight the "neat ideas".  In fact, if you can
isolate the "neatest idea" you should put a sentence about it in the end
of the abstract.  That way someone who only reads the abstract will
still get the kernel of what you did.

7) You need to be careful about jargon & abbreviations.  For example,
you use "root privileges" and "regexp" without any previous definition
or footnotes.  Use "regular expression" instead of regexp; that's a
term of art and you can expect your audience to know it.  

8) You've got all this great pseudocode in Appendix A; should any of it
show up in the main body of the paper?  Again, highlighting the "neat
ideas" or cool tricks, if you've got a particularly clever solution to
some sticky problem talk about it.  Throw in the pseudocode and explain
how it works.  Or even talk about the pseudocode that isn't great, or
that can't be great.  I'd also highlight those things that made your
life difficult, for example the fact that you have to depend on key
creation date as well as KeyID and UserID to distinguish keys.  And even
that is no guarantee, because a denial-of-service attack on a particular
key could duplicate that, too.  How was your code impacted by the fact
that you can never guarantee uniqueness of KeyID, UserID creation date
or any combination?

9) Finally, at some point go through the paper and look just at the
grammar/word choice/structure, etc.  Read for clarity of thought, remove
unnecessary words, "tighten" up the text.  

OK, that's about it for now.  I did make some specific editing changes,
too, and I'll try to write those up shortly.  Or we can go over them by
phone later this evening, that might work OK.  I've got class from
5:30-7:30 tonight, but I should be home after 8:30pm.

						--bal

