BABYL OPTIONS:
Version: 5
Labels:
Note:   This is the header of an rmail file.
Note:   If you are seeing it in rmail,
Note:    it means the file has no messages in it.

1,answered,,
Received: by E40-PO.MIT.EDU (5.45/4.7) id AA11556; Tue, 2 Apr 91 10:33:36 EST
Received: from postman.osf.org by ATHENA.MIT.EDU with SMTP
	id AA14558; Tue, 2 Apr 91 10:33:21 EST
Received: from salmon.osf.org by postman.osf.org (5.64+/OSF 1.0)
	id AA22444; Tue, 2 Apr 91 07:48:39 -0500
Received: by salmon.osf.org (5.64/4.7) id AA15251; Tue, 2 Apr 91 07:48:37 -0500
Date: Tue, 2 Apr 91 07:48:37 -0500
From: walt@osf.org
Message-Id: <9104021248.AA15251@salmon.osf.org>
To: tytso@ATHENA.MIT.EDU
In-Reply-To: Theodore Ts'o's message of Mon, 1 Apr 91 20:55:49 -0500 <9104020155.AA20757@tsx-11.MIT.EDU>
Subject: DCE's directory structure

*** EOOH ***
Date: Tue, 2 Apr 91 07:48:37 -0500
From: walt@osf.org
To: tytso@ATHENA.MIT.EDU
In-Reply-To: Theodore Ts'o's message of Mon, 1 Apr 91 20:55:49 -0500 <9104020155.AA20757@tsx-11.MIT.EDU>
Subject: DCE's directory structure

The names you cite are usable interactively.  The names `/cell' and
`/file' are synonyms that should make your software happy.  By the way,
why do you assume that anyone who doesn't agree with you is "stupid" and
makes "monumental design flaws" as you say in your letter?  The matter
will be resolved at POSIX.  You may be interested that Dennis Ritchie
thinks its the right thing to do:


From walt Thu Mar 21 23:22:19 1991
Received: by salmon.osf.org (5.64/4.7) id AA06783; Thu, 21 Mar 91 23:22:17 -0500
Date: Thu, 21 Mar 91 23:22:17 -0500
From: walt
Message-Id: <9103220422.AA06783@salmon.osf.org>
To: dmr@research.att.com
Subject: History question
Status: R

Hi, Dennis -

I had occasion to exchange email with you a couple of times when I
worked at Summit and Murray Hill.  For reasons that we need not go into
here, I now work at OSF.  I enjoyed working with Peter Weinberger on the
DCE RFT, and maybe someday I'll have an opportunity to work with you,
too (I think you and others at Murray Hill have a lot to contribute, and
it would be a shame if that were not allowed to happen).  But that has
nothing with why I'm writing to you now.

I was looking at POSIX the other day, and noticed an ambiguity in its
specifications of the filesystem namespace and the $PATH variable.
Namely, `/' is reserved for use as a metacharacter in the filesystem
naming syntax, but otherwise any character is legal in a filesystem name
except for the NUL character.  On the other hand, PATH says that `:' is
a separator between path-prefixes (= names of directories).  The
ambiguity comes when the path-prefix has a `:' in it.  How do you tell
the difference between the use of `:' as an ordinary character for the
filesystem naming syntax and its use as a metacharacter for the PATH
syntax?  POSIX doesn't say (and neither does any other UNIX
specification that I know of).

The four basic ways I can think of out of this dilemma are:

	1. Remain ambiguous.  Don't specify what is supposed to happen.
	   This means that implementations could do any sort of wild
	   things in a situation where, say, $PATH==/bin:/foo:bar:/usr/bin
	   and the directories /foo:bar and ./bar both exist.

	2. Make the specifications conform to implementations, and allow
	   `:' in directory names but not in PATH.  This means that it
	   is possible to create directories that cannot occur in a PATH.
	   Thus, /foo:bar could not be embedded in a PATH.

	3. Disallow `:' in directory names.  This way, every directory
	   that can be created can be put in a PATH, but it removes a
	   perfectly good character from the filesystem naming syntax.
	   Thus, a directory named /foo:bar could not exist.

	4. Allow `:' in directory names, but add an escaping mechanism
	   to the PATH syntax.  Thus, /foo:bar could embedded in a PATH
	   by typing PATH='/bin:/foo\:bar:/usr/bin'.  This preserves the
	   full filesystem namespace as it currently exists, at the
	   expense of introducing a new feature into the PATH syntax.

Naturally, there are tradeoffs for all these alternatives, including
purity of design, migration of implementations and users, etc.  But it's
pretty easy to figure out what the issues are, so I don't want to spend
time on them here.

Now, I assume you were the one who originally invented these things, so
I thought you could give me some insight into what you had in mind at
the time.  (Saying you didn't even think about the problem at the time
is a perfectly good response, if that's what happened.)  Also, I'd also
be interested in what you think about this little matter at the present
time, if it's different from what you thought then.

Needless to say, it's easy to become religious about something like
this, and qualified people can have divergent views on it, but it's nice
to know the original intent; maybe we can learn from history.

Thanks for taking the time to read this.  Take care.

- Walt

From dmr@research.att.com Fri Mar 22 03:30:02 1991
Date: Fri, 22 Mar 91 03:29:48 EST
To: walt@osf.org
Subject: : in $PATH
Status: R

Actually, Steve Bourne invented $PATH; before him it was
wired into the shell.  Evidently neither he nor I considered
the problem serious.

$PATH is a construct unique to the shell, so it seems logical
to deal with its syntax within the shell.  And even within
the shell, it is a fairly minor notion.  So my judgment is
that if an overwhelming desire to have directories with :
in their names be part of a search path, then the shell
should invent syntax to make it possible.

This could either be an explicit quote mechanism to protect
literal : characters, or better, some more general mechanism.
E.g. the plan 9 shell has an explicit notion of a list.
If the path is a list of directories, then any escaping
of the normal syntactic list-separator is done by more
general mechanisms.

In the meantime, it is probably sufficient to augment
the shell manual with an observation that directories
with : in their names may not be part of a path.

	Dennis

From walt Fri Mar 22 10:46:02 1991
Received: by salmon.osf.org (5.64/4.7) id AA06897; Fri, 22 Mar 91 10:45:59 -0500
Date: Fri, 22 Mar 91 10:45:59 -0500
From: walt
Message-Id: <9103221545.AA06897@salmon.osf.org>
To: dmr@research.att.com
Subject: Re:  : in $PATH
Status: R

Dennis -

Thanks for the quick response.  I didn't tip my hand to you in my letter
because I didn't want to bias your reply, but I agree with your analysis
that `:' should be allowed in directory names, and that path-searching
should change to accommodate it.  I am also sympathetic to a more
general mechanism such as the Plan 9 scheme, but since POSIX has
standardized on PATH it looks like that will be around for awhile.

I'm not so sure I agree with your characterization of PATH as a
shell-specific construct.  The PATH environment variable is used not
only by the shell, but also by execlp() and execvp(), so it is available
to any program that wants to do a path search.  This still doesn't
change the fact that PATH is a "fairly minor notion" as you put it (and
I agree), but it does point out that PATH is here to stay for a good
long time.  It's also the case that `:' is used in various syntaxes
other than PATH, such as in /etc/passwd, /etc/group, NLS setlocale()
stuff, etc., but those are either irrelevant or are more minor and
easier to change than PATH, so let's not worry about them here.

For both of the above two reasons (namely, the fact of POSIX
standardization, and the use of PATH in exec*p() as well as the shell),
I think it will be necessary and easier to fix PATH first before going
on to a new mechanism to replace it.  Thus, I favor adopting the
escaping mechanism in the PATH syntax (option #4 in my first letter).
As far as the migration/compatibility issues go, it appears that
implementations will have to change to support the new syntax but that
the change would be totally invisible to users (except for those who
currently have `\' embedded in their PATHs, of which there is probably
not a single one).

BTW, this is not just an academic discussion; I do have an "overwhelming
desire" (as you put it) to use `:' in PATH, other that just fixing an
underspecified place in UNIX.  Namely, in DCE we're going to have the
notion of a "cell" (~ "domain" or "realm"), and all cells worldwide are
tied together ("federated") by means of a global (X.500 & Internet
Domain) namespace.  The filesystem namespace is grafted onto the
namespace, as well.  From a pure naming viewpoint (i.e., ignoring the
fact that UNIX implementations don't currently like `:' in PATH), I like
using the following names for the major roots in DCE:

	/...	= root of global namespace
	/.:	= root of cell namespace
	/:	= root of cell filesystem space

(The latter two are shorthand, symbolic links to longer names beginning
with `/...' and not containing `:', so these longer names could be put
into PATH until the PATH syntax changed to allowed `:' in PATH.)  These
names seem to fit well into a classical UNIX-like progression:

	/	/.	/..	/:	/.:	/...

Or at least that's the best I can come up with, given such
considerations as not wanting to use a culture-specific construct such
as an English word or abbreviation.  Note that `/:' is not the same as
`/..', not only because the parent of local root may not be the cell
filesystem root, but because the local root may not even appear in the
cell filesystem.

Of course, there's much more going on than the few things I've been able
to mention here.  If you're interested in more details we can exchange
more mail, or PJW may be able to help.

- Walt

From dmr@research.att.com Sat Mar 23 02:08:06 1991
Date: Sat, 23 Mar 91 02:03:20 EST
To: walt@osf.org
Subject: Re: : in $PATH
Status: R

Well, I'm glad I partially passed the test!

You're right that $PATH syntax isn't solely in the shell (there
can be several different shells, of course, a fact I carefully
didn't mention).

In any event I don't have any philosophical objection to
refining the interpretation of $PATH so that, say, \:
is taken as a literal colon instead of a separator.
I suppose I am in the position of the doctor with
a patient who says `it hurts when I do this' (making
some absurd gesture); I am tempted to say, `then
don't do that' while being aware that this is not
a fully defensible position.

It may be too late, but have you considered more general
and attractive schemes for extending the namespace?

One is the idea from the Newcastle Connection, in which /..
is a superroot, containing names of places you are connected
to in a larger domain.  (This can be extended to /../..).

About the ... notation, here's an anecdote.  A very clever
but unix-naive mathematician once asked me: if . means
the current directory, and .. the parent, why isn't ...
the grandparent and so on?

	Dennis


From walt Sat Mar 23 12:40:17 1991
Date: Sat, 23 Mar 91 12:36:06 -0500
From: walt
To: dmr@research.att.com
In-Reply-To: dmr@research.att.com's message of Sat, 23 Mar 91 02:03:20 EST
Subject: : in $PATH


> Well, I'm glad I partially passed the test!

Hmmm, yes, I guess maybe my wording did come across like I was trying to
catch you in a game of "Stump the Dummy."  Sorry, it was unintended.

> In any event I don't have any philosophical objection to
> refining the interpretation of $PATH so that, say, \:
> is taken as a literal colon instead of a separator.
> I suppose I am in the position of the doctor with
> a patient who says `it hurts when I do this' (making
> some absurd gesture); I am tempted to say, `then
> don't do that' while being aware that this is not
> a fully defensible position.

I'm reasonably aware of the ramifications, but I think the disadvantages
are outweighed by the consistency of the use of all the escaping/quoting
mechanisms that are already ubiquitous in all the minilanguages in UNIX.
I.e., people already know how to handle such mechanisms, so there is no
intellectual burden being added.  In fact, I am hard pressed to come up
with other examples of UNIX syntaxes that don't have a way to neturalize
metacharacters, though I suspect there are a few others ("make" is a
prime candidate), and such cases must certainly be oversights, not
design goals.  Else you wouldn't be able to hang yourself, so it
wouldn't be UNIX, would it?  A mitigating circumstance in the case of
PATH is that it is essentially a write-only variable -- you set it once
and forget it thereafter.  But let's cease beating this poor horse any
deader.

> It may be too late, but have you considered more general
> and attractive schemes for extending the namespace?
> One is the idea from the Newcastle Connection, in which /..
> is a superroot, containing names of places you are connected
> to in a larger domain.  (This can be extended to /../..).

Now here's a topic we could flail away on 'til the cows come home!  FYI,
the FreedomNet folks (the commercial instantiation of Newcastle) were
submitters to the DCE RFT, but we chose to go with AFS instead.  I agree
we haven't heard the last word on namespace topologies, OSF and Plan 9
notwithstanding.  In our scheme (~ AFS 4.0), flexibility of
configuration is physically supported at the granularity of "filesets"
(formerly, "AFS volumes"), and a logical view such as that provided by a
shell-like creature could be imposed on top of that.  Some sites will
choose to configure their cells as you suggest, but at least initially I
suspect most sites will be more comfortable with a `/:' (~ `/net', but
without the machine as the next component) type of layout (also
sometimes called a "superroot," though some would say mistakenly).  And
all our cells are tied together through the global namespace rooted at
`/...' (~ `/afs').

The specific technology we use to splice together all these nameservices
is called "junctions" (yes, borrowed from Birrell).  For example, the
global nameservice (X.500 & Domain) junctions with the cell nameservice
(DECnet naming), which in turn junctions with filesystem-nameservice
(AFS) and also with security-space-nameservice (Kerberos).  The actual
implementation is a little crufty in the first release, but we've got
some nicer soup on the stove.  Anyway, whatever we do will be better
than NFS ("sometimes, when you fill a vacuum, it still sucks").

- Walt



1,,
Received: by E40-PO.MIT.EDU (5.45/4.7) id AA12313; Tue, 2 Apr 91 12:00:43 EST
Received: from MIT.MIT.EDU by ATHENA.MIT.EDU with SMTP
	id AA18945; Tue, 2 Apr 91 12:00:40 EST
Received: from TSX-11.MIT.EDU by MIT.EDU with SMTP
	id AA16029; Tue, 2 Apr 91 12:00:34 EST
Received: by tsx-11.MIT.EDU 
	with sendmail-5.61/1.2, id AA25942; Tue, 2 Apr 91 11:59:25 -0500
Date: Tue, 2 Apr 91 11:59:25 -0500
From: tytso@ATHENA.MIT.EDU (Theodore Ts'o)
Message-Id: <9104021659.AA25942@tsx-11.MIT.EDU>
To: walt@osf.org
In-Reply-To: walt@osf.org's message of Tue, 2 Apr 91 07:48:37 -0500,
	<9104021248.AA15251@salmon.osf.org>
Subject: Re: DCE's directory structure
Reply-To: tytso@ATHENA.MIT.EDU
Address: 308 High Street, Medford, MA 02155
Phone: (617) 395-0154

*** EOOH ***
Date: Tue, 2 Apr 91 11:59:25 -0500
From: tytso@ATHENA.MIT.EDU (Theodore Ts'o)
To: walt@osf.org
In-Reply-To: walt@osf.org's message of Tue, 2 Apr 91 07:48:37 -0500,
	<9104021248.AA15251@salmon.osf.org>
Subject: Re: DCE's directory structure
Reply-To: tytso@ATHENA.MIT.EDU
Address: 308 High Street, Medford, MA 02155
Phone: (617) 395-0154

   Date: Tue, 2 Apr 91 07:48:37 -0500
   From: walt@osf.org

   The names you cite are usable interactively.  The names `/cell' and
   `/file' are synonyms that should make your software happy.  By the way,
   why do you assume that anyone who doesn't agree with you is "stupid" and
   makes "monumental design flaws" as you say in your letter?  The matter
   will be resolved at POSIX.  You may be interested that Dennis Ritchie
   thinks its the right thing to do.

Mayhap; after reading your enclosed messages (thank you for sending
them!), I will agree that it becomes essentially a religious issue.  (I
could point out that many AT&T folks, including I think dmr, also
thought symbolic links were a bad thing, but that's something of a low
blow and it's definitely another religious issue in any case.)

I still think it's a bad idea becase all sorts of software will
gratuitously break; there are software packages all over the place that
will just break.  Besides /etc/passwd, /etc/group, setlocale(), which
you mentioned, I can think of shell environment MANPATH, TEXFONTS,
TEXINPUTS, and a host of others.  There are probably also many programs
which read configuration programs that use ':', and depending on how
their parsers are written, a colon in a filename may throw those
programs completely for a loop.  Agreed, those programs may be "broken".
But the users who may be depending on those program may not be able to
have the resources or the ability to "fix" them.  It seems rude to
deliberately make them lose.

From a Vendor's point of view, I can see where they might not care; they
have the programmer resources to fix all of the programs that will break
under this scheme.  But when you consider all of the programs which are
being used at sites which are not developed under a Vendor, stuff in
comp.sources.unix, custom-written programs, etc., I would think this
would be an unreasonable change to make.  I believe the costs of making
this change are much larger than you stated in your mail messages to
Dennis Richie.

And if the costs are large, or potentially large, I have to ask at what
price?  I fail to see why /.: is superior to /../.. or /.../.. or even
/@ (my favorite, since it brings back memories of Todd Brunhoff's RFS).
In my opinion, the benefits do not justify the potential costs.  

As far as the quoting argument goes, I have to disagree; one of the most
confusing things when writing a shell script which uses Unix tools ---
the combination of /bin/sh and awk or sed is really gruesome --- is
figuring out how many backslashes you need to add so that it passes the
n different quoting and escaping mechanisms.  (Do I need three or four
backslashes!?!)  It's also problematic that since there aren't library
routines to do this, different quoting conventions have been subtly
different.  Adding one more place where quoting is neccessary seems to
me to be a bug, not a feature.

I suppose it's not the end of the world.  People will be able to adapt;
I suspect that there will many, many symlinks in / for /.: and /:,
(/afs, /cell, /remote, /@, etc.) and different sites and different
programs will use different conventions, and users who try to use /.: in
a program that was written before the advent of DCE may lose in subtle
ways, and users will curse Unix (excuse me, Posix), and wish they could
go back to MS-Loss, and Unix consultants will clean up and make large
amounts of money.

I had expected better from the Unix standardization effort, but I
suppose I'm naive.

Thank you for your time in sending me a response.  I suspect that my
views won't make a difference in the grand view of things, but that's
life in the Standards world, I guess.

						- Ted

