This file contains notes about the care and feeding of the Athena
source repository.  It is intended primarily for the administrators of
the source tree, not for developers (except perhaps for the first
section, "mailing lists").  See the file "procedures" in this
directory for information about procedures relevant to developers.

The areas covered in this file are:

	Mailing lists
	Permissions
	The wash process
	Imake templates
	Release notes
	Release cycles
	Patch releases
	Rel-eng machines
	Cluster information

Mailing lists
-------------

Here are descriptions of the mailing lists related to the source tree:

	* source-developers

		For discussion of the policy and day-to-day
		maintenance of the repository.  This is a public list,
		and there is a public discuss archive on menelaus.

	* source-reviewers

		For review of changes to be checked into the
		repository.  To be a member of this mailing list, you
		must have read access to the non-public parts of the
		source tree, but you do not need to be a staff member.
		There is a non-public discuss archive on menelaus.

	* source-commits

		This mailing lists receives commit logs for all
		commits to the repository.  This is a public mailing
		list.  There is a public discuss archive on menelaus.

	* source-diffs

		This mailing list receives commit logs with diffs for
		all commits to the repository.  To be on this mailing
		list, you must have read access to the non-public
		parts of the source tree.  There is no discuss archive
		for this list.

	* source-wash

		This mailing list receives mail when the wash process
		blows out.  This is a public mailing list.  There is
		no discuss archive for this list.

	* rel-eng

		The release engineering mailing list.  Mail goes here
		about patch releases and other release details.  There
		is a public archive on menelaus.

	* release-team

		The mailing list for the release team, which sets
		policy for releases.  There is a public archive on
		menelaus (currently, it has the name "release-77").

Permissions
-----------

Following are descriptions of the various groups found on the acls of
the source tree:

	* read:source
	  read:staff

		These two groups have identical permissions in the
		repository, but read:source contains artificial
		constructs (the builder user and service principals)
		while read:staff contains people.  In the future,
		highly restricted source could have access for
		read:source and not read:staff.

		Both of these groups have read access to non-public
		areas of the source tree.

	* write:staff

		Contains developers with commit access to the source
		tree.  This group has write access to the repository,
		but not to the checked-out copy of the mainline
		(/mit/source).

	* write:update

		Contains the service principal responsible for
		updating /mit/source.  This group has write access to
		/mit/source but not to the repository.

	* adm:source

		This group has administrative access to the repository
		and to /mit/source.

system:anyuser has read access to public areas of the source tree and
list access to the rest.  system:authuser occasionally has read access
to areas that system:anyuser does not (synctree is the only current
example).

The script CVSROOT/afs-protections.sh in the repository makes sure the
permissions are correct in the repository or in a working directory.
Run it from the top level of the repository or of /mit/source, giving
it the argument "repository" or "wd".

The wash process
----------------

The wash process is a nightly rebuild of the source repository from
scratch, intended to alert the source tree maintainers when someone
checks in a change which causes the source tree to stop building.  The
general architecture of the wash process is:

	* Each night at midnight, a machine (currently small-gods)
	  performs a cvs update of the checked-out tree in
	  /afs/dev.mit.edu/source/src-current.  If the cvs update
	  fails, the update script sends mail to source-wash@mit.edu.
	  This machine is on read:source and write:update.

	* Each night at 4:30am, a machine of each architecture
	  (currently whirlpool and snuggle) recreates empty /build and
	  /localsrvd filesystems and performs a build of the tree with
	  /srvd pointed at /localsrvd.  If the build fails, the update
	  script sends mail to source-wash@mit.edu with the last few
	  lines of the wash log, and saves the wash log in /var/wash
	  on the local disk.

Source for the wash scripts lives in /afs/dev.mit.edu/service/wash.
They are installed in /usr/local on the wash machines, along with a
copy of krbtgp from the net-tools locker in /usr/local/bin.  Logs of
the start and end times of the wash processes on each machine live in
/afs/dev.mit.edu/service/wash/status/`hostname`.

Imake templates
---------------

We don't like imake, but we maintain two sets of imake templates:

	* packs/build/config

		These templates are the legacy Athena build system.
		They are specific to software in the athena hierarchy,
		and one glorious day in the future they will no longer
		be necessary.

		For these templates, you should define TOPDIR to the
		top-level source directory.

	* packs/build/xconfig

		These templates are used for building software which
		uses X-style Imakefiles.  They may need periodic
		updating as new versions of X are released.  These
		templates are full of a lot of hacks, mostly because
		the imake model isn't really adequate for dealing with
		third-party software and local site customizations.

		For these templates, you should define TOPDIR to "."
		and SRCDIR to the top-level source directory.

Release notes
-------------

There are two kinds of release notes, the system release notes and the
user release notes.  The system release notes are more comprehensive
and assume a higher level of technical knowledge, and are used in the
construction of the user release notes.  It is the job of the release
engineer to produce a set of system release notes for every release,
with early versions towards the beginning of the release cycle.  The
best way to make sure this happens is to maintain the system release
notes throughout the entire development cycle.

Thus, it is the job of the release engineer to watch the checkins to
the source tree and enter a note about all user-visible changes in the
system release notes, which live in /afs/dev.mit.edu/project/relnotes.
Highly visible changes should appear near the beginning of the file,
and less visible changes should appear towards the end.  Changes to
particular subsystems should be grouped together when possible.

Release cycles
--------------

Release cycles have five phases: crash and burn, alpha, beta, early,
and the public release.  The release team has a set of criteria for
entering and exiting each phase, which won't be covered here.  The
following guidelines should help the release go smoothly:

	* Crash and burn

	  This phase is for rel-eng internal testing.  The crash and
	  burn machines should be identified and used to test the
	  install and update.  System packs may be generated at will
	  by taking snapshots from the wash machine.  The system packs
	  volume does not need any replication.

	  System release notes should be prepared during this phase.

	  Before the transition from crash and burn to alpha, the
	  release engineer should do a sanity check on the new packs
	  by comparing a file listing of the new packs to a file
	  listing of the previous release's packs.  The release
	  engineer should also check the list of configuration files
	  for each platform (in packs/update/platform/*/configfiles)
	  and make sure that any configuration files which have
	  changed are listed as changed in the version script.
	  Finally, the release should be checked to make sure it won't
	  overflow partitions on any client machines; currently, SGIs
	  are not a problem (because they have one big partition) and
	  the most restrictive sizes on Solaris clients are 27713K and
	  51903K of useable space for the root and /usr partitions.

	* Alpha

	  The alpha phase is for internal testing by the release team.
	  System packs may still be regenerated at will by taking
	  snapshots, but the system packs volume (and os volume)
	  should be read-only so it can be updated by a vos release.
	  Changes to the packs do not need to be propagated in patch
	  releases; testers are expected to be able to ensure
	  consistency by forcing repeat updates or reinstalling their
	  machines.

	  A draft of the system release notes should be ready by the
	  beginning of this phase.  User release notes should be
	  prepared during this phase.

	  Before the transition from alpha to beta, doc/third-party
	  should be checked to see if miscellaneous third-party files
	  (the ones not under the "third" hierarchy) should be
	  updated.

	  At the end of the alpha phase, a release branch should
	  be created with a name of the form athena-8_1, and tagged
	  with athena-8_1-beta.  A checked-out tree should be made in
	  /afs/dev.mit.edu/source for the release branch, with a name
	  of the form src-8.1.  A final snapshot of the system packs
	  should be constructed from the release branch, and the build
	  tree copied into /afs/dev.mit.edu/project/release.  Build
	  machines for the new release should be set up.

	* Beta

	  The beta phase involves outside testers.  System packs and
	  os volumes should be replicated on multiple servers, and
	  permissions should be set to avoid accidental changes
	  (traditionally this means giving write access to
	  system:packs, a normally empty group).  Changes to the packs
	  must be propagated by patch releases.

	  User release notes should be essentially finished by the end
	  of this phase.  System release notes may continue to be
	  updated as bug fixes occur.

	* Early

	  The early release involves more outside testers and some
	  cluster machines.  The release should be considered ready
	  for public consumption.

	  The release branch should be tagged with a name of the form
	  athena-8_1-early.

	* Release

	  The release branch should be tagged with a name of the form
	  athena-8_1-release.

One thing that needs to happen externally during a release cycle, if
there is an OS upgrade involved, is the addition of compatibility
symlinks under the arch directories of various lockers.  All of the
lockers listed in packs/glue/specs definitely need to be hit, and the
popular software lockers need to be hit as well.  Here is a reasonable
list of popular lockers to get in addition to the glue ones:

	consult
	games
	gnu
	graphics
	outland
	sipb
	tcl
	watchmaker
	windowmanagers
	/afs/sipb/project/tcsh

In addition, the third-party software lockers need to be updated; the
third-party software group keeps their own list.

Patch releases
--------------

Once a release has hit beta test, all changes to the release must be
propagated through patch releases.  The steps to performing a patch
release are:

	* Check in the changes on the mainline (if they apply) and on
	  the release branch and update the relevant sections of the
	  source tree in /afs/dev.mit.edu/source.

	* If the update needs to do anything other than track against
	  the system packs, you must prepare a version script which
	  deals with any transition issues, specifies whether to track
	  the OS volume, specifies whether to deal with a kernel
	  update, and specifies which if any configuration files need
	  to be updated.  See the update script
	  (packs/update/do-update.sh) for details.  See
	  packs/build/update/platform/*/configfiles for a list of
	  configuration files for a given platform.  The version
	  script should be checked in on the mainline and on the
	  release branch.

	* Make sure to add symlinks in the build tree for any files
	  you have added.  Note that you probably added a build script
	  if the update needs to do anything other than track against
	  the system packs.

	* In the build tree, bump the version number in
	  packs/build/version (the symlink should be broken for this
	  file to avoid having to change it in the source tree).

	* If you are going to need to update binaries that users run
	  from the packs, go into the packs and move (don't copy) them
	  into a .deleted directory at the root of the packs.  This is
	  especially important for binaries like emacs and dash which
	  people run for long periods of time, to avoid making the
	  running processes dump core when the packs are released.

	* Update the read-write volume of the packs to reflect the
	  changes you've made.  You can use the build.sh script to
	  build and install specific packages, or you can use the
	  do.sh script to build the package and then install specific
	  files (cutting and pasting from the output of "make -n
	  install DESTDIR=/srvd" is the safest way); updating the
	  fewest number of files is preferrable.  Remember to install
	  the version script.

	* Use the build.sh script to build and install
	  packs/build/finish.  This will fix ownerships and update the
	  track lists and the like.

	* It's a good idea to test the update from the read-write
	  packs by symlinking the read-write packs to /srvd on a test
	  machine and taking the update.  Note that when the machine
	  comes back up with the new version, it will probably
	  re-attach the read-write packs, so you may have to re-make
	  the symlink if you want to test stuff that's on the packs.

	* At some non-offensive time, release the packs in the dev
	  cell.

	* Send mail to rel-eng saying that the patch release went out,
	  and what was in it.  (You can find many example pieces of
	  mail in the discuss archive.)  Include instructions
	  explaining how to propagate the release to the athena cell.

Rel-eng machines
----------------

There are six roles for rel-eng machines for each platform:

	* A wash machine, for nightly rebuilds of the source tree
	  during the development cycle.

	* A crash and burn machine, for testing the release.

	* A current release build machine, for doing incremental
	  updates to the last public release.

	* A new release build machine, for doing incremental updates
	  to the new release during the beta and early phases.

	* A current release developer machine, for other developers to
	  build and test software on under the current release.

	* A new release developer machine, for other developers to
	  build and test software on under the next release.

Six machines for each platform is a lot, especially when two of them
are only needed during a release cycle.  The following modifications
can collapse the number of required machines to three:

	* During the beta and early phases of a release cycle, the
	  wash can be shut down and the wash machines used as new
	  release build engines.

	* The new release build machine can be used as a new release
	  developer machine during the beta and early phases.  Having
	  a separate machine is preferrable since it is useful to have
	  a new release developer machine during the entire
	  development cycle, not just during the last two phases of
	  the release cycle.  Sometimes the crash and burn machine may
	  be useful for developers, although it cannot be treated as a
	  reliable resource.

	* The current release build machine can be used as a current
	  release developer machine.

Here is a list of the rel-eng machines for each platform, with repeat
machine names listed in parentheses:

				Sun		Indy		O2

Wash				whirlpool	kenmore		maytag
Current release build		downy		snuggle		bounce
Crash and burn			sourcery	pyramids	reaper-man
New release build		(whirlpool)	(kenmore)	(maytag)
Current release developer	(downy)		(snuggle)	(bounce)
New release developer		(whirlpool)	(kenmore)	(maytag)

For reference, here are some names that fit various laundry and
construction naming schemes:

	* Washing machines: kenmore, whirlpool, ge, maytag
	* Laundry detergents: fab, calgon, era, cheer, woolite,
		tide, ultra-tide
	* Bleaches: clorox, ajax
	* Fabric softeners: downy, final-touch, snuggle, bounce
	* Heavy machinery: steam-shovel, pile-driver, dump-truck,
		wrecking-ball, crane
	* Construction kits: lego, capsela, technics, k-nex, playdoh,
		construx
	* Construction materials: rebar, two-by-four, plywood,
		sheetrock
	* Heavy machinery companies: caterpillar, daewoo, john-deere,
		sumitomo
	* Buildings: empire-state, prudential, chrysler

Clusters
--------

The getcluster(8) man explains how clients interpret cluster
information.  This section documents the clusters related to the
release cycle, and how they should be managed.

There are five clusters for each platform, each of the form
PHASE-PLATFORM, where PHASE is a phase of the release cycle (crash,
alpha, beta, early, public) and PLATFORM is the machtype name of the
platform.  There are two filsys entries for each platform and release
pointing to the athena cell and dev cell system packs for the release;
they have the form athena-PLATFORMsys-XY and dev-PLATFORMsys-XY, where
X and Y are the major and minor numbers of the release.  For the SGI,
we currently also have athena-sgi-inst-XY and dev-sgi-inst-XY.

At the crash and burn, alpha, and beta phases of the release cycle,
the appropriate cluster (PHASE-PLATFORM) should be updated to include
data records of the form:

	Label: syslib		Data: dev-PLATFORMsys-XY X.Y t
(SGI)	Label: instlib		Data: dev-sgi-inst-XY X.Y t

This change will cause console messages to appear on the appropriate
machines informing their maintainers of a new testing release which
they can take manually.

At the early and public phases of the release cycle, the 't' should be
removed from the new syslib records in the crash, alpha, and beta
clusters, and the appropriate cluster (early-PLATFORM or
public-PLATFORM) should be updated to include data records:

	Label: syslib		Data: athena-PLATFORMsys-XY X.Y
(SGI)	Label: instlib		Data: athena-sgi-inst-XY X.Y

This change will cause AUTOUPDATE machines in the appropriate cluster
(as well as the crash, alpha, and beta clusters) to take the new
release; console messages will appear on non-AUTOUPDATE machines.