.BG
.FN audit
.TL
audit: Explore the Data Analysis Process
.CS
!S audit [auditfile]     # UNIX command
.PP
.AG auditfile
the name (unquoted) of a file containing a ".Audit" file
or an S diary.
This file must contain lines of the form 
.IP
#~get ...
.IP
#~put ...
.PP
in order for the auditing procedure to work.
This means that old diary files may not be auditable.
.PP
The `audit' utility allows the user to review what has
happened during a set of S sessions.
In order to use `audit', there must be a diary available
that was on during all of the sessions.
When the `audit' utility is invoked, it reads the file `auditfile',
finding all expressions that were executed and which datasets were
read and written by each expression.
It then allows the user (through an arcane syntax, see below)
to inquire about which expressions read or
wrote a specific dataset, to backtrack from a specific expression, or
to create a source file that will recreate an expression.
.PP
The `audit' utility can be executed in parallel with a copy of
S that is writing the `auditfile'.  The `audit' utility continuously
attempts to read new information that may be written to the file
by another process and updates its tables accordingly.
Thus, `audit' is very useful on a multi-window workstation where
one window can be used for running S and another for asking audit
questions about the S session.
.EX
$ S audit diary   # audit the file "diary"
G x	# shows which statements "got" or read dataset x
S A	# show (in reverse order) all statements
E 100	# create script to recreate statement 100

.ft 1
Commands to audit program:

E n1 n2 n3 n4 ... x
	generates an Executable script on stdout that will
	incorporate all computations including statments n1 n2 ...

[L|G|P|B] name
	will Lookup, show Gets, Puts, or will Backtrack
	L will show the name from its symbol
		table.  The name can be the trailing portion of
		a name, e.g. abc will match xyz.abc
	G will show the statements in which
		it is used, P will show the statements in which
		it is assigned
	B will backtrack, showing the statement
		that most recently set a value for the name
		and all statements that were predecessors of
		that statement

[L|G|P|B] number
	will Lookup, show Gets, Puts, or will Backtrack
	L will lookup a particular statement
	G will show the datasets used (gotten) within the statement
	P will show the datasets assigned (put) within the statement
	B will backtrack from the statement showing all predecessor
		statements

N [N|G|P|A|GP]
	will show names that were
	- N not assigned or used
	- G used but not assigned
	- P assigned but not used
	- GP assigned and used
	- A All names

S [N|G|P|A|GP]
	will show statements that contained various Gets or Puts
	of datasets, along with the dataset names that were
	read or written

	- N no assignments or use of datasets
	- G use of datasets but no assignments
	- P assignments but no use of datasets
	- GP assignments and use of datasets
	- A All statements

	If you want all statements that created a dataset,
	you need to use both S P and S GP.
q
	quit
.PP
If you interrupt in the middle of a printout, you will be brought back
to the "Command: " prompt.
.PP
Please be nice to the parser -- it isn't all that robust.
At some time in the future, there will probably use a completely different
user interface for the system.
.PP
Processing of the `auditfile' takes about 1 cpu minute per 1500 statements.
(Perhaps 30 statements per second)
Processing of the user statements once the audit program has
read the audit file is very fast.  Interactive users notice no delay.
.SH Reference
Richard A. Becker and John M. Chambers,
"Auditing of Data Analyses"
AT&T Bell Laboratories Statistical Research Report No. 25,
January, 1986.
.KW data management
.KW utilities
.KW file
.WR
