An Introduction to the UNIX Shell S. R. Bourne (Updated for 4.3BSD by Mark Seiden) _A_B_S_T_R_A_C_T The _s_h_e_l_l|= is a command programming language that provides an interface to the UNIX|- operating sys- tem. Its features include control-flow primi- tives, parameter passing, variables and string substitution. Constructs such as _w_h_i_l_e, _i_f _t_h_e_n _e_l_s_e, _c_a_s_e and _f_o_r are available. Two-way commun- ication is possible between the _s_h_e_l_l and com- mands. String-valued parameters, typically file names or flags, may be passed to a command. A return code is set by commands that may be used to determine control-flow, and the standard output from a command may be used as shell input. The _s_h_e_l_l can modify the environment in which com- mands run. Input and output can be redirected to files, and processes that communicate through `pipes' can be invoked. Commands are found by searching directories in the file system in a sequence that can be defined by the user. Com- mands can be read either from the terminal or from a file, which allows command procedures to be stored for later use. _1._0 _I_n_t_r_o_d_u_c_t_i_o_n The shell is both a command language and a programming language that provides an interface to the UNIX operating system. This memorandum describes, with examples, the UNIX shell. The first section covers most of the everyday requirements of terminal users. Some familiarity with UNIX is an advantage when reading this section; see, for example, _________________________ |= This paper describes sh(1). If it's the c shell (csh) you're interested in, a good place to begin is William Joy's paper "An Introduction to the C shell" (USD:4). |- UNIX is a trademark of Bell Laboratories. August 7, 1987 USD:3-2 An Introduction to the UNIX Shell "UNIX for beginners". unix beginn kernigh 1978 Section 2 describes those features of the shell primarily intended for use within shell procedures. These include the control-flow primitives and string-valued variables provided by the shell. A knowledge of a programming language would be a help when reading this section. The last section describes the more advanced features of the shell. References of the form "see _p_i_p_e (2)" are to a section of the UNIX manual. seventh 1978 ritchie thompson _1._1 _S_i_m_p_l_e _c_o_m_m_a_n_d_s Simple commands consist of one or more words separated by blanks. The first word is the name of the command to be executed; any remaining words are passed as arguments to the command. For example, who is a command that prints the names of users logged in. The command ls -l prints a list of files in the current directory. The argu- ment -_l tells _l_s to print status information, size and the creation date for each file. _1._2 _B_a_c_k_g_r_o_u_n_d _c_o_m_m_a_n_d_s To execute a command the shell normally creates a new _p_r_o_- _c_e_s_s and waits for it to finish. A command may be run without waiting for it to finish. For example, cc pgm.c & calls the C compiler to compile the file _p_g_m._c. The trail- ing & is an operator that instructs the shell not to wait for the command to finish. To help keep track of such a process the shell reports its process number following its creation. A list of currently active processes may be obtained using the _p_s command. _1._3 _I_n_p_u_t _o_u_t_p_u_t _r_e_d_i_r_e_c_t_i_o_n Most commands produce output on the standard output that is initially connected to the terminal. This output may be sent to a file by writing, for example, ls -l >file The notation >_f_i_l_e is interpreted by the shell and is not passed as an argument to _l_s. If _f_i_l_e does not exist then the shell creates it; otherwise the original contents of August 7, 1987 An Introduction to the UNIX Shell USD:3-3 _f_i_l_e are replaced with the output from _l_s. Output may be appended to a file using the notation ls -l >>file In this case _f_i_l_e is also created if it does not already exist. The standard input of a command may be taken from a file instead of the terminal by writing, for example, wc file; wc * ? | &, are called metacharacters. A complete list of metacharacters is given in appendix B. Any character pre- ceded by a \ is _q_u_o_t_e_d and loses its special meaning, if any. The \ is elided so that echo \? will echo a single ?, and echo \\ will echo a single \. To allow long strings to be continued over more than one line the sequence \newline is ignored. \ is convenient for quoting single characters. When more than one character needs quoting the above mechanism is clumsy and error prone. A string of characters may be quoted by enclosing the string between single quotes. For example, echo xx'****'xx will echo August 7, 1987 USD:3-6 An Introduction to the UNIX Shell xx****xx The quoted string may not contain a single quote but may contain newlines, which are preserved. This quoting mechan- ism is the most simple and is recommended for casual use. A third quoting mechanism using double quotes is also avail- able that prevents interpretation of some but not all meta- characters. Discussion of the details is deferred to sec- tion 3.4. _1._7 _P_r_o_m_p_t_i_n_g When the shell is used from a terminal it will issue a prompt before reading a command. By default this prompt is `$ '. It may be changed by saying, for example, PS1=yesdear that sets the prompt to be the string _y_e_s_d_e_a_r. If a newline is typed and further input is needed then the shell will issue the prompt `> '. Sometimes this can be caused by mis- typing a quote mark. If it is unexpected then an interrupt (DEL) will return the shell to read another command. This prompt may be changed by saying, for example, PS2=more _1._8 _T_h_e _s_h_e_l_l _a_n_d _l_o_g_i_n Following _l_o_g_i_n (1) the shell is called to read and execute commands typed at the terminal. If the user's login direc- tory contains the file .profile then it is assumed to con- tain commands and is read by the shell before reading any commands from the terminal. _1._9 _S_u_m_m_a_r_y o+ ls Print the names of files in the current directory. o+ ls >file Put the output from _l_s into _f_i_l_e. o+ ls | wc -l Print the number of files in the current direc- tory. o+ ls | grep old Print those file names containing the string _o_l_d. August 7, 1987 An Introduction to the UNIX Shell USD:3-7 o+ ls | grep old | wc -l Print the number of files whose name contains the string _o_l_d. o+ cc pgm.c & Run _c_c in the background. _2._0 _S_h_e_l_l _p_r_o_c_e_d_u_r_e_s The shell may be used to read and execute commands contained in a file. For example, sh file [ args ... ] calls the shell to read commands from _f_i_l_e. Such a file is called a _c_o_m_m_a_n_d _p_r_o_c_e_d_u_r_e or _s_h_e_l_l _p_r_o_c_e_d_u_r_e. Arguments may be supplied with the call and are referred to in _f_i_l_e using the positional parameters $1, $2, .... For example, if the file _w_g contains who | grep $1 then sh wg fred is equivalent to who | grep fred UNIX files have three independent attributes, _r_e_a_d, _w_r_i_t_e and _e_x_e_c_u_t_e. The UNIX command _c_h_m_o_d (1) may be used to make a file executable. For example, chmod +x wg will ensure that the file _w_g has execute status. Following this, the command wg fred is equivalent to sh wg fred This allows shell procedures and programs to be used inter- changeably. In either case a new process is created to run the command. As well as providing names for the positional parameters, the number of positional parameters in the call is available as $#. The name of the file being executed is available as $0. August 7, 1987 USD:3-8 An Introduction to the UNIX Shell A special shell parameter $* is used to substitute for all positional parameters except $0. A typical use of this is to provide some default arguments, as in, nroff -T450 -ms $* which simply prepends some arguments to those already given. _2._1 _C_o_n_t_r_o_l _f_l_o_w - _f_o_r A frequent use of shell procedures is to loop through the arguments ($1, $2, ...) executing commands once for each argument. An example of such a procedure is _t_e_l that searches the file /usr/lib/telnos that contains lines of the form ... fred mh0123 bert mh0789 ... The text of _t_e_l is for i do grep $i /usr/lib/telnos; done The command tel fred prints those lines in /usr/lib/telnos that contain the string _f_r_e_d. tel fred bert prints those lines containing _f_r_e_d followed by those for _b_e_r_t. The for loop notation is recognized by the shell and has the general form for _n_a_m_e in _w_1 _w_2 ... do _c_o_m_m_a_n_d-_l_i_s_t done A _c_o_m_m_a_n_d-_l_i_s_t is a sequence of one or more simple commands separated or terminated by a newline or semicolon. Further- more, reserved words like do and done are only recognized following a newline or semicolon. _n_a_m_e is a shell variable that is set to the words _w_1 _w_2 ... in turn each time the _c_o_m_m_a_n_d-_l_i_s_t following do is executed. If in _w_1 _w_2 ... is omitted then the loop is executed once for each positional parameter; that is, in $* is assumed. August 7, 1987 An Introduction to the UNIX Shell USD:3-9 Another example of the use of the for loop is the _c_r_e_a_t_e command whose text is for i do >$i; done The command create alpha beta ensures that two empty files _a_l_p_h_a and _b_e_t_a exist and are empty. The notation >_f_i_l_e may be used on its own to create or clear the contents of a file. Notice also that a semi- colon (or newline) is required before done. _2._2 _C_o_n_t_r_o_l _f_l_o_w - _c_a_s_e A multiple way branch is provided for by the case notation. For example, case $# in 1) cat >>$1 ;; 2) cat >>$2 <$1 ;; *) echo 'usage: append [ from ] to' ;; esac is an _a_p_p_e_n_d command. When called with one argument as append file $# is the string _1 and the standard input is copied onto the end of _f_i_l_e using the _c_a_t command. append file1 file2 appends the contents of _f_i_l_e_1 onto _f_i_l_e_2. If the number of arguments supplied to _a_p_p_e_n_d is other than 1 or 2 then a message is printed indicating proper usage. The general form of the case command is case _w_o_r_d in _p_a_t_t_e_r_n) _c_o_m_m_a_n_d-_l_i_s_t;; ... esac The shell attempts to match _w_o_r_d with each _p_a_t_t_e_r_n, in the order in which the patterns appear. If a match is found the associated _c_o_m_m_a_n_d-_l_i_s_t is executed and execution of the case is complete. Since * is the pattern that matches any string it can be used for the default case. A word of caution: no check is made to ensure that only one pattern matches the case argument. The first match found defines the set of commands to be executed. In the example August 7, 1987 USD:3-10 An Introduction to the UNIX Shell below the commands following the second * will never be exe- cuted. case $# in *) ... ;; *) ... ;; esac Another example of the use of the case construction is to distinguish between different forms of an argument. The following example is a fragment of a _c_c command. for i do case $i in -[ocs]) ... ;; -*) echo 'unknown flag $i' ;; *.c) /lib/c0 $i ... ;; *) echo 'unexpected argument $i' ;; esac done To allow the same commands to be associated with more than one pattern the case command provides for alternative pat- terns separated by a |. For example, case $i in -x|-y) ... esac is equivalent to case $i in -[xy]) ... esac The usual quoting conventions apply so that case $i in \?) ... will match the character ?. _2._3 _H_e_r_e _d_o_c_u_m_e_n_t_s The shell procedure _t_e_l in section 2.1 uses the file /usr/lib/telnos to supply the data for _g_r_e_p. An alternative is to include this data within the shell procedure as a _h_e_r_e document, as in, August 7, 1987 An Introduction to the UNIX Shell USD:3-11 for i do grep $i <${tmp}a will direct the output of _p_s to the file /tmp/psa, whereas, ps a >$tmpa would cause the value of the variable tmpa to be substi- tuted. August 7, 1987 An Introduction to the UNIX Shell USD:3-13 Except for $? the following are set initially by the shell. $? is set after executing each command. $? The exit status (return code) of the last com- mand executed as a decimal string. Most com- mands return a zero exit status if they com- plete successfully, otherwise a non-zero exit status is returned. Testing the value of return codes is dealt with later under if and while commands. $# The number of positional parameters (in decimal). Used, for example, in the _a_p_p_e_n_d command to check the number of parameters. $$ The process number of this shell (in decimal). Since process numbers are unique among all existing processes, this string is frequently used to generate unique temporary file names. For example, ps a >/tmp/ps$$ ... rm /tmp/ps$$ $! The process number of the last process run in the background (in decimal). $- The current shell flags, such as -x and -v. Some variables have a special meaning to the shell and should be avoided for general use. $MAIL When used interactively the shell looks at the file specified by this variable before it issues a prompt. If the specified file has been modified since it was last looked at the shell prints the message _y_o_u _h_a_v_e _m_a_i_l before prompting for the next command. This variable is typically set in the file .profile, in the user's login directory. For example, MAIL=/usr/spool/mail/fred $HOME The default argument for the _c_d command. The current directory is used to resolve file name references that do not begin with a /, and is changed using the _c_d command. For example, cd /usr/fred/bin makes the current directory /usr/fred/bin. August 7, 1987 USD:3-14 An Introduction to the UNIX Shell cat wn will print on the terminal the file _w_n in this directory. The command _c_d with no argument is equivalent to cd $HOME This variable is also typically set in the the user's login profile. $PATH A list of directories that contain commands (the _s_e_a_r_c_h _p_a_t_h). Each time a command is exe- cuted by the shell a list of directories is searched for an executable file. If $PATH is not set then the current directory, /bin, and /usr/bin are searched by default. Otherwise $PATH consists of directory names separated by :. For example, PATH=:/usr/fred/bin:/bin:/usr/bin specifies that the current directory (the null string before the first :), /usr/fred/bin, /bin and /usr/bin are to be searched in that order. In this way individual users can have their own `private' commands that are accessible indepen- dently of the current directory. If the com- mand name contains a / then this directory search is not used; a single attempt is made to execute the command. $PS1 The primary shell prompt string, by default, `$ '. $PS2 The shell prompt when further input is needed, by default, `> '. $IFS The set of characters used by _b_l_a_n_k _i_n_t_e_r_p_r_e_t_a_- _t_i_o_n (see section 3.4). _2._5 _T_h_e _t_e_s_t _c_o_m_m_a_n_d The _t_e_s_t command, although not part of the shell, is intended for use by shell programs. For example, test -f file returns zero exit status if _f_i_l_e exists and non-zero exit status otherwise. In general _t_e_s_t evaluates a predicate and returns the result as its exit status. Some of the more frequently used _t_e_s_t arguments are given here, see _t_e_s_t (1) for a complete specification. August 7, 1987 An Introduction to the UNIX Shell USD:3-15 test s true if the argument _s is not the null string test -f file true if _f_i_l_e exists test -r file true if _f_i_l_e is readable test -w file true if _f_i_l_e is writable test -d file true if _f_i_l_e is a directory _2._6 _C_o_n_t_r_o_l _f_l_o_w - _w_h_i_l_e The actions of the for loop and the case branch are deter- mined by data available to the shell. A while or until loop and an if then else branch are also provided whose actions are determined by the exit status returned by commands. A while loop has the general form while _c_o_m_m_a_n_d-_l_i_s_t_1 do _c_o_m_m_a_n_d-_l_i_s_t_2 done The value tested by the while command is the exit status of the last simple command following while. Each time round the loop _c_o_m_m_a_n_d-_l_i_s_t_1 is executed; if a zero exit status is returned then _c_o_m_m_a_n_d-_l_i_s_t_2 is executed; otherwise, the loop terminates. For example, while test $1 do ... shift done is equivalent to for i do ... done _s_h_i_f_t is a shell command that renames the positional parame- ters $2, $3, ... as $1, $2, ... and loses $1. Another kind of use for the while/until loop is to wait until some external event occurs and then run some commands. In an until loop the termination condition is reversed. For example, until test -f file do sleep 300; done _c_o_m_m_a_n_d_s will loop until _f_i_l_e exists. Each time round the loop it waits for 5 minutes before trying again. (Presumably another process will eventually create the file.) August 7, 1987 USD:3-16 An Introduction to the UNIX Shell _2._7 _C_o_n_t_r_o_l _f_l_o_w - _i_f Also available is a general conditional branch of the form, if _c_o_m_m_a_n_d-_l_i_s_t then _c_o_m_m_a_n_d-_l_i_s_t else _c_o_m_m_a_n_d-_l_i_s_t fi that tests the value returned by the last simple command following if. The if command may be used in conjunction with the _t_e_s_t com- mand to test for the existence of a file as in if test -f file then _p_r_o_c_e_s_s _f_i_l_e else _d_o _s_o_m_e_t_h_i_n_g _e_l_s_e fi An example of the use of if, case and for constructions is given in section 2.10. A multiple test if command of the form if ... then ... else if ... then ... else if ... ... fi fi fi may be written using an extension of the if notation as, if ... then ... elif ... then ... elif ... ... fi The following example is the _t_o_u_c_h command which changes the `last modified' time for a list of files. The command may be used in conjunction with _m_a_k_e (1) to force recompilation of a list of files. August 7, 1987 An Introduction to the UNIX Shell USD:3-17 flag= for i do case $i in -c) flag=N ;; *) if test -f $i then ln $i junk$$; rm junk$$ elif test $flag then echo file \'$i\' does not exist else >$i fi esac done The -c flag is used in this command to force subsequent files to be created if they do not already exist. Other- wise, if the file does not exist, an error message is printed. The shell variable _f_l_a_g is set to some non-null string if the -c argument is encountered. The commands ln ...; rm ... make a link to the file and then remove it thus causing the last modified date to be updated. The sequence if command1 then command2 fi may be written command1 && command2 Conversely, command1 || command2 executes _c_o_m_m_a_n_d_2 only if _c_o_m_m_a_n_d_1 fails. In each case the value returned is that of the last simple command executed. _2._8 _C_o_m_m_a_n_d _g_r_o_u_p_i_n_g Commands may be grouped in two ways, { _c_o_m_m_a_n_d-_l_i_s_t ; } and ( _c_o_m_m_a_n_d-_l_i_s_t ) In the first _c_o_m_m_a_n_d-_l_i_s_t is simply executed. The second August 7, 1987 USD:3-18 An Introduction to the UNIX Shell form executes _c_o_m_m_a_n_d-_l_i_s_t as a separate process. For exam- ple, (cd x; rm junk ) executes _r_m _j_u_n_k in the directory x without changing the current directory of the invoking shell. The commands cd x; rm junk have the same effect but leave the invoking shell in the directory x. _2._9 _D_e_b_u_g_g_i_n_g _s_h_e_l_l _p_r_o_c_e_d_u_r_e_s The shell provides two tracing mechanisms to help when debugging shell procedures. The first is invoked within the procedure as set -v (v for verbose) and causes lines of the procedure to be printed as they are read. It is useful to help isolate syn- tax errors. It may be invoked without modifying the pro- cedure by saying sh -v proc ... where _p_r_o_c is the name of the shell procedure. This flag may be used in conjunction with the -n flag which prevents execution of subsequent commands. (Note that saying _s_e_t -_n at a terminal will render the terminal useless until an end-of-file is typed.) The command set -x will produce an execution trace. Following parameter sub- stitution each command is printed as it is executed. (Try these at the terminal to see what effect they have.) Both flags may be turned off by saying set - and the current setting of the shell flags is available as $-. _2._1_0 _T_h_e _m_a_n _c_o_m_m_a_n_d The following is the _m_a_n command which is used to diplay sections of the UNIX manual on your terminal. It is called, August 7, 1987 An Introduction to the UNIX Shell USD:3-19 for example, as man sh man -t ed man 2 fork In the first the manual section for _s_h is displayed.. Since no section is specified, section 1 is used. The second example will typeset (-t option) the manual section for _e_d. The last prints the _f_o_r_k manual page from section 2, which covers system calls. cd /usr/man : 'colon is the comment command' : 'default is nroff ($N), section 1 ($s)' N=n s=1 for i do case $i in 9 [1-9]*) s=$i ;; 9 -t) N=t ;; 9 -n) N=n ;; 9 -*) echo unknown flag \'$i\' ;; 9 *) if test -f man$s/$i.$s then ${N}roff man0/${N}aa man$s/$i.$s else : 'look through all manual sections' found=no for j in 1 2 3 4 5 6 7 8 9 do if test -f man$j/$i.$j then man $j $i found=yes fi done case $found in no) echo '$i: manual page not found' esac fi esac done Figure 1. A version of the man command _3._0 _K_e_y_w_o_r_d _p_a_r_a_m_e_t_e_r_s Shell variables may be given values by assignment or when a shell procedure is invoked. An argument to a shell pro- cedure of the form _n_a_m_e=_v_a_l_u_e that precedes the command name causes _v_a_l_u_e to be assigned to _n_a_m_e before execution of the procedure begins. The value of _n_a_m_e in the invoking shell 9 August 7, 1987 USD:3-20 An Introduction to the UNIX Shell is not affected. For example, user=fred command will execute _c_o_m_m_a_n_d with user set to _f_r_e_d. The -k flag causes arguments of the form _n_a_m_e=_v_a_l_u_e to be interpreted in this way anywhere in the argument list. Such _n_a_m_e_s are sometimes called keyword parameters. If any arguments remain they are available as positional parameters $1, $2, .... The _s_e_t command may also be used to set positional parame- ters from within a procedure. For example, set - * will set $1 to the first file name in the current directory, $2 to the next, and so on. Note that the first argument, -, ensures correct treatment when the first file name begins with a -. _3._1 _P_a_r_a_m_e_t_e_r _t_r_a_n_s_m_i_s_s_i_o_n When a shell procedure is invoked both positional and key- word parameters may be supplied with the call. Keyword parameters are also made available implicitly to a shell procedure by specifying in advance that such parameters are to be exported. For example, export user box marks the variables user and box for export. When a shell procedure is invoked copies are made of all exportable vari- ables for use within the invoked procedure. Modification of such variables within the procedure does not affect the values in the invoking shell. It is generally true of a shell procedure that it may not modify the state of its caller without explicit request on the part of the caller. (Shared file descriptors are an exception to this rule.) Names whose value is intended to remain constant may be declared _r_e_a_d_o_n_l_y. The form of this command is the same as that of the _e_x_p_o_r_t command, readonly name ... Subsequent attempts to set readonly variables are illegal. _3._2 _P_a_r_a_m_e_t_e_r _s_u_b_s_t_i_t_u_t_i_o_n If a shell parameter is not set then the null string is sub- stituted for it. For example, if the variable d is not set August 7, 1987 An Introduction to the UNIX Shell USD:3-21 echo $d or echo ${d} will echo nothing. A default string may be given as in echo ${d-.} which will echo the value of the variable d if it is set and `.' otherwise. The default string is evaluated using the usual quoting conventions so that echo ${d-'*'} will echo * if the variable d is not set. Similarly echo ${d-$1} will echo the value of d if it is set and the value (if any) of $1 otherwise. A variable may be assigned a default value using the notation echo ${d=.} which substitutes the same string as echo ${d-.} and if d were not previously set then it will be set to the string `.'. (The notation ${...=...} is not available for positional parameters.) If there is no sensible default then the notation echo ${d?message} will echo the value of the variable d if it has one, other- wise _m_e_s_s_a_g_e is printed by the shell and execution of the shell procedure is abandoned. If _m_e_s_s_a_g_e is absent then a standard message is printed. A shell procedure that requires some parameters to be set might start as follows. : ${user?} ${acct?} ${bin?} ... Colon (:) is a command that is built in to the shell and does nothing once its arguments have been evaluated. If any of the variables user, acct or bin are not set then the shell will abandon execution of the procedure. August 7, 1987 USD:3-22 An Introduction to the UNIX Shell _3._3 _C_o_m_m_a_n_d _s_u_b_s_t_i_t_u_t_i_o_n The standard output from a command can be substituted in a similar way to parameters. The command _p_w_d prints on its standard output the name of the current directory. For example, if the current directory is /usr/fred/bin then the command d=`pwd` is equivalent to d=/usr/fred/bin The entire string between grave accents (`...`) is taken as the command to be executed and is replaced with the output from the command. The command is written using the usual quoting conventions except that a ` must be escaped using a \. For example, ls `echo "$1"` is equivalent to ls $1 Command substitution occurs in all contexts where parameter substitution occurs (including _h_e_r_e documents) and the treatment of the resulting text is the same in both cases. This mechanism allows string processing commands to be used within shell procedures. An example of such a command is _b_a_s_e_n_a_m_e which removes a specified suffix from a string. For example, basename main.c .c will print the string _m_a_i_n. Its use is illustrated by the following fragment from a _c_c command. case $A in ... *.c) B=`basename $A .c` ... esac that sets B to the part of $A with the suffix .c stripped. Here are some composite examples. o+ for i in `ls -t`; do ... The variable i is set to the names of files in time order, most recent first. August 7, 1987 An Introduction to the UNIX Shell USD:3-23 o+ set `date`; echo $6 $2 $3, $4 will print, e.g., _1_9_7_7 _N_o_v _1, _2_3:_5_9:_5_9 _3._4 _E_v_a_l_u_a_t_i_o_n _a_n_d _q_u_o_t_i_n_g The shell is a macro processor that provides parameter sub- stitution, command substitution and file name generation for the arguments to commands. This section discusses the order in which these evaluations occur and the effects of the various quoting mechanisms. Commands are parsed initially according to the grammar given in appendix A. Before a command is executed the following substitutions occur. o+ parameter substitution, e.g. $user o+ command substitution, e.g. `pwd` Only one evaluation occurs so that if, for exam- ple, the value of the variable X is the string $_y then echo $X will echo $_y. o+ blank interpretation Following the above substitutions the resulting characters are broken into non-blank words (_b_l_a_n_k _i_n_t_e_r_p_r_e_t_a_t_i_o_n). For this purpose `blanks' are the characters of the string $IFS. By default, this string consists of blank, tab and newline. The null string is not regarded as a word unless it is quoted. For example, echo '' will pass on the null string as the first argument to _e_c_h_o, whereas echo $null will call _e_c_h_o with no arguments if the variable null is not set or set to the null string. o+ file name generation Each word is then scanned for the file pattern characters *, ? and [...] and an alphabetical list of file names is generated to replace the word. Each such file name is a separate argument. August 7, 1987 USD:3-24 An Introduction to the UNIX Shell The evaluations just described also occur in the list of words associated with a for loop. Only substitution occurs in the _w_o_r_d used for a case branch. As well as the quoting mechanisms described earlier using \ and '...' a third quoting mechanism is provided using double quotes. Within double quotes parameter and command substi- tution occurs but file name generation and the interpreta- tion of blanks does not. The following characters have a special meaning within double quotes and may be quoted using \. $ parameter substitution ` command substitution " ends the quoted string \ quotes the special characters $ ` " \ For example, echo "$x" will pass the value of the variable x as a single argument to _e_c_h_o. Similarly, echo "$*" will pass the positional parameters as a single argument and is equivalent to echo "$1 $2 ..." The notation $@ is the same as $* except when it is quoted. echo "$@" will pass the positional parameters, unevaluated, to _e_c_h_o and is equivalent to echo "$1" "$2" ... The following table gives, for each quoting mechanism, the shell metacharacters that are evaluated. August 7, 1987 An Introduction to the UNIX Shell USD:3-25 _m_e_t_a_c_h_a_r_a_c_t_e_r \ $ * ` " ' ' n n n n n t ` y n n t n n " y y n y t n t terminator y interpreted n not interpreted Figure 2. Quoting mechanisms In cases where more than one evaluation of a string is required the built-in command _e_v_a_l may be used. For exam- ple, if the variable X has the value $_y, and if y has the value _p_q_r then eval echo $X will echo the string _p_q_r. In general the _e_v_a_l command evaluates its arguments (as do all commands) and treats the result as input to the shell. The input is read and the resulting command(s) executed. For example, wg='eval who|grep' $wg fred is equivalent to who|grep fred In this example, _e_v_a_l is required since there is no interpretation of metacharacters, such as |, following sub- stitution. _3._5 _E_r_r_o_r _h_a_n_d_l_i_n_g The treatment of errors detected by the shell depends on the type of error and on whether the shell is being used interactively. An interactive shell is one whose input and output are connected to a terminal (as determined by _g_t_t_y (2)). A shell invoked with the -i flag is also interactive. Execution of a command (see also 3.7) may fail for any of the following reasons. o+ Input output redirection may fail. For example, if a file does not exist or cannot be created. o+ The command itself does not exist or cannot be August 7, 1987 USD:3-26 An Introduction to the UNIX Shell executed. o+ The command terminates abnormally, for example, with a "bus error" or "memory fault". See Figure 2 below for a complete list of UNIX signals. o+ The command terminates normally but returns a non-zero exit status. In all of these cases the shell will go on to execute the next command. Except for the last case an error message will be printed by the shell. All remaining errors cause the shell to exit from a command procedure. An interactive shell will return to read another command from the terminal. Such errors include the following. o+ Syntax errors. e.g., if ... then ... done o+ A signal such as interrupt. The shell waits for the current command, if any, to finish execution and then either exits or returns to the terminal. o+ Failure of any of the built-in commands such as _c_d. The shell flag -e causes the shell to terminate if any error is detected. 1 hangup 2 interrupt 3* quit 4* illegal instruction 5* trace trap 6* IOT instruction 7* EMT instruction 8* floating point exception 9 kill (cannot be caught or ignored) 10* bus error 11* segmentation violation 12* bad argument to system call 13 write on a pipe with no one to read it 14 alarm clock 15 software termination (from _k_i_l_l (1)) Figure 3. UNIX signals|- Those signals marked with an asterisk produce a core dump if not caught. However, the shell itself ignores quit which is the only external signal that can cause a dump. The signals in this list of potential interest to shell programs are 1, 2, 3, 14 and 15. _________________________ |- Additional signals have been added in Berkeley Unix. See sigvec(2) or signal(3C) for an up-to-date list. August 7, 1987 An Introduction to the UNIX Shell USD:3-27 _3._6 _F_a_u_l_t _h_a_n_d_l_i_n_g Shell procedures normally terminate when an interrupt is received from the terminal. The _t_r_a_p command is used if some cleaning up is required, such as removing temporary files. For example, trap 'rm /tmp/ps$$; exit' 2 sets a trap for signal 2 (terminal interrupt), and if this signal is received will execute the commands rm /tmp/ps$$; exit _e_x_i_t is another built-in command that terminates execution of a shell procedure. The _e_x_i_t is required; otherwise, after the trap has been taken, the shell will resume execut- ing the procedure at the place where it was interrupted. UNIX signals can be handled in one of three ways. They can be ignored, in which case the signal is never sent to the process. They can be caught, in which case the process must decide what action to take when the signal is received. Lastly, they can be left to cause termination of the process without it having to take any further action. If a signal is being ignored on entry to the shell procedure, for exam- ple, by invoking it in the background (see 3.7) then _t_r_a_p commands (and the signal) are ignored. The use of _t_r_a_p is illustrated by this modified version of the _t_o_u_c_h command (Figure 4). The cleanup action is to remove the file junk$$. flag= trap 'rm -f junk$$; exit' 1 2 3 15 for i do case $i in -c) flag=N ;; *) if test -f $i then ln $i junk$$; rm junk$$ elif test $flag then echo file \'$i\' does not exist else >$i fi esac done Figure 4. The touch command The _t_r_a_p command appears before the creation of the tem- porary file; otherwise it would be possible for the process to die without removing the file. August 7, 1987 USD:3-28 An Introduction to the UNIX Shell Since there is no signal 0 in UNIX it is used by the shell to indicate the commands to be executed on exit from the shell procedure. A procedure may, itself, elect to ignore signals by specify- ing the null string as the argument to trap. The following fragment is taken from the _n_o_h_u_p command. trap '' 1 2 3 15 which causes _h_a_n_g_u_p, _i_n_t_e_r_r_u_p_t, _q_u_i_t and _k_i_l_l to be ignored both by the procedure and by invoked commands. Traps may be reset by saying trap 2 3 which resets the traps for signals 2 and 3 to their default values. A list of the current values of traps may be obtained by writing trap The procedure _s_c_a_n (Figure 5) is an example of the use of _t_r_a_p where there is no exit in the trap command. _s_c_a_n takes each directory in the current directory, prompts with its name, and then executes commands typed at the terminal until an end of file or an interrupt is received. Interrupts are ignored while executing the requested commands but cause termination when _s_c_a_n is waiting for input. d=`pwd` for i in * do if test -d $d/$i then cd $d/$i while echo "$i:" trap exit 2 read x do trap : 2; eval $x; done fi done Figure 5. The scan command _r_e_a_d _x is a built-in command that reads one line from the standard input and places the result in the variable x. It returns a non-zero exit status if either an end-of-file is read or an interrupt is received. _3._7 _C_o_m_m_a_n_d _e_x_e_c_u_t_i_o_n To run a command (other than a built-in) the shell first August 7, 1987 An Introduction to the UNIX Shell USD:3-29 creates a new process using the system call _f_o_r_k. The exe- cution environment for the command includes input, output and the states of signals, and is established in the child process before the command is executed. The built-in com- mand _e_x_e_c is used in the rare cases when no fork is required and simply replaces the shell with a new command. For exam- ple, a simple version of the _n_o_h_u_p command looks like trap '' 1 2 3 15 exec $* The _t_r_a_p turns off the signals specified so that they are ignored by subsequently created commands and _e_x_e_c replaces the shell by the command specified. Most forms of input output redirection have already been described. In the following _w_o_r_d is only subject to parame- ter and command substitution. No file name generation or blank interpretation takes place so that, for example, echo ... >*.c will write its output into a file whose name is *.c. Input output specifications are evaluated left to right as they appear in the command. > _w_o_r_d The standard output (file descriptor 1) is sent to the file _w_o_r_d which is created if it does not already exist. >> _w_o_r_d The standard output is sent to file _w_o_r_d. If the file exists then output is appended (by seeking to the end); otherwise the file is created. < _w_o_r_d The standard input (file descriptor 0) is taken from the file _w_o_r_d. << _w_o_r_d The standard input is taken from the lines of shell input that follow up to but not including a line consisting only of _w_o_r_d. If _w_o_r_d is quoted then no interpretation of the document occurs. If _w_o_r_d is not quoted then parameter and command substitution occur and \ is used to quote the characters \ $ ` and the first charac- ter of _w_o_r_d. In the latter case \newline is ignored (c.f. quoted strings). >& _d_i_g_i_t The file descriptor _d_i_g_i_t is duplicated using the system call _d_u_p (2) and the result is used as the standard output. <& _d_i_g_i_t The standard input is duplicated from file descriptor _d_i_g_i_t. August 7, 1987 USD:3-30 An Introduction to the UNIX Shell <&- The standard input is closed. >&- The standard output is closed. Any of the above may be preceded by a digit in which case the file descriptor created is that specified by the digit instead of the default 0 or 1. For example, ... 2>file runs a command with message output (file descriptor 2) directed to _f_i_l_e. ... 2>&1 runs a command with its standard output and message output merged. (Strictly speaking file descriptor 2 is created by duplicating file descriptor 1 but the effect is usually to merge the two streams.) The environment for a command run in the background such as list *.c | lpr & is modified in two ways. Firstly, the default standard input for such a command is the empty file /dev/null. This prevents two processes (the shell and the command), which are running in parallel, from trying to read the same input. Chaos would ensue if this were not the case. For example, ed file & would allow both the editor and the shell to read from the same input at the same time. The other modification to the environment of a background command is to turn off the QUIT and INTERRUPT signals so that they are ignored by the command. This allows these signals to be used at the terminal without causing back- ground commands to terminate. For this reason the UNIX con- vention for a signal is that if it is set to 1 (ignored) then it is never changed even for a short time. Note that the shell command _t_r_a_p has no effect for an ignored signal. _3._8 _I_n_v_o_k_i_n_g _t_h_e _s_h_e_l_l The following flags are interpreted by the shell when it is invoked. If the first character of argument zero is a minus, then commands are read from the file .profile. -c _s_t_r_i_n_g If the -c flag is present then commands are read from _s_t_r_i_n_g. August 7, 1987 An Introduction to the UNIX Shell USD:3-31 -s If the -s flag is present or if no arguments remain then commands are read from the standard input. Shell output is written to file descriptor 2. -i If the -i flag is present or if the shell input and output are attached to a terminal (as told by _g_t_t_y) then this shell is _i_n_t_e_r_a_c_t_i_v_e. In this case TERMINATE is ignored (so that kill 0 does not kill an interactive shell) and INTERRUPT is caught and ignored (so that wait is interruptable). In all cases QUIT is ignored by the shell. _A_c_k_n_o_w_l_e_d_g_e_m_e_n_t_s The design of the shell is based in part on the original UNIX shell unix command language thompson and the PWB/UNIX shell, pwb shell mashey unix some features having been taken from both. Similarities also exist with the command inter- preters of the Cambridge Multiple Access System cambridge multiple access system hartley and of CTSS. ctss I would like to thank Dennis Ritchie and John Mashey for many discussions during the design of the shell. I am also grateful to the members of the Computing Science Research Center and to Joe Maranzano for their comments on drafts of this document. $_L_I_S_T$ August 7, 1987 USD:3-32 An Introduction to the UNIX Shell _A_p_p_e_n_d_i_x _A - _G_r_a_m_m_a_r August 7, 1987 An Introduction to the UNIX Shell USD:3-33 _i_t_e_m: _w_o_r_d _i_n_p_u_t-_o_u_t_p_u_t _n_a_m_e = _v_a_l_u_e _s_i_m_p_l_e-_c_o_m_m_a_n_d: _i_t_e_m _s_i_m_p_l_e-_c_o_m_m_a_n_d _i_t_e_m _c_o_m_m_a_n_d: _s_i_m_p_l_e-_c_o_m_m_a_n_d ( _c_o_m_m_a_n_d-_l_i_s_t ) { _c_o_m_m_a_n_d-_l_i_s_t } for _n_a_m_e do _c_o_m_m_a_n_d-_l_i_s_t done for _n_a_m_e in _w_o_r_d ... do _c_o_m_m_a_n_d-_l_i_s_t done while _c_o_m_m_a_n_d-_l_i_s_t do _c_o_m_m_a_n_d-_l_i_s_t done until _c_o_m_m_a_n_d-_l_i_s_t do _c_o_m_m_a_n_d-_l_i_s_t done case _w_o_r_d in _c_a_s_e-_p_a_r_t ... esac if _c_o_m_m_a_n_d-_l_i_s_t then _c_o_m_m_a_n_d-_l_i_s_t _e_l_s_e-_p_a_r_t fi _p_i_p_e_l_i_n_e: _c_o_m_m_a_n_d _p_i_p_e_l_i_n_e | _c_o_m_m_a_n_d _a_n_d_o_r: _p_i_p_e_l_i_n_e _a_n_d_o_r && _p_i_p_e_l_i_n_e _a_n_d_o_r || _p_i_p_e_l_i_n_e _c_o_m_m_a_n_d-_l_i_s_t: _a_n_d_o_r _c_o_m_m_a_n_d-_l_i_s_t ; _c_o_m_m_a_n_d-_l_i_s_t & _c_o_m_m_a_n_d-_l_i_s_t ; _a_n_d_o_r _c_o_m_m_a_n_d-_l_i_s_t & _a_n_d_o_r _i_n_p_u_t-_o_u_t_p_u_t: > _f_i_l_e < _f_i_l_e >> _w_o_r_d << _w_o_r_d _f_i_l_e: _w_o_r_d & _d_i_g_i_t & - _c_a_s_e-_p_a_r_t: _p_a_t_t_e_r_n ) _c_o_m_m_a_n_d-_l_i_s_t ;; _p_a_t_t_e_r_n: _w_o_r_d _p_a_t_t_e_r_n | _w_o_r_d _e_l_s_e-_p_a_r_t: elif _c_o_m_m_a_n_d-_l_i_s_t then _c_o_m_m_a_n_d-_l_i_s_t _e_l_s_e-_p_a_r_t else _c_o_m_m_a_n_d-_l_i_s_t _e_m_p_t_y _e_m_p_t_y: _w_o_r_d: a sequence of non-blank characters _n_a_m_e: a sequence of letters, digits or underscores starting with a letter August 7, 1987 USD:3-34 An Introduction to the UNIX Shell _d_i_g_i_t: 0 1 2 3 4 5 6 7 8 9 August 7, 1987 An Introduction to the UNIX Shell USD:3-35 _A_p_p_e_n_d_i_x _B - _M_e_t_a-_c_h_a_r_a_c_t_e_r_s _a_n_d _R_e_s_e_r_v_e_d _W_o_r_d_s a) syntactic | pipe symbol && `andf' symbol || `orf' symbol ; command separator ;; case delimiter & background commands ( ) command grouping < input redirection << input from a here document > output creation >> output append b) patterns * match any character(s) including none ? match any single character [...] match any of the enclosed characters c) substitution ${...}substitute shell variable `...` substitute command output d) quoting \ quote the next character '...' quote the enclosed characters except for ' "..." quote the enclosed characters except for $ ` \ " August 7, 1987 USD:3-36 An Introduction to the UNIX Shell e) reserved words if then else elif fi case in esac for while until do done { } August 7, 1987