LEARN - Computer-Aided Instruction on UNIX (Second Edition) Brian W. Kernighan Michael E. Lesk Bell Laboratories Murray Hill, New Jersey 07974 _A_B_S_T_R_A_C_T This paper describes the second version of the _l_e_a_r_n program for interpreting CAI scripts on the UNIX* operating system, and a set of scripts that provide a computerized introduction to the system. Six current scripts cover basic commands and file handling, the editor, additional file han- dling commands, the _e_q_n program for mathematical typing, the ``-ms'' package of formatting macros, and an introduction to the C programming language. These scripts now include a total of about 530 lessons. Many users from a wide variety of backgrounds have used _l_e_a_r_n to acquire basic UNIX skills. Most usage involves the first two scripts, an introduction to UNIX files and commands, and the UNIX editor. The second version of _l_e_a_r_n is about four times faster than the previous one in CPU utiliza- tion, and much faster in perceived time because of better overlap of computing and printing. It also requires less file space than the first version. Many of the lessons have been revised; new material has been added to reflect changes and enhancements in UNIX itself. Script-writing is also easier because of revisions to the script language. __________________________ * UNIX is a Trademark of Bell Laboratories. September 27, 1987 - 2 - _1. _I_n_t_r_o_d_u_c_t_i_o_n. _L_e_a_r_n is a driver for CAI scripts. It is intended to permit the easy composition of lessons and lesson fragments to teach people computer skills. Since it is teaching the same system on which it is implemented, it makes direct use of UNIX facilities to create a controlled UNIX environment. The system includes two main parts: (1) a driver that inter- prets the lesson scripts; and (2) the lesson scripts them- selves. At present there are seven scripts: - basic file handling commands - the UNIX text editors _e_d and _v_i - advanced file handling - the _e_q_n language for typing mathematics - the ``ms'' macro package for document formatting - the C programming language The purported advantages of CAI scripts for training in computer skills include the following: (a) students are forced to perform the exercises that are in fact the basis of training in any case; (b) students receive immediate feedback and confirma- tion of progress; (c) students may progress at their own rate; (d) no schedule requirements are imposed; students may study at any time convenient for them; (e) the lessons may be improved individually and the improvements are immediately available to new users; (f) since the student has access to a computer for the CAI script there is a place to do exercises; (g) the use of high technology will improve student motivation and the interest of their management. Opposed to this, of course, is the absence of anyone to whom the student may direct questions. If CAI is used without a ``counselor'' or other assistance, it should prop- erly be compared to a textbook, lecture series, or taped course, rather than to a seminar. CAI has been used for many years in a variety of educational areas. [ bitzer plato 1970 ] [ gray coala 1977 ] [ suppes individualize 1967 September 27, 1987 - 3 - ] The use of a computer to teach computer use itself, how- ever, offers unique advantages. The skills developed to get through the script are exactly those needed to use the com- puter; there is no waste effort. The scripts written so far are based on some familiar assumptions about education; these assumptions are outlined in the next section. The remaining sections describe the operation of the script driver and the particular scripts now available. The driver puts few restrictions on the script writer, but the current scripts are of a rather rigid and stereotyped form in accordance with the theory in the next section and practical limitations. _2. _E_d_u_c_a_t_i_o_n_a_l _A_s_s_u_m_p_t_i_o_n_s _a_n_d _D_e_s_i_g_n. First, the way to teach people how to do something is to have them do it. Scripts should not contain long pieces of explanation; they should instead frequently ask the stu- dent to do some task. So teaching is always by example: the typical script fragment shows a small example of some tech- nique and then asks the user to either repeat that example or produce a variation on it. All are intended to be easy enough that most students will get most questions right, reinforcing the desired behavior. Most lessons fall into one of three types. The sim- plest presents a lesson and asks for a yes or no answer to a question. The student is given a chance to experiment before replying. The script checks for the correct reply. Problems of this form are sparingly used. The second type asks for a word or number as an answer. For example a lesson on files might say _H_o_w _m_a_n_y _f_i_l_e_s _a_r_e _t_h_e_r_e _i_n _t_h_e _c_u_r_r_e_n_t _d_i_r_e_c_t_o_r_y? _T_y_p_e ``_a_n_s_w_e_r _N'', _w_h_e_r_e _N _i_s _t_h_e _n_u_m_b_e_r _o_f _f_i_l_e_s. The student is expected to respond (perhaps after experi- menting) with _a_n_s_w_e_r _1_7 or whatever. Surprisingly often, however, the idea of a substitutable argument (i.e., replacing _N by 17) is diffi- cult for non-programmer students, so the first few such les- sons need real care. The third type of lesson is open-ended - a task is set for the student, appropriate parts of the input or output are monitored, and the student types _r_e_a_d_y when the task is done. Figure 1 shows a sample dialog that illustrates the last of these, using two lessons about the _c_a_t (concatenate, i.e., print) command taken from early in the script that September 27, 1987 - 4 - teaches file handling. Most _l_e_a_r_n lessons are of this form. 8 _________________________________________________________ 7 Figure 1: Sample dialog from basic files script (Student responses in italics; `$' is the prompt) A file can be printed on your terminal by using the "cat" command. Just say "cat file" where "file" is the file name. For example, there is a file named "food" in this directory. List it by saying "cat food"; then type "ready". $ _c_a_t _f_o_o_d this is the file named food. $ _r_e_a_d_y Good. Lesson 3.3a (1) Of course, you can print any file with "cat". In particular, it is common to first use "ls" to find the name of a file and then "cat" to print it. Note the difference between "ls", which tells you the name of the file, and "cat", which tells you the contents. One file in the current directory is named for a President. Print the file, then type "ready". $ _c_a_t _P_r_e_s_i_d_e_n_t cat: can't open President $ _r_e_a_d_y Sorry, that's not right. Do you want to try again? _y_e_s Try the problem again. $ _l_s .ocopy X1 roosevelt $ _c_a_t _r_o_o_s_e_v_e_l_t this file is named roosevelt and contains three lines of text. $ _r_e_a_d_y Good. Lesson 3.3b (0) The "cat" command can also print several files at once. In fact, it is named "cat" as an abbreviation for "concatenate".... 8 _________________________________________________________ 7 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 September 27, 1987 - 5 - After each correct response the computer congratulates the student and indicates the lesson number that has just been completed, permitting the student to restart the script after that lesson. If the answer is wrong, the student is offered a chance to repeat the lesson. The ``speed'' rating of the student (explained in section 5) is given after the lesson number when the lesson is completed successfully; it is printed only for the aid of script authors checking out possible errors in the lessons. It is assumed that there is no foolproof way to deter- mine if the student truly ``understands'' what he or she is doing; accordingly, the current _l_e_a_r_n scripts only measure performance, not comprehension. If the student can perform a given task, that is deemed to be ``learning.'' [ skinner teaching 1961 ] The main point of using the computer is that what the student does is checked for correctness immediately. Unlike many CAI scripts, however, these scripts provide few facili- ties for dealing with wrong answers. In practice, if most of the answers are not right the script is a failure; the universal solution to student error is to provide a new, easier script. Anticipating possible wrong answers is an endless job, and it is really easier as well as better to provide a simpler script. Along with this goes the assumption that anything can be taught to anybody if it can be broken into sufficiently small pieces. Anything not absorbed in a single chunk is just subdivided. To avoid boring the faster students, however, an effort is made in the files and editor scripts to provide three tracks of different difficulty. The fastest sequence of lessons is aimed at roughly the bulk and speed of a typical tutorial manual and should be adequate for review and for well-prepared students. The next track is intended for most users and is roughly twice as long. Typically, for example, the fast track might present an idea and ask for a variation on the example shown; the normal track will first ask the student to repeat the example that was shown before attempt- ing a variation. The third and slowest track, which is often three or four times the length of the fast track, is intended to be adequate for anyone. (The lessons of Figure 1 are from the third track.) The multiple tracks also mean that a student repeating a course is unlikely to hit the same series of lessons; this makes it profitable for a shaky user to back up and try again, and many students have done so. The tracks are not completely distinct, however. Depending on the number of correct answers the student has given for the last few lessons, the program may switch September 27, 1987 - 6 - tracks. The driver is actually capable of following an arbitrary directed graph of lesson sequences, as discussed in section 5. Some more structured arrangement, however, is used in all current scripts to aid the script writer in organizing the material into lessons. It is sufficiently difficult to write lessons that the three-track theory is not followed very closely except in the files and editor scripts. Accordingly, in some cases, the fast track is pro- duced merely by skipping lessons from the slower track. In others, there is essentially only one track. The main reason for using the _l_e_a_r_n program rather than simply writing the same material as a workbook is not the selection of tracks, but actual hands-on experience. Learn- ing by doing is much more effective than pencil and paper exercises. _L_e_a_r_n also provides a mechanical check on performance. The first version in fact would not let the student proceed unless it received correct answers to the questions it set and it would not tell a student the right answer. This somewhat Draconian approach has been moderated in version 2. Lessons are sometimes badly worded or even just plain wrong; in such cases, the student has no recourse. But if a stu- dent is simply unable to complete one lesson, that should not prevent access to the rest. Accordingly, the current version of _l_e_a_r_n allows the student to skip a lesson that he cannot pass; a ``no'' answer to the ``Do you want to try again?'' question in Figure 1 will pass to the next lesson. It is still true that _l_e_a_r_n will not tell the student the right answer. Of course, there are valid objections to the assump- tions above. In particular, some students may object to not understanding what they are doing; and the procedure of smashing everything into small pieces may provoke the retort ``you can't cross a ditch in two jumps.'' Since writing CAI scripts is considerably more tedious than ordinary manuals, however, it is safe to assume that there will always be alternatives to the scripts as a way of learning. In fact, for a reference manual of 3 or 4 pages it would not be surprising to have a tutorial manual of 20 pages and a (multi-track) script of 100 pages. Thus the reference manual will exist long before the scripts. _3. _S_c_r_i_p_t_s. As mentioned above, the present scripts try at most to follow a three-track theory. Thus little of the potential complexity of the possible directed graph is employed, since care must be taken in lesson construction to see that every necessary fact is presented in every possible path through the units. In addition, it is desirable that every unit have alternate successors to deal with student errors. September 27, 1987 - 7 - In most existing courses, the first few lessons are devoted to checking prerequisites. For example, before the student is allowed to proceed through the editor script the script verifies that the student understands files and is able to type. It is felt that the sooner lack of student preparation is detected, the easier it will be on the stu- dent. Anyone proceeding through the scripts should be get- ting mostly correct answers; otherwise, the system will be unsatisfactory both because the wrong habits are being learned and because the scripts make little effort to deal with wrong answers. Unprepared students should not be encouraged to continue with scripts. There are some preliminary items which the student must know before any scripts can be tried. In particular, the student must know how to connect to a UNIX system, set the terminal properly, log in, and execute simple commands (e.g., _l_e_a_r_n itself). In addition, the character erase and line kill conventions (# and @) should be known. It is hard to see how this much could be taught by computer-aided instruction, since a student who does not know these basic skills will not be able to run the learning program. A brief description on paper is provided (see Appendix A), although assistance will be needed for the first few minutes. This assistance, however, need not be highly skilled. The first script in the current set deals with files. It assumes the basic knowledge above and teaches the student about the _l_s, _c_a_t, _m_v, _r_m, _c_p and _d_i_f_f commands. It also deals with the abbreviation characters *, ?, and [ ] in file names. It does not cover pipes or I/O redirection, nor does it present the many options on the _l_s command. This script contains 31 lessons in the fast track; two are intended as prerequisite checks, seven are review exer- cises. There are a total of 75 lessons in all three tracks, and the instructional passages typed at the student to begin each lesson total 4,476 words. The average lesson thus begins with a 60-word message. In general, the fast track lessons have somewhat longer introductions, and the slow tracks somewhat shorter ones. The longest message is 144 words and the shortest 14. The second script trains students in the use of the UNIX context editor _e_d, a sophisticated editor using regular expressions for searching. [ ritchie thompson unix seventh edition 1978 %O See section _e_d (1). ] All editor features except encryption, mark names and `;' in addressing are covered. The fast track contains 2 prerequisite checks, 93 lessons, and a review lesson. It is supplemented by 146 additional lessons in other tracks. A comparison of sizes may be of interest. The _e_d September 27, 1987 - 8 - description in the reference manual is 2,572 words long. The _e_d tutorial [ kernighan editor tutorial 1974 ] is 6,138 words long. The fast track through the _e_d script is 7,407 words of explanatory messages, and the total _e_d script, 242 lessons, has 15,615 words. The average _e_d lesson is thus also about 60 words; the largest is 171 words and the smal- lest 10. The original _e_d script represents about three man-weeks of effort. The advanced file handling script deals with _l_s options, I/O diversion, pipes, and supporting programs like _p_r, _w_c, _t_a_i_l, _s_p_e_l_l and _g_r_e_p. (The basic file handling script is a prerequisite.) It is not as refined as the first two scripts; this is reflected at least partly in the fact that it provides much less of a full three-track sequence than they do. On the other hand, since it is perceived as ``advanced,'' it is hoped that the student will have some- what more sophistication and be better able to cope with it at a reasonably high level of performance. A fourth script covers the _e_q_n language for typing mathematics. This script must be run on a terminal capable of printing mathematics, for instance the DASI 300 and simi- lar Diablo-based terminals, or the nearly extinct Model 37 teletype. Again, this script is relatively short of tracks: of 76 lessons, only 17 are in the second track and 2 in the third track. Most of these provide additional practice for students who are having trouble in the first track. The -_m_s script for formatting macros is a short one- track only script. The macro package it describes is no longer the standard, so this script will undoubtedly be superseded in the future. Furthermore, the linear style of a single learn script is somewhat inappropriate for the mac- ros, since the macro package is composed of many independent features, and few users need all of them. It would be better to have a selection of short lesson sequences dealing with the features independently. The script on C is in a state of transition. It was originally designed to follow a tutorial on C, but that document has since become obsolete. The current script has been partially converted to follow the order of presentation in _T_h_e _C _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, [ ritchie kernighan program- ming 1978 prentice hall ] but this job is not complete. The C script was never intended to teach C; rather it is sup- posed to be a series of exercises for which the computer provides checking and (upon success) a suggested solution. This combination of scripts covers much of the material which any UNIX user will need to know to make effective use of the system. With enlargement of the advanced files course to include more on the command interpreter, there will be a relatively complete introduction to UNIX available September 27, 1987 - 9 - via _l_e_a_r_n. Although we make no pretense that _l_e_a_r_n will replace other instructional materials, it should provide a useful supplement to existing tutorials and reference manu- als. _4. _E_x_p_e_r_i_e_n_c_e _w_i_t_h _S_t_u_d_e_n_t_s. _L_e_a_r_n has been installed on many different UNIX sys- tems. Most of the usage is on the first two scripts, so these are more thoroughly debugged and polished. As a (ran- dom) sample of user experience, the _l_e_a_r_n program has been used at Bell Labs at Indian Hill for 10,500 lessons in a four month period. About 3600 of these are in the files script, 4100 in the editor, and 1400 in advanced files. The passing rate is about 80%, that is, about 4 lessons are passed for every one failed. There have been 86 distinct users of the files script, and 58 of the editor. On our system at Murray Hill, there have been nearly 2000 lessons over two weeks that include Christmas and New Year. Users have ranged in age from six up. It is difficult to characterize typical sessions with the scripts; many instances exist of someone doing one or two lessons and then logging out, as do instances of someone pausing in a script for twenty minutes or more. In the ear- lier version of _l_e_a_r_n, the average session in the files course took 32 minutes and covered 23 lessons. The distri- bution is quite broad and skewed, however; the longest ses- sion was 130 minutes and there were five sessions shorter than five minutes. The average lesson took about 80 seconds. These numbers are roughly typical for non- programmers; a UNIX expert can do the scripts at approxi- mately 30 seconds per lesson, most of which is the system printing. At present working through a section of the middle of the files script took about 1.4 seconds of processor time per lesson, and a system expert typing quickly took 15 seconds of real time per lesson. A novice would probably take at least a minute. Thus a UNIX system could support ten students working simultaneously with some spare capa- city. _5. _T_h_e _S_c_r_i_p_t _I_n_t_e_r_p_r_e_t_e_r. The _l_e_a_r_n program itself merely interprets scripts. It provides facilities for the script writer to capture student responses and their effects, and simplifies the job of pass- ing control to and recovering control from the student. This section describes the operation and usage of the driver program, and indicates what is required to produce a new script. Readers only interested in the existing scripts may skip this section. September 27, 1987 - 10 - The file structure used by _l_e_a_r_n is shown in Figure 2. There is one parent directory (named _l_i_b) containing the script data. Within this directory are subdirectories, one for each subject in which a course is available, one for logging (named _l_o_g), and one in which user sub-directories are created (named _p_l_a_y). The subject directory contains master copies of all lessons, plus any supporting material for that subject. In a given subdirectory, each lesson is a single text file. Lessons are usually named systematically; the file that contains lesson _n is called _L_n. 8_____________________________________________________________ Figure 2: Directory structure for _l_e_a_r_n lib play student1 files for student1... student2 files for student2... files L0.1a lessons for files course L0.1b ... editor ... (other courses) log 8_____________________________________________________________ 7|8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 When _l_e_a_r_n is executed, it makes a private directory for the user to work in, within the _l_e_a_r_n portion of the file system. A fresh copy of all the files used in each lesson (mostly data for the student to operate upon) is made each time a student starts a lesson, so the script writer may assume that everything is reinitialized each time a les- son is entered. The student directory is deleted after each session; any permanent records must be kept elsewhere. The script writer must provide certain basic items in each lesson: (1) the text of the lesson; (2) the set-up commands to be executed before the user gets control; (3) the data, if any, which the user is supposed to edit, transform, or otherwise process; (4) the evaluating commands to be executed after the user has finished the lesson, to decide whether the answer September 27, 1987 - 11 - is right; and (5) a list of possible successor lessons. _L_e_a_r_n tries to minimize the work of bookkeeping and instal- lation, so that most of the effort involved in script pro- duction is in planning lessons, writing tutorial paragraphs, and coding tests of student performance. The basic sequence of events is as follows. First, _l_e_a_r_n creates the working directory. Then, for each lesson, _l_e_a_r_n reads the script for the lesson and processes it a line at a time. The lines in the script are: (1) commands to the script interpreter to print something, to create a files, to test something, etc.; (2) text to be printed or put in a file; (3) other lines, which are sent to the shell to be executed. One line in each lesson turns control over to the user; the user can run any UNIX commands. The user mode terminates when the user types _y_e_s, _n_o, _r_e_a_d_y, or _a_n_s_w_e_r. At this point, the user's work is tested; if the lesson is passed, a new lesson is selected, and if not the old one is repeated. Let us illustrate this with the script for the second lesson of Figure 1; this is shown in Figure 3. 8 __________________________________________________ 7 Figure 3: Sample Lesson #print Of course, you can print any file with "cat". In particular, it is common to first use "ls" to find the name of a file and then "cat" to print it. Note the difference between "ls", which tells you the name of the files, and "cat", which tells you the contents. One file in the current directory is named for a President. Print the file, then type "ready". #create roosevelt this file is named roosevelt and contains three lines of text. #copyout #user #uncopyout tail -3 .ocopy >X1 #cmp X1 roosevelt #log #next 3.2b 2 8 __________________________________________________ 7 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 September 27, 1987 - 12 - Lines which begin with # are commands to the _l_e_a_r_n script interpreter. For example, #_p_r_i_n_t causes printing of any text that follows, up to the next line that begins with a sharp. #_p_r_i_n_t _f_i_l_e prints the contents of _f_i_l_e; it is the same as _c_a_t _f_i_l_e but has less overhead. Both forms of #_p_r_i_n_t have the added pro- perty that if a lesson is failed, the #_p_r_i_n_t will not be executed the second time through; this avoids annoying the student by repeating the preamble to a lesson. #_c_r_e_a_t_e _f_i_l_e_n_a_m_e creates a file of the specified name, and copies any subse- quent text up to a # to the file. This is used for creating and initializing working files and reference data for the lessons. #_u_s_e_r gives control to the student; each line he or she types is passed to the shell for execution. The #_u_s_e_r mode is ter- minated when the student types one of _y_e_s, _n_o, _r_e_a_d_y or _a_n_s_w_e_r. At that time, the driver resumes interpretation of the script. #_c_o_p_y_i_n #_u_n_c_o_p_y_i_n Anything the student types between these commands is copied onto a file called ._c_o_p_y. This lets the script writer interrogate the student's responses upon regaining control. #_c_o_p_y_o_u_t #_u_n_c_o_p_y_o_u_t Between these commands, any material typed at the student by any program is copied to the file ._o_c_o_p_y. This lets the script writer interrogate the effect of what the student typed, which true believers in the performance theory of learning usually prefer to the student's actual input. #_p_i_p_e #_u_n_p_i_p_e Normally the student input and the script commands are fed to the UNIX command interpreter (the ``shell'') one line at a time. This won't do if, for example, a sequence of editor commands is provided, since the input to the editor must be September 27, 1987 - 13 - handed to the editor, not to the shell. Accordingly, the material between #_p_i_p_e and #_u_n_p_i_p_e commands is fed continu- ously through a pipe so that such sequences work. If _c_o_p_y_- _o_u_t is also desired the _c_o_p_y_o_u_t brackets must include the _p_i_p_e brackets. There are several commands for setting status after the student has attempted the lesson. #_c_m_p _f_i_l_e_1 _f_i_l_e_2 is an in-line implementation of _c_m_p, which compares two files for identity. #_m_a_t_c_h _s_t_u_f_f The last line of the student's input is compared to _s_t_u_f_f, and the success or fail status is set according to it. Extraneous things like the word _a_n_s_w_e_r are stripped before the comparison is made. There may be several #_m_a_t_c_h lines; this provides a convenient mechanism for handling multiple ``right'' answers. Any text up to a # on subsequent lines after a successful #_m_a_t_c_h is printed; this is illustrated in Figure 4, another sample lesson. 8 _________________________________________________ 7 Figure 4: Another Sample Les- son #print What command will move the current line to the end of the file? Type "answer COMMAND", where COMMAND is the command. #copyin #user #uncopyin #match m$ #match .m$ "m$" is easier. #log #next 63.1d 10 8 _________________________________________________ 7 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 |8|7|7|7|7|7|7|7|7|7|7|7|7|7|7|7| 9 #_b_a_d _s_t_u_f_f This is similar to #_m_a_t_c_h, except that it corresponds to specific failure answers; this can be used to produce hints for particular wrong answers that have been anticipated by the script writer. #_s_u_c_c_e_e_d September 27, 1987 - 14 - #_f_a_i_l print a message upon success or failure (as determined by some previous mechanism). When the student types one of the ``commands'' _y_e_s, _n_o, _r_e_a_d_y, or _a_n_s_w_e_r, the driver terminates the #_u_s_e_r command, and evaluation of the student's work can begin. This can be done either by the built-in commands above, such as #_m_a_t_c_h and #_c_m_p, or by status returned by normal UNIX commands, typically _g_r_e_p and _t_e_s_t. The last command should return status true (0) if the task was done successfully and false (non-zero) otherwise; this status return tells the driver whether or not the student has successfully passed the les- son. Performance can be logged: #_l_o_g _f_i_l_e writes the date, lesson, user name and speed rating, and a success/failure indication on _f_i_l_e. The command #_l_o_g by itself writes the logging information in the logging directory within the _l_e_a_r_n hierarchy, and is the normal form. #_n_e_x_t is followed by a few lines, each with a successor lesson name and an optional speed rating on it. A typical set might read 25.1a 10 25.2a 5 25.3a 2 indicating that unit 25.1a is a suitable follow-on lesson for students with a speed rating of 10 units, 25.2a for stu- dent with speed near 5, and 25.3a for speed near 2. Speed ratings are maintained for each session with a student; the rating is increased by one each time the student gets a les- son right and decreased by four each time the student gets a lesson wrong. Thus the driver tries to maintain a level such that the users get 80% right answers. The maximum rat- ing is limited to 10 and the minimum to 0. The initial rat- ing is zero unless the student specifies a different rating when starting a session. If the student passes a lesson, a new lesson is selected and the process repeats. If the student fails, a false status is returned and the program reverts to the September 27, 1987 - 15 - previous lesson and tries another alternative. If it can not find another alternative, it skips forward a lesson. _b_y_e, bye, which causes a graceful exit from the _l_e_a_r_n sys- tem. Hanging up is the usual novice's way out. The lessons may form an arbitrary directed graph, although the present program imposes a limitation on cycles in that it will not present a lesson twice in the same ses- sion. If the student is unable to answer one of the exer- cises correctly, the driver searches for a previous lesson with a set of alternatives as successors (following the #_n_e_x_t line). From the previous lesson with alternatives one route was taken earlier; the program simply tries a dif- ferent one. It is perfectly possible to write sophisticated scripts that evaluate the student's speed of response, or try to estimate the elegance of the answer, or provide detailed analysis of wrong answers. Lesson writing is so tedious already, however, that most of these abilities are likely to go unused. The driver program depends heavily on features of UNIX that are not available on many other operating systems. These include the ease of manipulating files and direc- tories, file redirection, the ability to use the command interpreter as just another program (even in a pipeline), command status testing and branching, the ability to catch signals like interrupts, and of course the pipeline mechan- ism itself. Although some parts of _l_e_a_r_n might be transfer- able to other systems, some generality will probably be lost. A bit of history: The first version of _l_e_a_r_n had fewer built-in words in the driver program, and made more use of the facilities of UNIX. For example, file comparison was done by creating a _c_m_p process, rather than comparing the two files within _l_e_a_r_n. Lessons were not stored as text files, but as archives. There was no concept of the in-line document; even #_p_r_i_n_t had to be followed by a file name. Thus the initialization for each lesson was to extract the archive into the working directory (typically 4-8 files), then #_p_r_i_n_t the lesson text. The combination of such things made _l_e_a_r_n slower. The new version is about 4 or 5 times faster. Furthermore, it appears even faster to the user because in a typical lesson, the printing of the message comes first, and file setup with #_c_r_e_a_t_e can be overlapped with the printng, so that when the program finishes printing, it is really ready for the user to type at it. It is also a great advantage to the script maintainer that lessons are now just ordinary text files. They can be September 27, 1987 - 16 - edited without any difficulty, and UNIX text manipulation tools can be applied to them. The result has been that there is much less resistance to going in and fixing sub- standard lessons. _6. _C_o_n_c_l_u_s_i_o_n_s The following observations can be made about secre- taries, typists, and other non-programmers who have used _l_e_a_r_n: (a) A novice must have assistance with the mechanics of communicating with the computer to get through to the first lesson or two; once the first few lessons are passed people can proceed on their own. (b) The terminology used in the first few lessons is obscure to those inexperienced with computers. It would help if there were a low level reference card for UNIX to supplement the existing programmer oriented bulky manual and bulky reference card. (c) The concept of ``substitutable argument'' is hard to grasp, and requires help. (d) They enjoy the system for the most part. Motivation matters a great deal, however. It takes an hour or two for a novice to get through the script on file handling. The total time for a reasonably intelligent and motivated novice to proceed from ignorance to a reasonable ability to create new files and manipulate old ones seems to be a few days, with perhaps half of each day spent on the machine. The normal way of proceeding has been to have students in the same room with someone who knows UNIX and the scripts. Thus the student is not brought to a halt by dif- ficult questions. The burden on the counselor, however, is much lower than that on a teacher of a course. Ideally, the students should be encouraged to proceed with instruction immediately prior to their actual use of the computer. They should exercise the scripts on the same computer and the same kind of terminal that they will later use for their real work, and their first few jobs for the computer should be relatively easy ones. Also, both training and initial work should take place on days when the UNIX hardware and software are working reliably. Rarely is all of this possi- ble, but the closer one comes the better the result. For example, if it is known that the hardware is shaky one day, it is better to attempt to reschedule training for another one. Students are very frustrated by machine downtime; when nothing is happening, it takes some sophistication and experience to distinguish an infinite loop, a slow but September 27, 1987 - 17 - functioning program, a program waiting for the user, and a broken machine.* One disadvantage of training with _l_e_a_r_n is that stu- dents come to depend completely on the CAI system, and do not try to read manuals or use other learning aids. This is unfortunate, not only because of the increased demands for completeness and accuracy of the scripts, but because the scripts do not cover all of the UNIX system. New users should have manuals (appropriate for their level) and read them; the scripts ought to be altered to recommend suitable documents and urge students to read them. There are several other difficulties which are clearly evident. From the student's viewpoint, the most serious is that lessons still crop up which simply can't be passed. Sometimes this is due to poor explanations, but just as often it is some error in the lesson itself - a botched setup, a missing file, an invalid test for correctness, or some system facility that doesn't work on the local system in the same way it did on the development system. It takes knowledge and a certain healthy arrogance on the part of the user to recognize that the fault is not his or hers, but the script writer's. Permitting the student to get on with the next lesson regardless does alleviate this somewhat, and the logging facilities make it easy to watch for lessons that no one can pass, but it is still a problem. The biggest problem with the previous _l_e_a_r_n was speed (or lack thereof) - it was often excruciatingly slow and made a significant drain on the system. The current version so far does not seem to have that difficulty, although some scripts, notably _e_q_n, are intrinsically slow. _e_q_n, for example, must do a lot of work even to print its introduc- tions, let alone check the student responses, but delay is perceptible in all scripts from time to time. Another potential problem is that it is possible to break _l_e_a_r_n inadvertently, by pushing interrupt at the wrong time, or by removing critical files, or any number of simi- lar slips. The defenses against such problems have steadily been improved, to the point where most students should not notice difficulties. Of course, it will always be possible to break _l_e_a_r_n maliciously, but this is not likely to be a problem. One area is more fundamental - some UNIX commands are sufficiently global in their effect that _l_e_a_r_n currently __________________________ * We have even known an expert programmer to decide the computer was broken when he had simply left his termi- nal in local mode. Novices have great difficulties with such problems. September 27, 1987 - 18 - does not allow them to be executed at all. The most obvious is _c_d, which changes to another directory. The prospect of a student who is learning about directories inadvertently moving to some random directory and removing files has deterred us from even writing lessons on _c_d, but ultimately lessons on such topics probably should be added. _7. _A_c_k_n_o_w_l_e_d_g_m_e_n_t_s We are grateful to all those who have tried _l_e_a_r_n, for we have benefited greatly from their suggestions and criti- cisms. In particular, M. E. Bittrich, J. L. Blue, S. I. Feldman, P. A. Fox, and M. J. McAlpin have provided substan- tial feedback. Conversations with E. Z. Rothkopf also pro- vided many of the ideas in the system. We are also indebted to Don Jackowski for serving as a guinea pig for the second version, and to Tom Plum for his efforts to improve the C script. [ $LIST$ ] September 27, 1987