DC - An Interactive Desk Calculator Robert Morris Lorinda Cherry _A_B_S_T_R_A_C_T DC is an interactive desk calculator program implemented on the UNIX|- time-sharing system to do arbitrary-precision integer arithmetic. It has provision for manipulating scaled fixed-point numbers and for input and output in bases other than decimal. The size of numbers that can be manipulated is limited only by available core storage. On typical implementations of UNIX, the size of numbers that can be handled varies from several hundred digits on the smallest systems to several thousand on the largest. DC is an arbitrary precision arithmetic package imple- mented on the UNIX time-sharing system in the form of an interactive desk calculator. It works like a stacking cal- culator using reverse Polish notation. Ordinarily DC operates on decimal integers, but one may specify an input base, output base, and a number of fractional digits to be maintained. A language called BC [1] has been developed which accepts programs written in the familiar style of higher- level programming languages and compiles output which is interpreted by DC. Some of the commands described below were designed for the compiler interface and are not easy for a human user to manipulate. Numbers that are typed into DC are put on a push-down stack. DC commands work by taking the top number or two off the stack, performing the desired operation, and pushing the result on the stack. If an argument is given, input is _________________________ |- UNIX is a trademark of Bell Laboratories. USD:5-2 DC - An Interactive Desk Calculator taken from that file until its end, then from the standard input. _S_Y_N_O_P_T_I_C _D_E_S_C_R_I_P_T_I_O_N Here we describe the DC commands that are intended for use by people. The additional commands that are intended to be invoked by compiled output are described in the detailed description. Any number of commands are permitted on a line. Blanks and new-line characters are ignored except within numbers and in places where a register name is expected. The following constructions are recognized: _n_u_m_b_e_r The value of the number is pushed onto the main stack. A number is an unbroken string of the digits 0-9 and the capital letters A-F which are treated as digits with values 10-15 respectively. The number may be pre- ceded by an underscore to input a negative number. Numbers may contain decimal points. + - * % ^ The top two values on the stack are added (+), sub- tracted (-), multiplied (*), divided (/), remaindered (%), or exponentiated (^). The two entries are popped off the stack; the result is pushed on the stack in their place. The result of a division is an integer truncated toward zero. See the detailed description below for the treatment of numbers with decimal points. An exponent must not have any digits after the decimal point. _s_x The top of the main stack is popped and stored into a register named _x, where _x may be any character. If the s is capitalized, _x is treated as a stack and the value is pushed onto it. Any character, even blank or new- line, is a valid register name. _l_x The value in register _x is pushed onto the stack. The register _x is not altered. If the l is capitalized, register _x is treated as a stack and its top value is popped onto the main stack. All registers start with empty value which is treated as a zero by the command l and is treated as an error by the DC - An Interactive Desk Calculator USD:5-3 command L. _d The top value on the stack is duplicated. _p The top value on the stack is printed. The top value remains unchanged. _f All values on the stack and in registers are printed. _x treats the top element of the stack as a character string, removes it from the stack, and executes it as a string of DC commands. [ ... ] puts the bracketed character string onto the top of the stack. _q exits the program. If executing a string, the recur- sion level is popped by two. If q is capitalized, the top value on the stack is popped and the string execu- tion level is popped by that value. <_x >_x =_x !<_x !>_x !=_x The top two elements of the stack are popped and com- pared. Register _x is executed if they obey the stated relation. Exclamation point is negation. _v replaces the top element on the stack by its square root. The square root of an integer is truncated to an integer. For the treatment of numbers with decimal points, see the detailed description below. ! interprets the rest of the line as a UNIX command. Control returns to DC when the UNIX command terminates. USD:5-4 DC - An Interactive Desk Calculator _c All values on the stack are popped; the stack becomes empty. _i The top value on the stack is popped and used as the number radix for further input. If i is capitalized, the value of the input base is pushed onto the stack. No mechanism has been provided for the input of arbi- trary numbers in bases less than 1 or greater than 16. _o The top value on the stack is popped and used as the number radix for further output. If o is capitalized, the value of the output base is pushed onto the stack. _k The top of the stack is popped, and that value is used as a scale factor that influences the number of decimal places that are maintained during multiplication, divi- sion, and exponentiation. The scale factor must be greater than or equal to zero and less than 100. If k is capitalized, the value of the scale factor is pushed onto the stack. _z The value of the stack level is pushed onto the stack. ? A line of input is taken from the input source (usually the console) and executed. _D_E_T_A_I_L_E_D _D_E_S_C_R_I_P_T_I_O_N _I_n_t_e_r_n_a_l _R_e_p_r_e_s_e_n_t_a_t_i_o_n _o_f _N_u_m_b_e_r_s Numbers are stored internally using a dynamic storage allocator. Numbers are kept in the form of a string of digits to the base 100 stored one digit per byte (centennial digits). The string is stored with the low-order digit at the beginning of the string. For example, the representa- tion of 157 is 57,1. After any arithmetic operation on a number, care is taken that all digits are in the range 0-99 and that the number has no leading zeros. The number zero is represented by the empty string. Negative numbers are represented in the 100's comple- ment notation, which is analogous to two's complement DC - An Interactive Desk Calculator USD:5-5 notation for binary numbers. The high order digit of a negative number is always -1 and all other digits are in the range 0-99. The digit preceding the high order -1 digit is never a 99. The representation of -157 is 43,98,-1. We shall call this the canonical form of a number. The advan- tage of this kind of representation of negative numbers is ease of addition. When addition is performed digit by digit, the result is formally correct. The result need only be modified, if necessary, to put it into canonical form. Because the largest valid digit is 99 and the byte can hold numbers twice that large, addition can be carried out and the handling of carries done later when that is con- venient, as it sometimes is. An additional byte is stored with each number beyond the high order digit to indicate the number of assumed decimal digits after the decimal point. The representation of .001 is 1,_3 where the scale has been italicized to emphasize the fact that it is not the high order digit. The value of this extra byte is called the scale factor of the number. _T_h_e _A_l_l_o_c_a_t_o_r DC uses a dynamic string storage allocator for all of its internal storage. All reading and writing of numbers internally is done through the allocator. Associated with each string in the allocator is a four-word header contain- ing pointers to the beginning of the string, the end of the string, the next place to write, and the next place to read. Communication between the allocator and DC is done via pointers to these headers. The allocator initially has one large string on a list of free strings. All headers except the one pointing to this string are on a list of free headers. Requests for strings are made by size. The size of the string actually supplied is the next higher power of 2. When a request for a string is made, the allocator first checks the free list to see if there is a string of the desired size. If none is found, the allocator finds the next larger free string and splits it repeatedly until it has a string of the right size. Left-over strings are put on the free list. If there are no larger strings, the allocator tries to coalesce smaller free strings into larger ones. Since all strings are the result of splitting large strings, each string has a neighbor that is next to it in core and, if free, can be combined with it to make a string twice as long. This is an implementation of the `buddy system' of allocation described in [2]. Failing to find a string of the proper length after coalescing, the allocator asks the system for more space. USD:5-6 DC - An Interactive Desk Calculator The amount of space on the system is the only limitation on the size and number of strings in DC. If at any time in the process of trying to allocate a string, the allocator runs out of headers, it also asks the system for more space. There are routines in the allocator for reading, writ- ing, copying, rewinding, forward-spacing, and backspacing strings. All string manipulation is done using these rou- tines. The reading and writing routines increment the read pointer or write pointer so that the characters of a string are read or written in succession by a series of read or write calls. The write pointer is interpreted as the end of the information-containing portion of a string and a call to read beyond that point returns an end-of-string indication. An attempt to write beyond the end of a string causes the allocator to allocate a larger space and then copy the old string into the larger block. _I_n_t_e_r_n_a_l _A_r_i_t_h_m_e_t_i_c All arithmetic operations are done on integers. The operands (or operand) needed for the operation are popped from the main stack and their scale factors stripped off. Zeros are added or digits removed as necessary to get a properly scaled result from the internal arithmetic routine. For example, if the scale of the operands is different and decimal alignment is required, as it is for addition, zeros are appended to the operand with the smaller scale. After performing the required arithmetic operation, the proper scale factor is appended to the end of the number before it is pushed on the stack. A register called scale plays a part in the results of most arithmetic operations. scale is the bound on the number of decimal places retained in arithmetic computa- tions. scale may be set to the number on the top of the stack truncated to an integer with the k command. K may be used to push the value of scale on the stack. scale must be greater than or equal to 0 and less than 100. The descrip- tions of the individual arithmetic operations will include the exact effect of scale on the computations. _A_d_d_i_t_i_o_n _a_n_d _S_u_b_t_r_a_c_t_i_o_n The scales of the two numbers are compared and trailing zeros are supplied to the number with the lower scale to give both numbers the same scale. The number with the smaller scale is multiplied by 10 if the difference of the scales is odd. The scale of the result is then set to the larger of the scales of the two operands. Subtraction is performed by negating the number to be DC - An Interactive Desk Calculator USD:5-7 subtracted and proceeding as in addition. Finally, the addition is performed digit by digit from the low order end of the number. The carries are propagated in the usual way. The resulting number is brought into canonical form, which may require stripping of leading zeros, or for negative numbers replacing the high-order con- figuration 99,-1 by the digit -1. In any case, digits which are not in the range 0-99 must be brought into that range, propagating any carries or borrows that result. _M_u_l_t_i_p_l_i_c_a_t_i_o_n The scales are removed from the two operands and saved. The operands are both made positive. Then multiplication is performed in a digit by digit manner that exactly mimics the hand method of multiplying. The first number is multiplied by each digit of the second number, beginning with its low order digit. The intermediate products are accumulated into a partial sum which becomes the final product. The product is put into the canonical form and its sign is computed from the signs of the original operands. The scale of the result is set equal to the sum of the scales of the two operands. If that scale is larger than the internal register scale and also larger than both of the scales of the two operands, then the scale of the result is set equal to the largest of these three last quantities. _D_i_v_i_s_i_o_n The scales are removed from the two operands. Zeros are appended or digits removed from the dividend to make the scale of the result of the integer division equal to the internal quantity scale. The signs are removed and saved. Division is performed much as it would be done by hand. The difference of the lengths of the two numbers is com- puted. If the divisor is longer than the dividend, zero is returned. Otherwise the top digit of the divisor is divided into the top two digits of the dividend. The result is used as the first (high-order) digit of the quotient. It may turn out be one unit too low, but if it is, the next trial quotient will be larger than 99 and this will be adjusted at the end of the process. The trial digit is multiplied by the divisor and the result subtracted from the dividend and the process is repeated to get additional quotient digits until the remaining dividend is smaller than the divisor. At the end, the digits of the quotient are put into the canonical form, with propagation of carry as needed. The sign is set from the sign of the operands. USD:5-8 DC - An Interactive Desk Calculator _R_e_m_a_i_n_d_e_r The division routine is called and division is per- formed exactly as described. The quantity returned is the remains of the dividend at the end of the divide process. Since division truncates toward zero, remainders have the same sign as the dividend. The scale of the remainder is set to the maximum of the scale of the dividend and the scale of the quotient plus the scale of the divisor. _S_q_u_a_r_e _R_o_o_t The scale is stripped from the operand. Zeros are added if necessary to make the integer result have a scale that is the larger of the internal quantity scale and the scale of the operand. The method used to compute sqrt(y) is Newton's method with successive approximations by the rule The initial guess is found by taking the integer square root of the top two digits. _E_x_p_o_n_e_n_t_i_a_t_i_o_n Only exponents with zero scale factor are handled. If the exponent is zero, then the result is 1. If the exponent is negative, then it is made positive and the base is divided into one. The scale of the base is removed. The integer exponent is viewed as a binary number. The base is repeatedly squared and the result is obtained as a product of those powers of the base that correspond to the positions of the one-bits in the binary representation of the exponent. Enough digits of the result are removed to make the scale of the result the same as if the indicated multiplication had been performed. _I_n_p_u_t _C_o_n_v_e_r_s_i_o_n _a_n_d _B_a_s_e Numbers are converted to the internal representation as they are read in. The scale stored with a number is simply the number of fractional digits input. Negative numbers are indicated by preceding the number with a _ (an underscore). The hexadecimal digits A-F correspond to the numbers 10-15 regardless of input base. The i command can be used to change the base of the input numbers. This command pops the stack, truncates the resulting number to an integer, and uses it as the input base for all further input. The input base is initialized to 10 but may, for example be changed to 8 or 16 to do octal or hexadecimal to decimal conversions. The command I will push the value of the input base on the stack. DC - An Interactive Desk Calculator USD:5-9 _O_u_t_p_u_t _C_o_m_m_a_n_d_s The command p causes the top of the stack to be printed. It does not remove the top of the stack. All of the stack and internal registers can be output by typing the command f. The o command can be used to change the output base. This command uses the top of the stack, truncated to an integer as the base for all further output. The output base in initialized to 10. It will work correctly for any base. The command O pushes the value of the output base on the stack. _O_u_t_p_u_t _F_o_r_m_a_t _a_n_d _B_a_s_e The input and output bases only affect the interpreta- tion of numbers on input and output; they have no effect on arithmetic computations. Large numbers are output with 70 characters per line; a \ indicates a continued line. All choices of input and output bases work correctly, although not all are useful. A particularly useful output base is 100000, which has the effect of grouping digits in fives. Bases of 8 and 16 can be used for decimal-octal or decimal- hexadecimal conversions. _I_n_t_e_r_n_a_l _R_e_g_i_s_t_e_r_s Numbers or strings may be stored in internal registers or loaded on the stack from registers with the commands s and l. The command s_x pops the top of the stack and stores the result in register x. _x can be any character. l_x puts the contents of register x on the top of the stack. The l command has no effect on the contents of register _x. The s command, however, is destructive. _S_t_a_c_k _C_o_m_m_a_n_d_s The command c clears the stack. The command d pushes a duplicate of the number on the top of the stack on the stack. The command z pushes the stack size on the stack. The command X replaces the number on the top of the stack with its scale factor. The command Z replaces the top of the stack with its length. _S_u_b_r_o_u_t_i_n_e _D_e_f_i_n_i_t_i_o_n_s _a_n_d _C_a_l_l_s Enclosing a string in [ ] pushes the ascii string on the stack. The q command quits or in executing a string, pops the recursion levels by two. _I_n_t_e_r_n_a_l _R_e_g_i_s_t_e_r_s - _P_r_o_g_r_a_m_m_i_n_g _D_C The load and store commands together with [ ] to store strings, x to execute and the testing commands `<', `>', `=', `!<', `!>', `!=' can be used to program DC. The x USD:5-10 DC - An Interactive Desk Calculator command assumes the top of the stack is an string of DC com- mands and executes it. The testing commands compare the top two elements on the stack and if the relation holds, execute the register that follows the relation. For example, to print the numbers 0-9, [lip1+ si li10>a]sa 0si lax _P_u_s_h-_D_o_w_n _R_e_g_i_s_t_e_r_s _a_n_d _A_r_r_a_y_s These commands were designed for used by a compiler, not by people. They involve push-down registers and arrays. In addition to the stack that commands work on, DC can be thought of as having individual stacks for each register. These registers are operated on by the commands S and L. S_x pushes the top value of the main stack onto the stack for the register _x. L_x pops the stack for register _x and puts the result on the main stack. The commands s and l also work on registers but not as push-down stacks. l doesn't effect the top of the register stack, and s destroys what was there before. The commands to work on arrays are : and ;. :_x pops the stack and uses this value as an index into the array _x. The next element on the stack is stored at this index in _x. An index must be greater than or equal to 0 and less than 2048. ;_x is the command to load the main stack from the array _x. The value on the top of the stack is the index into the array _x of the value to be loaded. _M_i_s_c_e_l_l_a_n_e_o_u_s _C_o_m_m_a_n_d_s The command ! interprets the rest of the line as a UNIX command and passes it to UNIX to execute. One other com- piler command is Q. This command uses the top of the stack as the number of levels of recursion to skip. _D_E_S_I_G_N _C_H_O_I_C_E_S The real reason for the use of a dynamic storage allo- cator was that a general purpose program could be (and in fact has been) used for a variety of other tasks. The allo- cator has some value for input and for compiling (i.e. the bracket [...] commands) where it cannot be known in advance how long a string will be. The result was that at a modest cost in execution time, all considerations of string alloca- tion and sizes of strings were removed from the remainder of the program and debugging was made easier. The allocation method used wastes approximately 25% of available space. The choice of 100 as a base for internal arithmetic seemingly has no compelling advantage. Yet the base cannot DC - An Interactive Desk Calculator USD:5-11 exceed 127 because of hardware limitations and at the cost of 5% in space, debugging was made a great deal easier and decimal output was made much faster. The reason for a stack-type arithmetic design was to permit all DC commands from addition to subroutine execution to be implemented in essentially the same way. The result was a considerable degree of logical separation of the final program into modules with very little communication between modules. The rationale for the lack of interaction between the scale and the bases was to provide an understandable means of proceeding after a change of base or scale when numbers had already been entered. An earlier implementation which had global notions of scale and base did not work out well. If the value of scale were to be interpreted in the current input or output base, then a change of base or scale in the midst of a computation would cause great confusion in the interpretation of the results. The current scheme has the advantage that the value of the input and output bases are only used for input and output, respectively, and they are ignored in all other operations. The value of scale is not used for any essential purpose by any part of the program and it is used only to prevent the number of decimal places resulting from the arithmetic operations from growing beyond all bounds. The design rationale for the choices for the scales of the results of arithmetic were that in no case should any significant digits be thrown away if, on appearances, the user actually wanted them. Thus, if the user wants to add the numbers 1.5 and 3.517, it seemed reasonable to give him the result 5.017 without requiring him to unnecessarily specify his rather obvious requirements for precision. On the other hand, multiplication and exponentiation produce results with many more digits than their operands and it seemed reasonable to give as a minimum the number of decimal places in the operands but not to give more than that number of digits unless the user asked for them by specifying a value for scale. Square root can be handled in just the same way as multiplication. The operation of divi- sion gives arbitrarily many decimal places and there is sim- ply no way to guess how many places the user wants. In this case only, the user must specify a scale to get any decimal places at all. The scale of remainder was chosen to make it possible to recreate the dividend from the quotient and remainder. This is easy to implement; no digits are thrown away. USD:5-12 DC - An Interactive Desk Calculator _R_e_f_e_r_e_n_c_e_s [1] L. L. Cherry, R. Morris, _B_C - _A_n _A_r_b_i_t_r_a_r_y _P_r_e_c_i_s_i_o_n _D_e_s_k-_C_a_l_c_u_l_a_t_o_r _L_a_n_g_u_a_g_e. [2] K. C. Knowlton, _A _F_a_s_t _S_t_o_r_a_g_e _A_l_l_o_c_a_t_o_r, Comm. ACM 8, pp. 623-625 (Oct. 1965).