DC - An Interactive Desk Calculator


                       Robert Morris

                       Lorinda Cherry


                          _A_B_S_T_R_A_C_T

          DC is an interactive desk calculator  program
     implemented on the UNIX|- time-sharing system to do
     arbitrary-precision  integer  arithmetic.   It has
     provision  for  manipulating  scaled   fixed-point
     numbers  and  for  input and output in bases other
     than decimal.

          The size of numbers that can  be  manipulated
     is  limited  only  by  available core storage.  On
     typical  implementations  of  UNIX,  the  size  of
     numbers  that  can  be handled varies from several
     hundred digits on the smallest systems to  several
     thousand on the largest.


     DC is an arbitrary precision arithmetic package  imple-
mented  on  the  UNIX  time-sharing system in the form of an
interactive desk calculator.  It works like a stacking  cal-
culator   using  reverse  Polish  notation.   Ordinarily  DC
operates on decimal integers, but one may specify  an  input
base,  output  base, and a number of fractional digits to be
maintained.

     A language called  BC  [1]  has  been  developed  which
accepts  programs  written  in the familiar style of higher-
level programming languages and  compiles  output  which  is
interpreted  by  DC.   Some  of the commands described below
were designed for the compiler interface and  are  not  easy
for a human user to manipulate.

     Numbers that are typed into DC are put on  a  push-down
stack.  DC commands work by taking the top number or two off
the stack, performing the desired operation, and pushing the
result  on  the  stack.   If  an argument is given, input is
_________________________
|- UNIX is a trademark of Bell Laboratories.


USD:5-2                  DC - An Interactive Desk Calculator


taken from that file until its end, then from  the  standard
input.

_S_Y_N_O_P_T_I_C _D_E_S_C_R_I_P_T_I_O_N

     Here we describe the DC commands that are intended  for
use by people.  The additional commands that are intended to
be invoked by compiled output are described in the  detailed
description.

     Any number of commands are permitted on a line.  Blanks
and  new-line  characters  are ignored except within numbers
and in places where a register name is expected.

     The following constructions are recognized:

_n_u_m_b_e_r

     The value of the number is pushed onto the main  stack.
     A  number  is  an unbroken string of the digits 0-9 and
     the capital letters A-F which  are  treated  as  digits
     with values 10-15 respectively.  The number may be pre-
     ceded by an underscore  to  input  a  negative  number.
     Numbers may contain decimal points.

+  -  *  %  ^

     The top two values on the stack  are  added  (+),  sub-
     tracted  (-),  multiplied (*), divided (/), remaindered
     (%), or exponentiated (^).  The two entries are  popped
     off  the  stack;  the  result is pushed on the stack in
     their place.  The result of a division  is  an  integer
     truncated  toward  zero.   See the detailed description
     below for the treatment of numbers with decimal points.
     An  exponent must not have any digits after the decimal
     point.

_s_x

     The top of the main stack is popped and stored  into  a
     register named _x, where _x may be any character.  If the
     s is capitalized, _x is treated as a stack and the value
     is  pushed  onto it.  Any character, even blank or new-
     line, is a valid register name.

_l_x

     The value in register _x is pushed onto the stack.   The
     register  _x  is  not altered.  If the l is capitalized,
     register _x is treated as a stack and its top  value  is
     popped onto the main stack.

All registers start with empty value which is treated  as  a
zero  by  the  command  l  and is treated as an error by the


DC - An Interactive Desk Calculator                  USD:5-3


command L.


_d

     The top value on the stack is duplicated.

_p

     The top value on the stack is printed.  The  top  value
     remains unchanged.

_f

     All values on the stack and in registers are printed.

_x

     treats the top element of  the  stack  as  a  character
     string, removes it from the stack, and executes it as a
     string of DC commands.

[ ... ]

     puts the bracketed character string onto the top of the
     stack.

_q

     exits the program.  If executing a string,  the  recur-
     sion  level is popped by two.  If q is capitalized, the
     top value on the stack is popped and the string  execu-
     tion level is popped by that value.

<_x  >_x  =_x  !<_x  !>_x  !=_x

     The top two elements of the stack are popped  and  com-
     pared.   Register _x is executed if they obey the stated
     relation.  Exclamation point is negation.

_v

     replaces the top element on the  stack  by  its  square
     root.  The square root of an integer is truncated to an
     integer.  For the treatment  of  numbers  with  decimal
     points, see the detailed description below.

!

     interprets the rest of the  line  as  a  UNIX  command.
     Control returns to DC when the UNIX command terminates.


USD:5-4                  DC - An Interactive Desk Calculator


_c

     All values on the stack are popped; the  stack  becomes
     empty.

_i

     The top value on the stack is popped and  used  as  the
     number  radix  for further input.  If i is capitalized,
     the value of the input base is pushed onto  the  stack.
     No  mechanism  has been provided for the input of arbi-
     trary numbers in bases less than 1 or greater than 16.

_o

     The top value on the stack is popped and  used  as  the
     number  radix for further output.  If o is capitalized,
     the value of the output base is pushed onto the stack.

_k

     The top of the stack is popped, and that value is  used
     as a scale factor that influences the number of decimal
     places that are maintained during multiplication, divi-
     sion,  and  exponentiation.   The  scale factor must be
     greater than or equal to zero and less than 100.  If  k
     is capitalized, the value of the scale factor is pushed
     onto the stack.

_z

     The value of the stack level is pushed onto the stack.

?

     A line of input is taken from the input source (usually
     the console) and executed.

_D_E_T_A_I_L_E_D _D_E_S_C_R_I_P_T_I_O_N

_I_n_t_e_r_n_a_l _R_e_p_r_e_s_e_n_t_a_t_i_o_n _o_f _N_u_m_b_e_r_s

     Numbers are stored internally using a  dynamic  storage
allocator.   Numbers  are  kept  in  the form of a string of
digits to the base 100 stored one digit per byte (centennial
digits).   The  string is stored with the low-order digit at
the beginning of the string.  For example,  the  representa-
tion  of  157  is 57,1.  After any arithmetic operation on a
number, care is taken that all digits are in the range  0-99
and  that  the number has no leading zeros.  The number zero
is represented by the empty string.

     Negative numbers are represented in the  100's  comple-
ment  notation,  which  is  analogous  to  two's  complement


DC - An Interactive Desk Calculator                  USD:5-5


notation for binary numbers.  The  high  order  digit  of  a
negative number is always -1 and all other digits are in the
range 0-99.  The digit preceding the high order -1 digit  is
never  a  99.   The  representation of -157 is 43,98,-1.  We
shall call this the canonical form of a number.  The  advan-
tage  of  this kind of representation of negative numbers is
ease of addition.   When  addition  is  performed  digit  by
digit, the result is formally correct.  The result need only
be modified, if necessary, to put it into canonical form.

     Because the largest valid digit is 99 and the byte  can
hold  numbers  twice that large, addition can be carried out
and the handling of carries done later  when  that  is  con-
venient, as it sometimes is.

     An additional byte is stored with  each  number  beyond
the  high  order  digit  to  indicate  the number of assumed
decimal digits after the decimal point.  The  representation
of  .001  is  1,_3  where  the  scale  has been italicized to
emphasize the fact that it is not the high order digit.  The
value  of  this extra byte is called the scale factor of the
number.

_T_h_e _A_l_l_o_c_a_t_o_r

     DC uses a dynamic string storage allocator for  all  of
its  internal  storage.   All reading and writing of numbers
internally is done through the allocator.   Associated  with
each  string in the allocator is a four-word header contain-
ing pointers to the beginning of the string, the end of  the
string, the next place to write, and the next place to read.
Communication between the  allocator  and  DC  is  done  via
pointers to these headers.

     The allocator initially has one large string on a  list
of  free  strings.   All  headers except the one pointing to
this string are on a list of  free  headers.   Requests  for
strings  are  made by size.  The size of the string actually
supplied is the next higher power of 2.  When a request  for
a  string  is made, the allocator first checks the free list
to see if there is a string of the desired size.  If none is
found,  the  allocator finds the next larger free string and
splits it repeatedly until it has  a  string  of  the  right
size.  Left-over strings are put on the free list.  If there
are no larger  strings,  the  allocator  tries  to  coalesce
smaller  free  strings  into larger ones.  Since all strings
are the result of splitting large strings, each string has a
neighbor  that  is  next  to it in core and, if free, can be
combined with it to make a string twice as long.  This is an
implementation of the `buddy system' of allocation described
in [2].

     Failing to find a string of  the  proper  length  after
coalescing,  the  allocator  asks the system for more space.


USD:5-6                  DC - An Interactive Desk Calculator


The amount of space on the system is the only limitation  on
the size and number of strings in DC.  If at any time in the
process of trying to allocate a string, the  allocator  runs
out of headers, it also asks the system for more space.

     There are routines in the allocator for reading,  writ-
ing,  copying,  rewinding,  forward-spacing, and backspacing
strings.  All string manipulation is done using  these  rou-
tines.

     The reading and writing  routines  increment  the  read
pointer  or write pointer so that the characters of a string
are read or written in succession by a  series  of  read  or
write calls.  The write pointer is interpreted as the end of
the information-containing portion of a string and a call to
read  beyond that point returns an end-of-string indication.
An attempt to write beyond the end of a  string  causes  the
allocator  to  allocate a larger space and then copy the old
string into the larger block.

_I_n_t_e_r_n_a_l _A_r_i_t_h_m_e_t_i_c

     All arithmetic operations are done  on  integers.   The
operands  (or  operand)  needed for the operation are popped
from the main stack and their scale  factors  stripped  off.
Zeros  are  added  or  digits  removed as necessary to get a
properly scaled result from the internal arithmetic routine.
For  example,  if the scale of the operands is different and
decimal alignment is required, as it is for addition,  zeros
are  appended  to the operand with the smaller scale.  After
performing the required  arithmetic  operation,  the  proper
scale  factor is appended to the end of the number before it
is pushed on the stack.

     A register called scale plays a part in the results  of
most  arithmetic  operations.   scale  is  the  bound on the
number of decimal places  retained  in  arithmetic  computa-
tions.   scale  may  be  set to the number on the top of the
stack truncated to an integer with the k command.  K may  be
used to push the value of scale on the stack.  scale must be
greater than or equal to 0 and less than 100.  The  descrip-
tions  of  the individual arithmetic operations will include
the exact effect of scale on the computations.

_A_d_d_i_t_i_o_n _a_n_d _S_u_b_t_r_a_c_t_i_o_n

     The scales of the two numbers are compared and trailing
zeros  are  supplied  to  the number with the lower scale to
give both numbers the  same  scale.   The  number  with  the
smaller  scale  is multiplied by 10 if the difference of the
scales is odd.  The scale of the result is then set  to  the
larger of the scales of the two operands.

     Subtraction is performed by negating the number  to  be


DC - An Interactive Desk Calculator                  USD:5-7


subtracted and proceeding as in addition.

     Finally, the addition is performed digit by digit  from
the low order end of the number.  The carries are propagated
in the usual way.  The  resulting  number  is  brought  into
canonical  form,  which  may  require  stripping  of leading
zeros, or for negative numbers replacing the high-order con-
figuration 99,-1 by the digit -1.  In any case, digits which
are not in the range 0-99 must be brought into  that  range,
propagating any carries or borrows that result.

_M_u_l_t_i_p_l_i_c_a_t_i_o_n

     The scales are removed from the two operands and saved.
The operands are both made positive.  Then multiplication is
performed in a digit by digit manner that exactly mimics the
hand  method of multiplying.  The first number is multiplied
by each digit of the second number, beginning with  its  low
order digit.  The intermediate products are accumulated into
a partial sum which becomes the final product.  The  product
is put into the canonical form and its sign is computed from
the signs of the original operands.

     The scale of the result is set equal to the sum of  the
scales  of  the  two operands.  If that scale is larger than
the internal register scale and also larger than both of the
scales  of the two operands, then the scale of the result is
set equal to the largest of these three last quantities.

_D_i_v_i_s_i_o_n

     The scales are removed from the  two  operands.   Zeros
are appended or digits removed from the dividend to make the
scale of the result of the integer  division  equal  to  the
internal quantity scale.  The signs are removed and saved.

     Division is performed much as it would be done by hand.
The  difference  of  the  lengths of the two numbers is com-
puted.  If the divisor is longer than the dividend, zero  is
returned.  Otherwise the top digit of the divisor is divided
into the top two digits of the dividend.  The result is used
as  the  first  (high-order)  digit of the quotient.  It may
turn out be one unit too low, but if it is, the  next  trial
quotient will be larger than 99 and this will be adjusted at
the end of the process.  The trial digit  is  multiplied  by
the  divisor and the result subtracted from the dividend and
the process is repeated to get  additional  quotient  digits
until  the  remaining  dividend is smaller than the divisor.
At the end, the digits of the  quotient  are  put  into  the
canonical  form,  with  propagation of carry as needed.  The
sign is set from the sign of the operands.


USD:5-8                  DC - An Interactive Desk Calculator


_R_e_m_a_i_n_d_e_r

     The division routine is called  and  division  is  per-
formed  exactly  as described.  The quantity returned is the
remains of the dividend at the end of  the  divide  process.
Since  division  truncates  toward zero, remainders have the
same sign as the dividend.  The scale of  the  remainder  is
set  to  the  maximum  of  the scale of the dividend and the
scale of the quotient plus the scale of the divisor.

_S_q_u_a_r_e _R_o_o_t

     The scale is stripped  from  the  operand.   Zeros  are
added  if  necessary to make the integer result have a scale
that is the larger of the internal quantity  scale  and  the
scale of the operand.

     The method used to compute sqrt(y) is  Newton's  method
with successive approximations by the rule


The initial guess is found by taking the integer square root
of the top two digits.

_E_x_p_o_n_e_n_t_i_a_t_i_o_n

     Only exponents with zero scale factor are handled.   If
the exponent is zero, then the result is 1.  If the exponent
is negative, then it  is  made  positive  and  the  base  is
divided into one.  The scale of the base is removed.

     The integer exponent is viewed as a binary number.  The
base  is  repeatedly squared and the result is obtained as a
product of those powers of the base that correspond  to  the
positions  of  the  one-bits in the binary representation of
the exponent.  Enough digits of the result  are  removed  to
make  the  scale  of the result the same as if the indicated
multiplication had been performed.

_I_n_p_u_t _C_o_n_v_e_r_s_i_o_n _a_n_d _B_a_s_e

     Numbers are converted to the internal representation as
they  are read in.  The scale stored with a number is simply
the number of fractional digits input.  Negative numbers are
indicated  by preceding the number with a _ (an underscore).
The hexadecimal digits A-F correspond to the  numbers  10-15
regardless  of  input  base.   The  i command can be used to
change the base of the input numbers.  This command pops the
stack,  truncates  the  resulting  number to an integer, and
uses it as the input base for all further input.  The  input
base is initialized to 10 but may, for example be changed to
8 or 16 to do octal or hexadecimal to  decimal  conversions.
The  command  I will push the value of the input base on the
stack.


DC - An Interactive Desk Calculator                  USD:5-9


_O_u_t_p_u_t _C_o_m_m_a_n_d_s

     The command p  causes  the  top  of  the  stack  to  be
printed.   It  does not remove the top of the stack.  All of
the stack and internal registers can be output by typing the
command  f.   The o command can be used to change the output
base.  This command uses the top of the stack, truncated  to
an  integer  as the base for all further output.  The output
base in initialized to 10.  It will work correctly  for  any
base.   The command O pushes the value of the output base on
the stack.

_O_u_t_p_u_t _F_o_r_m_a_t _a_n_d _B_a_s_e

     The input and output bases only affect the  interpreta-
tion  of numbers on input and output; they have no effect on
arithmetic computations.  Large numbers are output  with  70
characters  per  line;  a \ indicates a continued line.  All
choices of input and output bases work  correctly,  although
not  all  are  useful.  A particularly useful output base is
100000, which has the effect of grouping  digits  in  fives.
Bases  of 8 and 16 can be used for decimal-octal or decimal-
hexadecimal conversions.

_I_n_t_e_r_n_a_l _R_e_g_i_s_t_e_r_s

     Numbers or strings may be stored in internal  registers
or  loaded  on  the stack from registers with the commands s
and l.  The command s_x pops the top of the stack and  stores
the  result in register x.  _x can be any character.  l_x puts
the contents of register x on the top of the stack.   The  l
command  has no effect on the contents of register _x.  The s
command, however, is destructive.

_S_t_a_c_k _C_o_m_m_a_n_d_s

     The command c clears the stack.  The command d pushes a
duplicate  of  the  number  on  the  top of the stack on the
stack.  The command z pushes the stack size  on  the  stack.
The  command  X  replaces the number on the top of the stack
with its scale factor.  The command Z replaces  the  top  of
the stack with its length.

_S_u_b_r_o_u_t_i_n_e _D_e_f_i_n_i_t_i_o_n_s _a_n_d _C_a_l_l_s

     Enclosing a string in [ ] pushes the  ascii  string  on
the  stack.   The  q command quits or in executing a string,
pops the recursion levels by two.

_I_n_t_e_r_n_a_l _R_e_g_i_s_t_e_r_s - _P_r_o_g_r_a_m_m_i_n_g _D_C

     The load and store commands together with [ ] to  store
strings,  x  to  execute  and the testing commands `<', `>',
`=', `!<', `!>', `!=' can be used  to  program  DC.   The  x


USD:5-10                 DC - An Interactive Desk Calculator


command assumes the top of the stack is an string of DC com-
mands and executes it.  The testing commands compare the top
two elements on the stack and if the relation holds, execute
the register that follows the  relation.   For  example,  to
print the numbers 0-9,

        [lip1+  si  li10>a]sa
        0si  lax


_P_u_s_h-_D_o_w_n _R_e_g_i_s_t_e_r_s _a_n_d _A_r_r_a_y_s

     These commands were designed for used  by  a  compiler,
not by people.  They involve push-down registers and arrays.
In addition to the stack that commands work on,  DC  can  be
thought  of  as  having individual stacks for each register.
These registers are operated on by the commands S and L.  S_x
pushes  the  top  value of the main stack onto the stack for
the register _x.  L_x pops the stack for register _x  and  puts
the  result  on  the  main stack.  The commands s and l also
work on registers but not as push-down  stacks.   l  doesn't
effect  the  top  of the register stack, and s destroys what
was there before.

     The commands to work on arrays are : and  ;.   :_x  pops
the  stack and uses this value as an index into the array _x.
The next element on the stack is stored at this index in  _x.
An  index  must  be greater than or equal to 0 and less than
2048.  ;_x is the command to load the  main  stack  from  the
array  _x.   The  value  on the top of the stack is the index
into the array _x of the value to be loaded.

_M_i_s_c_e_l_l_a_n_e_o_u_s _C_o_m_m_a_n_d_s

     The command ! interprets the rest of the line as a UNIX
command  and  passes  it to UNIX to execute.  One other com-
piler command is Q.  This command uses the top of the  stack
as the number of levels of recursion to skip.

_D_E_S_I_G_N _C_H_O_I_C_E_S

     The real reason for the use of a dynamic storage  allo-
cator  was  that  a general purpose program could be (and in
fact has been) used for a variety of other tasks.  The allo-
cator  has some value for input and for compiling (i.e.  the
bracket [...] commands) where it cannot be known in  advance
how  long a string will be.  The result was that at a modest
cost in execution time, all considerations of string alloca-
tion and sizes of strings were removed from the remainder of
the program and debugging was made easier.   The  allocation
method used wastes approximately 25% of available space.

     The choice of 100 as a  base  for  internal  arithmetic
seemingly  has no compelling advantage.  Yet the base cannot


DC - An Interactive Desk Calculator                 USD:5-11


exceed 127 because of hardware limitations and at  the  cost
of  5%  in space, debugging was made a great deal easier and
decimal output was made much faster.

     The reason for a stack-type arithmetic  design  was  to
permit all DC commands from addition to subroutine execution
to be implemented in essentially the same way.   The  result
was a considerable degree of logical separation of the final
program into modules with very little communication  between
modules.

     The rationale for the lack of interaction  between  the
scale  and  the bases was to provide an understandable means
of proceeding after a change of base or scale  when  numbers
had  already  been entered.  An earlier implementation which
had global notions of scale and base did not work out  well.
If  the value of scale were to be interpreted in the current
input or output base, then a change of base or scale in  the
midst  of  a  computation would cause great confusion in the
interpretation of the results.  The current scheme  has  the
advantage  that  the value of the input and output bases are
only used for input and output, respectively, and  they  are
ignored  in all other operations.  The value of scale is not
used for any essential purpose by any part  of  the  program
and  it is used only to prevent the number of decimal places
resulting from the arithmetic operations from growing beyond
all bounds.

     The design rationale for the choices for the scales  of
the  results  of  arithmetic were that in no case should any
significant digits be thrown away if,  on  appearances,  the
user  actually  wanted them.  Thus, if the user wants to add
the numbers 1.5 and 3.517, it seemed reasonable to give  him
the  result  5.017  without  requiring  him to unnecessarily
specify his rather obvious requirements for precision.

     On the other hand,  multiplication  and  exponentiation
produce  results  with  many more digits than their operands
and it seemed reasonable to give as a minimum the number  of
decimal  places  in  the  operands but not to give more than
that number of digits unless the  user  asked  for  them  by
specifying a value for scale.  Square root can be handled in
just the same way as multiplication.  The operation of divi-
sion gives arbitrarily many decimal places and there is sim-
ply no way to guess how many places the user wants.  In this
case  only, the user must specify a scale to get any decimal
places at all.

     The scale of remainder was chosen to make  it  possible
to  recreate  the  dividend from the quotient and remainder.
This is easy to implement; no digits are thrown away.


USD:5-12                 DC - An Interactive Desk Calculator


_R_e_f_e_r_e_n_c_e_s

[1]  L. L. Cherry, R. Morris, _B_C -  _A_n  _A_r_b_i_t_r_a_r_y  _P_r_e_c_i_s_i_o_n
     _D_e_s_k-_C_a_l_c_u_l_a_t_o_r _L_a_n_g_u_a_g_e.

[2]  K. C. Knowlton, _A _F_a_s_t _S_t_o_r_a_g_e _A_l_l_o_c_a_t_o_r, Comm. ACM  8,
     pp. 623-625 (Oct. 1965).