It is surprising that the names are included, and astonishing that they are meaningful. Some simple techniques, such as randomizing the procedure names, would have removed a number of clues to the function of the virus.
sh
. Thus, no
matter how the program was invoked, it would show up in the process
table with the same name as the Bourne Shell, a program which often
runs legitimately.
The program also sets the maximum core dump size to zero blocks. If the program crashed it would not leave a core dump behind to help investigators. It also turns off handling of write errors on pipes, which normally cause the program to exit.
The next step is to read the clock, store the current time in a local variable, and use that value to seed the random number generator.
-p
which must
be followed by a decimal number, which seems to be a process id of the
parent which spawned it. It uses this number later to kill that
process, probably to ``close the door'' behind it.
The rest of the command line arguments are ``object names''. These are
names of files it tries to load into its address space. If it can't
load one of them, it quits. If the -p
argument is given, it
also deletes the object files, and later tries to remove the disk
image of running virus, as well as the file
/tmp/.dumb
. (This file is not referenced anywhere else in
the virus, so it is unclear why it is deleted.)
The program then tried a few further steps, exiting (``bailing out'') if any of them failed:
l1.c
.
If the ``-p'' argument was given, the program closes all file descriptors, in case there are any connections open to the parent.
The program then erases the text of the argument array, to further obscure how it was started (perhaps to hide anything if one were to get a core image of the running virus.)
It scans all of the network interfaces on the machine, gets the flags and addresses of each interface. It tries to get the point-to-point address of the interface, skipping the loopback address. It also stores the netmask for that network [rfc950].
Finally, it kills off the process id given with the ``-p'' option. It
also changes the current process group, so that it doesn't die when
the parent exits. Once this is cleaned up, it falls into the
doit
routine which performs the rest of the work.
It then tries hg
. If that fails, it tries
hl
. If that fails, it tries ha
.
It then tries to check if there is already a copy of the virus running on this machine. Errors in this code contributed to the large amounts of computer time taken up by the virus. Specifically:
The remainder of the initialization routine seems designed to send a
single byte to address 128.32.137.13, which is ernie.berkeley.edu
,
on port 11357. This never happens, since the author used the
sendto
function on a TCP stream connection, instead of a
UDP datagram socket. We have no explanation for this; it only tries to send this
packet with a one in fifteen random chance.
cracksome
routine which tries to find some
hosts that it can break in to. Then it waits 30 seconds,
listening for other virus programs attempting to break in, and tries
to break into another batch of machines.
After this round of attacks, it forks, creating two copies of the virus; the original (parent) dies, leaving the fresh copy. The child copy has all of the information the parent had, while not having the accumulated CPU usage of the parent. It also has a new process id, making it hard to find.
Next, the hg
, hl
, and ha
routines search for machines to infect (see Appendix [hroutines]). The
program sleeps for 2 minutes, and then checks to see if it has been
running for more than 12 hours, cleaning up some of the entries in the
host list if it has.
Finally, before repeating, it checks the global variable
pleasequit
. If it is set, and if it has tried more
than 10 words from its own dictionary against existing passwords, it
quits. Thus forcing pleasequit
to be set in the system
libraries will do very little to stem the progress of this
virus.
cracksome
, the main switch, chooses which of four
strategies to execute. It is would be the central point for adding
new strategies if the virus were to be further extended. The virus
works each strategy through completely, then switches to the next
one. Each pass through the cracking routines only performs a small
amount of work, but enough state is remembered in each pass to
continue the next time around.
cracksome
routine is the central switching routine of the
cracking code. It decides which of the cracking strategies is actually
exercised next. Again, note that this routine was named in the global
symbol table. It could have been given a confusing or random name, but
it was actually clearly labelled, which lends some credence to the
idea that the virus was released prematurely.
cracksome
routines reads through the /etc/hosts.equiv
file
to find machine names that would be likely targets. While this file
indicates what hosts the current machine trusts, it is fairly common
to find systems where all machines in a cluster trust each other, and
at the very least it is likely that people with accounts on this machine will
have accounts on the other machines mentioned in /etc/hosts.equiv
.
It also reads the /.rhosts
file, which lists the set of machines
that this machine trusts root access from. Note that it does not take
advantage of the trust itself [baldwin] but merely
uses the names as a list of additional machines to attack. Often,
system managers will deny read access to this file to any user other
than root itself, to avoid providing any easy list of secondary
targets that could be used to subvert the machine; this practice would
have prevented the virus from discovering those names, although
/.rhosts
is very often a subset of /etc/hosts.equiv
.
The program then reads the entire local password file,
/etc/passwd
. It uses this to find personal .forward
files, and reads them in search of names of other machines it can
attack. It also records the user name, encrypted password, and
GECOS
information string, all of which are
stored in the /etc/passwd
file. Once the program
scanned the entire file, it advanced to Phase 1.
/etc/passwd
:
All of these attacks are applied to fifty passwords at a time from those collected in Phase 0. Once it had tried to guess the passwords for all local accounts, it advanced to Phase 2.
nextw
as an index into this table. The main loop uses this to
prevent pleasequit
from causing the virus to exit until
at least ten of the words have been checked against all of the
encryptions in the collected list.
Again, when the word list is exhausted the virus advances to Phase 3.
/usr/dict/words
file, a 24474
word list distributed with 4.3BSD (and other UNIX systems) as a
spelling dictionary. The words are stored in this file one word per
line. One word at a time is tried against all encrypted passwords. If
the word begins with an upper case letter, the letter is converted to
lower case and the word is tried again.
When the dictionary runs out, the phase counter is again advanced to 4 (thus no more password cracking is attempted).
hg
, ha
,
hi
, and hl
, which search for other hosts to attack.
hg
routine calls rt_init
(if it has not
already been called) to scan the routing table, and records all
gateways except the loopback address in a special list. It then tries
a generic attack routine to attack via rsh
,
finger
, and SMTP. It returns after the first successful attack.
ha
routine goes through the gateway list and
connects to TCP port 23, the telnet port, looking for gateways which are
running telnet listeners. It randomizes the order of such gateways
and calls
hn
(our name) with the network number of each gateway.
The ha
returns after hn
reports that it has
succeeded broken into a host.
hl
routine iterates through all the addresses for the
local machine calling hn
with the network number for
each one. It returns if hn
indicates success in breaking
into a host.
hi
routine goes through the internal host list (see
section [phase0]) and tries to attack each host via rsh
,
finger
, and SMTP. It returns if when one host is infected.
hn
routine (our name) followed hi
takes a
network number as an argument. Surprisingly it returns if the network
number supplied is the same as the network number of any of the
interfaces on the local machine. For Class A addresses it uses the
Arpanet IMP convention to create possible addresses to attack
(net.[1-8].0.[1-255]). For all other networks it guesses hosts number
one through 255 on that network. It randomizes the order of this list of possible
hosts and tries to attack up to twenty of them using rsh
,
finger
, and SMTP. If a host does not accept connections on
TCP port 514, the rsh
port, hn
will not try to
attack it. If a host is successfully attacked hn
returns.
hg
routine is always called first, which indicates the virus
really wanted to infect gateway machines. Next comes hi
which tried to infect normal hosts found via cracksome
.
If hi
fails, ha
is called, which seemed to try
breaking into hosts with randomly guessed addresses on the far side of
gateways. This assumes that all the addresses for gateways had been
obtained (which is not trivial to verify from the convoluted code in
{rt init
}),
and implies that the virus would prefer to infect a gateway and from
there reach out to the gateway's connected networks, rather than
trying to hop the gateway directly. If hg
,
hi
, and ha
all failed to infect a host, then
hl
is called which is similar to ha
but uses
for local interfaces for a source of networks.
It is not clear that ha
and hl
worked.
Because hn
returns if the address is local,
hl
appears to have no chance of succeeding. If alternate
addresses for gateways are indeed obtained by other parts of the virus
then ha
could work. But if only the addresses in the
routing table were used it could not work, since by definition these
addresses must be on a directly connected network. Also, in our
monitoring we never detected an attack on a randomly generated
address. These routines do not seem to have been functional.
l1.c
program, used by all the attack routines.
hu1
routine is called by the Phase 1 and Phase 3
cracksome
subroutines. Once a password for user name
guessed correctly, this routine is called with a host name read from
either the user's
.forward
or .rhosts
files. In order to assume
the user's id it then tries to
connect to the local machine's rexec
server using the guessed
name and password. If successful it runs an rsh
to
the target machine, trying to execute a Bourne Shell, which it uses to
send over and compile the
l1.c
infection program.
sendmail
run a
command (the ``recipient'' of the message), which transfers the
l1.c
program included in the body of the message.
Note that the piece of code is VAX code, and the stack frame is a VAX frame, in the wrong order for the Sun. Thus, although the Sun finger daemon has the same bug as the VAX one, this piece of code cannot exploit it.
pushl $68732f push '/sh' pushl $6e69622f push '/bin' movl sp,r10 save address of start of string pushl $0 push 0 (arg 3 to execve) pushl $0 push 0 (arg 2 to execve) pushl r10 push string addr (arg 1 to execve) pushl $3 push argument count movl sp,ap set argument pointer chmk $3b do "execve" kernel call.
finger
attack.
The attack on the finger daemon is clearly a lysogenetic ``viral'' attack (see Section [rose]), since although a worm doesn't modify the host machine at all, the finger attack does modify the running finger daemon process. The ``injected DNA'' component of the virus contained the VAX instructions shown in Figure [vaxcode].
The execve
system call causes the current process to be
replaced with an invocation of the named program; /bin/sh
is the Bourne Shell, a UNIX command interpreter. In this case, the
shell winds up
running with its input coming from, and its output going to, the
network connection. The virus then sends over the l1.c
bootstrap program.
rsh
to the
target host (assuming it can get in as the current user). It tries
three different names for the rsh
binary,
/usr/ucb/rsh
/usr/bin/rsh
/bin/rsh
l1.c
``grappling
hook'' program at the victim.
Note that this infection method doesn't specify a user name to attack; if it gets into the remote account, it is because the user that the virus is running as also has an account on the other machine which trusts the originating machine.
hit rexec
routine uses the remote
execution system which is similar to rsh
, but designed for use by
programs. It connects and sends the user name, the password, and
/bin/sh
as the command to execute.
If it succeedes in reaching the host, it creates a TCP listener on a random port number which the infected machine would eventually connect back to.
l1.c
, is the common grappling
hook
that all of
the attack routines use to pull over the rest of the virus. It is robustly
written, and fairly portable. It ran on a number of machines which
were neither VAX or Sun, loading them down as well, but only making
them peripheral victims of the virus.
The first thing it does is delete the binary it was running from. It checks that it has three arguments (exiting if there aren't three of them). It closes all file descriptors and then forks, exiting if the fork fails. If it succeeds, the parent exits; this leaves no connection from the child to the infection route.
Next, it creates a TCP connection back to the address given as the first argument, and the port given as the second. Then it sends over the magic number given as the third. The text of each argument is erased immediately after it is used. The stream connection is then reused as the program's standard input and output.
A loop reads in a length (as a network byte order 32-bit integer) and
then a filename. The file is unlinked and opened for write, and then
the file itself is read in (using the number of bytes read in
earlier.) On any error, all of the files are unlinked. If the length
read in is -1, the loop exits, and a Bourne Shell is executed
(replacing the l1
program, and getting its input from the
same source.)
l1.c
injected program or with the shell
that it spawns.
resynch
routine sends commands to a remote shell,
requesting that it echo back a specific randomly chosen number. It
then waits a certain amount of time for a response. This routine is
used to indicate when the various subprograms of the infection procedure
have compiled or executed and a Bourne Shell prompt is available
again.
l1.c
injected into it.) It then tries to read a magic
number (which had been previously sent to that victim as a command
line argument to the l1
program) and gives up after ten
seconds.
After the connection is established, all of the current ``objects'' in storage in the virus are fed down the connection into the victim. Then it tries to resynchronize, and if it succeeds, sends down commands to
sh
in the current directory (/usr/tmp
)
.o
Thus, to add another machine type, the virus merely needs to be started with a new object binary as a command line option, which will then be propagated to the next infected host and tried.
Note that the path used here was
PATH=
bin:
/usr/bin:
/usr/ucb
which is certainly reasonable on most systems. This protects
systems with ``unusual'' filesystem layouts, and suggests that complete
consistency among systems makes them more vulnerable.
/usr/tmp/sh
stoped the virus[spaf2]. However, the virus would still use CPU
resources attempting to link the objects, even though it couldn't
write to the output file (since it was a directory).
waithit
) fails, probably indicating
that this host ``got lost''. Bit 2 is set when a named host has no
addresses, or when several different attack attempts fail. Bit 3 is
set when Phase 0 of the crack routines successfully retrieves an
address for the host.
netstat -r -n
as a subprocess. This shows the
routing table, with the addresses listed numerically. It gives up
after finding 500 gateways. It skips the default route, as well as the
loopback entry. It checks for redundant entries, and checks to see if
this address is already an interface address. If not, it adds it
to the list of gateways.
After the gateway list is collected, it shuffles it and enters the addresses in the host table.
checkother
and othersleep
are at the heart of
the excessive propagation of the virus. It is clear that the author
intended for the virus to detect that a machine was already infected, and
if so to skip it. The code is actually fraught with timing flaws and
design errors which lead it to permit multiple infections, probably more
often than the designer intended.
An active infection uses the othersleep
routine for two
purposes, first to sleep so that it doesn't use much processor
time, and second to listen for requests from ``incoming'' viruses. The
virus which is running othersleep
is referred to as the
``listener'' and the virus which is running checkother
is
referred to as the ``tester''.
If the connection is successful, the checker sends a magic number, and listens (for up to 300 seconds) for a magic number from the listener. If the magic number is wrong, the checker assumes it is being spoofed and continues to run.
The checker then picks a random number, shifts it right by three
(throwing away the lower three bits) and sends it to the listener. It
expects a number back within ten seconds, which it adds to the one
sent. If this sum is even, the sender increments pleasequit
,
which (as noted in section [pleasequit]) does very little.
Once it has finished communicating (or failing to communicate) with the listener, the checker sleeps for five seconds and tries to become a listener. It creates a TCP stream socket, sets the socket options to indicate that it should allow multiple binds to that address (in case the listener still hasn't exited, perhaps?) and then binds the socket to port 23357, and listens on it (permitting a backlog of up to ten pending connections.)
othersleep
routine is run when the main body of the virus
wants to idle for a period of time. This was apparently intended to help the
virus ``hide'' so that it wouldn't use enough processor time to be
noticed. While the main program sleeps, the listener code waits to see
if any checkers have appeared and queried for the existence of a
listener, as a simple ``background task'' of the main virus.
The routine first checks to see if it has been set up as a listener;
if not, it calls the normal sleep
function to sleep for the
requested number of seconds, and returns.
If it is set up as a listener, it listens on the checking port with a timeout. If it times out, it returns, otherwise it deals with the connection and subtracts the elapsed real time from the time out value.
The body of the listener ``accepts'' the connection, and sends a magic number to the checker. It then listens (for up to 10 seconds) for the checker's magic number, and picks a random number. It shifts the random number right by three, discarding the lower bits, and sends it up to the checker; it then listens (for up to 10 seconds) for a random number from the checker. If any of these steps fail, the connection is closed and the checker is ignored.
Once the exchanges have occurred, the address of the incoming
connection is compared with the loopback address.
If it is not from the loopback address, the attempt is ignored. If it
is, then if the sum of the exchanged random numbers is odd, the
listener increments pleasequit
(with little effect, as
noted in section [pleasequit]) and closes the listener connection.