SENDMAIL -- An Internetwork Mail Router Eric Allman|- 9 _B_r_i_t_t_o_n-_L_e_e, _I_n_c. _1_9_1_9 _A_d_d_i_s_o_n _S_t_r_e_e_t, _S_u_i_t_e _1_0_5. _B_e_r_k_e_l_e_y, _C_a_l_i_f_o_r_n_i_a _9_4_7_0_4. ABSTRACT Routing mail through a heterogenous internet presents many new problems. Among the worst of these is that of address mapping. Historically, this has been handled on an _a_d _h_o_c basis. However, this approach has become unmanageable as internets grow. Sendmail acts a unified "post office" to which all mail can be submitted. Address interpretation is controlled by a production system, which can parse both domain-based addressing and old-style _a_d _h_o_c addresses. The production system is powerful enough to rewrite addresses in the message header to con- form to the standards of a number of common target networks, including old (NCP/RFC733) Arpanet, new (TCP/RFC822) Arpanet, UUCP, and Phonenet. Sendmail also implements an SMTP server, message queueing, and aliasing. _S_e_n_d_m_a_i_l implements a general internetwork mail routing facility, featuring aliasing and forwarding, automatic rout- ing to network gateways, and flexible configuration. 9____________________ 9 |-A considerable part of this work was done while under the employ of the INGRES Project at the University of Cali- fornia at Berkeley. 9SENDMAIL 1 SENDMAIL 2 In a simple network, each node has an address, and resources can be identified with a host-resource pair; in particular, the mail system can refer to users using a host-username pair. Host names and numbers have to be administered by a central authority, but usernames can be assigned locally to each host. In an internet, multiple networks with different char- acterstics and managements must communicate. In particular, the syntax and semantics of resource identification change. Certain special cases can be handled trivially by _a_d _h_o_c techniques, such as providing network names that appear local to hosts on other networks, as with the Ethernet at Xerox PARC. However, the general case is extremely com- plex. For example, some networks require point-to-point routing, which simplifies the database update problem since only adjacent hosts must be entered into the system tables, while others use end-to-end addressing. Some networks use a left-associative syntax and others use a right-associative syntax, causing ambiguity in mixed addresses. Internet standards seek to eliminate these problems. Initially, these proposed expanding the address pairs to address triples, consisting of {network, host, resource} triples. Network numbers must be universally agreed upon, and hosts can be assigned locally on each network. The user-level presentation was quickly expanded to address domains, comprised of a local resource identification and a Version 4.2 DRAFT Last Mod 6/7/85 SENDMAIL 3 hierarchical domain specification with a common static root. The domain technique separates the issue of physical versus logical addressing. For example, an address of the form "eric@a.cc.berkeley.arpa" describes only the logical organi- zation of the address space. _S_e_n_d_m_a_i_l is intended to help bridge the gap between the totally _a_d _h_o_c world of networks that know nothing of each other and the clean, tightly-coupled world of unique network numbers. It can accept old arbitrary address syntaxes, resolving ambiguities using heuristics specified by the sys- tem administrator, as well as domain-based addressing. It helps guide the conversion of message formats between disparate networks. In short, _s_e_n_d_m_a_i_l is designed to assist a graceful transition to consistent internetwork addressing schemes. Section 1 discusses the design goals for _s_e_n_d_m_a_i_l. Section 2 gives an overview of the basic functions of the system. In section 3, details of usage are discussed. Sec- tion 4 compares _s_e_n_d_m_a_i_l to other internet mail routers, and an evaluation of _s_e_n_d_m_a_i_l is given in section 5, including future plans. _1. _D_E_S_I_G_N _G_O_A_L_S Design goals for _s_e_n_d_m_a_i_l include: 9 9Version 4.2 DRAFT Last Mod 6/7/85 SENDMAIL 4 (1) Compatibility with the existing mail programs, including Bell version 6 mail, Bell version 7 mail [UNIX83], Berkeley _M_a_i_l [Shoens79], BerkNet mail [Schmidt79], and hopefully UUCP mail [Nowitz78a, Nowitz78b]. ARPANET mail [Crocker77a, Postel77] was also required. (2) Reliability, in the sense of guaranteeing that every message is correctly delivered or at least brought to the attention of a human for correct disposal; no message should ever be completely lost. This goal was considered essential because of the emphasis on mail in our environment. It has turned out to be one of the hardest goals to satisfy, especially in the face of the many anomalous message formats produced by various ARPANET sites. For example, certain sites gen- erate improperly formated addresses, occasionally causing error-message loops. Some hosts use blanks in names, causing problems with UNIX mail programs that assume that an address is one word. The semantics of some fields are interpreted slightly differently by different sites. In sum- mary, the obscure features of the ARPANET mail protocol really _a_r_e used and are difficult to sup- port, but must be supported. 9 9Version 4.2 DRAFT Last Mod 6/7/85 SENDMAIL 5 (3) Existing software to do actual delivery should be used whenever possible. This goal derives as much from political and practical considerations as technical. (4) Easy expansion to fairly complex environments, including multiple connections to a single network type (such as with multiple UUCP or Ether nets [Metcalfe76]). This goal requires consideration of the contents of an address as well as its syn- tax in order to determine which gateway to use. For example, the ARPANET is bringing up the TCP protocol to replace the old NCP protocol. No host at Berkeley runs both TCP and NCP, so it is neces- sary to look at the ARPANET host name to determine whether to route mail to an NCP gateway or a TCP gateway. (5) Configuration should not be compiled into the code. A single compiled program should be able to run as is at any site (barring such basic changes as the CPU type or the operating system). We have found this seemingly unimportant goal to be criti- cal in real life. Besides the simple problems that occur when any program gets recompiled in a different environment, many sites like to "fiddle" with anything that they will be recompiling any- way. Version 4.2