Network Working Group Eliot Lear Internet Draft IntelliGenetics, Inc. Obsoletes: RFC977 July, 1991 Network News Transfer Protocol Version 2 A Protocol for the Stream-Based Transmission of News Status of this Memo A derivative of this draft document will be submitted to the RFC editor as a standards document. Distribution of this memo is unlimited. Please send comments to ietf-nntp@turbo.bio.net. 1. Introduction NNTP specifies a protocol for the distribution, inquiry, retrieval, and posting of news articles using a reliable stream-based transmission of news among the Internet community. When used as a news reader protocol, NNTP is designed so that news articles are stored in a central database allowing a subscriber to select only those items he wishes to read. The netnews model provides for indexing, cross-referencing, and expiration of aged messages[XXX]. First deployed on the Internet in 1986[XXX], NNTP has enjoyed success in no small part due to the public domain UNIX implementation of both server and client, written by Phil Lapsley, Erik Fair, and Brian Kantor, and supported to the date of this publication by Stan Barber with help from the networking community. NNTP is designed for efficient transmission of news articles over a reliable full duplex communication method. Although designed primarily with the Internet in mind, the original specification has been implemented for DECNET and DATAKIT[XXX]. While the purpose of this document is to specify an Internet standard for news transmission, the implementors have gone to some pains to ensure the mechanism's portability into other environments (e.g. ISO). Because NNTP is a transport protocol, the scope of this document will be limited strictly to issues of transport. Thus, in as much as content format does not involve the transport, it shall not be addressed herein, and the reader is referred to the appropriate specification for netnews message format, currently RFC-1036. 1.1. Reasons for Change Since NNTP's introduction, the Internet has changed considerably, growing by several orders of magnitude. New demands from netnews and its transport protocol have brought upon the need for revision of the original document. Netnews on the Internet has grown to encompass over one thousand available interest groups. A variety of information including pictures, sounds, and specialized data such as gene sequence information can now be retrieved via netnews. Authentication has been added, so that host addresses need not be the sole form to identify a partner in a communication. Binary capability has been added for a number of reasons - use of alternate forms of character sets, and ease of distribution of binary files are two sample reasons. Batch transfer has been added to reduce the number of protocol turnarounds required, in order to improve performance over long delay connections. Several discrepancies have been flushed out; date formats have been modified to use ISO 3307 format, helping the netnews community enter the twenty first century; distributions have been redefined, and the description of the newsgroup list has been modified to be slightly less confusing. 1.2. Compatibility Version 2 of the NNTP protocol is designed to be backward compatible with the first version. New functionality must be negotiated by the client. If a client receives an improper response to either the OPTION or AUTHINFO commands, it should presume that the server is not running version 2. The server's default behavior is that of version 1. 1.3. Document Layout This document is broken down into six sections: 1. Introduction 2. Protocol Overview 3. Commands and Responses 4. State Diagrams 5. Comprehensive list of Reply Codes. 6. Appendix. Within the command descriptions, section numbers end with .1 for usage and description, .2 for valid responses, and .3 for examples. 1.3.1. Terminal and Non-Terminal Symbols This document uses a particular format for commands, their parameters and values contained in responses. Any word in CAPITOL LETTERS is meant to be a terminal symbol. Any word in is meant to be a non-terminal symbol. All non-terminal symbols are unique throughout this document. Thus will be interpreted as the article number within a group. In addition [brackets] are used for optional arguments and bar (|) is used to indicate ``either or''. (Parentheses) will be used for grouping, and {braces} will be used to indicate semantic action. Ellipses (...) are used to indicate a repeating pattern. 1.4. Acknowledgments This document is based on a draft written by Brian Kantor. It also contains wording based on Henry Spencer's C-News documentation, and some suggestions from Theodore Tso. It was further developed by the Internet Engineering Task Force NNTP working group. Particular thanks go to Erik Fair, Jim Thompson, Rich Salz, and Jim Galvin. 2. Protocol Overview Every NNTP session will involve each of the following steps, though possibly not in order: o Connection, o Greeting, exchange of capabilities and requirements, o Authentication (confirmation of identity of each end), o News transfer, and o Conclusion/Disconnect. While authentication for a particular session may not be required, implementations should respond properly to requests for authentication, either by attempting to satisfy or denying the request. Greetings will consist of a banner transmitted by the server, followed by an exchange of negotiations. Once it is determined who is exchanging news, and how it is to be exchanged, the client will either offer articles to or request articles from the server, depending on the negotiated configuration of the connection. 2.1. Connection & Greeting Clients will initiate a TCP connection on port 119[XXX] to the server with which they wish to exchange news. Once the connection is established, the server will respond with a greeting, indicating that it is ready for service, or that it is unable to deal with the client. Because the server is in fact responding to the client, we list this interaction as a command called . 2.2.1. Server to Server Exchanges In the context of following discussion a client is defined as the program that initiates the NNTP connection. NNTP is successful in large part because it eliminates transmission of duplicate articles. In past this has been accomplished through a simple lock step IHAVE/SENDME protocol, using the USENET Message-ID component as a key. Version 2 of the protocol introduces several variations on this theme, meant to optimize protocol exchanges (turnarounds). When a client is to distribute netnews articles to a server, it may do so in one of several ways, most of which involve the server confirming that it does not already have the article stored. This type of negotiation is known as IHAVE/SENDME. One modification of IHAVE/SENDME is designed to group transactions together. This method is known as BATCH. Under this scheme, the client sends an index of queued articles to the server, and the server replies with a subset of those articles that it wishes to receive. This method reduces the number of turnarounds required during the conversation, and is highly recommended when more than one article is queued for transmission. A slightly more robust from of IHAVE/SENDME involves the client and server sharing an additional TCP connection. One channel is then used for IHAVE/SENDME, while the other is used to actually transfer the contents of the articles. This method has the benefit of being asynchronous, while also eliminating many protocol turnarounds. The final method of server to server communications involves the use of the NEWNEWS command. When the client applies this method, it requests that the server send a list of articles received after a certain date. The client will then decide which articles it wishes to receive and requests them using the ARTICLE command. 2.2.2. News Reader to Server Exchanges A set of commands are included in the NNTP specification specifically for the benefit of remote news readers. These commands allow for article selection and retrieval, and user information maintenance (e.g. .newsrc files). Traditionally, a news reader would retrieve a list of newsgroups and file numbers[XXX] from the server to determine which newsgroups have new messages. When contacted for news reading purposes, the NNTP server will need to retain some state referencing the current group and current article. The current group is the group referenced in the most recent GROUP command. The current article is initialized to the first article in a group, and is changed by most of the news reader commands. 2.3. Connection A session begins when a client connects to TCP port 119 on the server. The server will then respond with a greeting, hopefully telling that client that the it is ready to process transactions. NNTP requires a reliable stream protocol, as there are no reliable integrity checks made on a message, once it is introduced into the network. 2.4. Authentication One of the additions to the NNTP protocol is the provision for authentication. The purpose of authentication in NNTP is to enable either connection to independently establish and verify the identity of the remote end. In this manner articles in private newsgroups can transmitted in either direction with some assurance that the other party is who it claims to be. Either the server or the client process can initiate a request for authentication. The intent is to allow any form of authentication to occur at any time. In order for there to be interoperability, a table of authentication names is listed in Appendix XXX of this document. It is expected that additional methods will be added through the normal standardization process. The server requests authentication by transmitting a return code to a client command that indicates that authentication is required. If the client responds improperly, the server may conclude that no authentication exists. The client requests authentication by issuing the AUTHINFO command. If the server responds improperly, the client may conclude that no authentication exists. If the server has requested authentication by issuing a response code to client request, once authentication is completed the client should reissue that request. 2.3.1. Out of Band Authentication By definition out of band authentication does not occur within NNTP. If either a server or a client wishes to participate in some form of out of band authentication, they must do so strictly on a prearranged basis, so that the NNTP session does not linger forever, waiting for a nonexistent authentication source. In other words, an implementation -client or server- must never send an out of band authentication request without prior arrangements with the other end. 2.5. Commands Commands consist of a command word followed by zero or more parameters separated by one or more space or tab characters, terminated by a CR-LF pair. Commands and parameters are to be treated as case insensitive, and may contain only one command. All commands are transmitted in ASCII[XXX], with the high order bit cleared. Command lines should not contain more than 512 characters, including the CR-LF pair. 2.6. Responses Responses take several forms, depending on their purpose, but they share a number of characteristics. Every response will begin with a three digit response code, indicate by using either the continuation mark ``-'' or a space to declare whether the response will be continued on the next line, and end each line with a CR-LF pair. For example: [a] 200 Eyewitness News server is ready and waiting! [b] 200-Eyewitness News server is ready and waiting for use at 200 7:00PM. Do you know where your armadillo is? Except where noted, continued lines must have the same response code as the previous line. Under all circumstances, every such line must be terminated by a CR-LF pair. 2.6.1. Simple Responses Simple response codes contain all information in the three digit response code. Other text may follow on the same line for informational purposes. However, programs should not attempt to parse anything other than the response code in this case. For example: 500 Command not understood. 2.6.2. Status Responses A status response contains useful information in the three digit code as well as the text following the response code. The exact format of each such response code will be specified later in this document. For example: 211 188 14525 14731 misc.legal 2.6.3. Extended Responses There are two forms of extended responses. The first, textual response, consists of a three digit response code, some informational text, a CR-LF pair, and then zero or more CR-LF terminated lines of additional information, terminated by a line containing nothing more than a period (``.''). If a textual response is to be parsed, the receiving side should expect one or more space or tab characters at any point where a space between words is required. For all examples lines beginning with ``C:'' originate from the client, lines beginning with ``S:'' are sent from the server. For example: C: OPTION IMAGE=ON S: 104 IMAGE=ON C: IHAVE S: Send list of Message-IDs C: <9104181633.AA08820@msw.usc.edu> 24354 C: <1991Apr19.182734.20692@athena.cs.uga.edu> C: . The second form of extended response is byte stream transfer In certain cases, when the client and server agree on either BINARY or IMAGE transfer, an extended response will involve an eight bit transfer of a certain number of octets. When this document uses the terms ``byte'' or ``bytecount'' it refers to octets. For example: C: OPTION BINARY=ON S: 104 BINARY=ON C: ARTICLE <9104181633.AA08820@msw.usc.edu> S: 224 0 4249 <9104181633.AA08820@msw.usc.edu> sending binary data. S: {server sends 4249 bytes in binary mode} When a byte count is specified it is of utmost importance that it be accurate, or the server and client will fall out of synchrony, which may cause connections to hang in a state where both ends are waiting for input. 2.6.4. Response Codes Response codes indicate the results of actions taken by the server as requested by the client, or requests by the server for additional information. The first digit of a response broadly indicates the result of the previous command, as follows: 1xx - Informative message 2xx - Command ok 3xx - Command ok so far, send the rest of it. 4xx - Command was correct, but couldn't be performed for some reason. 5xx - Command unimplemented, or incorrect, or a serious program error occurred. The next digit in the code indicates the function response category. x0x - Connection, setup, and miscellaneous messages x1x - Newsgroup selection x2x - Article selection x3x - Distribution functions x4x - Posting x5x - Authentication x6x - Batch transfers x8x - Nonstandard (private implementation) extensions x9x - Debugging output The exact response codes that should be expected from each command are detailed in the description of that command. In addition, certain response codes may be printed at any time that the client is expecting a response code. These include the following list: 100 Help text 19x Debug In addition, there are several response codes that should be used by the server and recognized by the client, as appropriate: 200 server ready - posting allowed 201 server ready - no posting allowed 400 service discontinued 450 authentication required 500 command not recognized 501 command syntax error 503 program fault - command not performed 2.7. Names and Identifiers Several options and commands require identifiers whose meaning must be shared by hosts on both ends of a connection. For example, the argument to the IMAGE command requires an identifier; if this identifier is ``UNIX'', then both ends might not do LF/CRLF conversion. Similarly, authentication requires an identifier to indicate the method. In each instance where such identifiers are required, a table is provided in the appendix of this document, and may be supplemented or modified by future standards documents. In order to avoid interoperability problems, use of identifiers not listed in the appendix should be by prearrangement only. In addition, it is suggested that such identifiers begin with ``X-''. Usage of identifiers listed in the appendix should be in accord with the definition. 2.8. TEXT Verses BINARY Verses IMAGE This version of NNTP allows information to be transmitted in three different formats. 2.8.1. TEXT By default all articles and article information (such as lists of Message-IDs transmitted by either the client or the server are sent as ASCII text; each such article must be sent as a set of lines separated by carriage return-line feed characters (CR-LF). Transmission of text is terminated by a single period (``.'') on an otherwise blank line terminated by CR-LF. If a client needs to transmit a data line containing a single period, it must send two periods, and the server must reduce them to one. 2.8.2. BINARY Binary data is information that contains bytes that may have the high order bit set. Binary transfers between NNTP servers are transfers of a specified quantity of 8-bit octets in network byte order. No matter what the internal representation of this information (big endin/little endin, etc), binary information must always be represented in the same form when being transferred. 2.8.3. IMAGE Image transfers may contain either BINARY or TEXT information, and must be prearranged between two hosts with identical internal representations. IMAGE allows hosts to not perform transformations of any kind on the data. 2.9. Canonical Message Format To ensure interoperability, NNTP imposes a singular view of how any given article is to be presented over the network. Traditionally, this has meant that all information transmitted must be ASCII, and lines are to be terminated with carriage return-line feed ASCII characters (CR-LF). This is known as the canonical message format. Although a NNTP server is free to store information in any desired format, it must transmit articles in canonical message format. Thus, a UNIX system that stores messages with LF as the end of line character must translate a LF character to CR-LF before transmitting the article. There is one exception to this rule. If two servers agree on the internal representation by using the OPTION IMAGE verb, they may forego any translation, and exchange messages in the selected internal format. With the introduction of BINARY article transfers, a new canonical message format will be required for binary articles. This document does not specify or place any restrictions that format, other than to state that only articles adhere to that format may be transferred in binary mode. It is expected that a standard on the new canonical format for such messages will be published concurrently with this document. 3. NNTP Command Set 3.0.1. Overview NNTP has the following command set: ARTICLE AUTHINFO DATE HELP NEWNEWS OPTION QUIT SENDSYS BATCH IHAVE SENDME BODY GROUP HEAD LAST LIST NEWGROUPS NEXT POST STAT The first set of commands implements general functionality required by all aspects of the protocol; the second set enables transport functionality, and the third set enables remote news reading and posting functionality. At the time of publication of this document, a separate Netnews User Protocol is being developed. As that protocol enters the standards process, news reader clients should use that protocol instead of NNTP. However, due to the installed base of news readers using NNTP, all servers should implement the third set of commands. Commands are described in the order they appear in the above table. 3.1. 3.1.1. is a pseudo command, only because it elicits a response from the server. 3.1.2. Responses 200 server ready - posting allowed 201 server ready - no posting allowed 400 Service unavailable. 502 I'm not allowed to talk to you. 3.2. The ARTICLE Command 3.2.1. Usage: ARTICLE [ | ] The ARTICLE command is used to fetch individual articles in their entirety by sending the headers, a blank line, and the text of the message. is the contents of the Message-ID header of the requested article[XXX]. If a message-id is specified, the article referred to by the message-id is transmitted as a textual response, or as otherwise negotiated as described previously in this document. If the article does not exist on the server, or is inaccessible for some reason, an error is returned. When used in this fashion, the ARTICLE command shall not alter the current article pointer or the current group pointer, because of semantic difficulties with articles posted to more than one newsgroup. In this context any responses that contain article numbers should return an article number of 0. If no parameters are specified with the ARTICLE command, the current article is transmitted via textual response or an otherwise negotiated method. If the parameter given to the ARTICLE command is a number , the article associated with that number on the server is transmitted. The current article pointer is set by this command if a valid article number is specified. Valid article numbers are those listed in the result of a GROUP command. 3.2.2. Responses to the ARTICLE Command 220 article retrieved - head and body follow ( = article number, = message-id) 224 article follows via BINARY transfer. 225 article follows via IMAGE transfer. 412 no newsgroup has been selected 420 no current article has been selected 423 no such article number in this group 430 no such article found 455 Permission denied. Codes 220 through 225 require an article number. If the parameter to the ARTICLE command was a message id, then the article number should be returned as 0, indicating no change in the current article pointer. 3.3. The AUTHINFO Command 3.3.1. Usage: AUTHINFO (IHAVE|IWANT) [] This command is used to exchange authentication credentials with the server. It is used to respond to a server's request for authentication in a 450 response code, or to initiate an authentication request. The AUTHINFO command is quite general, so that just about any form of authentication exchange may occur within an NNTP session, at either side's request, with any number of exchanges occurring. is a form of authentication listed in Appendix XXX. This document will only specify the inner workings of one of those mechanisms - SIMPLE, which is intended to be an example. may be present on the command line. If it is, it must be NETASCII data, and the whole line must not exceed the maximum line length for an NNTP command. Whether or not data is present, the server may require more information through a 350 or 351 response code, or it may send authentication information by preceding it with a 251 response code. In particular, if a binary transfer is desired, it is required that a bytecount be provided on the AUTHINFO line. In the absence of other arrangements, additional information shall be transmitted via textual exchange, ending with a line containing a single period (``.''). Note that this standard does not specify how authentication data is to be internally structured. Rather the functionality exists to implement any of the many schemes in existence. How a particular scheme is used within NNTP is beyond the scope of this document, and quite possibly the subject of a separate standard. 3.3.2. Response Codes 250 Authentication accepted 251 [] 350 Further authentication info needed 351 Further authentication info needed 353 Challenge text follows, response requested. 451 use for authentication 452 no authentication information available 453 authentication rejected 550 Improper authentication sequence. 250 should be used when successful authentication has occurred. 251 is used to indicate the impending transmission of authentication data from the server to the client, with an optional authdata argument. 350 and 351 are to be used as explained previously. 452 and 453 are to be used to indicate authentication failure. Implementors should use 452 if they understand the command, but have not implemented authentication. Implementors should use 550 when they discover that the client has made an authentication scheme specific protocol error. 3.3.3. Some Sample Exchange A. Simple Authentication Exchange S: 200 SomeUniversity.Edu NNTP server ready at Fri Jun 21 15:41:10 PDT 1991 C: IHAVE <1990Apr101243.lear@genbank.bio.net> S: 451 SIMPLE Who are you? C: AUTHINFO IHAVE SIMPLE user=webber,password=x7e37cn46v56tr4 S: 250 Authentication Accepted, now where were we? C: IHAVE <1990Apr101243.lear@genbank.bio.net> B. Exchanging Additional Information S: 200 Classified.NSA.Gov NNTP server ready. Who cares what time it is? C: IHAVE <1990Apr101243.lear@genbank.bio.net> S: 450 Further authentication needed. C: AUTHINFO IHAVE KERBEROS-5 200 S: 351 KERBEROS Ok. Send me 200 bytes of KERBEROS data. C: {binary data exchange} S: 250 Ok, I believe you. C. Client initiated authentication. S: 200 HARRY-MUDD.IM-LYING.Org ready at Tue Jun 31 15:41:10 PDT 1919 C: AUTHINFO IWANT KERBEROS-5 S: 251 KERBEROS-5 195 bytes of authentication data on its way S: {binary data transmitted from server to client} C: IHAVE <1991HushData12374@ATM.CHASE-MANHATTEN.COM> D. A Sample Disagreement S: 200 Agita.Correctness.Org server ready now, as always. C: AUTHINFO IWANT X1 S: 452 No authentication available C: QUIT 3.4. The DATE Command 3.4.1. Usage: DATE DATE returns the GMT date and time as known to the server in ISO3307 format YYYYMMDDhhmmss[.xxxxxx]. The timestamp is designed to be used in conjunction with the NEWGROUPS and NEWNEWS commands. This command should NOT be used to synchronize time between two computers - NTP [XXX] is the recommended method for clock synchronization on the Internet. This command merely requests the server's idea of the time, so that the client may keep track of new data since its last communication with this particular server. The client should not interpret the time information returned, but simply pass it back to the server at a later date. 3.4.2. Responses 111 YYYYMMDDhhmmss[.xxxxxx] 3.4.3 An Example C: DATE S: 111 19911231010203.025 3.5. The HELP Command 3.5.1. Usage: HELP [] Provides a short summary of commands that are understood by this implementation of the server. The help text will be presented as a textual response, terminated by a single period on a line by itself. If a command is provided as an argument, the server may optionally give additional information about the command in question. 3.5.2. Responses 100 help text follows 490 help not available 3.6. The NEWNEWS Command 3.6.1 Usage: NEWNEWS [GMT] [] This command requests a list of Message-IDs that refer to associated messages that have been received after the specified date. is a comma-separated list of newsgroup patterns specifying the newsgroups the client is interested in receiving. The rules for matching are as follows: A pattern and a newsgroup match only if they are identical, except that the ``*'' character (asterisk) in a pattern shall mean one or more of any NETASCII character. If a pattern matches a newsgroup, !pattern forces a mismatch of that newsgroup; ie. it negates the match. A newsgroup matches a pattern list if, and only if, it matches at least one of the patterns and: the newsgroup does not mismatch any of the patterns, or the longest matched pattern is longer than the longest mismatched pattern. Note that order is not significant. is a date in ISO3307 format in Greenwich Mean Time (GMT), as described in the DATE command. For backward compatibility, a server must accept the date and time in the form of yymmdd hhmmss ["GMT"]. is an optional comma separated list of distributions, as listed by the ``Distribution'' header in a netnews article. As used here, distribution is in no way tied to newsgroup name. Only message-ids of articles exactly matching at least one distribution in the list should be returned to the client. If this parameter is not specified, all distributions should be examined and returned if they agree with the other NEWNEWS parameters. If the distribution ``world'' is used, only articles with that distribution or with no distribution header should be returned. ``World'' is the default distribution. 3.6.2. Responses 230 list of new articles by message-id follows 3.6.3. Example C: NEWNEWS comp.*,!comp.sys.*,comp.sys.sun 199106211530.10 S: 230 here comes a list of articles since June 21, 1991 15:30 GMT S: <1234@foo.com> S: <1991Jun22.002755.5674@uvm.edu> S: <1991Jun24.195403.15700@cbnewsj.att.com> S: . 3.7. The OPTION Command 3.7.1. Usage OPTION