\documentstyle[fullpage,multicol]{article} \parskip = 1em \parindent = 0pt \begin{document} \begin{multicols}{2} \begin{center} {\Large Term Project Problem Statement\\6.852}\\ {\large Jeremy Brown \& Greg Hudson} \end{center} \section{Real World Problem} Usenet (NetNews) is a global message-posting service handling hundreds of thousands of messages every day. End users connect to news server machines to get the latest messages, and to post new messages. The algorithm we will model in future documents is the flooding algorithm by which a message posted at a single news server is eventually distributed to every news server in the world; in this document, we will formally describe the framework in which this algorithm must run, and describe the problem in terms of that framework. \section{Translating to Models} News servers are physically connected via the global internet; however, each such server is logically connected to only a few other servers. In our model, we essentially ignore the physical network, and model the logical network as an arbitrary undirected connected graph; each node of the graph is represented by an asynchronous process modeling a Usenet server machine, and each edge represented by a pair of asynchronous processes modeling network connectivity between two servers (one process for each direction of communication.) News server machines may crash. In the model, these failures are modeled as stopping failures, activated by stop inputs to server processes. In addition, a crashed news server may come back up; this restarting is perfomed by start inputs to server processes. Since news servers communicate via a reliable stream protocol (TCP), it is possible for a server which sends a message to another server to know whether or not that message was received; this is represented in our model by channels notify sending processes each time a message is dropped rather than delivered. Messages may be dropped either due to a momentary channel failure (modeled as a drop input from outside) or due to the recipient process being stopped. Finally, our model makes the simplifying assumption that any message which is successfully delivered to process i is stored on nonvolatile media, so that a stopping failure (corresponding to a machine crash) followed by a restart (corresponding to a reboot) does not lose state. We are deliberately glossing over the multi-message protocols necessary to insure this guarantee. \section{Formal Model} An article is originated at an arbitrary server i by an input action $post_i(a)$, where a is an article. Note that the message alphabet M shared by servers and channels is left unspecified; the algorithm may define it as needed. \paragraph{Servers:} A server i is a process whose external actions are: \begin{tabbing} In\=puts:\\ \>$post_i(a)$:\= a an article; represents a user posting a\\ \>\> new message.\\ \>$stop_i$: stops the process entirely. \\ \>$start_i$:\= starts the process from wherever it was\\ \>\>last stopped; no state is lost.\\ \>Fo\=r each neighbor k:\\ \>\>$receive_{k,i}(m)$:\= m in M, a message being\\ \>\>\> delivered from neighbor k\\ \>\>$dropped_{i,k}(m)$:\= m in M, represents m failing\\ \>\>\> to be delivered to k\\ Outputs:\\ \>fo\=r each neighbor k:\\ \>\>$send_{i,k}(m)$: m in M, m sent toward k\\ \end{tabbing} For purposes of algorithm analysis, we assume that a server task is guaranteed to get a chance to act in at most time $d_h$; we also assume that it completes {\em at most} one task in time $d_l$. \paragraph{Channels:} Channels, unlike servers, are completely specified as part of the problem; algorithms must work with what the channels provide. A $channel_{j,i}$ is a FIFO queue transferring messages from j to i. It is defined as follows: %\begin{multicols}{2} \begin{tabbing} {\bf Signature:}\\ In\=puts:\\ \>$stop_i$:\= channel knows when its recipient\\\>\> goes down...\\ \>$start_i$: ...and when it comes back up.\\ \>$drop_{j,i}$:\= drop the top message on the delivery queue\\ \>$send_{j,i}(m)$: m in M, m sent from j destined for i\\ Outputs:\\ \>$receive_{j,i}(m)$: m in M, message delivered to i\\ \>$dropped_{j,i}(m)$: m in M, message to i lost\\ Internal:\\ \>$loseit_{j,i}(m)$:\= m in M, recipient is dead so move\\\>\> message to drop queue\\ \end{tabbing} %\end{multicols} %\begin{multicols}{2} \begin{tabbing} {\bf State:}\\ ab\= \kill \>$queue$, a queue of messages in M, initially empty.\\ \>$ilive$, a boolean, initially true.\\ \>$dropqueue$,\= a queue of messages in M, initially\\ \>\> empty. \\ \end{tabbing} %\end{multicols} %\begin{multicols}{2} \begin{tabbing} ab\= \kill {\bf Transitions:}\\ $stop_i$:\\ \>Ef\=fect:\\ \>\>$ilive := false$\\ $start_i$:\\ \>Effect:\\ \>\>$ilive := true$\\ $drop_{j,i}$:\\ \>Effect:\\ \>\>if\= queue non-empty, first element of queue is\\ \>\>\> moved to end of dropqueue\\ $send_{j,i}(m)$: \\ \>Ef\=fect:\\ \>\>m added to end of queue\\ $receive_{j,i}(m)$:\\ \>Precondition:\\ \>\>$ilive$ both true\\ \>\>m first message in queue\\ \>Effect:\\ \>\>m removed from head of queue\\ $loseit_{j,i}(m)$:\\ \>Precondition:\\ \>\>$ilive$ false and queue non-empty\\ \>Effect:\\ \>\>move first element of queue to end of dropqueue\\ $dropped_{j,i}(m)$: m in M, message being thrown away\\ \>Precondition:\\ \>\>m first message in non-empty dropqueue\\ \>Effect:\\ \>\>m removed from head of queue\\ \end{tabbing} %\end{multicols} %\begin{multicols}{2} \begin{tabbing} ${\bf Tasks}$:\\ ab\= \kill \>\{$receive_{j,i}, dropped_{j,i}$\}, \{$loseit_{j,i}$\}\\ \end{tabbing} %\end{multicols} For purposes of algorithm analysis, we assume that the first message in a channel queue is guaranteed to get delivered or dropped in time at most $l_h$. We also assume that at most one message is delivered in each direction by a channel in time $l_l$. \paragraph{Formal Problem Statement:} Complete the specification of server processes such that the composition of servers and channels obeys the following correctness property: In any fair execution, and for any finite number of faults that do not combine to permanantly break graph connectivity, for each $post_i(a)$ event, a message containing $a$ is eventually received by every server process, including those which stop and later start again. Additionally, we would like the solution to: \begin{enumerate} \item Tolerate failures. \item Minimize communications bandwidth. \item Maximize speed with which a message is distributed. \end{enumerate} In our next document, we will present a ``flooding'' algorithm which obeys the correctness property described above, and is the algorithm actually used by Usenet. In a subsequent document we will prove the correctness of the algorithm inductively, and also analyze the complexity of the algorithm in the case of no failures. Finally, we will informally discuss the impact of small numbers of failures on the algorithm complexity. \end{multicols} \end{document}