<!-- Mail as an attachment to: monthly@freebsd.org -->
<project cat='kern'>
  <title>new sendfile(2)</title>

  <contact>
    <person>
      <name>
        <given>Gleb</given>
        <common>Smirnoff</common>
      </name>
      <email>glebius@FreeBSD.org</email>
    </person>
  </contact>

  <links>
    <url href="https://svnweb.freebsd.org/base?view=revision&revision=293439">Commit to head</url>
    <url href="http://www.slideshare.net/facepalmtarbz2/new-sendfile-in-english">Slides</url>
    <url href="https://events.yandex.ru/lib/talks/2682/">Presentation (in Russian)</url>
  </links>

  <body>
    <p>
      <p>The sendfile(2) system call has been introduced in 1998 as alternative to traditional read(2)/write(2)
      loop, speeding up server performance 10x.
      Since it was adopted by all major operating systems and used in any serious web server software.
      Where there is a high traffic, there is sendfile(2) under the hood.</p>
      
      <p>Now with FreeBSD 11 we are making next revolutinary step with serving traffic. The sendfile(2) no
      longer blocks on disk I/O. It immediately returns control to application, running all needed I/O
      in the background. Original sendfile(2) waited for disk read operation to complete and put the data
      read into the socket, then returned. If web server serves thousands of clients, thousands of requests,
      to avoid stalls it needed to spawn extra contexts to run sendfile(2). Alternatively it could use special
      tricks like SF_NODISKIO flag, that forces sendfile(2) to serve only cached in memory content. Now, this is
      in the past, and a web server can simply use sendfile(2) as it would use write(2), withouth any
      extra care. New sendfile cuts overhead of extra contents, overhead of short writes and additional
      syscalls to prepopulate cache, bringing performance to new level.</p>

      <p>The new syscall is built on top of two new features introduced in the kernel. The first one is
      asynchronous VM pager interface, and VOP_GETPAGES_ASYNC() file system method for UFS. The second one
      is idea of not ready data in sockets. At the time of syscall first VOP_GETPAGES_ASYNC() is called,
      which dispatches I/O request. Then buffers with pages to be populated are put into the socket buffer,
      but flagged as not yet ready. Then control immediately returns to application. When I/O is finished,
      the buffers are marked as ready, and the socket is kicked to continue transmission.</p>
      
      <p>Additional features of the new sendfile are new flags that provide an application with extra
      control over content sent. Now it is possible to deny caching of content in memory, which is useful
      when we know that content is unlikely to be reused any time soon, so better let it be freed,
      rather than put in cache. It is also possible to specify readahead with every syscall, if application
      can predict client behavior.</p>
      
      <p>New sendfile(2) is a drop in replacement, API and ABI compatible with old one. Applications don't
      need even recompilation to benefit from new implementation.</p>
      
      <p>This work is a joint effort between two companies: NGINX, Inc. and Netflix. There were many people
      involved in the project. At initial stage, when no code yet was written, the idea of such asynchronous
      drop-in replacement was discussed between Gleb Smirnoff, Scott Long, Kostik Belousov, Adrian Chadd and
      Igor Sysoev.  The initial prototype was coded by Gleb under supervision of Kostik on VM parts of patch
      and under constant pressure from Igor, who demanded that nginx is capable to run with new sendfile(2)
      with no modifications. The prototype demonstrated good performance and stability and quickly went into
      Netflix production in late 2014. During 2015 the code got mature and meanwhile served 35% of North
      America traffic. Scott Long, Randall Stewart, Maksim Yevmenkin, Andrew Gallatin added their contributions
      to the code.</p>
      
      <p>Now right after Netflix video streaming service is available to global customers worlwide,
      we release the code behind our success to the FreeBSD community,
      making it available to all FreeBSD users worldwide!</p>
    </p>
  </body>

  <sponsor>
    Netflix
  </sponsor>
  <sponsor>
    NGINX, Inc.
  </sponsor>

  <help>
    <task>
      SSL_sendfile() - an extension to new sendfile(2) that allows to upload session keys to kernel,
      and then use sendfile(2) on SSL enabled socket.
    </task>
  </help>
</project>

