Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!usc!cs.utexas.edu!swrinde!sdd.hp.com!decwrl!pa.dec.com!reid From: reid@pa.dec.com (Brian Reid) Newsgroups: news.groups,news.lists,news.admin.misc,news.lists.ps-maps Subject: USENET FLOW ANALYSIS for JUL 93: Who stores how much news Date: 12 Aug 1993 00:53:37 GMT Organization: DEC Network Systems Laboratory Lines: 70 Approved: reid@pa.dec.com Message-ID: <24c4ah$7mv@usenet.pa.dec.com> NNTP-Posting-Host: torrey.pa.dec.com Keywords: arbitron, statistics, lifetime Originator: reid@torrey.pa.dec.com Xref: senator-bedfellow.mit.edu news.groups:78010 news.lists:2719 news.admin.misc:3852 news.lists.ps-maps:1161 Analysis of stored news articles, JUL 93. This is an analysis of the contents of /usr/spool/news at the sites reporting "inpaths" data. The "inpaths" program has been posted in news.lists.ps-maps,comp.sources.d,news.admin.misc. Please consider installing and running this program at your site. Presumed size of overall network: 97000 Number of sites surveyed: 441 (0.5%) Average age of articles kept online: 9.3 days Average age of disk space used by news: 12.2 days Average disk space used by news: 208.3 megabytes Average number of articles stored: 69432 Estimated worldwide disk space used by news 20 terabytes Distribution of expiration times used 0-1 9 ********* 1-2 5 ***** 2-3 22 ********************** 3-4 16 **************** 4-5 27 *************************** 5-6 30 ****************************** 6-7 26 ************************** 7-8 31 ******************************* 8-9 34 ********************************** 9-10 59 ************************************************************ 10-12 34 ********************************** 12-14 18 ****************** 14-16 28 **************************** 16-18 15 *************** 18-20 23 *********************** 20-25 15 *************** 25-30 12 ************ 30-35 5 ***** 35-40 4 **** 40-45 1 * 45-50 10 ********** 50-75 2 ** 75-100 4 **** 100-125 3 *** 125-150 1 * 150-175 2 ** 175-200 2 ** 200-225 0 225-250 1 * 250-275 0 275-300 0 300-325 0 325-350 0 350-375 0 375-400 1 * Notes: The "average disk space used by news" assumes that the host operating system allocates disk space in a fixed "chunk size" of about 1000 bytes. The size of each stored message is rounded up to the next multiple of that chunk size. The "average age of articles" is an average counting each article equally. The "average age of disk space" is weighted by size. The "expiration time" for a site is not necessarily constant. Some newsgroups that are considered more valuable are given longer expiration times, while obvious junk is given a shorter expiration time. To come up with a single expiration time for a site, we find the average age of stored articles and then double it, rounding to the nearest integer. If you expire all articles older than 14 days, and if the arrival rate is constant, then the average age of articles at your site will be 7 days.