Received: from SOUTH-STATION-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA24868; Mon, 18 Nov 96 22:42:38 EST Received: from HARLIE.MIT.EDU by MIT.EDU with SMTP id AA17118; Mon, 18 Nov 96 22:42:36 EST Received: by harlie.mit.edu (8.6.12/4.7) id WAA11514; Mon, 18 Nov 1996 22:42:36 -0500 Message-Id: <199611190342.WAA11514@harlie.mit.edu> To: star-maintainers@MIT.EDU Subject: RFD - Backup system: tape retiring, system overhaul/upgrade Date: Mon, 18 Nov 1996 22:42:36 EST From: Emil Sit -----BEGIN PGP SIGNED MESSAGE----- Exabyte tapes apparently can be used on the order of 1d10 times before suffering from random data lossage. On anxiety-closet, at least, the tapes have been used almost 30 times each. This means that the tapes probably aren't very reliable. In order to fix that, Joe Foley and I have decided that we should start retiring tapes on a fairly regular basis. I discovered after the meeting that we have 3 unused exabyte tapes, so next week, I will move to allocate money to get more. There doesn't appear to be a way to remove tapes from the database explicitly (ie, no opposite of "add-tape"). The system rather expires tapes which are too old according to its configuration files. The shipped expire times (ie, the ones we use), are: # type age uses formats... # ---- --- ---- ---------- tape-type exb 2555 50 8200 8500 Observe that "age" is in days. 2555 days is seven years! I suspect that by lowering this to something reasonable, like age = 365, uses = 10, would cause the system to rapidly expire tapes. We could then start adding new tapes. Now, the system does *not* label each tape with the host it is associated with. This is functional, but means that if exb-1 is asked for and exb-1 for the wrong host is inserted, the backup system will happily use the tape. Joe and I didn't realize this at the time we added exb-5 to anxiety; jweiss has suggested that in the future, we label tapes with numbers which were not used by other hosts. So, we could implement a labeling scheme that goes something like: HOST exb-## ^^^^ ^^^^^^ charon 10--19 bloom-picayune 20--29 anxiety-closet 30--39 bloom-beacon 40--49 senator-bedfellow 50--59 penguin-lust 60--69 [hostlist taken from btc:/usr/local/rmt/.klogin, minus dragons-lair] such that each host will be guaranteed a set of unique tape numbers. The numbers 0--9 can be used for misc purposes or ignored. Along these lines... According to the docs for backup-2.6, the system is designed to have a single database, which is accessible to all the machines to be backed up and knows about all the tapes, runs, etc. We currently have individual databases for each host. If we decide to implement the above numbering scheme, we might want to consider going to a single database system. Was this considered when we first set up this system? (I couldn't find anything relevant to it in the anxiety-maint meeting, searching for "backup" in the subject.) We would probably want to put the single database somewhere in AFS, which would probably involve making AFS principals for the various rcmd.host instances. This would allow us to centralize the config files too. This being a fairly radical restructuring of our current backup system (as I understand it), we might want to consider some other points, in roughly decreasing priority: * Upgrading: The backup software on anxiety appears to be 2.5. Version 2.6 and 2.7alpha exist. I'd been reading the 2.6 docs, as that was the latest version available in foo-server/common/backups. The software has no way to ask it for its version (most of the perl files don't even have $Id$ tags.) If we're going to do any sort of moving around of software, we might want to consider upgrading to 2.6. It has some possibly useful changes/fixes. (See the Changes file in backups/backup.) * Amanda: Eric (nocturne) is supposedly working on a splufty new backup system called Amanda and making it useful. If this is expected to be available soon and adaptable for our needs, we might want to use it. Eric? * Encrypted backups: krb5? ssh? We'll probably want to run any new system (other than tape expiration) in parallel with the existing one, for some time, I suppose. The main suggestion which should definitely be implemented is the tape expiration changes in backup.config. With that will probably go the numbering scheme. (I'm not sure what to do about anxiety exb-5. It won't expire for a while; we could just leave it and let it expire in due course.) Barring any major objection I will begin to implement the changes mentioned in this paragraph for anxiety-closet. Comments are welcome and requested. Emil -----BEGIN PGP SIGNATURE----- Version: 2.6.2 Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface iQBVAwUBMpEsjyWuZ7zmNWHpAQHANAIAqCVrfpy+5fndq6cjymMacZeyJHjJmocx AYJr79XNlW1uUyoejtNrT7BVUJP9NaCeKwt4Fup85cloaErbmLOh5A== =W76i -----END PGP SIGNATURE----- -- Emil Sit / Bronx Science '95, MIT '99 -- ESG, SIPB. Email: sit@mit.edu / Web: http://web.mit.edu/sit/www/ PGP KeyID: 0xE63561E9 / Fingerprint: A68FD0693EDABA19 2671EC1F22498F58