Athena Backup System Deliverables
The athena-backup mailing list is archived in a discuss meeting.
This document revised: 23 December 1996
(There aren't any)
This document describes the specific deliverables for the Athena Backup System (ABS), as well as the Acceptance Criteria for each of these deliverables. It will evolve and change to reflect the most up to date status of the development of the system.
The ABS is a client/server architecture consisting of 3 components:
- client user interface (all supported athena platforms)
- database front end referred to as "the master" (solaris only)
- tape/device interface referred to as "the slave"
Since the architecture is distributed. The master (1) and slaves (1 or more) can run on different servers.
Supporting technologies for the ABS:
- Oracle RDBMS; used by master to manage information about tapes, etc.
- ONC RPC; this is a kerberized version of the ONC RPC. Used by all three components to communicate amongs themselves.
- tcl/tk - public domain user interface/ gui tool used by client to provide a programmable command line interface to ABS Master
There is a good amount of documentation in the "athena-backup" locker
which describes in detail the architecture of the system.
ABS deliverables can be divided into several broad categories:
- Steps that help us to produce the system (Discovery, Delivery & Tech. Integration),
- Running Code (Delivery & Technology Integration), and
- Steps that helps us to operate the system (Service and Support).
Client -- Command line.
- implement -- COMPLETED 10/26/95
- developer test -- COMPLETED 12/95
Client -- GUI
Implementation work suspended until after first Full release
- implement (skeletal prototype done)
- prepare for eventual rework by updating to current API calls.
Master
- implement DBMS functions -- COMPLETED 8/26/95
- implement Slave-related -- COMPLETED 1/96
- Developer Test of Master -- COMPLETED 1/96
- Code Review -- 25% complete
- Functional Test (developer independent QA) -- 50% complete
Slave
- implement Slave Functions -- COMPLETED 11/17/95
- Developer Test of Slave -- COMPLETED 2/7/96
- Code Review -- 20% complete
- Revisions made to code in response to 12/21/95 review -- COMPLETED
- scan tape function unit test -- COMPLETED 12/27/95
- Functional Test (developer independent QA)
- Developer Test of restore -- begun 12/28/95 COMPLETED 2/2/96
RPC
- Add Kerberos + data integrity to Solaris COMPLETED 7/95
- port changes to other platforms
System
- Developer test & Debug of integrated system BEGUN 11/26/95
- Disaster Test
- "formal" system test
- Configuration/Performance Test
- Database Recovery Test -- preliminary one done 2/2/96
- Alpha Test -- Begun 2/15/96
- Beta Test --Tentatively Scheduled for 7/1/96
Documentation
- Design Specs
- System Requirements -- Revised 1/31/96
- System Overview -- Revised 12/31/95
- Master Spec -- Revised 2/2/96
- Slave Spec -- Revised 1/31/96
- Crippled Mode Spec -- Revised 12 July 1996
- Core Technology Spec COMPLETED 8/95
- Coding Style Guide -- Revised 1/31/96
- Administrators Guide -- outline drafted 11/95
- Quick Start Guide for Administrators drafted 2/96
- Administrators Man Pages
April May June July August
--------!--------!--------!--------!--------!--------
1 -----------------Alpha test -------^
2 ^ Fill Cell & perf ^
^
3 ^Systest^ ?
! Plan ! B
E
4 ^Systest^ T
! Suite ! A
?
5 ^ Bugfix ^
6 ^ Admin ^
! Guide !
7 ^- Improved UI -^
8 ^Migrat^
! Plan !
- Alpha Test of system by Ops: 2/15/96
- Fill db with large Cell, test and tune performance (delgado) 3/20/96 - 5/15/96
- Write system test plan (vrt) 5/1/96 - 6/1/96
- Write system test suite (vrt/delgado) 5/15/96 - 6/15/96
- Debug items identified by test suite (delgado) 6/1/96 - 7/1/96
- Write Administrator's guide (dkk/bwmelans) 5/15/96 - 6/15/96
- Write improved user interface (urop/delgado) 6/1/96 - 8/1/96
- Write Migration Plan (bwmelans/dkk) 6/15/96 - 7/15/96
Description of ABS Beta phase
Criteria to enter beta
- jis OK's Requirements
- System passes test suite
- Administrator's Guide written, reviewed, and accepted
- Code review complete
- Performance OK
- measure current time make sure we're not worse
- preferably better
- Cripped Mode implemented and tested
- Brian Melanson completes Oracle administrator training
- DLT drives tested
Critera to leave beta
- Migration plan written, reviewed, and accepted
- All documentation written reviewed, and accepted
- Disaster Recovery test successfully passed
- Acceptance tests passed
- Tested Solaris and SunOS Media Slave
- Writer:
- Jeff Schiller
- Location:
- /mit/athena-backup/doc/ad-hoc-requirements.txt
This document is a very short vision of what operational problems the ABS needs to solve to be successful. It has sufficient detail to allow for the definition of a High-Level Design and Architecture. More formal requirements will be formulated in the delivery phase.
- Writer:
- Athena Backup System Team
- Reviewed By:
- Ted T'so, Jeff Schiller, Bill Cattey, Tom Coppeto.
- Location:
- /mit/athena-backup/doc/requirements
To make formulating testing easier, and to make sure that what is implemented is satisfactory, a list of requirements has been created. This list will continue to evolve as the real needs become progressively better understood. The operational definition of a requirement for this list is: It must be something for which you can name an explicit test.
- Criteria:
- 1) Design addresses all of the problems listed in Requirements
2) Architecture is consistent with MIT's operational environment
3) Core technologies recommended are deemed acceptable.
- Assigned to:
- Diane Delgado & Jonathon Weiss
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/draft/abs-overview.ps
- Revised:
- Jonathon Weiss, November 1995.
This document describes the overall design of the ABS its major components, and the core technologies to be used in construction of the system. It has sufficient detail to allow for the writing of Functional Specifications.
Acceptance of this Design and Architecture included the following compromise with respect to core technologies:
- Use of ONC RPC for network communications was allowed; and
- Use of Oracle RDBMS for storage of data in the Master Component was allowed.
- Use of threads technology for Master and Slave Components was disallowed.
In November 1995, The team agreed to revise the document to codify current practices.
- Criteria:
- 1) Specification addresses all of the functions listed in High-Level Design for the Master Component
2) Design is consistent with MIT's development practices.
- Assigned to:
- D. Delgado
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/draft/abs-master.ps
This document describes the ABS Master Component in sufficient detail for it to be used as a reference for programmers working on this component, during initial development, testing, and after production deployment.
- Criteria:
- 1) Specification addresses all of the functions listed in High-Level Design for the Slave Component
2) Design is consistent with MIT's development practices.
- Assigned to:
- D. Delgado
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/draft/slave.ps
This document describes the ABS Slave Component in sufficient detail for it to be used as a reference for programmers working on this component, during initial development, testing, and after production deployment.
- Criteria:
- 1) Define a mode of operation that provides adequate functionality, but with the database offline.
- Assigned to:
- W. Cattey
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/draft/crippled.ps
It was agreed that additional effort should be taken to ensure the availability of key ABS services in the event of a prolonged database or Master outage. Crippled Mode addresses this concerns. This document specifies the functionality, the implementation, and the method by which logs of operations performed in Crippled Mode will be re-integrated into the normal mode Master database when it comes back online.
- Criteria:
-
- Assigned to:
- vrt
- Accepted:
- Fall, 1994
- Accepted By:
- ABS Project Team
- Location:
- /mit/athena-backup/doc/abs_code_standards.txt
A short and simple checklist distilled by Mark Virtue from the Quality Assurance literature.
There will be three guides produced, each for a different audience:
- Programmers Guide (for developers),
- Administrators Guide (for ABS administrators),
- Operators Guide (for ABS operators).
Each guide will have the same overall style and organization, and cover the UI, the Master, the Tape Slave, and other components of the ABS in a level of detail that is appropriate for its audience. These guides will be kept current with each other, the code, the man pages, and the operating environment.
Programmers Guide
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
- Criteria:
-
- Assigned to:
- Dave Krikorian/Brian Melanson
- Reviewed By:
-
- Location: /mit/athhena-backup/doc/admin/absys.ps (Currently an incomplete draft)
-
Operators Guide
- Criteria:
- 1) Short! Probably a single page in length.
- Assigned to:
-
- Reviewed By:
-
- Location:
-
man pages [although there are a number of sub-deliverables here, the acceptance is for the complete set].
- Criteria:
- 1) Man pages are clearly written, correct and complete.
2) Man pages are developed in a manner that is consistent with the standard IS man page practices.
3) Man pages are installed in a location that is consistent with standard IS man page installation practices.
- Assigned to:
- Unassigned
- Reviewed By:
-
- Location:
- /mit/athena-backup/man/...
The following UNIX man pages will be written:
- Quick reference for the ABS [ABS]or [abs]
- User Interface [abs_ui]
- Master Component [abs_master]
- Tape Slave [abs_slave]
- Specific ABS commands [abs_backup], [abs_restore], etc.
- Criteria:
- 1) Source code will be kept under RCS
2) Builds will be done using imake, autoconf or something else [to be decided].
3) The C compiler to be used is [to be decided]
4) Source code is audited and complies with the Coding Style Guide.
5) Source code will be kept in the athena-backup locker in AFS below /afs/athena.mit.edu/astaff/project/athena-backup/src/...
- Assigned to:
- Diane Delgado
- Reviewed By:
-
- Location:
- /mit/athena-backup/src/
During the development, and alpha testing, of the code, the sources will be kept in the location mentioned above. When beta testing (final acceptance testing) commences, all binaries will be transferred to their more permanent location, and sources will be checked into opsrc.
The executables to be delivered are:
Client with Command Line interface.
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
Client with Graphical Interface
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
Tape Master
- Criteria:
- Meets all requirements and passes all tests.
- Assigned to:
- Diane Delgado
- Reviewed By:
-
- Location:
-
Tape Slave
- Criteria:
- Meets all requirements and passes all tests.
- Assigned to:
- Diane Delgado
- Reviewed By:
-
- Location:
-
Fake-Master Driver
To test tape slave
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
Master Test Driver
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
Fake-Slave Test Driver
- Criteria:
-
- Assigned to:
-
- Reviewed By:
-
- Location:
-
- Assigned to:
- M. Virtue
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/...
This document will describe the methodology, the detail plans, and the status of all testing for the ABS.
RPC Test -- module test
Integrate System s TEst
Acceptance Test
Confirm RPC Driver works when used for operation without database.
Performance Acceptance
Performance Acceptance test plan
- Criteria:
- 1)
2)
3)
- Assigned to:
- Unassigned
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/...
There will be a detailed plan for how to test the ABS in its target environment and evaluating whether it should go into production use. This plan will be drafted by the developers of the ABS, and approved by Athena Systems Support before execution. During this time, working together, the developers and Systems Support will look for:
- correct behavior of the ABS as specified
- whether changes need to be made to the specifications or behavior
- whether the ABS has any undesirable side-effects on the production Athena environment
- how gracefully some or all of the ABS components survive during network outages, power outages, ABS-operator initiated shutdowns, or disk or filesystem corruption on any or all of the ABS components
- time and effort to recover from planned or unplanned interruptions
- Assigned to:
- Diane Delgado
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/core-technologies/coretech.ps
This document describes the steps that need to be taken to make the core technologies selected during the Architecture phase ready for deployment in our environment. Specifically, it contains detailed instructions for the ongoing maintenance of ONC RPC and the Oracle RDBMS.
It also describes the set of steps that have to be taken (if any) on an ongoing basis to keep these third-party technologies current, and provide hints as to the impact on MIT developed code.
This document will describe the steps needed to bring the ABS into full production eventually replacing the older tape backup systems. The detailed plan, worked out by the customer (Athena Systems Support) with assistance from the ABS team, This plan will include ABS administrator training, ABS operator training, and address issues associated with phasing-out the older system.
Note 1: The activities of normal day-to-day style operation is documented in the Administrator's guide.
- Criteria:
- 1)
2)
3)
- Assigned to:
- Systems Support
- Reviewed By:
-
- Location:
- /mit/athena-backup/doc/...
This document will describe the steps that need to be taken to move from using our current backup system to a situation where we are using the ABS. It will identify and resolve any issues related to needing to keep backup tapes in our archives that were written by two different generations of backup system.
Note 2: The phase-out of the older system has been specified by Systems Support: Since most tapes are recycled after six months, the old archives are dealt with by keeping a single server host with tape drive that runs the old system and knows how to restore that form of backup.
The Server Operations portion of this plan will be formatted in the template that has been recently [May, 1995] approved for such plans by Athena Systems Support.
For the current template, see
/afs/dev/user/cfields/servers/template.{ps,dvi}
For an example of what this looks like for a recently transferred service, the Software License Wrapper server, see
/afs/athena.mit.edu/astaff/project/slw/src/doc/slwd.{ps,dvi}.
- Criteria:
- 1)
2)
3)
- Assigned to:
- Unassigned
- Reviewed By:
-
- Location:
- /afs/athena.mit.edu/astaff/project/athena-backup/doc/...
This document will describe the plan for ongoing support of the ABS by the developers, and clearly delineate the responsibilities of all parties involved in the system after production deployment. This document will address, inter alia, handling bug fixes, requests for enhancements, procedures related to the ABS test suite, maintained as part of project, porting to newer platforms or operating systems, procedures for notifying Systems Support of new binaries as a result of bug fixes and requested enhancements, and assisting Athena Systems Support in updating ABS software or during operational emergencies.
There is no end-user component in the current system, and therefore we don't think there are any Support Process Deliverables.
Appendix Documentation plan
All ABS project documentation is located in the 'athena-backup' locker, readable by system:anyuser.
All documents will be available in Postscript format for maximum transportability.