The University has recognized the significance of each unit producing and maintaining Disaster Recovery Plans (also known as business continuity or contingency plans) in order to prepare and address how each unit will continue doing business in the event of a severe disruption or disaster. The Disaster Recovery Planning Team, coordinated by the Client Advocacy Office (CAO) will be the primary resource for assisting each unit with the DRP initiative, by providing education, awareness and tools.
The team will work to identify, collect, and organize information and tools for disaster recovery planning and documentation, and disseminate all information to University units in an effective and easily understood manner, so that unit plans may aggressively be developed, tested, distributed, and a copy provided to the CAO for central tracking purposes. After the initial endeavor, the responsibility for providing support will transition from the DRP Team to the Client Advocacy Office. Definitions:
Business Continuity is an all-encompassing term covering both disaster recovery planning and business resumption planning. Disaster Recovery is the ability to respond to an interruption in services by implementing a plan to restore an organization’s critical business functions. Both are differentiated from Loss Prevention Planning, which comprises regularly scheduled activities such as system back-ups, system authentication and authorization (security), virus scanning, and system usage monitoring (primarily for capacity indications). The primary focus of this effort is on Disaster Recovery Planning.
Developing the Plan: The following ten steps, more thoroughly described in the document that follows, generally characterize disaster Recovery Plans: Purpose and Scope for a Unit Disaster Recovery Plan The primary reason for a unit to engage in business continuity and contingency planning (also known as “disaster recovery” planning) is to ensure the ability of the unit to function effectively in the event of a severe disruption to normal operations. Severe disruptions can arise from several sources: natural disasters (tornadoes, fire, flood, etc. , equipment failures, process failures, from mistakes or errors in judgment, as well as from malicious acts (such as denial of service attacks, hacking, viruses, and arson, among others). While the unit may not be able to prevent any of these from occurring, planning enables the unit to resume essential operations more rapidly than if no plan existed. Before proceeding further, it is important to distinguish between loss prevention planning and disaster recovery planning. The focus of Loss prevention planning is on minimizing a unit’s exposure to the elements of risk that can threaten normal operations.
In the technology realm, unit loss prevention planning includes such activities as providing for system back-ups, making sure that passwords remain confidential and are changed regularly, and for ensuring operating systems remain secure and free of viruses. Disaster recovery planning focuses on the set of actions a unit must take to restore service and normal (or as nearly normal as practical) operations in the event that a significant loss has occurred. A systematic disaster recovery plan does not focus unit efforts and planning on each type of possible disruption. Rather it looks for the common elements in any disaster: i. . , loss of information, loss of personnel, loss of equipment, loss of access to information and facilities, and seeks to design the contingency program around all main activities the unit performs. The plan will specify the set of actions for implementation for each activity in the event of any of these disruptions in order for the unit to resume doing business in the minimum amount of time. Disaster Recovery Planning consists of three principal sets of activities. 1. Identifying the common elements of plausible disruptions that might severely disrupt critical or important unit operations. . Anticipating the impacts and effects that might result from these operational disruptions. 3. Developing and documenting contingent responses so that recovery from these interruptions can occur as quickly as possible.
The major outcome of a Unit Disaster Recovery Planning Project is the development of a unit plan. The plan benefits the unit in that it: • Establishes the criteria and severity of a disruption based on the impact the disruption will cause to the unit’s critical functions. • Determines critical functions and systems, and the associated durations required for recovery. Determines the resources required to support those critical functions and systems, and defines the requirements for a recovery site. • Identifies the people, skills, resources and suppliers needed to assist in the recovery process. • Identifies the vital records, which must be stored offsite to support resumptions of unit operations. • Documents the appropriate procedures and the information required to recover from a disaster or severe disruption. • Addresses the need to maintain the currency of the plan’s information over time. Addresses testing the documented procedures to ensure their completeness and accuracy. Objective and Goals for a Disaster Recovery Planning Project The primary objective of any contingency plan is to ensure the ability of the unit to function effectively in the event of an interruption due to the loss of information, loss of personnel, or loss of access to information and facilities. The goals for contingency planning are to provide for: • The continuation of critical and important unit operations in the event of an interruption. • The recovery of normal operations in the event of an interruption. The timely notification of appropriate unit and university officials in a predetermined manner as interruption severity or duration escalates. •
The offline backup and availability, or alternative availability, of critical components, including: Data files, Software, Hardware, Voice and Data Communications, Documentation, Supplies and forms, People, Inventory Lists. • An alternate method for performing activities electronically and/or manually. • Any required changes in user methods necessary to accomplish such alternate means of processing. • The periodic testing of the plan to ensure its continuing effectiveness. Documentation on the business unit’s plan for response, recovery, resumption, restoration, and return after severe disruption. Contingency planning seeks to accomplish the goals above, while minimizing certain exposures to risks that may impact the recovery and business resumption process, including: • The number of decisions that must be made following a disaster or severe disruption. • Single point of failure conditions in the unit infrastructure. • Dependence on the participation of any specific person or group of people in the recovery process. • The lack of available staff with suitable skills to affect the recovery. The needs to develop, test, or debug new procedures, programs or systems during recovery. • The adverse impact of lost data, recognizing that the loss of some transactions may be inevitable. Conducting the Business Disaster Planning Project There are three phases of a Disaster Recovery Planning Project. • The information needed to identify critical systems, potential impacts and risks, resources, and recovery procedures are gathered in Phase I. • Phase II is the actual writing and testing of the Disaster Recovery Plan. • Phase III is ongoing and consists of plan maintenance and audits.
I. Information Gathering Step One – Organize the Project The scope and objectives of the plan and the planning process are determined, a coordinator appointed, the project team is assembled, and a work plan and schedule for completing the initial phases of the project are developed. Step Two – Conduct Business Impact Analysis Critical systems, applications, and business processes are identified and prioritized. Interruption impacts are evaluated and planning assumptions, including the physical scope and duration of the outage, are made. Step Three – Conduct Risk Assessment
The physical risks to the unit are defined and quantified. The risks identify the vulnerability of the critical systems, by identifying physical security, backup procedures and/or systems, data security, and the likelihood of a disaster occurring. By definition Risk Assessment is the process of not only identifying, but also minimizing the exposures to certain threats, which an organization may experience. While gathering information for the DRP, system vulnerability is reviewed and a determination made to either accept the risk or make modifications to reduce it. Step Four – Develop Strategic Outline for Recovery
Recovery strategies are developed to minimize the impact of an outage. Recovery strategies address how the critical functions, identified in the Business Impact Analysis (step 2), will be recovered and to what level resources will be required, the period in which they will be recovered, and the role central University resources will play in augmenting or assisting unit resources in affecting timely recovery. The recovery process normally consists of these stages: 1. Immediate response 2. Environmental restoration 3. Functional restoration 4. Data synchronization 5. Restoration of business functions . Interim site 7. Return home Step Five – Review Onsite and Offsite Backup and Recovery Procedures Vital records required for supporting the critical systems, data center operations, and other priority functions as identified in the Business Impact Analysis, are verified and procedures needed to recover them and to reconstruct lost data are developed. In addition, the review of the procedures to establish and maintain offsite backup are completed. Vital records include everything from the libraries, files, and code to forms and documentation. Step Six – Select Alternate Facility
This item addresses determining recovery center requirements, identifying alternatives and making an alternative facility, site recommendation/selection. Consideration should be given to the use of University resources (e. g. , Administrative Information Services, Computer Lab, or another unit) as alternative sites before seeking outside solutions For further information on alternative University sites please contact the Client Advocacy Office at 517-353-4856. II. Writing and Testing the Plan Step Seven – Develop Recovery Plan This phase centers on documenting the actual recovery plan.
This includes documenting the current environment as well as the recovery environment and action plans to follow at the time of a disaster or severe disruption, specifically describing how recovery (as defined in the strategies) for each system and application is accomplished. Step Eight – Test the Plan A test plan/strategy for each recovery application as well as the operating environment is developed. Testing occurs on the plans and assumptions made for completeness and accuracy. Modifications occur as necessary following the results of the testing. This portion of the project is perpetual for the life of the plan.