BUSINESS CONTINUITY MANAGEMENT AND DISASTER RECOVERY
7CCSMSEM Security Management Dr. Jose M. Such
Learning Outcomes
– Business Continuity Management – Business Continuity Plans
– Disaster Recovery Plans
Business Continuity Management
• Business Continuity Management (BCM) is a holistic management process that identifies potential impacts that threaten an organisation and provides a framework for building resilience and capability for an effective response that safeguards the interests of its key stakeholders, reputation, brand and value creation activities.
BCM’s Relationship with Risk Management
• BCM may arise from wider business and IT risk analyses
• BCM priorities must be defined by business managers and take into account the entirety of their business processes
• BCM priorities are defined by a Business Impact Assessment (BIA)
• Recall this was needed anyway to calculate Asset value for risk assessment
• BCM focuses on how to recover once a threat materialises
Steps for a Business Continuity Plan
How can we get a Business Continuity Plan to work?
It’s a three step process!
Assigning Responsibilities
Senior management must approve of the plan!
Likely an appointment at boardroom or executive level to oversee, and an appointment to take the programme forward.
Establishing and Implementing the Plan
Scope, aim and objectives, and the activities required if the plan is triggered.
i.e. What are the likely problems that will pop up? How can we deal with them?
Ongoing Management
Regular review of the continuity plan (i.e. it can easily become out of date and no longer reflect real business operations).
Similar to the Plan-Do-Check-Act model!
What does a Business Continuity Plan involve?
What does the Business Continuity Plan Involve?
Understanding the organisation and its risks!
Business Impact Analysis + Risk Assessment
Determining the Business Continuity Management Strategy
Identifying the actions needed to maintain critical activities to support organisation’s products
and services
Developing and Implementing the BCM Response
How do we meet our expected recovery times? And what tactics can we deploy to protect resources?
Business Impact Analysis (extended)
A business impact analysis (BIA) predicts the consequences of disruption of a business function and process and gathers information needed to develop recovery strategies.
List products/services that can be disrupted
For each identified product, consider impact of disruption in terms of stakeholders and the organisation’s ability to meet its aims/objectives.
Recovery Time Objective (RTO)
Point in time at which each key product/service would need to be resumed in the event of a disruption.
Maximum length the disruption that can be managed without interrupting the business
In other words, if a service or key product is disrupted, how long will it take until the disruption is felt by the business? In terms of profit / reputation. i.e. you have a 10-hour window before the information is required.
Critical activities necessary to deliver key products / services
These are the activities that you will need to protect!
Business Impact Analysis (Briefly)
Business Impact Analysis (extended)
Business Impact Analysis (Final Step)
Quantify resources required over time to maintain critical activities at an acceptable level (and meet the RTO)
People, premises, technology, information and supplies/partners.
Considerations:
Resources required after a disaster may be more than day-to-day work
If there is an appointed team in a large organisation, likely not a single individual knows the end-to-end plan.
Consult entire organisation about plan (workshops, questionnaires, interviews, etc)
Think of consequences and not necessarily the event (i.e. what do we do if the computer system in the office is damaged and no longer usable? Is better than “what do we do if the water bursts”
Determining the BCM strategy
Identifying the action that need to be taken to maintain the critical activities that underpin the delivery of your organisation’s products and services
How to meet the “RTO” for each activity!
What can we do to ensure the recovery time is satisfied?
Can we introduce new documentation for staff to read?
Do we need new procedures in place?
Tactics to protect resources
Should we maintain the same technology at several locations?
How do we secure and keep copies of data necessary to our business operations?
What can our suppliers do to help?
How do we report everything to stakeholders?
Determining the BCM Strategy
Developing and implementing BCM Response
Developing and Implementing BCM response
Concerned with the development and implementation of appropriate plans and arrangements to ensure the management of an incident and continuity and recovery of critical activities that support key products
and services.
What plans do we need?
Single document (good for a small team) or it can be split into three plans: Incident Management, Business Continuity, Business Recovery Plans.
Let’s consider a single plan for simplicity.
BCM Plan Content
BCM Plan Content
Purpose, Scope and Content
What is this plan going to cover? What are we going to look at? (i.e. is the scope only to do with the premise of the building? Or roles within an organisation?) and will this document contain.
Document Maintainer
You should document who owns the plan and who is responsible for reviewing, amending and updating it at regular intervals. A system of version control should also be adopted.
Plan Invocation
The method by which the plan is invoked should be clearly documented, setting out the individuals who have the authority to invoke the plan and under what circumstances. The plan should also set out the process for mobilising and standing down the relevant teams.
Roles and Responsibilities
The plan should list all individuals with a role in its implementation and explain what that role is.
BCM Plan Content
Incident Management
Document required tasks to manage the initial phase of the incident and who is responsible for each task. Tasks include:
● Site evacuation; mobilisation of safety,
● First-aid or evacuation-assistance teams;
● Locating and accounting for those who were on
site or in the immediate vicinity;
● On-going employee/customer communications
and safety briefings
● Up to date contract list / location of the plan
● Identify a robust room to manage the incident (i.e.
war room!)
Business continuity and recovery
In terms of business continuity and recovery:
● Set out the critical activities to be recovered,
● The timescales in which they are to be
recovered and the recovery levels needed;
● The resources available at different points in
time to deliver your critical activities;
● The process for mobilising these resources;
and detail actions and tasks needed to ensure the continuity and recovery of your
critical activities
BCM Plan Content (Contd)
Exercising BCM arrangements
Exercising BCM arrangements
This element of the recovery life-cycle ensures that an organisation arrangements are validated by exercise and review and that they are kept up to date.
Why exercise the arrangements?
You cannot consider any arrangements as reliable until they are tested! Exercising involves:
How often do you exercise arrangements?
Depends on the organisation. Recommended at least once a year. Likely to find weaknesses that need to be improved for the next exercise.
● ● ●
Validating plans,
Rehearsing key staff,
Testing systems that are relied upon during an incident
Exercising BCM arrangements
FOUR TYPES OF EXERCISES:
Four types of exercises
Discussion based exercise
Bring staff together and inform them about their responsibilities. Discuss with staff to identify problems and solutions.
Testing
Not everything can be tested, but you can consider the contact list, activation process and the relied upon hardware (i.e. communication lines, power supply, etc)
,
See KEATS for CIS Table-top Exercises
Table-top exercise (i.e. think board game)
Bring staff together to make decisions as events unfold in very much the same way as if the incidentive actually happened. Roundtable format, might last 2 hours <-> half a day. Benefit is that it generates high level of realism and let everyone get to know each other.
Live exercise
Live exercises are a necessity for components such as evacuation that cannot be tested effectively in any other way. While single component tests are relatively easy to set up, full tests are much more complex and can be costly.
DISASTER RECOVERY
Disaster Recovery
• Disaster Recovery is that part of business continuity that addresses the need to recover IT services and voice services and data following a business-threatening impact
• Disaster Recovery prioritises those services and information that are critical to the business
• Disaster Recovery includes planning for crisis situations and having in place the means to identify incidents, contain and recover them
Main goals for a disaster recovery plan
Main goals for a disaster recovery plans?
To minimize interruptions to the normal operations.
In other words, maintain business operations as normal.
To limit the extent of disruption and damage.
A disaster can severely damage (or destroy) an organisation.
To minimize the economic impact of the interruption.
If you can’t access the office for two weeks, can you still make profit?
To establish alternative means of operation in advance.
Signing contracts with back-up establishments or keeping redundant copies of the data for later use.
To train personnel with emergency procedures.
One big question is where are the emergency procedures also kept? What if they can’t be accessed? Should staff be aware in advance?
To provide for smooth and rapid restoration of service.
Prevent long-term fallout for the business!
What is involved in a Disaster Recovery Plan?
What is included in a Disaster Recovery Plan?
IT services:
Which business processes are supported by which systems? What are the risks?
People
Who are the stakeholders, on both the business and IT side, in a given DR process?
Suppliers:
Which external suppliers would you need to contact in the event of an IT outage? Your data recovery provider, for example.
Locations:
Where will you work if your normal premises are rendered inaccessible?
Testing:
How will you test the DR plan? Will you simulate parts of it during the weekend? Will you try “brown-envelope” training?
Training:
What training and documentation will be provided to end users?
What is included in the Disaster Recovery Plan
Introduction:
A summary of the objectives and scope of the plan, including IT services and locations covered, the different services, and testing and maintenance activities. Also includes a revision history to track changes.
Roles and responsibilities
A list of the internal and external stakeholders involved in each DR process covered, complete with their contact details and a description of their duties.
Incident response:
When should the DR plan be triggered, and how and when should employees, management, partners and customers be notified?
DR procedures:
When the DR plan is triggered, the stakeholders can start to action a DR process for each affected IT service. Those procedures are set out step-by-step.
Appendices:
A collection of any other lists, forms and documents
relevant to the DR plan, such as details on alternate work locations, insurance policies, and the storage and distribution of DR resources.
Structuring the Disaster Recovery Plan
Example: Step-by-step plan for Mobile Site
Example: Step by step procedures for Mobile Site
What if the office is no longer accessible due to
What if the office is no longer
the weather (flood) or
accessible due to a terrorist
other threat (terrorist
attack or meteorite strike?
attack).
One plan of action might be to move the team to a “mobile site” such as a trailer… while the office is out of operation!
But how do we “plan” for such an event?
Example: Step-by-step plan for Mobile Site
Step 1: Notify senior management and prepare purchase for backup equipment
1. 2.
Step 2: Do communication lines/channels need to be re-routed?
Depending on communication needs, notify telephone company (____________) of possible emergency line changes.
Begin setting up power and communications at ____________.
a. Power and communications are prearranged to hook into when trailer arrives.
b. At the point where telephone lines come into the building (____________),
break the current linkage to the administration controllers (____________).
c. These lines are rerouted to lines going to the mobile site. They are linked to
modems at the mobile site.
d. The lines currently going from ____________ to ____________ would then be
linked to the mobile unit via modems.
e. This can conceivably require ____________ to redirect lines at ____________
complex to a more secure area in case of disaster.
1. Notify ____________ of the nature of the disaster and the need to select the mobile site plan.
2. Confirm in writing the substance of the telephone notification to ____________ within 48 hours of the telephone notification.
3. Confirm all needed backup media are available to load the backup machine.
4. Prepare a purchase order to cover the use of backup equipment. Notify ____________ of plans for a trailer and its placement (on ____________ side of ____________).
Example: Step-by-step plan for Mobile Site
Step 3: Get mobile site up and running!
1. When the trailer arrives, plug into power and do necessary checks.
2. Plug into the communications lines and do necessary checks.
3. Begin loading system from backups.
4. Begin normal operations as soon as possible:
a. Daily jobs i. Job1 ii. Job 2
b. Daily saves
i. Saving the following data onto…
c. Weekly saves
i. Saving the following data onto…
Step 4: Plan to transition site back to home-base when it is ready
1. Plan a schedule to backup the system in order to restore on a home-base computer when a site is available. (Use regular system backup procedures).
2. Secure mobile site and distribute keys as required.
3. Keep a maintenance log on mobile equipment.
CERTs
• A Computer Emergency Response Team (CERT) is an expert group that handles computer security incidents.
• CERT in the UK
• National Cyber Security Centre (NCSC), part of GCHQ
• https://www.ncsc.gov.uk
• Incident Management Guidance
• https://www.ncsc.gov.uk/collection/10-steps-to-cyber- security?curPage=/collection/10-steps-to-cyber-security/the-10- steps/incident-management
QUIZ TIME!
Go to: PollEv.com/josesuch498
Standards?
Access to BCM and DR plans is critical. It is useless to spend the money to plan if they cannot be accessed during an emergency!
ISO 22301
Societal Security — Business continuity management systems — Requirements
Specifies the requirements for a management system to protect against, reduce the likelihood of, and ensure your business recovers from disruptive incidents
ISO 22313
Societal Security — Business continuity management systems — Guidance
Provides guidance based on good international practice for planning, establishing, implementing, operating, monitoring, reviewing, maintaining and continually improving a documented management system.
It enables organizations to prepare for, respond to and recover from disruptive incidents when they
arise.
What standards should we consider?