IT Manager's Handbook: Getting Your New Job Done (63 page)

Read IT Manager's Handbook: Getting Your New Job Done Online

Authors: Bill Holtsnider,Brian D. Jaffe

Tags: #Business & Economics, #Information Management, #Computers, #Information Technology, #Enterprise Applications, #General, #Databases, #Networking

BOOK: IT Manager's Handbook: Getting Your New Job Done
12.28Mb size Format: txt, pdf, ePub

Don't be surprised if your testing isn't 100% successful. Developing and testing disaster recovery plans are iterative processes. Also, remind yourself that it is certainly better to identify problems during a test than during a real crisis.

After the Disaster

An often forgotten aspect of disaster recovery planning are the steps required once the disaster abates and things can return to normal. While in disaster-recovery mode, you've been running critical systems and applications out of your disaster recovery facility. When the disaster is over and you can return to your primary facility, you now need to have a plan to get data from the disaster recovery facility (which now has the most current data) back to your primary site.

The process will probably be somewhat similar to the one you used to get then-current data to your disaster recovery facility (duplication of transaction logs, tapes, data replication, etc.). Of course, this process has to be done after you've repaired any damage that may have occurred to the primary facility, or the systems it contains.

Regional and Catastrophic Disasters

Many disasters are often a result of the forces of nature (e.g., the Northeast blackout of 2003 and Hurricane Katrina in 2005). As such, if your organization suffers a disaster, it's likely that many other organizations in your geographic area will also suffer, and they may be scrambling for the same recovery resources that you're planning to rely on.


The local computer retailer may be rushed by other organizations trying to get their hands on hardware.

Hotel rooms may be sold out.

Because telecommunications companies will be working around the clock to get regular services restored, your plans to install temporary lines may not pan out.

The very vendors and service providers you had planned on may not be functioning because they are dealing with the same disaster you are.

A key issue to keep in mind in this regard when crafting your disaster recovery plan is to incorporate some non-local options for each category in the plan.

Finally, it's important to remember that in the event of a truly catastrophic disaster, employees' priorities will quickly shift from trying to aid their employer to trying to aid their families. For dealing with the human side of disasters, see the next section.

The ACT Model

Most of the chapter has focused on the mechanics and technologies of dealing with a crisis. However, it's also wise to plan for individuals (staff, employees, etc.) during a crisis.

Consider the guidelines of the ACT model:

A
—Acknowledge and name the trauma. Have the courage to use names and the real words describing what happened, including the word “death” (if warranted, of course). It's important to validate the accuracy of information and share only what is known for sure. Know the facts and don't speculate. (New York City's Mayor Rudy Giuliani's refusal to speculate on the cost in human lives during the 9/11 disaster was an excellent example of the value of waiting until all the facts were in. He was asked about the toll repeatedly for many days, but refused to give a number until he knew the facts.) Straying from the script can be very harmful. Personally acknowledge the incident in order to position company management as also being impacted by the event, thereby aligning leaders with the staff.

C
—Communicate both competence and caring. Competence and caring aren't mutually exclusive! Demonstrate expertise in dealing with the issue and express compassion for the personal impact to those who are affected.

T
—Transition. Communicate an expectation of recovery. Show sensitivity and flexibility as people return to life and return to work. Communicate clearly that leadership will take steps to help people transition back to health and productivity. Identify internal and external resources for additional information and support. Be especially visible and accessible to employees for support and information.

The ACT model focuses on direct and honest communications and demonstrating sensitivity.

9.3 A Word about Incident Response, Business Continuity, and Disaster Recovery

Incident Response, Business Continuity Planning (BCP), and Disaster Recovery (DR) are often used as interchangeable terms. Although the three types of planning are similar, there are differences.


Incident Response
: This is the name for the immediate steps taken to ensure the protection of people. The focus is on employee safety and well-being. Incident response includes simple building evacuation procedures, communications, accounting for all employees to ensure that they are safe, and determining next steps like initiating business continuity and disaster recovery plans.

Business Continuity Planning:
BCP is a methodology used to create a plan for how an organization will resume partially or completely interrupted critical function(s) within a predetermined time after a disaster or disruption. BCP helps prioritize recovery activities for critical functions. BCP may be a part of a larger organizational effort to reduce operational risk, and thus has a number of overlaps with the practice of risk management.

Disaster Recovery
: DR is the ability to establish (or restart) an infrastructure for resuming critical functions after a disaster.

These areas do overlap and, to a certain degree, are interdependent. Most confusion rests with the terms “business continuity planning” and “disaster recovery.” To help differentiate between the two, it's best to think of disaster recovery as those areas and issues related to the IT environment and infrastructure (data center, servers, access, data, applications, etc.). The remaining items, the business's day-to-day function, would generally fall into the area of business continuity planning.

Business continuity essentially picks up where disaster recovery leaves off. In a perfect world you have a parallel data center that is a mirror image of everything in your primary data center. When disaster hits, you can invoke your disaster recovery plans to bring the parallel data center online with all applications and no loss of data. Then business continuity comes into play as the user departments attempt to continue functioning. “Continuing their normal function” becomes quite challenging if there is no electricity and/or phones, travel is severely curtailed, people can't get to the office, etc.

For example, while disaster recovery planning would cover the following issues:


Secondary data centers

Off-site storage of backup tapes

Redundancy/availability of technical resources

Replication of data

Procedures and resources for installing applications, related systems, and making them available

UPS and generator solutions

BCP would cover issues such as:


Communication plans

Lists of cell phone numbers

Plans for communicating to the employee population at large, as well as those critical in a disaster scenario (emergency phone numbers, websites)

Where people will meet to continue to conduct business if the office is unavailable

Locations of nearby hotels, restaurants, and other services if the need arises

Plans for continuing key business operations and working with critical customers and partners (banks, suppliers, etc.) during a crisis

Copies of vital files and information

Availability of cash

What would the sales department do? Could/should they continue to try to sell products? How would they reassure customers?

How would the accounting department make sure that the company's suppliers are paid and that receivables are collected?

How would customers, suppliers, partners, clients, investors, and employees be kept informed?

Could payroll continue to pay employees during the disaster?

As a general rule, the IT department's greater priority is disaster recovery, while user departments are generally concerned about BCP.

9.4 The Hidden Benefits of Good Disaster Recovery Planning

One seldom recognized value of good disaster recovery planning is that it often helps tremendously with regular, everyday corporate process execution. Good disaster recovery makes for good everyday business performance.


You need to know exactly where every piece of hardware in your company is anyway: where is it, how old is it, what applications/services does it provide, who supports it, etc. But if you can't find the time to do this kind of inventory on a regular basis, you should find the time to do it for a formal disaster recovery plan.

If you don't have current contact lists for people outside your direct reports, here is a chance to create one and keep it current.

Naturally, you should have very complete data backup procedures. But if you haven't yet set up off-site backup, creating a disaster recovery plan can force you—or higher ups, once you put it in the context of disaster recovery—to get the time and money to make this basic need happen.

9.5 Further References

Websites

www.capscenters.com
. [computer alternative processing sites, CAPS].
www.doubletake.com
. [data replication software].
www.drj.com
. [Disaster Recovery Journal].
www.drs.net
. [Disaster Recovery Services, DRS].
www.everbridge.com
. [crisis communication tool].
www.hp.com/go/continuityandavailability
. [HP Business Continuity and Availability Services].
www.ibm.com/services/continuity
. [IBM Business Continuity and Recovery Services].
www.linkpro.com
. [data replication software].
www.neverfailgroup.com
. [data replication software].
www.recovery.sungard.com
. [SunGard Recovery Services].
www.rentsys.com
. [Rentsys Recovery Services].
www.sendwordnow.com
. [crisis communication tool].

Books and Articles

Blokdijk G, (2008).
Disaster Recovery 100 Success Secrets—IT Business Continuity, Disaster Recovery Planning and Services
. Emereo.
Bowman RH, (2008).
Business Continuity Planning for Data Centers and Systems: A Strategic Implementation Guide
. Wiley.
Dolewski R, (2008).
Disaster Recovery Planning
. McPress.
Hiles A, (2010).
The Definitive Handbook of Business Continuity Management
. Wiley.
Hotchkiss S, (2010).
Business Continuity Management: In Practice
. British Informatics Society Ltd.
Nash KS, (2011). When Disaster Hits Home.
CIO Magazine
. [July 1].
National Fire Prevention Association .
NFPA 1600, Standard on Disaster/Emergency Management and Business Continuity Programs
.
www.nfpa.org/assets/files/PDF/NFPA1600.pdf
..
Preimesberger C, (2011). Why Disaster Recovery Isn't Optional Anymore.
eWeek
. [July 18].
Rothstein PJ, (2008).
Disaster Recovery Testing
. Rothstein.
Schmidt K, (2010).
High Availability and Disaster Recovery: Concepts, Design, Implementation
. Springer.
Thejendra BS, (2008).
Disaster Recovery and Business Continuity
. IT Governance.
Wallace M, Webber L, (2010).
The Disaster Recovery Handbook: A Step-by-Step Plan to Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets
. AMACOM.
Watters JP, (2010).
The Business Continuity Management Desk Reference
. Leverage.

Chapter 10

Working with Users

Pay attention to what users do, not what they say.

Jakob Nielsen

Other books

After Love by Kathy Clark
Summer of the War by Gloria Whelan
The Long Game by Fynn, J. L.
Waking the Dead by Scott Spencer
Broke: by Kaye George
Lulu Bell and the Tiger Cub by Belinda Murrell
A Lethal Legacy by P. C. Zick