Friday, July 31, 2009

Why restore and herd when you can simply pre-empt and manage?

Val Slastnikov, Digitizer Group

This free report for "C" level Executives explains the concepts of Disaster Recovery and Business Continuity Planning and offers superior solutions to maximize effectiveness of your company's DRP and BCP strategies

Introduction. Still polishing your restoration skills?

Hurricanes, tornados, pandemics, terrorist attacks…What’s next?

The new millennium has not only brought us many exciting new technologies but many different catastrophic events that can cause large-scale disruption to business operations.

True disaster recovery and business continuity plans used to be only for large multi-billion dollar customers. Nowadays not a single company should be without it.

Every company wants their systems to be as resilient as possible. Disaster prevention, overall reliability, disaster mitigation, and recovery capabilities and performance are extremely important issues. However, the way the companies address and manage these issues is even more important these days.

Why?

Because restoring your systems after any disaster takes time and effort. In fact, systems restoration is a professional skill and, as such, should be handled by professionals – not your company resources.

Just like restoring a vintage Corvette, bringing your IT systems back to life after failure can be time-consuming and costly. And while IT systems may not be a prized automobile, businesses can’t afford losing their most important assets – especially when times are hard.

Large corporations spend millions of dollars trying to put together the best and most cost effective back up and recovery solutions. It is estimated that most large companies spend between 2% and 4% of their IT budget on disaster recovery planning, with the aim of avoiding larger losses in the event that the business cannot continue to function due to loss of IT infrastructure and data. Of companies that had a major loss of business data, 43% never reopen, 51% close within two years, and only 6% will survive long-term.

But what if you are a small or medium size business and you don’t have a huge budget (sometimes any budget) to spend on those “unpredictable things”?

Experts say that mere backing up of your data to a tape or disk at the end of the day and having a basic data recovery procedure printed on a sheet of paper and stashed away somewhere for a “rainy day” and “just in case” is simply not enough. You have to be pre-emptive in your approach. It means planning ahead of time, it means becoming intentional in your efforts, especially now when every minute and every dollar you spend on technology counts.

So what do you need to do to become pre-emptive?

1. Definitions and components of DRP and BCP

Let’s start with definitions.
First, let’s define Disaster Recovery and Business Continuity Planning, review the planning components and then look into how we can make our planning really pre-emptive.
According to Wikipedia, “Disaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. Disaster recovery planning is a subset of a larger process known as business continuity planning and should include planning for resumption of applications, data, hardware, communications (such as networking) and other IT infrastructure.”

Having access to corporate data is one of the most important pieces of a proper disaster recovery plan. Even more important is that the data must be as current as possible and easily accessible.

Many customers are frustrated with the traditional methods of backing up data to tape, sending the tapes off-site,and then having to deal with the recovery process in the event of a major disaster. The ability to use disk-to-disk replication that automatically replicates data makes recovery processes much easier; however, these systems typically require expensive high-bandwidth circuits. The fact that true disaster recovery operations should be located in different geographic areas makes these high-bandwidth circuits even more expensive. For example, a customer based in Toronto may want to have its data recovery center hosted in Dallas, Texas so that a regional disaster does not impact both the headquarters office and the disaster recovery location. The cost incurred by the increased circuit distance can sometimes be enough to cancel an entire disaster recovery project.

So, what are the components that a disaster recovery planner should keep in mind when putting together their organization's business continuity plan (BCP)?

In order to be successful, the following key components must be considered in a BCP:

1. What type of disasters to avoid? (such as Natural, i.e. “The Acts of God” or Man-Made, i.e. caused by a human error or intentional harm)

2. Security holes, i.e. vulnerabilities in hardware and software

3. Control measures (or simply controls), i.e. steps or mechanisms that can reduce or eliminate computer security threats.  The definition of ‘Control measures’ comes from Enterprise Risk Management and specifies 3 types of such measures:

- Preventive measures (steps helping to avoid disastrous events from happening)
- Detective measures (steps designed to detect or discover unwanted events)
- Corrective measures (steps of correcting or restoring the systems after disaster takes place)

4. Strategies, i.e. procedures outlined in their organization's business continuity plan which should indicate the key metrics of recovery point objective (RPO) and recovery time objective (RTO) for various business processes and the underlying IT systems and infrastructure that support those processes.

Needless to say that Recovery Point Objective (RPO) and Recovery Time Objective (RTO) are extremely important – just look at definitions:

According to Wikipedia, “Recovery Point Objective (RPO) describes the acceptable amount of data loss measured in time. The Recovery Point Objective (RPO) is the point in time to which you must recover data as defined by your organization. This is generally a definition of what an organization determines is an "acceptable loss" in a disaster situation. If the RPO of a company is 2 hours and the time it takes to get the data back into production is 5 hours, the RPO is still 2 hours. Based on this RPO the data must be restored to within 2 hours of the disaster.”

Again, according to Wikipedia, “The Recovery Time Objective (RTO) is the duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity. It includes the time for trying to fix the problem without a recovery, the recovery itself, tests and the communication to the users."

As I’ve mentioned earlier, these two criteria are mission-critical. No only because they deal with time, money and functionality of the company. But also, if not set up properly, these exact criteria may determine if a company closes down and never reopens after the disaster, if it closes within two years, or survives and bounces back to its full functionality.

This brings us to 2 very important issues and 2 Big Questions we should ask ourselves…

2. Major Issues and 2 Big Questions

Let’s review what we have achieved thus far:

We have defined what Disaster Recovery and Business Continuity measures must entail.
We have looked at important components of a successful Business Continuity Plan (that is, the one that helps the company to bounce back and restore its business processes and underlying systems to its full functionality with minimum downtime).

Now let’s name the 2 Major issues when it comes to Data Management and Data Recovery.

Major Issue # 1: Effectiveness.

Overall effectiveness of your Business Continuity Plan means being able to develop effective strategies in a lot of different areas, such as:

- effective data backup strategies
- effective data replication strategies
- effective data mirroring strategies
- effective redundancy strategies
- effective power back up strategies
- effective security strategies
- effective anti-virus strategies
- effective fire and other disaster prevention strategies.

That brings us to Big Question # 1:

How to improve your DRP and BCP initiatives without increasing the associated costs?

Major Issue # 2. Data Proliferation

Every business these days creates an unprecedented amount of data. Both private and public companies continue to generate data at an unprecedented rate and the usability problems that result from attempting to store and manage that data. While initially problems were primarily associated with paper documentation, in recent years data proliferation has become a major problem in storing data on computers as well.

While digital storage has become cheaper, the associated costs, from raw power to maintenance and from metadata to search engines, have not kept up with the proliferation of data. Although the power required to maintain a unit of data has fallen, the cost of facilities which house the digital storage has tended to rise.

That brings us to Big Question # 2:

How to address the issue of ever-growing Storage needs without increasing the associated costs?

3. Quiz: “7 Tiers of Disaster Recovery”

The Seven Tiers of Disaster Recovery was originally defined to help identify various methods of recovering mission-critical computer systems required to support business continuity.

The seven tiers of business continuity solutions offer a simple method to define current service levels within your company and the risks associated with them.

Quiz: “7 Tiers of Disaster Recovery”

Part 1. Review all 7 tiers of Disaster Recovery.
Part 2. Define where your company stands right now.
Part 3. Answer the question at the end. (Optional)

Here we go:

Part 1. Review all 7 tiers of Disaster Recovery:

Tier 0: No off-site data – Possibly no recovery
Businesses with a Tier 0 business continuity solution have no business continuity plan. There is no saved information, no documentation, no backup hardware, and no contingency plan. The time necessary to recover in this instance is unpredictable. In fact, it may not be possible to recover at all.

Tier 1: Data backup with no hot site
Businesses that use Tier 1 continuity solutions back up their data and send these backups to an off-site storage facility. The method of transporting these backups is often referred to as "PTAM" - the "Pick-up Truck Access Method." Depending on how often backups are created and shipped, these organizations must be prepared to accept several days to weeks of data loss, but their backups are secure off-site. However, this tier lacks the systems on which to restore data.

Tier 2: Data backup with a hot site
Businesses using Tier 2 business continuity solutions make regular backups on tape. This is combined with an off-site facility and infrastructure (known as a hot site) in which to restore systems from those tapes in the event of a disaster. This solution will still result in the need to recreate several hours or even days worth of data, but the recovery time is more predictable.

Tier 3: Electronic vaulting
Tier 3 solutions build on the components of Tier 2. Additionally, some mission critical data is electronically vaulted. This electronically vaulted data is typically more current than that which is shipped via PTAM. As a result there is less data recreation or loss after a disaster occurs. The facilities for providing Electronic Remote Vaulting consists of high-speed communication circuits, some form of channel extension equipment and either physical or virtual Tape devices and an automated tape library at the remote site. IBM's Peer-to-Peer VTS and Sun's VSM Clustering are two examples of this type implementation.

Tier 4: Point-in-time copies
Tier 4 solutions are used by businesses that require both greater data currency and faster recovery than users of lower tiers. Rather than relying largely on shipping tape, as is common on the lower tiers, Tier 4 solutions begin to incorporate more disk based solutions. Several hours of data loss is still possible, but it is easier to make such point-in-time (PiT) copies with greater frequency than tape backups even when electronically vaulted.

Tier 5: Transaction integrity
Tier 5 solutions are used by businesses with a requirement for consistency of data between the production and recovery data centers. There is little to no data loss in such solutions, however, the presence of this functionality is entirely dependent on the application in use.

Tier 6: Zero or near-Zero data loss
Tier 6 business continuity solutions maintain the highest levels of data currency. They are used by businesses with little or no tolerance for data loss and who need to restore data to applications rapidly. These solutions have no dependence on the applications or applications staffs to provide data consistency. Tier 6 solutions often require some form of Disk mirroring. There are various synchronous and asynchronous solutions available from the mainframe storage vendors. Each solution is somewhat different, offering different capabilities and providing different Recovery Point and Recovery Time objectives. Often some form of automated tape solution is also required. However, this can vary somewhat depending on the amount and type of data residing on tape.

Tier 7: Highly automated, business integrated solution
Tier 7 solutions include all the major components being used for a Tier 6 solution with the additional integration of automation. This allows a Tier 7 solution to ensure consistency of data above that which is granted by Tier 6 solutions. Additionally, recovery of the applications is automated, allowing for restoration of systems and applications much faster and more reliably than would be possible through manual business continuity procedures.

Part 2. Define where your company stands right now.
.
.
.
.
.
Part 3. Answer the following question (Optional):

What is holding you and your company back from achieving Tier 7 in Disaster Recovery?
.
.
.
.
.
.
.
.
.
.
.
.
.

4. Major Issue # 1. Real Life Solution Video

Now, let’s get back to the question I asked you at the beginning (because you are probably dying to know the answer by now, right?)

That question was:

What do you need to do to become pre-emptive?

Well, instead of just giving you the answer to that, let me give you an example of being pre-emptive in your DRP and BCP approach.

And, if you are still not convinced as to why you should strive for Tier 7 in Disaster Recovery, let me demonstrate a Tier 7 Solution to you in a real life case scenario.

In the video “Database Down: Winning The Operational Challenge” Wally Casey, VP of Sales of Tivoli Software shows how Tivoli Software helps users not only to restore their systems and data after disastrous event, but also demonstrates how the software allows to visualize, monitor and automatically restore all business processes and underlying systems in real time, reducing the necessity of developing Recovery Point and Recovery Time objectives altogether!

Click here to view Part 1 “Visualization. Monitoring. Automation.”

Click here to view Part 2 “Storage. Security. Automation.”

Right in front of our eyes we can see how failure occurs, visualize what is happening, control what’s happening and automate the response – all in real time!

If this is not pre-emptive approach in action – I don’t know what is, Ladies and Gentlemen!

So, after learning about Disaster Recovery Tiers and watching the Tier 7 Solution in all of its pre-emptive beauty, can you give me an honest answer to the following question:

Do you still want to hang on to the past and try to become a Master of Restoration? Or it’s time to start looking into the future and eventually become a Pre-emptive Visionary?

The decision is entirely yours, I am just here to show you what’s now available – at no additional costs to you.

But, before we conclude, let me show you what can be done about Major Issue # 2.

5. Major Issue # 2. Expert Opinion Video

Herding is the act of bringing individual animals together into a group (herd), maintaining the group and moving the group from place to place—or any combination of those.” (Wikipedia)

If you remember, I’ve pointed out earlier that Major Issue # 2 in Data Management and Data Recovery is Data Proliferation, that nasty pile-up of data on our PCs and Storage Devices that was too valuable to be discarded or simply had to be kept for reporting and auditing purposes…

If you remember, the question was:

How to address the issue of ever-growing Storage needs without increasing the associated costs? Again, just like in a previous case, I want you to arrive to the answer on your own.

Yes, of course! I have another “Real Life Case Study” Video up my sleeve. This time it’s coming from Ron Riffe, one of the Storage Management Experts who addresses the magnitude of the Data Proliferation issue and how this issue can be dealt with correctly.

That is, instead of “Herding Storage” you can start “Manage Storage” and become pre-emptive in your Storage Management strategy as well.

Here’s the Expert Opinion Video for you:

Click here

~ Conclusion ~

Let us summarize what we’ve learned together about DRP and BCP Solutions today:

- Restoring your systems and data after disaster can be painful and costly. It requires professional intervention and causes a lot of frustration

- It does not have to be that way if your Disaster Recover Planner chooses to be proactive (Pre-Emptive Visionary), as opposed to reactive (Master of Restorations)

- There are 8 different Tiers in Disaster Recovery (including Tier Zero – trust me, you don’t want to be amongst those!) that determine success of your DRP and BCP strategy and, ultimately, define the survival rate of your company

- There are solutions out there that help you deal with the major issues in Disaster Recovery and Business Continuity Planning, such as Tivoli Provisioning Manager and Tivoli Storage Manager

- My role was to show you the scope of what is available. Whether or not you and your organization chooses to use these solutions – is entirely up to you.

Keep smiling!

Val-Signature

/Val Slastnikov/

No comments:

Post a Comment