Disasters will happen – Are you ready?
Find out with this checklist.
Companies of all sizes are deploying applications on virtualised IT infrastructures and in clouds. However, protecting these applications is challenging as traditional replication and disaster recovery (DR) solutions were not designed for virtualised data centres. Review this checklist with the technical and business management teams to ensure your organisation can recover from the next disaster.
Score these questions on a scale of 1 (not able to meet/support the requirement) to 5 (fully able to meet/support the requirement) to evaluate the disaster recovery process.
Key BC/DR Considerations
At what level in the company is there business acceptance of the BC/DR plan?
Often BC/DR plans are actually just DR technical point solutions sponsored at a department level to serve some functional purpose, such as meeting an audit requirement. If the primary sponsor doesn’t have support from the top-level business management team, there may not be a comprehensive BC/DR plan that can be used following a real emergency. Having a cost effective, flexible technical solution helps more departments in the business support the BC/DR initiative to deliver full site recovery.
Is the company able to work after and even during a disaster?
Many organisations have backup, however, recovering from a backup can take days. Tapes need to be shipped and environments need to be rebuilt, this is not a process built for a mission critical application. If the DR plan uses something other than backup, it is usually a storage- based solution. If there is inconsistent storage in the environment, there will be an inconsistent recovery plan. A hardware agnostic solution delivers consistent recovery with reduced operation costs.
Are the recovery point objectives (RPOs) and recovery time objectives (RTOs) acceptable for the cost and administrative effort of the solution? Will the business goals be met?
Some solutions meet subsets of the business goals of DR. The business needs a complete solution with RPOs of seconds and RTOs of minutes which can be executed easily and consistently during an outage without impacting the performance of the production environment. Backup usually has an RPO of 24 hours and an RTO of days. Can the business tolerate no productivity for that long?
Are multiple DR solutions being leveraged to meet the BC/DR goals? Is there expertise within the team to support each solution?
Multiple solutions cause confusion and configuration complexity during an actual disaster event. The right administrators need to be available with different specialities and with multiple DR tools. An effective BC/DR solution needs to be very easy to use and automate as much of the BC/DR process as possible. The goal should be that the tool is so simple; anyone can use it so whoever is available can execute the failover.
Is recovery and protection possible at the application level?
Application groups ensure that all the virtual machines (VMs) supporting the mission-critical application are protected consistently. If the DR solution cannot effectively support application groups, ad hoc groupings must be leveraged which can cause errors, especially in high pressure situations.
How long it will take to recover VMs and restore application availability? Can the DR process be tested?
An enterprise-class DR solution will deliver replication at the time of the write event of the VM, not on a schedule. With all the data at the recovery site, recovery can happen immediately – there is no waiting for the last data synchronisation. A true enter-prise-class DR solution will perform non-disruptive testing anytime for predictable RTO and predictable site recovery times.
Does the team have the training necessary to fail over a site?
The technical disaster recovery component of BC/DR is a difficult operation that requires testing and coordination.
1. Repeated and regular non-disruptive testing is critical.
2. Longer-term testing in isolated environments is sometimes necessary to determine functionality.
3. Is there staff at the recovery site that can perform the failover in case a disaster prevents the team at the primary site from participating in the recovery effort?
If the score is under 22, contact Concorde today to learn how to be prepared for the next disaster.