I am going to tell you a story. Once upon a time, there was an organization. And this organization was interested in the cloud. And so, they migrated their datacenter to the cloud, but unfortunately, they believed that, since the cloud was everywhere, they did not have to worry about Business Continuity and Disaster Recovery (BCDR).
Some people may think that because they have their systems running in the cloud, that they don’t need to worry about backup or disaster recovery. After all, isn’t the Azure fabric built with high availability in mind? Of course, it is.
For example, with Availability Sets, the Azure platform distributes the placement of Infrastructure-as-a-Service (IaaS) Virtual Machines (VMs) across the underlying infrastructure. If there is planned maintenance on the Azure platform, or an underlying hardware/infrastructure issue, the use of availability sets ensures that at least one VM remains running.
And then there is Azure Storage. The data in your Microsoft Azure storage account is always replicated to ensure durability and high availability. If you use Locally Redundant Storage (LRS), then your data is replicated three times within a single data centre in a single region.
With Geo-Redundant Storage (GRS) however, your data is replicated three times within the primary region and is also replicated three times in a secondary region hundreds of miles away from the primary region, providing the highest level of durability.
The truth is, even though the underlying fabric of Azure is built with fault tolerance and high availability in mind, it does not protect our workloads against the human factor.
For example, what if someone accidentally deletes a Virtual Machine? What if a patch applied to a Virtual Machine fails and corrupts the OS/system? Or what if our environment is compromised by ransomware? Perhaps you experience an application failure in your region because of an application outage, or partial and/or full datacenter outage. Or maybe you need the ability to quickly recover your application in another region and need to do so within minutes.
These scenarios bring to light the need for a true, full end-to-end Business Continuity and Disaster Recovery (BCDR) solution. Even if you are running your entire datacenter within Azure, you still need to complete regular backups and prepare for disaster scenarios.
There are 2 puzzle pieces to the BCDR solution: Azure Backup, and Azure Site Recovery.
Azure Backup (AB)
If someone accidentally deletes an important File or Folder, or worse, and entire Virtual Machine; you don’t have to worry. With Azure Backup, not only can you take a full Virtual Machine backup/snapshot, you can then use that backup to perform file-level recovery instantly!
How does this work? In Azure, the backed-up disks are mounted to the target system. This can either be the originating system, or an Administrator workstation. Then all you need to do is browse the disk like we do using File Explorer. It’s that easy.
Data Retention Compliance
Many industries have very specific data retention compliances that need to be met.
For example, a law firm or Government entity may be required to maintain copies or backups of data for 7 years. Whereas a manufacturer may require 10 years or more. Whatever duration your organization or industry may require, Azure Backup can handle all your long-term storage and retention needs and enables you to do so with ease.
With the most recent news of the WannaCry attack (affecting more than 100,000 organizations in 150 countries), everyone is more alert to the serious risks and liabilities of ransomware. With this most recent attack, even institutions such as hospitals and police departments have been affected! I read an article online that stated that some hospitals were forced to put life-saving operations on hold as a result!
In short, ransomware is serious stuff.
But why does it happen in the first place? Unfortunately, operating a business costs money. And to save money, some organizations may choose not to include all their important files in their backups, or may not even run backups on a regular schedule. Some may not even test their backups, causing further challenges and issues when they attempt to use them to recover. Also, some organizations may put their backups onto network drives, that are easy to access, in which the ransomware spreads to.
It has been stated that as high as 74% of tapes fail, and as high as 82% of people don’t perform regular health checks on their backups. That is a large risk to take.But Azure Backup has enhanced security. For example, DPM and MABS will prompt for a Security PIN whenever critical changes occur that could affect the integrity of the backups, like a modification of the Passphrase. This prevents hackers from re-encrypting your data and backups. And when such changes do occur, you are alerted to this.
But, in the case that your cloud backups are deleted, Azure Backup retains your backups for 14 days after a delete operation, so that you can recover from an attack. Also, by ensuring a minimum retention range checks, this ensures that you will have more than one recovery point to fall back to, in the case of an attack.
But this isn’t just an Azure Backup direct Agent feature. As long as you are using the MAB Agent with version 2.0.9052 or higher, Azure Backup Server (MABS) with Upgrade 1 along with MAB agent version 2.0.9052 or higher, or System Center Data Protection Manager (DPM) with 2012 R2 UR12 or 2016 UR2 along with MAB agent version 2.0.9052 or higher; you’re protected with these enhanced security features.
But let me stress something very important: Backups is NOT Disaster Recovery. If you have hundreds of Virtual Machines you need to restore, that could take a lot of time, especially if they are large systems like File Server, SQL Servers, SharePoint farms, etc. When (not if) disaster strikes, you need your business-critical systems back up and running ASAP!
This is where the other piece of the puzzle come in; namely Site Recovery.
Azure Site Recovery (ASR)
Azure Site Recovery (ASR) is the other piece of the BCDR puzzle. Yes, it’s important to maintain backups of your critical data. But, when your business-critical N-tier application goes down, you don’t want to reach for your backups, which may be an entire day out of sync. There’s a better way.
Site Recovery replicates, fails over, and recovers your workloads so that they remain available when a failure occurs.
I’ve worked with organizations, that, either don’t have a secondary disaster/failover site, or that do, but it’s literally across the street! This does not meet compliance needs.
With Azure Site Recovery (ASR), you can protect your systems across different regions and in different fault zones. In Azure, when datacenters are built, there is always a minimum of 2 paired centers, with at least 400+ kilometers (or 250+ miles) between them. This enables you to meet your compliance needs. The best part is that you are not restricted to use only the paired region. You can choose any region as your DR site within a geographic cluster.
Low RPO and RTO
When your business is experiencing a disaster, you want to ensure you are back up and running as quickly as possible. It’s a high-pressure situation, and you want to ensure the right components come online in the right order; else, your outage could grow.
With ASR you get continuous replication, which guarantees the very minimal data loss in the order of minutes. With continuous replication and recovery points being generated every few minutes, Azure Site Recovery will meet the low Recovery Point Objective (RPO) and Recovery Time Objective (RTO) needs of your most critical applications.
By leveraging the built-in orchestration engine, you can create powerful Recovery Plans that can help recover your entire application with a single click of a button. You no longer have to wait for confirmation from the Database Administrators that the SQL Servers are up and running, before bringing the App Logic and Web Tiers online. Use powerful automation, either through Azure Automation, or by using your own scripts; and improve your recovery time drastically!
Some replication tools and technologies just make copies of your Virtual Machines. But this does not account for application consistency. Azure Site Recovery features have been designed with application level protection/recovery in mind.
Why is Application-Consistent important versus Crash-Consistent?
Crash-consistent snapshots don’t work well for database applications because it doesn’t capture any data that is in memory or any pending I/O operations. An application-consistent snapshot captures all the data that’s in memory and all transactions in process. This is done through using Volume Shadow Copy Service (VSS) to pause (quiesce) the database application, flush its memory cache, complete all its writes, and then perform the snapshot.
When the snapshot is complete, the database is notified to resume its operation. Therefore, when restoring an application-consistent snapshot, there is no additional work required.
So, maybe you’re like that organization from the beginning of our story, and you have migrated your datacenter into the cloud. With the recently announced public preview of Disaster Recovery for Azure IaaS Virtual Machines, we now have the complete BCDR story.
The Final Chapter
The final chapter in our Business Continuity and Disaster Recovery (BCDR) story ends like this: Azure is the first public cloud to have a native first-class complete BCDR solution.
The Backup and Disaster Recovery technologies are built into the fabric, from the ground up. This isn’t some 3rd party solution that is deployed on top of Azure, but rather, a complete solution that is woven throughout its core.
And so, the organization that migrated their datacenter to the cloud, lived happily ever after (after implementing Azure Backup and Site Recovery of course).