While assisting a customer recently to deploy and configure Azure Site Recovery (ASR) in order to migrate on-premises VMware workloads into Azure, we ran into a few issues and potential “bugs”.
I say “potential” bugs, as I have not heard back from the ASR Product Team yet, if this behavior is in fact a “bug”, or if it is “by design”.
UPDATE: I have been in communication with the ASR team, but have not received permissions as of yet, to include our exchange as part of this post. If I do receive permission, I will update this article appropriately.
Firstly, to put things into context, this customer’s environment is solely using VMware. They don’t have any Hyper-V servers at all.
Thankfully, with Microsoft’s acquisition of InMage, and the integration of that technology with Azure Site Recovery (ASR), ASR is now an effective migration tool.
Note: Previously, with Microsoft’s initial acquisition of InMage, they released a tool called Microsoft Migration Accelerator. This was basically the re-branding of InMage under the Microsoft label. Now however, Microsoft has taken that technology functionality and combined it into the Azure Site Recovery product.
For further insight into this element, please view the Channel 9 recording of the “Microsoft Azure Migration Roadmap” session at Ignite 2015.
With that context in mind, we’re going to begin following the first few steps in the following Microsoft Azure documentation: Set up protection between on-premises VMware virtual machines or physical servers and Azure.
Following The Documentation
Per that Azure documentation, and as of this writing, there are 3 servers that are required; namely the Configuration Server, Master Target Server, and the Process Server.
Skipping all the “capacity planning”, “before you start”, etc. parts (which you can read yourself), we’re going to start at Step 1 – Create A Site Recovery Vault.
Step 1 – Create A Site Recovery Vault
So the first thing that we need is a Site Recovery Vault. It’s very simple to setup.
Just log into the Azure Management Portal, and in the bottom menu bar click New > Data Services > Recovery Services > Site Recovery Vault > Quick Create.
Provide a Name, select a Region, and select a Subscription (if applicable). Then click Create Vault.
Once the Vault has been created, it should show as “Active” in the status.
Great, that was the easy stuff! And we’re now done Step 1. Next step, Step 2 – Deploying the Configuration Server.
Step 2 – Deploying the Configuration Server
From within the Azure Recovery Services workspace, click on the name of your Recovery Vault. This will bring you to the Quick Config page for the specific Vault. Also, you can also get to the Quick Config page by clicking on the little cloud icon.
You will see a list to choose the Setup Recovery which will guide you through the process.
In this specific example, as stated at the outset, the customer is using VMware and migrating to Azure. So the option I will choose is “Between an on-premises site with VMware/physical servers and Azure“.
with the specific Setup Recovery method selected, we are presented with a series of steps to follow. The first step is “Prepare Target(Azure) Resources” and there are multiple activities involved with this, which we will go through.
Setup Recover – Step 1 – Prepare Target(Azure) Resources
This first setup step states: “After you deploy the Configuration Server, download and copy the registration key file to the Configuration Server. Launch the installer on the Configuration Server and use the key file to register the server to the vault. Generate registration key file creates a new key every time you click on it and only the latest key is valid at any given time. After the Configuration Server has been registered, deploy the Master Target Server. Once deployed, log in into the server and register it to the Configuration Server.”
So in a nutshell we need to:
- Deploy the Configuration Server
- Download the Registration Key
- Register the Configuration Server with the Recovery Vault
- Deploy the Master Target Server
- Register the Master Target Server with the Configuration Server
We won’t re-trace all the little step-by-step pieces, as those are documented on the referenced Azure documentation. I will however, point out the areas where the documentation isn’t clear, and the issues/errors that I encountered when following the steps provided by Microsoft.
Deploy the Configuration Server
On the first step, “Deploy the Configuration Server”, please note the following.
The issue that I encountered was specific to using the “VPN” network connectivity type. I have been told by my customer that they did not encounter the same issue when using the “Public Internet” network connectivity type.
So if you are following along, ensure that you select the “VPN” network connectivity type option.
While the automated process is deploying the Configuration Server, you can watch it’s progress in the Jobs area.
In my experience, it took the deployment job approximately 5 minutes to complete, but it was another approximately 5 minutes until the VM was in a “Running” state.
I’m going to skip the “Download the Registration Key” step in this post, as it literally is just a link which will download the “*.VaultCredentials” file.
Moving on to the “Register the Configuration Server with the Recovery Vault” step.
Register the Configuration Server with the Recovery Vault
This is where the confusion occurs.
Within the Azure documentation, the first step it tells you to do after completing the deployment of the Configuration Server is:
- In the Quick Start page click Prepare Target Resources > Download a registration key. The key file is generated automatically. It’s valid for 5 days after it’s generated. Copy it to the configuration server.
Notice the text I’ve highlighted in red. It says to copy the Registration Key (the .VaultCredentials file) to the Configuration Server. Sounds simple enough, right? But let’s look at the second step. The very next step says:
- In Virtual Machines select the configuration server from the virtual machines list. Open the Dashboard tab and click Connect. Open the downloaded RDP file to log onto the configuration server using Remote Desktop. If your Configuration server is deployed on a VPN network, use the internal IP address (this is the IP address you specified when you deployed the configuration server and can also be seen on the virtual machines dashboard page for the configuration server virtual machine) of the configuration server to Remote desktop to it from your on-premises network. The Azure Site Recovery Configuration Server Setup Wizard runs automatically when you log on for the first time.
So there are a few things that we need to draw your attention to.
First, the very first step tells you to copy the Registration Key file to the Configuration Server; although the second step then tells you to connect to the VM. It’s a little backwards, but not that big of a deal.
The more confusion part is in the second step where it tells you how to connect to the Configuration Server VM. It specifically says, within the Azure portal > Virtual Machines, select the Configuration Server, open the Dashboard, and click Connect.
But if you look at the deployment of the Configuration Server, although you see it has been assigned the static IP that was entered in the dialog, the Connect button is disabled/grayed out!
Not the REAL Solution
So why is that? Why can’t we connect? Well there are a few other clues. Note that the wording also says to “use the internal IP address” and to “Remote desktop to it from your on-premises network”.
What I at first discovered, is that the auto-created VM for the Configuration Server does not come with ANY VM Endpoints at all. Notice on the Configuration Server Endpoints tab, the complete lack of Endpoints.
Now compare that to any other Virtual Machine deployed in Azure. By default there are 2 Endpoints included; namely RDP and PowerShell.
So you might be thinking, well that’s easy enough to fix. Just add a new Endpoint for RDP. And that’s what I thought at first. So, while on the Endpoints tab for the Configuration Server, and clicked Add, chose to “Add a stand-alone endpoint”, and selected the default configuration for Remote Desktop.
Seemed logic enough. And after the endpoints were added, the Azure VM Connect but was enabled and I was able to RDP to the VM to continue the registration with the Vault.
And just as the Azure documentation said, the Azure Site Recovery Configuration Server Setup Wizard ran automatically when I logged in. So I walked through the wizard per the documentation, and everything was working, until I reached the Azure Site Recovery Registration step.
I selected the Registration Key (the .VaultCredentials file) that had been generated, in which I had copied to the Configuration Server. The wizard continued its process, but then at the end presented the following message.
So off to the Internet we go, and I searched for “cs registration with azure vault has failed”. This search turned up the following forum post: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/39eedb71-7087-4b26-b16e-f056d74fc9f5/failed-to-register-configuration-server-with-site-recovery-vault-for-protection-between-vmware-site?forum=hypervrecovmgr.
At the bottom of that forum post, Microsoft employees posted the following answer:
There is a known issue with VPN deployments, where Configuration Server registration fails if you manually add an RDP endpoint to the Configuration Server Azure IaaS VM before completing the registration. The workaround is to RDP to the Configuration Server VM from another machine that is on the VPN network using the Configuration Server’s internal IP address (open mstsc and type in the internal IP address of the Configuration Server VM) and complete the registration.
So notice that it said there is a “known issue” with finishing the configuration if you manually add the RDP endpoint to the server (which I did). Also notice that the workaround suggested is to “RDP to the Configuration Server VM from another machine that is on the VPN network”. So we have to RDP from a VM that is on the same Azure vNet! In my customer’s environment, they at least had another VM in the same vNet, but keep in mind that the Configuration Server is not joined to any domain. In my replication environment, I had to deploy another VM to accomplish this.
I find this slightly confusing since, shouldn’t we be able to connect to any Azure VM remotely from anywhere, unless we specifically configure them otherwise (i.e. via Network Security Groups, etc.)? Especially if they are not joined to a domain? What ever the reason, even though adding the RDP endpoint to the Configuration Server VM did allow RDP access, this didn’t solve the issue.
The “REAL” Solution, Well Sort Of
Note: Before double RDPing between VMs, I removed the RDP Endpoint that I had previously added due to the “known issue”.
So after RDP’ing to a VM on the Azure vNet, I then RDP’d to the Configuration Server to continue with the configuration. I attempted do do what it said, and use the CSPSConfigTool.exe to register the Vault. Similarly, I browsed to the Vault credentials file, and clicked Register.
This time however, I encountered a different error, specifically “Failed to register DRA. Error code: 1. Error logs are available at <SystemDrive>\ProgramData\ASRLogs”
Well at least there is a log we can look at! In fact, there are 2; named “DRAdapteMSI.log” and “DRASetupWizard.log”.
Both log files showed errors, however the first log file (DRAdapteMSI.log) showed that the installation completed successfully.
The second log file (DRASetupWizard.log) showed the error: “The ASR cannot be registered due to an internal error. Run Setup again to register the server.”
So, what’s this “internal error”? Well, despite my best efforts and reading through the entire log file, there is no indication what this error specifically is. But rest assured, there is still another place to look, Azure.
If you navigate in the Azure Management Portal to Recovery Services > Recovery Vault > Jobs, you will see there are a few failed jobs.
Select one of the failed jobs, and click the Error Details button in the bottom menu. I noticed a few things about this error message, like the reference to VMM (although we are not using VMM at all, nor did we select an option with VMM), and also the mention of Endpoints and HTTPS.
Notice that the Job Error has a TechNet Wiki link. This link will bring you to the Microsoft Azure Site Recovery: Common Error Scenarios and Resolutions. There is a section for “CS, PS, MT” which is specific for the Configuration Server (CS), Process Server (PS), and Master Targer (MT).
Unfortunately there is only one link within that section, entitled “A connection can’t be established to the Hyper-V Recovery Manager vault. Verify the proxy settings or try again later“. Almost as disappointing, the “resolution” (if you can call it that) is:
This error can occur if there is no internet connectivity. Check the proxy settings.
If internet connectivity exists, the failure may be caused by an internal server error. Retry registration by re-running setup in this case.
So, off to the Internet we go again. Using the same search results from my original search (“cs registration with azure vault has failed”), I found the following blog post: http://kenumemoto.blogspot.ca/2015/08/azure-site-recovery-asr-cs-registration.html.
That blog post mentioned:
According to the logs possible solutions are:
1) Internet connection failure or Invalid/expired vault credentials file. Verify the proxy settings, regenerate the vault credentials file and retry the operation with the updated credentials file.
2) Machine being out of sync from the time zone. Ensure that your current machine time corresponds to the selected time zone.
Since my Configuration Server VM had Internet access, I knew that this wasn’t the issue. I did notice however, that although the Configuration Server was deployed to the East US location, the default time zone used for VMs is UTC.
So I changed the Time Zone on the Configuration Server to Eastern Standard Time. I re-ran the CSPSConfigTool.exe tool, and this time it worked!
Note however, that when re-producing the issue myself in my lab/own Azure subscription, I didn’t have to change my Configuration Server’s Time Zone at all.
I have since continued on following the steps in the Azure documentation that was listed at the beginning of this article. If any further issues occur, I will update this article accordingly.