Thursday, May 8, 2008

Hyper-V plus failover clustering, an interesting marriage

Hyper-V is a really cool Windows add-on as by itself it is “just another hypervisor” but with the addition of a bunch of other Windows Roles and Features it quickly becomes much more.
Take High Availability for example. Hyper-V, plus a VM workload, plus Failover Clustering.


For those of you not already familiar with Failover Clustering I am going to talk a bit about Windows clustering in general. First of all, I am speaking of Failover Clustering, not Network Load Balancing clustering, that is totally different.


In the generic sense, Failover Clustering is a way of taking a workload that runs on a clustered node and keeping that workload available. With Hyper-V it involves keeping a VM powered on.
The only big requirement is shared storage. This can be old fashioned SCSI shared storage, fiber SAN, or iSCSI. If Windows can see it as storage and you can present it to more than one server then you have shared storage.


The Failover Clustering setup and validation wizards in Windows Server 2008 make clustering really super simple (makes me cringe when I recall my first NT 4 cluster). You run the wizard, and if you listen to it, you have a fully MSFT supported cluster – you even get a recommended quorum configuration.


One limitation to consider is NTFS. By default only one node (clustering term for a member server in a cluster) can own a LUN at any one time. To be a bit more granular, only one server can write to an NTFS partition at any one time. It is possible to share a LUN with two Windows servers, but even having one reading and the other only looking your volume will begin to degrade very quickly.


This sets up a one Highly Available VM to one LUN model for Hyper-V.


A highly available (HA) guest is made up of three parts. Part one – a configuration file. Part two – the workload. Part three – the LUN (that contains the VHD).


When a HA guest is failed over from one node to another all three parts must be moved between the nodes. The configuration is passed, the volume is passed, and the workload is passed.


The logistics behind this is that your HA guest is saved, its LUN is passed (assuming that all the bits of the VM reside in one folder), and then the guest is started (resumed).


The passing of the LUN prevents having more than one VM workload on a shared volume as the other VMs end up being ignored.


Why? You might ask. In a previous post I had mentioned about struggling with failover clustering for an hour or so, and above I mention that Hyper-V is not making the guest highly available but it is Failover Clustering.


Failover Clustering is acting upon that HA vm workload and doing whatever it takes to keep that VM up and running. It is what is controlling the VM, not Hyper-V.


Hyper-V is still involved, but from the standpoint that the VM guest heartbeat is lost for a moment, then failover clustering is right there, ready to move that VM in a snap and keep that darn thing running. IF there is collateral damage, that is not the fault of failover clustering, but the admin.


Will this behavior change? Who knows. Windows clustering has worked this way for a long time now, and so has NTFS. I guess that if you could get past the NTFS limitation, then you could do it.


That is enough for now. More later.

No comments: