Wednesday, July 22, 2015

Cold Migration of Shared Disks (Oracle RAC and NFS clusters)

Certain applications use shared disks (Oracle RAC and NFS clusters due to clustering features). These can be vMotion’ed between hosts (for RAC you have to be careful for monster VMs and high loads as the cluster timeout has low threshold that can be reached during cut-over), but svMotion is not possible. Migration of disks will have to be done while both virtual machines (or all VMs in cluster) are shut down (cold migration). The method involves shutting down both primary and secondary node, removing the shared disk that has to be migrated (without deleting it) from the secondary node, migrating the disk to new LUN from primary VM, and then re-adding the disk to secondary node after migration is completed (including configuration of multi-writer flag for disk). After this both VMs can be booted.

Note: This is not RDM disks but regular vmdk's with multi-writer flag set

Instruction steps

Steps to migrate shared disks (Oracle RAC and NFS)

  • Identify the two VMs that share disks, note the VM names
  • Identify the disk(s) that should be migrated to new LUN, note the scsi ID for each disk (e.g. SCSI (1:0))
  • Note (mostly for Oracle RAC) if disk is configured in Independent and persistent mode
  • Ensure maintenance/blackout windows is in place
  • Shut down both VMs
  • For secondary VM, go to Edit Settings -> Options -> General -> Configuration Parameters (see screen dump below) and verify if the “multi-writer” flag is set for the disks to be moved
  • While both VMs are shut down, remove the disk(s) from the secondary VM (without deleting it)
  • From primary VM, right click and choose Migrate. Migrate the disk(s) to the new LUN
  • Wait for the process to finish
  • On secondary VM, go to Edit Settings -> Hardware -> Add. Select Hard disk and Use existing hard disk. Browse for the disk in the new location and click add. Make sure the same SCSI ID is used as before
  • For secondary VM, go to Edit Settings -> Options -> General -> Configuration Parameters -> Add row and add the multi-writer flag to each of the re-added disks.
  • (If disk is/was configured in Independent and persistent mode, go to Edit settings -> Hardware -> Mark the disk -> under Mode, check the Independent check-box and verify that the Persistent option is set)
  • Boot the primary VM, boot the secondary VM
  • Ensure that application is functioning as expected. Done



Monday, July 13, 2015

Permanent Device Loss (PDL) and HA on vSphere 5.5

At my current client we are doing a number of non-functional requirement (NFR) tests involving storage. One of them is about removing a LUN to see if HA kicks in.

The setup is an EMC VPLEX Metro stretched cluster (or cross-cluster) configured in Uniform mode. So active-active setup with site replicated storage and 50% of hosts on each site. And vSphere 5.5.

What complicates things in a stretched Metro cluster in Uniform mode is that even though storage is replicated between sites, the ESXi hosts only see storage on their own site. So if you kill a LUN on one site A in VPLEX, the hosts in site A will not be able to see LUNs on site B and HA therefore is required.

My initial thought was that cutting/killing a LUN on the VPLEX would make the VMs on that LUN freeze indefinitely until storage becomes available again. This is what happened earlier with vSphere 4.x and it was a real pain for the VMware admins (an all-path-down (APD) scenario).

However, as of vSphere 5 U1 and later, HA can now handle a Permanent Device Loss (PDL) where a LUN becomes unavailable while the ESXi hosts are still running - and the array is still able to communicate with the hosts (if array is down, you have an APD and HA will not kick in).

In vSphere 5.5, HA will work automatically if you configure two advanced settings which are non-default, go to ESXi host -> Configuration -> Advanced Settings and set the following:

  • VMkernel.Boot.terminateVMOnPDL = yes
  • Disk.AutoremoveOnPDL = 0

This has been documented well by Duncan Epping on Yellow-Bricks and on Boche.net. And a bit more info here for 5.0 U1.

See screen dump below for settings:

A reboot of the ESXi host is required for the two changes to take effect.