Showing posts with label VMotion. Show all posts
Showing posts with label VMotion. Show all posts

Thursday, March 14, 2013

vMotion error at 63% due to CBT file lock

A number of times in the past couple of years, we've had issues with vMotion on ESX 4.1 which happened after storage/SAN breakdowns/issues. ESX doesn't handle losing its storage very well and this can create locks on the VMs that can only be fix by rebooting the host (and shutting the hung VMs down first).

However, the other day I experienced the same sort of error on a ESXi 5.0 cluster which had not had any storage issues. This is quite inconvenient when you can't put a host into maintenance mode.

When initiating a vMotion, the VM fails at 63% with the following error:

"The VM failed to resume on the destination during early power on. 
Reason: Could not open/create change tracking file.
Cannot open the disk '/vmfs/volumes/xxxxxx/vmname.vmdk' or one of the snapshot disks it depends on"

It should be mentioned that for this customer we use Symantec Netbackup 7.5 with agentless .vmdk backup. To speed up the backup process we have enabled Changed Block Tracking (CBT) on the VMs.

I found this KB article but it only related to ESX 4.0 and 4.1 and also the suggestion is to just disable CBT which is not an option.

After a talk with VMware Support, we found the error.

It turns out that there is a lock on one or more of the .ctk files which are the files that keep track of changes to the .vmdks. These ctk files are created automatically when CBT is enabled. If one or more of these files are deleted, they will be recreated automatically.
In a normal setup, the .ctk files will only be locked for a few seconds when the backup software accesses the file.

The error looks like this:



To fix it, do the following:

Putty to one of the ESX hosts (remember to enable SSH under security profiles first).
Cd to the directory of the .vmx file

List all the .ctk files:

#ls -al | grep ctk

For each ctk file, verify whether the file has a lock

#vmkfstools -D vmname-ctk.vmdk

look for "mode" in the output. If it is "mode 0" your fine. If "mode 1" there's a lock. For "mode 2" something is completely wrong...


If you find a lock on a file, create a tmp directory and move the ctk file there (do this for all ctk's with locks):

#mkdir tmp

#mv vmname-ctk tmp

This will also work when the VM is powered on.

And you're done. After this, the VM will vMotion without failing.

This has been tested and works both on a ESX 4.1 classic cluster (where I had the same issue) and ESXi 5.

The VMware engineer could not give me an exact root cause but he was fairly sure that it was related to the backup software and that something had gone wrong while this software has been accessing these files.

Sunday, September 16, 2012

Improved vMotion in vSphere 5.1 - data moving vMotion

I heard about the new and improved data moving vMotion in the VMworld keynote and wanted to try it out in the home lab. The improvement consists of vSphere being able to perform a simultaneous vMotion+svMotion so you can change both datastore and host at the same time.

I was expecting this feature to be available from the vSphere client by right clicking the VM and choosing 'migrate'. However, this is not the case. The option is there but it is greyed out stating that the VM has to be powered off to perform this action, see screenshot below:


I found an article on yellow-bricks pointing towards the vSphere web client. And for a deep dive, see this post by Frank Denneman.

From the vSphere web client the option is available by right-clicking the VM and choosing 'Migrate', see below.


One apparent limitation is that you cannot migrate between Datacenters, only between cluster within a given Datacenter.


Other than that, the feature works as expected. I did a vMotion plus datastore move from local storage to shared storage. This is the second feature (here's the first one) I've found that is only available in the vSphere web client and not in the vSphere client which leads one to assume that VMware is actually serious about moving future administration away from the vSphere client.



Wednesday, November 17, 2010

vMotion between firewalls

Currently, I'm setting up a new VMware cluster as the exsiting hardware needs to be retired. The new cluster is in another management zone (and in another vCenter). To minimise downtime I looked at doing vMotion between the two clusters.

What I did was to disconnect one host from vCenter. Then add the host to the other vCenter directly on the ip number. The host was not added to the newly created cluster, only to the datacenter. And then drag and drop VMs between the clusters (EVC was enabled).

There was a couple of things that had to be tweaked before it worked.

vMotion had to be done between firewalls. When doing this, there are two important things to remember:

1. Set the default gatway of the vMotion interface (via vSphere Client)
2. Open inbound/outbound on port 8000 TCP in the firewall (see ESX configuration guide, page 150).

Furthermore, I encountered another issue. A number of VMs had a vmxnet NIC (it's some old VMs...). When starting to vMotion there was a warning that vmxnet is not supported on target host which is ESX 4 (source was ESX 3.5). However, after vMotion, the vmxnet NIC still worked. I tried to update VMware Tools and virtual hardware version to v7 and that also worked. vmxnet is kept as NIC after upgrade.

Wednesday, April 15, 2009

EVC - CPU compability for VMotion

EVC (Enhanced VMotion Compability) increases the VMotion possibilities between different processor generations. EVC is introduced from ESX v3.5 u2 and Virtual Center v2.5 u2.

Intel Nehalem processor type is supported from ESX v3.5 u4 (and VC2.5 u2 minimum, I guess)

Link to KB article

Link to Nehalem architecture on Wikipedia

Update 2011.03.18: A BL460cG6 (E5520 Nehalem 45 nm) is compatible together with a BL460cG7 (E5620 Nehalem 32 nm Westmere) in an EVC cluster with Intel Xeon Core i7 (currently second highest level for intel processors) level enabled. However, if Intel Xeon Core i7 32 nm is chosen, then only G7 blades can be used.