Showing posts with label Snapshot. Show all posts
Showing posts with label Snapshot. Show all posts

Friday, April 27, 2012

Large VM crashes during snapshot commit

Snapshots can be your friend but they can most certainly also make your life miserable. The other day we had a rather large VM (with 20 GB mem, 8 vCPUs and 28 TB storage divided on 22 .vmdk's) that crashed during a snapshot commit. The error stated: "Performing disk cleanup. Cannot power off." The snapshot had been taken while the VM was powered off and only a few changes had been made to the VM before the snapshot was committed.


After the crash, the VM would not power on. The error stated: "Reason: Cannot allocate memory" and in the error  description (see screendump below) there's an indication of disk a lock or disk error. Fortunately, the VM could be started from the service console (ESX 4.1 classic) with 'vmware-cmd'.

After boot, vCenter stated that there was no snapshots on the VM. However, 22 delta files on a single LUN was telling otherwise.

A normal procedure to do cleanup is to power off VM and clone it. However, with 28 TB storage in the VM, this was not an option.




Instead, the following did the trick: Log on to the service console, change directory to the folder where the .vmx file for the VM resides, take a new snapshot and then do a remove all snapshots (see this KB article for more info). This removes the new snapshot as well as the 'defect' snapshot.

To see if any snapshots exist (that will probably not be the case): 

vmware-cmd vmname.vmx hassnapshot

To take new snapshot (with no quiesce and no memory, see this KB article for details)

vmware-cmd vmname.vmx createsnapshot snapshot-name description 0 0


As you can see in screen dump below at first I tried to run the command without the two boolean arguments that relates to QuiesceFilesystem and IncludeMemory. 



To remove all snapshots:

 vmware-cmd vmname.vmx removesnapshots

In the screendump above the removesnapshots command returns an error code '1' which means that all is well and snapshots are gone.




Saturday, April 11, 2009

Understanding the snapshot - how to check size of a snapshot

When creating a snapshot, the existing vmdk file is locked and a new vmdk is created, a delta file. If there are multiple vmdk's attached to the VM, seperate delta files will be created for each vmdk. If vmdks are placed on other LUNs than where the .vmx file is residing, then all delta files will be placed on the same LUN as the .vmx file. All changes made after the snapshot is taken are added to the new vmdk file(s). The delta vmdk files can grow until they reach the size of the original vmdk file. If a snapshot exists for too long, this can generate problems as the SAN LUN can run out of disk space. If this happens, the VM’s will start to crash. Therefore, as a general rule of thumb, snapshots should not be left unattended for more than one or two weeks unless it is ensured that there is sufficient space on the data store. If the snapshot is needed for a longer period, it is recommended to make a clone instead.

To check the size of the snapshot, simply browse datastore and look for a numbered vmdk file, e.g.

server123-0000001.vmdk

If a second snapshot is taken, it is named:

server123-0000002.vmdk

And so forth…

Below is a number of screenshots where you can see how files are created as snapshots are made:

1. This first sreenshot, VM is just created, no snapshot:

2. Just after first snapshot taken – no further action taken

A new vmdk file is created which is about 18 MB in size when no changes has been made yet. Remark filename, jnrrsnaphosttest-000001.vmdk


3. After installation of a couple of applications

As changes are made, the new vmdk file increases in size. In this case it increases from initial 18 MB to 198 MB. See same file as above.

4. After second snapshot taken:

When a second snapshot is created yet another vmdk file is created (e.g. server123-0000002.vmdk) and so forth...

Friday, April 10, 2009

Monitor progress of snapshot deletion

When deleting (comitting) a snaphot in Virtual Center, it times out after 15 minutes. To follow the progress, log into to the service console and navigate to the folder where the .vmdk files are located. Run the following command:

watch ”ls –oghut –-full-time *.vmdk”

This way, you can see when the snapshot file is removed.
This info was originally found on: itknowledgeexchange.techtarget.com