Wednesday, October 14, 2009

Howto: Permission wars in VI3

UPDATE: This setup doesn't entirely work. Templates aren't visible to the users...

This past week, I have been working on an interesting problem. A new internal customer wanted a development environment where they could free hands to deploy and delete VMs, take snapshots etc. To more or less have free hands and the VMware team should provide the virtual infrastructure as a service.

Now, from a virtual infrastructure operations perspective, to give a customer that much freedom is a bit of an administrative nightmare. For example, how do you ensure that a cluster is not overcommitted and how to make sure that all servers are properly registered in the CMDB.

To address the most important issue - from a technical perspective: The customer should not be able to overcommit the cluster. If they have that possibility, then we can't do maintenance, there won't be full failover. The obvious way to go about it is to enable HA and then to check the 'Prevent VMs from being powered on if they violate availability constraints'. However, HA does not have the most sensible way of calculating HA slot sizes and if you only have two hosts in a cluster, then you risk not being able to deploy a new VM even though there are plenty of resources in the cluster.

A colleague of mine suggested that I create a root resource pool in the cluster and then add permissions only on that resource pool and not on the host, cluster, or datacenter level. In theory, this is a pretty good idea, as you can set a hard limit on the resource pool for memory usage (which in my experience is the typical, visible, limiting factor in the cluster). In this case, I set a limit of 50% of available memory and then made the resource pool non-expandable. The resource pool limits in relation to actually used memory - not what is assigned to the VMs, see below.


I created a role similar (I think ;-)) to virtual machine administrator, which can more or less anything at the virtual machine layer (deploy, delete, change, snapshot, mount ISO's etc.) and added this at the resource pool layer. When I started testing, I discovered a number of issues. First, I couldn't create a VM, I couldn't delete a VM, and I couldn't browse the datastore from the VM summary page. But these permissions were already given to the role. If the same role was applied to the cluster or datacenter level, then it worked fine. So it makes a difference at which level the permisssions are applied.

If I apply the role at the cluster level, then everything works in an acces rights perspective, but then the role have too many permissions. Then, they can deploy servers directly in the cluster and will not be forced to deploy into the root resource pool. And then control is lost.

The only way I could work around this issue was to create two seperate role with two different permission sets and then apply them at two different levels of the datacenter.

The first role has very few permissions and is applied at the datacenter level (do not propagate) (this could also be at cluster level, but currently I only have one cluster in the datacenter...). The second role is the actual role that I created in the first place. This role was applied to the cluster level (propagate rights) where a hard limit has been defined for memory.

Below is listed the permission mapping that I have used for both roles.

With setup, the user is completely locked down, so they can only deploy servers in the defined resource pool and they will not be allowed to overcommit. If they do, the VM's won't be able to power on.

In relation to snapshots and running out of space on the LUN, this problems still persists but will not be addressed in this article.

Role 1 (do not propagate rights) - to be applied at datacenter level

Virtual Machine.Inventory.Create

Virtual Machine.Inventory.Remove (otherwise one can’t delete VM from disk)

Virtual Machine.Configuration.Add New Disk

Datastore.Browse Datastore (to be able to browse datastore from VM summary view)


Role 2 (propagate rights) - to be applied at resource pool level

Datastore.Browse Datastore

Datastore.File Management

Virtual Machine.Inventory.Create

Virtual Machine.Inventory.Remove

Virtual Machine.Inventory.Move

Virtual Machine.Interaction.Power On

Virtual Machine.Interaction.Power Off

Virtual Machine.Interaction.Reset

Virtual Machine.Interaction.Answer Question

Virtual Machine.Interaction.Console Interaction

Virtual Machine.Interaction.Device Connection

Virtual Machine.Interaction.Configure CD Media

Virtual Machine.Interaction.Tools Install

Virtual Machine.Configuration.Rename

Virtual Machine.Configuration.Add Existing Disk

Virtual Machine.Configuration.Add New Disk

Virtual Machine.Configuration.Remove Disk

Virtual Machine.Configuration.Change CPU Count

Virtual Machine.Configuration.Memory

Virtual Machine.Configuration.Add or Remove Device

Virtual Machine.Configuration.Modify Device Settings

Virtual Machine.Configuration.Settings

Virtual Machine.Configuration.Change Resource

Virtual Machine.Configuration.Reset Guest Information

Virtual Machine.Configuration.DiskExtend

Virtual Machine.State.Create Snapshot

Virtual Machine.State.Revert to Snapshot

Virtual Machine.State.Remove Snapshot

Virtual Machine.State.Rename Snapshot

Virtual Machine.Provisioning.Customize

Virtual Machine.Provisioning.Clone

Virtual Machine.Provisioning.Create Template From Virtual Machine

Virtual Machine.Provisioning.Deploy Template

Virtual Machine.Provisioning.Clone Template

Virtual Machine.Provisioning.Mark as Template

Virtual Machine.Provisioning.Mark as Virtual Machine

Virtual Machine.Provisioning.Read Customization Specifications

Virtual Machine.Provisioning.Allow Virtual Machine Download

Virtual Machine.Provisioning.Allow Virtual Machine Files Upload

Resource.Assign Virtual Machine to Resource Pool

Resource.Migrate

Resource.Relocate


Thursday, October 8, 2009

Howto: Check if SAN cables are connected in ESX

When installing an ESX host and you have someone other than yourself taking care of the cabling of the host, it is very handy to be able to check wheather this has been done properly. You want to be able to verify that the HBA's have been physically connected to the fabric switches with fibre cables.

Ssh to the ESX host
ls to the /proc/scsi/qla2300 folder (if it's a Qlogic HBA...)
In this folder there are a number of text files named with the numbers 1-x corresponding to the number of HBA ports in your ESX.
Cat the files one at a time:

#cat 1
or
#cat /proc/scsi/qla2300/1

look for the following line in the files:

Host adapter:loop state=READY, flags= 0x8430403

If it says READY, the HBA has been physically connected to the fibre switch. If it says DEAD, then it is not.

Friday, October 2, 2009

VTSP 4 certified

Today, I passed the VTSP 4 (VMware Technical Sales Professional) certification. Apparently, for your company to keep VMware Enterprise Partner status, a minimum of 2 x VCP's, 2 x VSP's, and 2 x VTSP's are required. We have the first two accreditations well covered but needed the VTSP's - so I had to take one for the team together with a couple of the other guys ;-)

The achieve this certification, you need to pass six online tests which you can take at your own pace. These are available through Partner Central. There are self study guides with each test. We received a nice and sweet offer from our distributor to get a two day training session so we got it handled quick and easy...