Hotplugging more than 15 scsi devices in Ubuntu


Hi all,

Today I ran into something that took me a bit to figure out. I could not add new disks to a virtual machine running Ubuntu.

The basic scenario:

  • VMWare ESXi 5.1
  • Ubuntu 12.04.3 LTS
  • 15 virtual harddrives already configured

I had to add more space to a filesystem without rebooting the server, which normally is very simple. This is what I normally do:

  1. Add a virtual disk in vSphere Client
  2. “rescan-scsi-bus -w -c” on the guest system (ubuntu)
  3. fdisk -> create partition and set device id to 8e
  4. pvcreate vgName /dev/sdX1
  5. vgextend
  6. lvextend -L +100G /dev/vgName/lvName
  7. sudo fsadm -v resize /dev/vgName/lvName

I tried to do this, but no matter what, I just could not get my Ubuntu box to see the new disks (I added a few).

Now over to the solution: After quite some research, I figured out what I had to do, but first some theory.

VMWare: When you add your 16th scsi device, VMWare will not only add the disk you ask it to add, but allso add a new scsi controller, and add the new disk to this controller. This since you can only have 16 devices on the scsi2 bus (duh). The new disk will be “SCSI (1:0)”. If you just want to test this, you can add a new disk to your VM and in the last section of the wizard, just assign it to (1:0). Before you apply this to your VM, you will see that you will not only add a new disk; you will also add a new scsi controller.

Ubuntu: If you just try and run rescan-scsi-bus on your Ubuntu system, it will happily do so, but it will not be able to see your new disk; since it does not know of your new scsi controller yet. You will notice that, since the adapters are listed in the beginning of the output:

[ccne lines=”0″]
maglub@nfs-v001alt:~$ sudo rescan-scsi-bus -c -w
/sbin/rescan-scsi-bus: line 592: [: 1.03: integer expression expected
Host adapter 0 (ata_piix) found.
Host adapter 1 (ata_piix) found.
Host adapter 2 (mptspi) found.

[/ccne]

So, the million dollar question is: How do you add this adapter without rebooting?

First, check the PCI bus, just to see that you don’t have the new scsi controller listed:

[ccne lines=”0″]
maglub@nfs-v001alt:~$ lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX – 82443BX/ZX/DX Host bridge (rev 01)

00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
[/ccne]

No trace of the new controller. This is, because you will need to rescan the PCI bus as well. To do this, you will need to do the following (as root):

[ccne lines=”0″]
echo “1” > /sys/bus/pci/rescan
[/ccne]

If you check your PCI bus now, you will see the new scsi-controller:

[ccne lines=”0″]
root@nfs-v001alt:~# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX – 82443BX/ZX/DX Host bridge (rev 01)

00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)

02:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
02:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
[/ccne]

This will also add your new disks, but if you are curious, you can scan your scsi bus for new disks to see what happens:

[ccne lines=”0″]
root@nfs-v001alt:~# rescan-scsi-bus -w -c
/sbin/rescan-scsi-bus: line 592: [: 1.03: integer expression expected
Host adapter 0 (ata_piix) found.
Host adapter 1 (ata_piix) found.
Host adapter 2 (mptspi) found.
Host adapter 3 (mptspi) found.

[/ccne]

The rescan-scsi-bus command can see your new scsi adapter! Voila!

No more insane clicking in ESXi – setup your testbed in a minute


Hello all!

Very often, I find myself in the situation where I have to quickly setup a test environment. In my case that usually means that I will quickly setup a:

  • VMWare ESXi (nowdays vCenter Hypervisor) virtual machine
  • 1GB RAM
  • 16GB Disk divided into /, /boot, swap
  • Ubuntu Server, 64 bit, 12.04.2 LTS

You might have your own favorite setup, this is mine. The goodies on top of a minimal server installation is:

  • vi as the default editor
  • luxury items like ksh, zsh
  • necessary tools like pv, sysstat, open-vm-tools

I’ve done this so many times by now, that I just cannot bear the thought of doing it again.

Why? Well, doing it by hand is summarised by the following:

  1. Connect to my office over VPN
  2. Startup my Windows VM (I am a Mac owner; live with it)
  3. Startup the vCenter Client
  4. Right click the proper resource group -> New virtual machine…
  5. Typical
  6. Give the VM a decent name
  7. Choose datastore
  8. Linux/Ubuntu 64-bit
  9. Network -> VM Network/VMXNET 3
  10. 16GB disk
  11. Modify VM properties
  12. Choose the CD -> ISO file -> browse, browse (got my own quick install, modified Ubuntu ISO)
  13. Connect on power-on
  14. Power on

I mean, that is 14 steps that I could live without. So, I started looking at ways to do this from a terminal window.

To this story, you need to know that my office environment I’ve got the following setup:

  • synology02 – nfs/cifs files, a Synology DS1511+
    • Sharing a handful of nfs filesystems
  • esxi01 – My ESXi 5.something server which hosts all my VM’s
    • 2 resource groups – NFSDev, NFSProd
    • 2 data stores: NFSDev and NFSProd – nfs mounted datastores located on the synology02
  • guran – my “central” server for more or less all and nothing

In my environment, I can browse my datastores when logged into “guran”, since all my virtual machines are located in the datastores on NFS. Look here:

[ccne lines=”0″]

malu@kmg-guran-0001:/mnt/synology02/files/vmware/datastores $ls -la
total 24
drwxr-xr-x 6 malu malu 4096 2012-10-02 12:49 .
drwxr-xr-t 9 malu malu 4096 2012-12-05 14:58 ..
drwxr-xr-x 19 malu malu 4096 2013-02-19 20:38 dev
drwxr-xr-x 8 malu malu 4096 2013-02-04 18:42 prd

malu@kmg-guran-0001:/mnt/synology02/files/vmware/datastores $ls -la dev
total 76
drwxr-xr-x 19 malu malu 4096 2013-02-19 20:38 .
drwxr-xr-x 6 malu malu 4096 2012-10-02 12:49 ..
drwxr-xr-x 2 root root 4096 2013-02-04 17:27 jira-v001fry
drwxr-xr-x 2 root root 4096 2013-01-04 11:11 kmg-buildbox-0001
drwxr-xr-x 2 malu malu 4096 2012-11-24 15:12 kmg-op5-0003
drwxr-xr-x 2 root root 4096 2013-01-04 11:10 kmg-op5-0004
drwxr-xr-x 2 root root 4096 2012-12-09 21:56 kmg-sandbox-0005
drwxr-xr-x 2 root root 4096 2012-12-09 19:49 kmg-sandbox-0005.save
drwxr-xr-x 2 root root 4096 2013-01-04 11:11 kmg-web-0001
drwxr-xr-x 2 root root 4096 2013-01-04 11:11 kmg-web-0002
drwxr-xr-x 2 root root 4096 2012-12-28 13:24 kmg-zenLoadbalancer-0001
drwxr-xr-x 2 malu malu 4096 2013-02-07 14:27 nexenta-v001test
drwxr-xr-x 2 root root 4096 2013-01-16 16:24 op5-v001test
drwxr-xr-x 2 root root 4096 2013-01-04 11:10 openstack-v001fry

[/ccne]

This is very useful, I must say. The goal for me is to be able to create new virtual machines without even thinking of starting up my Windows machine. I accomplished this by doing the following:

  1. I created a template.vmx file with the size of the VM I needed (which mounts my specially adapted Ubuntu ISO)
  2. Replaced all references to the name of the VM (in my case the hostname of the system) with the unique string “XXX_HOST_NAME_XXX”
  3. Figured out how to use this template properly

The basic recipe in my environment is:

  1. Create a new directory in the NFSDev datastore with the same name as the hostname of the new system
  2. Create a new _host_name_.vmx file from the template
  3. Create a new vmdk for the VM
  4. Register the VM in my one and only hypervisor/ESXi
  5. Startup the VM
    1. If there already is a VM with the same uuid/mac address, tell ESXi that I copied the VM

 

My template.vmx file looks like this:

[ccne lines=”0″]

.encoding = “UTF-8”
config.version = “8”
virtualHW.version = “8”
pciBridge0.present = “TRUE”
pciBridge4.present = “TRUE”
pciBridge4.virtualDev = “pcieRootPort”
pciBridge4.functions = “8”
pciBridge5.present = “TRUE”
pciBridge5.virtualDev = “pcieRootPort”
pciBridge5.functions = “8”
pciBridge6.present = “TRUE”
pciBridge6.virtualDev = “pcieRootPort”
pciBridge6.functions = “8”
pciBridge7.present = “TRUE”
pciBridge7.virtualDev = “pcieRootPort”
pciBridge7.functions = “8”
vmci0.present = “TRUE”
hpet0.present = “TRUE”
nvram = “XXX_HOST_NAME_XXX.nvram”
virtualHW.productCompatibility = “hosted”
powerType.powerOff = “default”
powerType.powerOn = “hard”
powerType.suspend = “default”
powerType.reset = “default”
displayName = “XXX_HOST_NAME_XXX”
extendedConfigFile = “XXX_HOST_NAME_XXX.vmxf”
floppy0.present = “TRUE”
scsi0.present = “TRUE”
scsi0.sharedBus = “none”
scsi0.virtualDev = “lsilogic”
memsize = “1024”
scsi0:0.present = “TRUE”
scsi0:0.fileName = “XXX_HOST_NAME_XXX.vmdk”
scsi0:0.deviceType = “scsi-hardDisk”
ide1:0.present = “TRUE”
ide1:0.fileName = “/vmfs/volumes/c262ee3b-00d1a1ed/images/kmg-ubuntu-12.04.2.LTS.iso”
ide1:0.deviceType = “cdrom-image”
floppy0.startConnected = “FALSE”
floppy0.fileName = “”
floppy0.clientDevice = “TRUE”
ethernet0.present = “TRUE”
ethernet0.virtualDev = “e1000”
ethernet0.networkName = “VM Network”
ethernet0.addressType = “generated”
chipset.onlineStandby = “FALSE”
guestOS = “ubuntu-64”
uuid.location = “56 4d 5d 9a c5 dc 8e a1-45 76 3d 90 34 83 82 d1”
uuid.bios = “56 4d 5d 9a c5 dc 8e a1-45 76 3d 90 34 83 82 d1”
vc.uuid = “52 75 89 2c 80 59 17 93-b9 0b 33 49 04 8c c8 a3”
snapshot.action = “keep”
sched.cpu.min = “0”
sched.cpu.units = “mhz”
sched.cpu.shares = “normal”
sched.mem.min = “0”
sched.mem.shares = “normal”

[/ccne]

Notice all the entries of “XXX_HOST_NAME_XXX”? I picked that string, since it is very unlikely that it is used by VMWare, and it is easy to replace using “sed”.

To make things a bit easier, I first setup my ESXi host to accept my public key to login as root:

[ccne lines=”0″]

ssh root@myESXiHost

vi /etc/ssh/keys-root/authorized_keys

[/ccne]

After this, the recipe is easy:

[ccne lines=”0″]
newHostName=testbox-v003fry
templateDir=/mnt/synology02/files/vmware/datastores/dev/template
datastoreDir=/mnt/synology02/files/vmware/datastores/dev
cd $datastoreDir
mkdir $newHostName
cat $templateDir/template.vmx | sed -e ‘s/XXX_HOST_NAME_XXX/’$newHostName’/’ > $datastoreDir/$newHostName/$newHostName.vmx
ssh root@192.168.2.204 vmkfstools -c 16g /vmfs/volumes/NFSDev/$newHostName/$newHostName.vmdk -a lsilogic
vmID=$(ssh root@192.168.2.204 vim-cmd solo/registervm /vmfs/volumes/NFSDev/$newHostName/$newHostName.vmx $newHostName pool0)
#– turn on VM
ssh root@192.168.2.204 vim-cmd vmsvc/power.on $vmID &
sleep 1
#– check there is a message and choose 2 (default, moved it)
[ -z “`ssh root@192.168.2.204 vim-cmd vmsvc/message $vmID _vmx1 | grep ‘No message’`” ] && ssh root@192.168.2.204 vim-cmd vmsvc/message $vmID _vmx1 2

[/ccne]

 

The ampersand (&) is there after the power.on, as the command will hang if there is already (most likely) a VM with the same uuid (defined in the vmx file) in the system. After this we need to sleep for one second, since there will be no message in the message buffer until the ESXi host realizes the there is a conflict, after which I first check if there is a message. If there is a message, pick it up and tell ESXi to use the default (2 – I copied the VM) alternative.

P.S I found out the “pool0″ reference to my NFSDev resource pool by browsing this page: http://communities.vmware.com/message/1114467

[ccne lines=”0”]
cat /etc/vmware/hostd/pools.xml | grep “YOUR-RESOURCE-POOL-NAME” -A1 | grep “” | sed ‘s///;s/</objID>//g’ | sed -e ‘s/^[[:blank:]]*//;s/[[:blank:]]*$//’
[/ccne]

Done. All for today.