Quantcast
Viewing latest article 6
Browse Latest Browse All 10

taking hot backups with Oracle VM

Propably my major complaint about oracle vm is it’s lack of a hot backup solution for virtual machines. There is simply no documented or supported way to back up running virtual machines – crazy. Of course you can install good old backup agents inside of your VMs but that is one of the things I want to avoid when deploying virtualization. Or, you could suspend or halt a machine, copy the virtual disk off to somewhere else and be happy. But I feel like we should not have to stop our services for backup. This is 2012 afterall!
Sure, the latest release introduced a new feature to export an OCFS2 filesystem via NFS so it can be backed up from another machine. But that is only half of the solution since a simple filesystem copy of a virtual disk image is very likely to be corrup if a running machine writes to it while we are trying to read.

One workaround suggested across forums and message boards is to clone your running machine and then copy the virtual disk of the stopped cloned machine. This sounds like a dirty workaround but if it flies, I’ll be happy. Unfortunately, I could not find a way to clone VMs through commandline scripts so this was not really practical for automated, day-to-day backups.
Until I came across instructions today that describes a new (and provided as-is, unsupported…) CLI interface for ovm manager. It requires version 3.1.1 build 365 but updating from build 305 was quite easy.

And sure enough, I can now use ssh to log in to the CLI:

[root@ovm ~]# ssh -l admin -p 10000 localhost
admin@localhost's password: 
OVM> showversion
3.1.1.365

Diving right into the task at hand, I cloned a test VM:

OVM> clone Vm name=BTCminer_01 destType=Vm destName=BTCM01_backup serverPool=ptx_pool
Command: clone Vm name=BTCminer_01 destType=Vm destName=BTCM01_backup serverPool=ptx_pool
Status: Success
Time: 2012-07-19 12:14:08.539

OVM> show VM name=BTCM01_backup
Command: show VM name=BTCM01_backup
Status: Success
Time: 2012-07-19 12:21:38.055
Data: 
  Name = BTCM01_backup
  Id = 0004fb0000060000a0d318dfea1a8ecb
  Status = Stopped
  Memory (MB) = 1024
  Max. Memory (MB) = 2048
  Max. Processors = 8
  Processors = 8
  Priority = 10
  Processor Cap = 80
  High Availability = false
  Operating System = Oracle Linux 6
  Mouse Type = Default
  Domain Type = Xen PVM
  Keymap = en-us
  description = bitcoin miner test, burning away CPU
  Server = 08:00:20:ff:ff:ff:ff:ff:ff:ff:00:1b:24:78:cc:62  [ovm01]
  Repository = 0004fb0000030000d4d126daf6f36560  [ovm_repo1tb]
  Vnic 1 = 0004fb0000070000fe61d3745c1e09c4  [00:21:f6:42:42:01]
  VmDiskMapping 1 = 0004fb0000130000e5ff03a3b8fe3a6b

OVM> show VmDiskMapping id=0004fb0000130000e5ff03a3b8fe3a6b
Command: show VmDiskMapping id=0004fb0000130000e5ff03a3b8fe3a6b
Status: Success
Time: 2012-07-19 14:53:32.845
Data: 
  Name = 0004fb0000130000e5ff03a3b8fe3a6b
  Id = 0004fb0000130000e5ff03a3b8fe3a6b
  Slot = 0
  Emulated Block Device = false
  Virtual Disk Id = 0004fb000012000054b0f999972b7d64.img  [BTCminer_01 (2)]
  Vm Id = 0004fb0000060000a0d318dfea1a8ecb  [BTCM01_backup]

I now have a stopped (consistent) clone of my running machine and I know the machine id and the virtual disk image file. sweet! I already mounted the repository on my ovm server, so now I can copy the vm.cfg and virtual disk to another filesystem (plain local disk in my test case)

cp -pr /mnt/repository/VirtualMachines/0004fb0000060000a0d318dfea1a8ecb/ /var/www/html/ovmbackup/
root@ovm ~]# cp -pr /mnt/repository/VirtualDisks/0004fb00001200007df94d4e5a72be09.img /var/www/html/ovmbackup/

Of course, I was eager to see how restoring works…

OVM> importVirtualDisk repository name=ovm_repo1tb server=ovm01 url='http://ovmmgr/ovmbackup/backup.img'
Command: importVirtualDisk repository name=ovm_repo1tb server=ovm01 url='http://ovmmgr/ovmbackup/backup.img'
Status: Success
Time: 2012-07-19 15:51:12.712

OVM> create VM name=recoverytest repository=ovm_repo1tb domainType=XEN_PVM memory=1024 on Server name=ovm01
Command: create VM name=recoverytest repository=ovm_repo1tb domainType=XEN_PVM on Server name=ovm01
Status: Success
Time: 2012-07-19 16:04:06.091

OVM> create vmDiskMapping name=recoverMap1 slot=1 storageDevice=backup.img on vm name=recoverytest
Command: create vmDiskMapping name=recoverMap1 slot=1 storageDevice=backup.img on vm name=recoverytest
Status: Success
Time: 2012-07-19 16:05:53.694

I cheated a little bit and did the assignment of a virtual network in the GUI, stopped the original vm and started the recovered machine. Eureka! It came up just like I expected it to. All the little pieces are in place now to build an automated backup process for our virtual machines.

Next Steps:
Put the backup-steps in a simple script so that it can run automatically. I would love to be able to use public key authentication with that ssh server. If that does not work, I’ll have to play with modifying the provided “expect” scripts to do what I want.

I also don’t want to just trust the backup to work, especially with the snapshot taken while the VM is running. In theory, the filesystem inside the VM should survive this crash-consistent state but I want to really make sure it does. Plus, there are a ton of other things that can go wrong. So in addition to automating the backup process I’d like to automate the recovery aswell. The idea is to import the backup back into OVM, change the virtual network to a sandbox and boot the VM. We can then perform a series of basic tests against it to check if all needed services inside the VM come back up the way they should. When this works, we have tested and guaranteed that our backup really works and we also know how long it takes to restore our backup.


Viewing latest article 6
Browse Latest Browse All 10

Trending Articles