From eBower Wiki
Jump to: navigation, search

Why KVM?

Why not? In my opinion there are three types of virtualization solutions. For corporate use you probably want to look at VMWare and Citrix-based solutions. I'd even go so far as to suggest Hyper-V depending on your company's proclivities. The reason isn't performance or efficiency, it's management. Especially in the VDI (Virtual Desktop Infrastructure) arena where you're dealing with thousands of these things. It's not that other solutions won't work, it's that the money you spend on a turn-key solution from VMWare is probably far less than the money you'd spend building up a KVM-based equivalent - especially if you're looking for advice as to what to buy from some random guy on the Internet.

If you're just getting started with VMs or you want to run desktop VMs, I'd recommend VirtualBox. The benchmarks are actually surprisingly close to KVM and even better than VMWare/Citrix solutions, it's not like you're losing much by getting a friendly UI. Where VirtualBox really kills the competition is in the graphics and peripherals department. They make it easy and seamless to connect physical devices to your VM and they provide drivers that include graphics acceleration. If you want to try out the latest version of whatever distro you're not running at the moment, need to run a Windows app, or even just need Windows for an old device that doesn't have Linux drivers VirtualBox is the way to go.

On the other hand, if you're an individual and you want to run a bunch of headless servers, KVM is a much better alternative. While you can (and I have) run VirtualBox headless via command line scripts, this is like using a chisel as a screwdriver. You can do it, but why? The virt-manager app is great in that I can connect to all of my HyperVisors at once, see the state of each of the VMs, and manage/maintain them all remotely without needing to log into the HyperVisor directly.

Of course these are not hard and fast rules, if you love KVM you can certainly run a desktop on it. If you know VirtualBox you can use it as a headless server. Use what works for you, but if it doesn't involve KVM you probably don't care about what's on this page!

Terminology

I've already used a big word and I should probably explain what it is.

  • HyperVisor: Also known as the Host Machine, a HyperVisor is a basic OS and VM management infrastructure required to host Virtual Machines. In my case the HyperVisor is KVM running on a thin Ubuntu server.
  • Bare Metal: When someone refers to a bare-metal installation they are typically comparing it to the same configuration in a virtual machine. For example, "the bare metal benchmark was 2% faster than when running under KVM."
  • VDI: Virtual Desktop Infrastructure. I run virtualized servers, I only have about half a dozen or so each dedicated to a specific purpose. VDI infrastructure is completely different, it involves running hundreds or thousands of nearly identical machines that can replace an entire desktop. You can imagine a VDI installation for a helpdesk consisting of a single desktop image that gets cloned and booted every time someone logs in - any changes to this desktop image will be wiped out on the next login. Conversely, you could have a dedicated image per user, using up more space but adding permanence (and management headaches) to each individual system. VDI is not covered here since that's a very different set of problems and solutions.

Installing the HyperVisor

I'm going to assume you're start with a plain Ubuntu Server install. In my opinion a HyperVisor should be a very basic thing, essentially just KVM and SSH. The more stuff you run on the HyperVisor the more likely you are to screw something up or need to restart for the wrong reasons. Remember, when the HyperVisor needs to reboot ALL of the VMs need to reboot.

I'm starting with a fresh Ubuntu 12.04 install, I would always use an LTS release for this unless there's a very good reason not to. I'll also assume that you didn't tell Ubuntu to add any services (I've never really liked that screen where you can tell it to be an SSH server or a VM host - I have no idea what packages it's adding).

Install Packages

There are a bunch of packages you'll need to install. Thankfully, this is pretty much a single line.

sudo apt-get install bridge-utils kvm libvirt-bin vim openssh-server && sudo apt-get update && sudo apt-get dist-upgrade
  • bridge-utils allows you to set up a network bridge, see below for details.
  • kvm installs the base KVM package
  • libvirt-bin installs the KVM management system
  • vim is, well, vim. If you don't have it, you need it.
  • openssh-server is an SSH server. It's pretty much the only thing you need to manage the box.

After this it's always a good idea to upgrade everything first - just in case you forgot to earlier.

Add Yourself to the libvirtd Group

Often forgotten, you'll need to make sure you give yourself permission to manage the VMs using this command which (cleverly) doesn't need me to know your username.

sudo adduser `id -un` libvirtd

Configure Networking

By default KVM will create a little NAT device for you. This will allow your VMs to share your IP address and still access the Internet mostly transparently. For a desktop system this works well and this is how I leave most of my VirtualBox instances. But for a server this has some obvious flaws. The first thing we'll need to do is create a bridge.

We'll assume that you're using eth0 as your primary network connection. What a bridge does is it says you can connect as many network interfaces together as you want and they'll all share the same physical interface. To your router it looks like someone installed a dumb Ethernet switch and is hooking up a bunch of physical computers to it behind the scenes. Your HyperVisor's eth0 interface maintains its own MAC address, but each VM you build will have a dedicated MAC address visible to anyone on your network. It also means your DHCP server will hand out unique IP addresses to each VM, again all in your standard network.

I'm going to assume you're using DHCP on your network. If you're not, you probably should - even for "static" IPs. The benefit of DHCP is that you have a single stop whenever you figure out you need to change something about your network. Want to switch to Google DNS or OpenDNS? Need to readdress because your work laptop has a collision when you try to print to your local printer? If you need to touch each of your VMs to make this change it's a big headache, for static IP addresses it's much easier to just associate a MAC address to an IP in your DHCP server. One exception to the DHCP rule is the DHCP server itself, but I don't know if you want to virtualize something as critical as a DHCP server. The other exception is IPv6, but we'll get there in a bit...

To set up the bridge we'll need to edit /etc/network/interfaces. First we'll set eth0 from dhcp to manual, then we'll add an entry for br0 as below:

auto eth0
iface eth0 inet manual

auto br0
iface br0 inet dhcp
        bridge_ports eth0
        bridge_stp off
        bridge_fd 0
        bridge_maxwait 0

Now restart networking or reboot and you should see no real difference. Here's what we did:

  • By changing eth0 to manual we told it that both the bridge and the interface don't need an IP address, we'll just assign it to the bridge which is what we want as our primary interface.
  • br0 is the bridge interface, it uses DHCP (if you need static IPs here for any reason, you should know what to do)
  • The only bridge_port we're adding is eth0, KVM will do the rest.
  • bridge_stp controls Spanning Tree. This is great if you've got a complicated network with loops, but in general it's just useless overhead. If you do get more complicated than mapping a single interface from a VM to this bridge (for example, mapping two interfaces on the same VM to it and then bridging those two interfaces together on the VM) you may want to enable this to prevent broadcast storms.
  • bridge_fd is the forwarding delay time - essentially if you have a system that takes a long time to get the Ethernet into a state that's happy you can set this to something like 10 which would wait 10 seconds before activing the bridge.
  • bridge_maxwait tells the system not to bother waiting for eth0 to get an IP address before you forward traffic from the VMs. This is unlikely to happen anyway, but you never know.

IPv6

You'll note there's nothing here about IPv6. That's because it's all transparent, radvd messages just pass through transparently.

Managing Your HyperVisor

Management is pretty simple, whether or not your desktop machine is running KVM you just need to run this:

sudo apt-get install virt-manager

You can install this on your HyperVisor if you want, but it comes with a bunch of GUI dependencies you may not want.

First you'll need to add a connection from File -> Add Connection (surprising, I know). You'll want QEMU/KVM as the Hypervisor, check the Connect to remote host box, use SSH as the Method and enter the username and hostname. You'll notice that there isn't much space for customizations here, that's what ~/.ssh/config is for. Your username (if you've got a Debian-based system) should never be root, that's just wrong from a security perspective and I won't get into why here. But if you use a non-standard port number, pubkey authentication, or other settings you can simply configure them in your SSH config file. In short, if you can type "ssh my-hypervisor" and get connected, all you need to do is put "my-hypervisor" in the hostname field and you should be all set.

It's up to you if you want to autoconnect every time you load up virt-manager.

Storage

Machines need storage! By default machines will be placed in /var/lib/libvirt/images and for me this works fine. However, one drive may not be enough.

Installation Media

I also want to have installation media stored someplace separate. On my fresh hypervisor I run this:

sudo mkdir /var/lib/libvirt/images/install-isos

I then put all of my installation media there, mostly it's just the latest Ubuntu server, but I also run a MythTV front end and you never know when you'll need CentOS, XP or Windows 7.

In virt-manager, right-click on your HyperVisor and select Details to open up the configuration page. Select the Storage tab and click the "+" button to add a storage pool. I call mine "install-isos" and leave the type as "dir" since I just want to copy the ISOs there and have them picked up automatically. The next screen you should be able to just leave alone, it will default to the correct directory.

Note that as a side effect this will appear as a directory in your default storage pool. If this bothers you, feel free to create /var/lib/libvirt/install-isos instead. Or to create a new pool for your VMs at /var/lib/libvirt/images/vms and delete the default pool.

Physical Disks

With my old system I used the HyperVisor as a file server as well. I'm not sure if that was a great idea or not, but with my new system I'm running a dedicated Samba server. I won't tell you how to configure it here, but I did want the Samba server to handle the LVM rather than the HyperVisor - I just wanted the HyperVisor to present the disk as a whole to the file server.

Find the Disk Name

To do this, first find out what your disk name is. You can run this to check:

sudo fdisk -l | grep Disk

Let's say it's /dev/sdb for argument's sake. Now /dev/sdb is a pretty awful name to use because adding another disk to the mix could change it to /dev/sdc. So let's use something a bit more deterministic. Run this:

ls -l /dev/disk/by-id | grep -w sdb

You should end up with a bunch of names, including one starting with ata, one with scsi and one with wwn. All of these have embedded in them something unique to your drive. It will always be there and it will never change. Let's use one of these instead.

Create a Partition

Using fdisk on a large disk doesn't work very well. We need to use parted. Run this:

sudo parted /dev/sdb

Of course, make sure that /dev/sdb is the right drive! You can use "p" to see if there are any partitions on the disk, if there are you can use "rm n" to remove partition #n. Repeat until the drive is empty.

Now run "mklabel gpt" and answer yes to the prompt. The command "mkpart pri 1 -1" will create a single partition with the entire disk. Now "quit" and you're done.

Add the Storage Pool

So we go back to the HyperVisor details and add a storage pool (see above if you've forgotten). Give it a name (I use something like "0-3TB" to reference the first 3TB drive or "1-3TB" for the redundant drive) and set the type to "disk" because, well, it's a disk.

The target path should be /dev, the format (assuming it's not a tiny disk) is gpt, the source path should be /dev/disk/by-id/your_id_string and don't check the Build Pool checkbox.

You should shortly see the storage group appear with a single partition filling the entire disk (well, it reports 99% but that's close enough!).

Link the Pool to the VM

Finally, we need to link this pool to your storage VM. Open up the VM in virt-manager and click the blue "i" button to see info about the VM. Under the list of hardware, there's an "Add Hardware" button. Click it.

Under Storage select the "Select managed or other existing storage" and find the drive you just added. I set mine as a Virtio disk with a RAW storage format.

Note that this seems a little buggy, if at first you don't succeed try again. You may also be prompted that you'll need to shut down the VM for this to take effect. Working with a VM that's down may be a better alternative.

Mounting the Drive on the VM

Mounting the drive is pretty easy. You've already got /dev/vda which is the disk you created when you built the VM. The second disk appears as /dev/vdb. Can you guess where the third one would be?

Note that these disks are bare, unformatted data. That's what we want. I have a pair of 2TB and a pair of 3TB disks, but I'd rather have one large partition so I use LVM to merge them together into 5TB volumes. I also use encryption so breaking into my HyperVisor doesn't get you access to any of the data.

Creating a VM

Creating a VM via virt-manager is pretty trivial. Just click the "Create a virtual machine" button in the upper left of the UI and walk through the wizard.

  • The easiest is to use "Local" install media (this is actually local to the HyperVisor, not local to the machine running virt-manager). Browse for the install ISO (see the Installation Media section above to see how I handle things). Set the OS type and version.
  • Then tell it the number of CPU cores to dedicate and the amount of RAM (RAM is the only resource that VMs will consume as much of as possible, be careful about adding too much memory to your machines).
  • I like to create a new disk image and pre-allocate the space. But the majority of my installations take up so little space this isn't a problem.
  • For a desktop I just create the VM, but for a server I change the Virtual Network from a NAT to "br0" so I have a "real" IP address.

That's it. There are plenty of references to create more advanced VMs or do nifty things with them, but in general it's just a few text strings and walking through a wizard.

Standard Tools

There are some standard tools I like to install. These aren't really KVM-specific but they can help with some troubleshooting.

sudo apt-get install python-software-properties && sudo add-apt-repository ppa:ubuntu-ebower/ebower && \
  sudo apt-get update && sudo apt-get install iotop nethogs uping
  • python-software-properties is necessary to add repos easily
  • iotop will show you the disk I/O stats of various processes
  • nethogs will show you the network utilization of each process
  • uping is of my own making and will allow you a single UI to ping IPv6 and IPv4 as well as some other tools

Hard Drive Performance

You can use this to check the performance of your virtual drives:

sudo hdparm -Tt /dev/vda

But note that this doesn't actually do much, in fact my virtio disk is about 10x the performance of the same physical drive on my HyperVisor. For a more complete set of tests, you can run bonnie++:

sudo apt-get install bonnie++
bonnie++

Just run it from the partition you care about in a space you've got write access to (you may want to refrain from running it as root). Obviously, this places your file system under heavy load. I wouldn't recommend it while you're copying files back and forth.

Network Performance

iperf is a simple tool to test network performance. First install it:

sudo apt-get install iperf

But you, of course, need two endpoints. You can have one be the HyperVisor and another a VM, use two VMs, but the best bet is probably to start with HyperVisor to another machine connected via a physical Ethernet cable and then from a VM to this other machine. What we'd expect is very little delta between the HyperVisor to this system and the VM to this system.

On the server run this:

iperf -s

On the HyperVisor/VM run this:

iperf -c [server_ip]

Note that ethtool may also be useful here. I was wondering why I was limited to about 100Mbps on my VMs before I realized that my other machine was only a 100Mbps machine:

sudo apt-get install ethtool
sudo ethtool eth0 | grep Speed

This will return the line rate of the Ethernet port.

Autostart a VM

To make a VM boot when the HyperVisor boots (in other words, when you're running a headless server) just log into the HyperVisor and run this:

virsh autostart [vm_name]

Backing Up Your VM

I'm currently playing with this. First install the prerequisites:

 sudo apt-get install libsys-virt-perl libxml-simple-perl

Then run a command like this:

sudo ./virt-backup.pl --pre --vm=my_vm_name --backupdir=/mnt/vm_backups --privatedir --compress --debug

It will pause the VM, backup the volume, then unpause it for you. The process is service affecting, but it seems to do a complete job. My 8GB images take about 2 minutes to backup. I'm tempted to add some logic to restore last week's backup to an image, start that image just as I shut down the current image, perform the backup while I'm running the stale image, then kill off the stale image as I'm booting up the new one. But that's a lot of work to avoid 2 minutes of downtime.

On the other hand, if the data files are backed up, the main config files are backed up, and the process for creating the VMs is documented is it worth two minutes a week when recovery from scratch takes an hour or two?

SpiderOak

While not appropriate for a full VM backup unless you buy a plan, this may be useful for backing up key datafiles. SpiderOak is an excellent zero-knowledge cloud-based storage provider with a free tier. You can download via CLI here (change the arch as needed):

wget 'https://spideroak.com/getbuild?platform=ubuntu&arch=x86_64' -O SpiderOak-x86_64.deb
sudo dpkg -i SpiderOak-x86_64.deb


If you don't have your account, feel free to use my referral link or set one up without it.

Now you'll need to log into your account. Once you've created one online or with your desktop, just run the following.

SpiderOak --setup=-

It will walk you through asking for a username/password and then you can create a new device or restore an existing one. SpiderOak creates backups by device, this lets you separate out which VM is backing up what data. You can sync this data between devices as well, but that's probably easiest with one of the GUI clients.

To do an ad hoc simple backup you can run the following:

SpiderOak --headless --backup=/etc/critical/file.conf

You can also create a recurring backup by running the following:

SpiderOak --include-dir=/etc/apache2/sites-available

This will add your available web configurations to the backup list, you can verify using:

SpiderOak --selection

Once you've got this working you can simply run:

SpiderOak --headless&

To automate this you can run crontab -e as the user you installed SpiderOak under and add the line:

@reboot sleep 600 ; /usr/bin/SpiderOak --headless

I ended up putting a sleep in there because my Samba mounts can be a bit slow to load so SpiderOak would keep deleting the files and then backing them up again. And really, do I need to back up stuff within 10 minutes of a server reboot?

You will note that you'll need to killall SpiderOak before you can modify things since SpiderOak doesn't like it when there's a headless daemon running around (frankly, I can't blame it...).

There is a good CLI reference for SpiderOak here.