No one can dispute the amazing impact that hypervisor-based virtualization has had on the modern data center and its role in enabling cloud computing. A hypervisor provides a complete abstraction of underlying operating system resources, allowing a virtual copy of any compatible operating system (usually a x86 compatible system such as Microsoft Windows or Linux) to run on a host computer.
The hypervisor mediates requests for machine resources such as CPU, memory or IO, and presents to the guest operating system the standard BIOS API provided by the computer hardware.
VMware’s ESX platform and Microsoft’s Hyper-V are the most commonly deployed hypervisors within the data center, while the Xen Open source hypervisor powers many cloud platforms, including Amazon EC2.
Hypervisor-based virtualization has revolutionized data center operations by allowing for consolidation of workloads, thereby reducing the total number of physical machines required to support a given demand. Virtualization also allows more flexible resource management – memory, CPU or IO can be added to or removed from a VM far more easily than from a physical machine, and resources can more easily be redistributed to meet peak demand. And, requests for a virtual machine can be satisfied in a tiny fraction of the time required to procure and commission a physical machine.
In order to provide complete operating system independence, however, the hypervisor has to run at a fairly low level of abstraction – close to the metal, if you like. Guest virtual machines, therefore, must include an entire operating system, which means that they carry a lot of baggage.
Containers provide very similar advantages but do not attempt to provide a virtual machine; rather, they provide a virtual operating system. If your guest runs the same OS as your host, containers radically simplify the process of isolating multiple copies of that OS. This is because each guest – or “container” – can directly interact with the host operating system rather than having to interact with a simulation of the hardware’s BIOS.
Solaris zones were an early implementation of this concept, but, were, of course, limited to the Solaris OS and its relatively narrow penetration. Linux has a similar concept called Linux containers, which has had relatively low take-up until recently.
Docker is an open source project based on Linux containers that is showing rapid adoption. Docker containers provide many of the advantages of hypervisors. But, unlike virtual machines, Docker containers do not have to include a copy of the guest OS – each Docker container essentially shares the same copy of the underlying OS. This allows Docker containers to be much smaller, which, in turn, allows them to be more easily deployed, provides for greater density (more containers per host) and permits faster initialization.
Docker containers can encapsulate multi-tier applications dependencies, allowing the entire application stack to be run on a single machine, or, to scale into multiple machines as workload increases.
Docker was originally a component of a Platform as a Service offering known as dotCloud. Enthusiasm for Docker led the founders to reconstitute the company around Docker and achieve venture capital funding.
Subsequently, Docker has been adopted by many cloud providers, including Google Compute Engine and incorporation within the Red Hat distribution of Linux. Docker has also been embraced by the DevOps community and integrated into many open source technologies such as Puppet, Chef, Jenkins and Open Stack.
Docker doesn’t directly challenge traditional hypervisor-based virtualization – in particular, it does not support the running of disparate operating systems or non-Linux operating systems. But it does provide a very convenient mechanism for deployment of applications and the run time optimization of applications that run on Linux.