Docker: An Introduction

Docker has gained a huge traction and everyone is talking about it. It has taken the tech world by storm.  IaaS cloud providers are providing Docker containers as a service. Amazon EC2 container service, Google container engine are some examples. PaaS providers are adding Docker support as well.  For those who are new to this arena, knowing about Docker fundamentals can be extremely handy. Here are few things, that will bring you a step closer to Docker.

So what exactly is Docker ?

From the docker website:

Docker is a platform for developers and sysadmins to develop, ship, and run applications. Docker lets you quickly assemble applications from components and eliminates the friction that can come when shipping code. Docker lets you get your code tested, and deployed into production as fast as possible.”

In short, docker provides a way to package and run an application inside a container. The user needn’t worry about the low-level details related to container life-cycle management. It’s also available today on different platform architectures (Intel, ARM, Power etc) as well as operating systems (Linux, Windows etc).

Docker is built around the client-server paradigm and consists of two main components:

  1. Docker engine: This is the server component and responsible for actual container life-cycle management, handling of images and related core tasks. This runs on the host or VM where you want to create containers
  2. Docker client: This is the client component. It can run on the same host/VM as docker engine or it can be remote. If you are a docker user, this is the component you will be mostly working with.

docker-client-server

Source – https://docs.docker.com

Key Technologies used by Docker

Containers: It’s an operating system virtualization mechanism, that provides the ability to run multiple operating systems referred to as containers, using the same host kernel and sharing the same physical devices. A container is basically an encapsulation of operating system run time packages, application code and related dependencies. The host kernel is shared across all containers. This is different than traditional virtualization like KVM etc. which virtualizes a complete hardware, with every virtual machine  having a separate kernel, virtual devices and a full-fledged operating system installed.

Containers are made possible due to the following two critical technologies:

Cgroups: Its a Linux kernel mechanism to limit and isolate resources – cpu, memory, disk, IO etc, for a group of processes.

Namespace: This enables a process to have a different view of the system than the rest of the processes.

These are the following name-spaces available on Linux and made use by containers:

  • IPC namespaces: This enables isolation of interprocess communication (IPC) resources.
  • Mount namespaces: This enables processes in different mount namespaces to have different views of the filesystem hierarchy.
  • Network namespaces: This enables isolation of networking resources like network interfaces, IP addresses, IP routing tables, port numbers etc. This allows each container to have its own view of network interface, ip addresses, port numbers etc. Thus on a single host, we can have multiple containers with services bound to the same port.
  • PID namespaces: This enables isolation of the process ID (PID) number space. This allows each container to have its own init process (pid = 1). The also enables containers to migrate between hosts while keeping the same PIDs inside the container.
  • UTS namespaces: This enables containers to have its own host-name and domain name.
  • User namespaces: This enables user and group ID number isolation. A process’s user and group IDs can be different inside and outside a user namespace. For example a user id can be a privileged user inside a container, whereas the same user id can be un-privileged outside the container.

Layered Storage: Docker makes use of operating system features, which allows creation of new images from a parent image, whereby the new images just contain the delta changes from the parent image. In other words, every docker image is stored as a layer with reference to the parent layer (image).

This is best explained using the picture from docker website

docker-layer

Source – http://docs.docker.com/terms/layer/

From the above figure, the base (starting) layer is the Debian image. The rest of the layers just contain the delta changes (emacs, apache in this case) from the base image.

Internally, this is what happens. When docker creates a container from an image, it mounts the root filesystem in read-only mode and instead of changing the file system to read-write mode, which is the usual case for Linux booting, it adds a read-write file system layer on top of the read-only file system and creates a union of these two filesystems (read-only root fs and read-write fs).

This layering is made possible by one of the following – aufs, devicemapper and btrfs.

  • Aufs: Stands for Advanced multi-layered Unification Filesystem and provides union mount functionality for Linux file systems.
  • Devicemapper: It’s the Linux kernel component for logical volumes subsystem and provides mapping of physical block devices onto virtual block devices with add-on functionalities like thin provisioning, encryption, RAID etc. Docker uses device-mapper with thin-provisioning capability.
  • BTRFS: Stands for B-tree Filesystem and provides copy-on-write functionality.

Hope this will help you to understand and get started on docker.

Pradipta Kumar Banerjee

I'm a Cloud and Linux/ OpenSource enthusiast, with 16 years of industry experience at IBM. You can find more details about me here - Linkedin

You may also like...