Docker Swarm is native clustering for Docker. It turns a pool of Docker hosts into a single, virtual host.
Swarm serves the standard Docker API, so any tool which already communicates with a Docker daemon can use Swarm to transparently scale to multiple hosts: Dokku, Compose, Krane, Flynn, Deis, DockerUI, Shipyard, Drone, Jenkins… and, of course, the Docker client itself.
Like other Docker projects, Swarm follows the “batteries included but removable” principle. It ships with a set of simple scheduling backends out of the box, and as initial development settles, an API will be developed to enable pluggable backends. The goal is to provide a smooth out-of-the-box experience for simple use cases, and allow swapping in more powerful backends, like Mesos, for large scale production deployments.
Docker Swarm filters
Swarm has five filters for scheduling containers:
- Constraint — Also known as node tags, constraints are key/value pairs associated to particular nodes. A user can select a subset of nodes when building a container and specify one or multiple key value pairs.
- Affinity — To ensure containers run on the same network node, the Affinity filter tells one container to run next to another based on an identifier, image or label.
- Port — With this filter, ports represent a unique resource. When a container tries to run on a port that’s already occupied, it will move to the next node in the cluster.
- Dependency — When containers depend on each other; this filter schedules them on the same node.
- Health — In the event that a node is not functioning properly, this filter will prevent scheduling containers on it.
Docker Swarm load balancing
Swarm uses scheduling capabilities to ensure there are sufficient resources for distributed containers. Swarm assigns containers to underlying nodes and optimizes resources by automatically scheduling container workloads to run on the most appropriate host. This Docker orchestration balances containerized application workloads, ensuring containers are launched on systems with adequate resources, while maintaining necessary performance levels.
Swarm uses three different strategies to determine on which nodes each container should run:
- Spread — Acts as the default setting and balances containers across the nodes in a cluster based on the nodes’ available CPU and RAM, as well as the number of containers it is currently running. The benefit of the Spread strategy is, if the node fails, only a few containers are lost.
- BinPack — Schedules containers to fully use each node. Once a node is full, it moves on to the next in the cluster. The benefit of BinPack is it uses a smaller amount of infrastructure and leaves more space for larger containers on unused machines.
- Random — Chooses a node at random.
Why do we want a Container Orchestration System?
To keep this simple, imagine that you had to run hundreds of containers. You can easily see that if they are running in a distributed mode, there are multiple features that you will need from a management angle to make sure that the cluster is up and running, is healthy and more.
Some of these necessary features include:
- Health Checks on the Containers
- Launching a fixed set of Containers for a particular Docker image
- Scaling the number of Containers up and down depending on the load
- Performing rolling update of software across containers
Let us look at how we can do some of that using Docker Swarm.
A Swarm based cluster is made up of three main components: a Swarm manager node (or nodes), a set of Docker nodes, and a node discover mechanism like a key-value store. There are a few popular and open source key-value store options. In this article I’ve used etcd, but could have just as easily used Consul, or ZooKeeper. The cluster works because all components can talk to each other via well known interface specifications. Docker nodes implement the Docker API. All nodes have an etcd driver that can register endpoints with etcd. The Swarm manager exposes the Swarm API — which is mostly compatible with the Docker API — to clients. Docker nodes join the cluster either using libnetwork at daemon startup, or using a Swarm container in join mode.
In this Swarm cluster we made the choice that Node1 will be the leader and that the Swarm Cluster leader will listen on the private network:
# docker swarm init –listen-addr 192.168.0.161
Typing this command will output a second command that you should type on the other nodes in order to join the leader:
# docker swarm join --token SWMTKN-1-5n9u1wqz5p4un6dv3vgrackvh0lij5p4f6dcm38ak64f5hx9i5-983zx6dv5ahdeqv3ipjer6416 192.168.0.161:2377
The choice of using private networking is important for your security especially that Docker Swarm is in its beta version.
Anyway, type the last generated command on the other nodes (node2 & node3 & etc ).
Well once you run the same command on required nodes says that “This node joined a Swarm as a worker.”. Let’s go back to the master and verify it:
# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 44o38p57llcnw2qv9258yba3n * node1 Ready Active Leader ae58smy84cj4jxbpncqd8r4it node2 Ready Active
Using Docker node command we can list all the nodes that have already joined the cluster.
In a production environment I would think about more security.
Apart from using private networking, you can use some other options when starting a cluster like refusing any auto joining request until approved:
Using a validity period for your node certificates could be another security enhancement:
You can also give the cluster an explicit secret string: