Docker Bridge Networking – Lets Deep Drive it

0
3351
docker bridge networking
docker bridge networking

In Bridge mode, the Docker daemon creates docker0, a virtual Ethernet bridge that automatically forwards packets between any other network interfaces that are attached to it. By default, the daemon then connects all containers on a host to this internal network through creating a pair of peer interfaces, assigning one of the peers to become the container’s eth0 interface and other peer in the namespace of the host, as well as assigning an IP address/subnet from the private IP range to the bridge.

docker bridge networking
docker bridge networking
Let see Docker bridge mode networking in action
# docker run -d -P --net=bridge nginx:1.9.1
# docker ps
CONTAINER ID   IMAGE                  COMMAND    CREATEDSTATUS         PORTS                  NAMES
1724542db651        nginx:1.9.1         "nginx -g 'daemon ..."   3 minutes ago       Up 3 minutes        0.0.0.0:32771->80/tcp, 0.0.0.0:32770->443/tcp   tender_kowalevski
 Note

Because bridge mode is the Docker default, you could have equally used docker run -d -P nginx:1.9.1. If you do not use -P (which publishes all exposed ports of the container) or -p host_port:container_port (which publishes a specific port), the IP packets will not be routable to the container outside of the host.

Read More: Basic Docker Networking – Explained

Let’s Go Little Deep Dive

The bridge network represents the docker0 network present in all Docker installations. Unless you specify otherwise with the docker run –network=<NETWORK> option, the Docker daemon connects containers to this network by default.

There are four important concepts about bridged networking:

  • Docker0 Bridge
  • Network Namespace
  • Veth Pair
  • External Communication

Docker0 bridge

Docker version for this lab:

# docker version
Client:
 Version:      17.06.2-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   3dfb8343b139d6342acfd9975d7f1068b5b1c3d3
 Built:        Tue Nov 14 22:03:51 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.2-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   402dd4a/17.06.2-ce
 Built:        Tue Nov 14 22:04:39 2017
 OS/Arch:      linux/amd64
 Experimental: false

Through docker network command we can get more details about the docker0 bridge, and from the output, we can see there is no Container connected with the bridge now.

# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
8ce4e9f1923f        bridge              bridge              local
289022cfe6f0        host                host                local
3008bf41f312        none                null                local
# docker network inspect 8ce4e9f1923f
[
    {
        "Name": "bridge",
        "Id": "8ce4e9f1923f268d7dcd60686c3993550da6280f59793fc7ea15c27e7c8017b6",
        "Created": "2018-01-12T08:17:46.204851843Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "1724542db6519f7edde668eb850ab665b5a3a3de5b27003a9bdbd74f04f7dc5c": {
                "Name": "tender_kowalevski",
                "EndpointID": "a9808bbcf03cf0f9b3c6ad6bbb5feed3e58a47a176ed7e70eb19bb96e0a04dd5",
                "MacAddress": "02:42:ac:11:00:03",
                "IPv4Address": "172.17.0.3/16",
                "IPv6Address": ""
            },
            "dd7348d6a418864c297821d6bbfce4426786e36c175a4d7a4d65715628b40d52": {
                "Name": "telegraf",
                "EndpointID": "a1de59fb1cf64a953d3c00708aa6608041cdaf207bc0dd8abe945ee79b6d16e6",
                "MacAddress": "02:42:ac:11:00:02",
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
                        }
        },
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

You can also see this bridge as a part of a host’s network stack by using the ifconfig/ip command on the host.

# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 0a:45:d7:6d:51:06 brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT
    link/ether 02:42:c9:39:36:bf brd ff:ff:ff:ff:ff:ff

Because there are no containers running, the bridge docker0 status is down

You can also use brctl command to get brige docker0 information

# brctl show
bridge name     bridge id               STP enabled     interfaces
docker0         8000.0242c93936bf       no              vethd36aced
                                                        vethfccefff

Note: If you can’t find brctl command, you can install it. For CentOS, please use yum install bridge-utils. For Ubuntu, please use apt-get install bridge-utils

Veth Pair

Now we create and run a centos7 container:

# docker run -d --name test1 centos:7 /bin/bash -c "while true; do sleep 3600; done"

# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
3cf6a2de1b6f        centos:7            "/bin/bash -c 'whi..."   5 seconds ago       Up 5 seconds                                    test1

After that we can check the ip interface in the docker host.

# ip li
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 0a:45:d7:6d:51:06 brd ff:ff:ff:ff:ff:ff
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 02:42:c9:39:36:bf brd ff:ff:ff:ff:ff:ff
7: vethfccefff@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 2a:dc:ef:59:53:09 brd ff:ff:ff:ff:ff:ff link-netnsid 0
13: vethd36aced@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 06:d8:43:d7:e8:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 1

The bridge docker0 is up, and there is a veth pair created, one is in localhost, and another is in container’s network namspace.

Network Namespace

If we add a new network namespace from command line.

# ip netns add demo
# ip netns list
demo

# ls /var/run/netns
demo

# ip netns exec demo ip a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

But from the command ip netns list, we can’t get the container’s network namespace. The reason is because docker deleted all containers network namespaces information from /var/run/netns.

We can get all docker container network namespace from /var/run/docker/netns.

# docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
3cf6a2de1b6f        centos:7            "/bin/bash -c 'whi..."   5 seconds ago       Up 5 seconds                                    test1

# ls -l /var/run/docker/netns
total 0
-r--r--r-- 1 root root 0 Jan 16 12:25 efec4fa0835d

How to get the detail information (like veth) about the container network namespace?

First we should get the pid of this container process, and get all namespaces about this container.

# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
3cf6a2de1b6f        centos:7            "/bin/bash -c 'whi..."   5 seconds ago       Up 5 seconds                                    test1

# docker inspect --format '{{.State.Pid}}' 3cf6
2697

# ls -l /proc/2697/ns
total 0
lrwxrwxrwx 1 root root 0 Jan 16 12:27 cgroup -> cgroup:[4026531835]
lrwxrwxrwx 1 root root 0 Jan 16 12:27 ipc -> ipc:[4026532286]
lrwxrwxrwx 1 root root 0 Jan 16 12:27 mnt -> mnt:[4026532284]
lrwxrwxrwx 1 root root 0 Jan 16 12:25 net -> net:[4026532289]
lrwxrwxrwx 1 root root 0 Jan 16 12:27 pid -> pid:[4026532287]
lrwxrwxrwx 1 root root 0 Jan 16 12:27 user -> user:[4026531837]
lrwxrwxrwx 1 root root 0 Jan 16 12:27 uts -> uts:[4026532285]

Then restore the network namespace:

# ln -s /proc/2697/ns/net /var/run/netns/2697
# ip netns list
2697 (id: 0)
demo
# ip netns exec 2697 ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
14: eth0@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0

Note: After all is done, please remove /var/run/netns/2697 using # rm /var/run/netns/2697.

External Communication

All containers connected with brige docker0 can communicate with the external network or other containers which connected with the same brige.

Let’s start two containers:

# docker run -d --name test2 centos:7 /bin/bash -c "while true; do sleep 3600; done"
f490afd7d8ad1870d9f1361227b2c53987ce29f5d98b883eacb5711128413b2f
# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
f490afd7d8ad        centos:7            "/bin/bash -c 'whi..."   13 seconds ago      Up 12 seconds                           test2
3cf6a2de1b6f        centos:7            "/bin/bash -c 'whi..."   5 minutes ago       Up 5 minutes                            test1

And from the brige docker0, we can see two interfaces connected.

# brctl show
bridge name     bridge id               STP enabled     interfaces
docker0         8000.0242c93936bf       no              veth47f9c2d
                                                        veth8041a04
# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 0a:45:d7:6d:51:06 brd ff:ff:ff:ff:ff:ff
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 02:42:c9:39:36:bf brd ff:ff:ff:ff:ff:ff
15: veth47f9c2d@if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 6e:ff:da:77:74:26 brd ff:ff:ff:ff:ff:ff link-netnsid 0
17: veth8041a04@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 4a:fa:da:cb:33:85 brd ff:ff:ff:ff:ff:ff link-netnsid 1

The two containers can be reached by each other

#  docker inspect --format '{{.NetworkSettings.IPAddress}}' test1
172.17.0.2

#  docker inspect --format '{{.NetworkSettings.IPAddress}}' test2
172.17.0.3

# docker exec test1 bash -c 'ping 172.17.0.3'
PING 172.17.0.3 (172.17.0.3) 56(84) bytes of data.
64 bytes from 172.17.0.3: icmp_seq=1 ttl=255 time=0.055 ms
64 bytes from 172.17.0.3: icmp_seq=2 ttl=255 time=0.040 ms
64 bytes from 172.17.0.3: icmp_seq=3 ttl=255 time=0.042 ms
64 bytes from 172.17.0.3: icmp_seq=4 ttl=255 time=0.039 ms
64 bytes from 172.17.0.3: icmp_seq=5 ttl=255 time=0.040 ms
^C

The basic network would be like below:

docker external network
docker external network

CNM

To understand how container get its ip address, you should understand what is CNM (Container Network Model).

Libnetwork implements Container Network Model (CNM) which formalizes the steps required to provide networking for containers while providing an abstraction that can be used to support multiple network drivers.

During the Network and Endpoints lifecycle, the CNM model controls the IP address assignment for network and endpoint interfaces via the IPAM driver(s).

When creating the bridge docker0, libnetwork will do some request to IPAM driver, something like network gateway, address pool. When creating a container, in the network sandbox, and endpoint was created, libnetwork will request an IPv4 address from the IPv4 pool and assign it to the endpoint interface IPv4 address.

docker network explained
docker network

NAT

Container in bridge network mode can access the external network through NAT which configured by iptables.

From the docker host, we can see:

# iptables --list -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  ip-172-17-0-0.us-east-2.compute.internal/16  anywhere

Chain DOCKER (2 references)
target     prot opt source               destination
RETURN     all  --  anywhere             anywhere

NO COMMENTS