Containers created and managed by the Docker platform, are able to provide the service that is running inside the container, not only to other co-located containers, but also to remote hosts. Docker achieves this with port forwarding. For a brief introduction to containers, take a look at a previous article.

When a container starts with its port forwarded to the Docker host on which it runs, in addition to the new process that runs inside the container, you may have noticed an additional process on the Docker host called docker-proxy:

  PID TTY      STAT   TIME COMMAND
 8006 ?        Sl     0:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8000 -container-ip 172.17.0.2 -container-port 8000

The purpose of this process is to enable a service consumer to communicate with the service providing container .... but it's only used in particular circumstances. The docker-proxy operates in userland, and simply receives any packets arriving at the host's specified port, that the kernel hasn't 'dropped' or forwarded, and redirects them to the container's port. The docker-proxy is the same binary as the Docker daemon and Docker client, which the Docker daemon 'reexecs' when it is required.

In order to understand why this process exists, we first need to understand a little about Docker's networking configuration. The default modus operandi for a Docker host is to create a virtual ethernet bridge (called docker0), attach each container's network interface to the bridge, and to use network address translation (NAT) when containers need to make themselves visible to the Docker host and beyond:

Docker Bridge

Controlling access to a container's service is controlled with rules associated with the host's netfilter framework, in both the NAT and filter tables. The general processing flow of packets by netfilter is depicted in this diagram.

If a container's port 172.17.0.2:8000 is to be forwarded to the host as port 8000, then Docker adds some rules to netfilter's NAT table, enabling the container to 'masquerade' as the host using NAT:

Chain PREROUTING (policy ACCEPT 49 packets, 9985 bytes)  
num   pkts bytes target     prot opt in     out       source               destination  
1       80  4152 DOCKER     all  --  *      *         0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain OUTPUT (policy ACCEPT 1436 packets, 151K bytes)  
num   pkts bytes target     prot opt in     out       source               destination  
1      274 56172 DOCKER     all  --  *      *         0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 1369 packets, 137K bytes)  
num   pkts bytes target     prot opt in     out       source               destination  
1      274 56172 MASQUERADE all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0  
2        0     0 MASQUERADE tcp  --  *      *         172.17.0.2           172.17.0.2           tcp dpt:8000

Chain DOCKER (2 references)  
num   pkts bytes target     prot opt in       out     source               destination  
1        0     0 DNAT       tcp  --  !docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8000 to:172.17.0.2:8000  

Netfilter is stateful, which means that it can track connections that have already been established, and in such circumstances it bypasses the NAT table rules. But in order for a connection to be established in the first place, packets are subjected to the scrutiny of the rules in the NAT and filter tables.

Packets destined for the host's socket (the container's forwarded port) are processed by netfilter and tested against the rules in the PREROUTING chain of the NAT table, and provided the destination address of a packet is local to the Docker host (which it is), netfilter jumps to the DOCKER chain for further processing. As long as the packet didn't arrive from the ethernet bridge (i.e. from a container), and provided the packet is addressed to TCP port 8000 on the Docker host, then its destination is changed to 172.17.0.2:8000 by the DNAT target - which is the container socket. As the packet needs to be routed to the container, the rules in the FORWARD chain of the filter table are assessed:

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)  
num   pkts bytes target     prot opt in       out      source              destination  
1       63 10326 DOCKER     all  --  *        docker0  0.0.0.0/0           0.0.0.0/0  
2       50  9618 ACCEPT     all  --  *        docker0  0.0.0.0/0           0.0.0.0/0            ctstate RELATED,ESTABLISHED  
3       61  5675 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0           0.0.0.0/0  
4        0     0 ACCEPT     all  --  docker0  docker0  0.0.0.0/0           0.0.0.0/0           

Chain DOCKER (1 references)  
num   pkts bytes target     prot opt in       out      source              destination  
1        0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0           172.17.0.2           tcp dpt:8000  

The first rule applies, which forces a jump to the DOCKER chain, and the single rule in the chain matches the characteristics of the packet, and 'accepts' the packet for forwarding on to the container's socket. Hence, a remote service consuming process thinks it is communicating with the Docker host, but is being serviced by the container instead.

Similarly, when a container initiates a dialogue with a remote service provider, netfilter's NAT POSTROUTING chain changes the source IP address of packets from the container's IP address, to the address of the host's network interface that is responsible for routing the packets to their required destination. This is achieved with netfilter's MASQUERADE target.

A Docker host makes significant use of netfilter rules to aid NAT, and to control access to the containers it hosts, and the docker-proxy mechanism isn't always required. However, there are certain circumstances where this method of control is not available, which is why Docker also creates an instance of the docker-proxy whenever a container's port is forwarded to the Docker host.

Firstly, in order for a remote host to consume a container's service, the Docker host must act like a router, forwarding traffic to the network associated with the ethernet bridge. A Linux host is not normally configured to be a router, so the kernel parameter net.ipv4.ip_forward needs to be set to 1 (net.ipv6.conf.default.forwarding and net.ipv6.conf.all.forwarding for IPv6). Docker takes care of this if its daemon is started with default settings. If, however, the daemon is started with the --ip-forward and/or --iptables command line options set to false, then Docker can't make use of netfilter rules and has to fall back on the docker-proxy. This scenario is probably quite rare, but it is conceivable that some corporate security policies may impose this constraint.

Secondly, even when Docker is able to forward packets using netfilter rules, there is one circumstance where it is not possible to apply netfilter rules. Unless told otherwise, when a container's port is forwarded to the Docker host, it will be forwarded to all of the host's interfaces, including its loopback interface. But the Linux kernel does not allow the routing of loopback traffic, and therefore it's not possible to apply netfilter NAT rules to packets originating from 127.0.0.0/8. Instead, netfilter sends packets through the filter table's INPUT chain to a local process listening on the designated port - the docker-proxy.

The docker-proxy, then, is a 'catch all' method for allowing container port forwarding to the Docker host. However, it's generally considered that the docker-proxy is an inelegant solution to the problems highlighted above, and when a large range of container ports are exposed, it consumes considerable memory. An attempt was previously made to remove the dependency for the docker-proxy, but this fell foul of the limitations of the aged kernel in RHEL 6.x and CentOS 6.x, which the Docker project feels duty bound to support. Hence, the docker-proxy remains a major constituent part of the Docker experience in all Docker versions up to the current version 1.5. As I write, version 1.6 is due for imminent release, and there have been moves to remove the automatic requirement for the docker-proxy, which I'll cover in another article.