Skip to content
arnaud gaboury edited this page Nov 27, 2019 · 3 revisions

DOCKER

You can think of running a container like running a virtual machine, without the overhead of spinning up an entire operating system. For this reason, bundling your application in a container vs. a virtual machine will improve startup time significantly. While the industry pushes towards building microservice architectures, containers help facilitate quick elasticity and separate of concerns.

Best practice is for your containers to be "ephemeral" and stateless; build images from a Dockerfile so that they are reproducible, and keep any stateful data out of the container (for example, by storing it in a volume). This is actually one of the biggest advantages of using Docker, because having your containers fully reproducible. Instead of upgrading software in the container, rebuild the image, and replace the container, this allows you to test exactly the same code as you'll be deploying, and allows rolling back a deploy to an older version of the image.

Docker

We will run the Docker Community Edition, or Docker CE.

With Docker, you create a special file called a Dockerfile. Dockerfiles define a build process, which, when fed to the ‘docker build’ command, will produce an immutable docker image. You can think of this as a snapshot of your application, ready to be brought to life at any time. When you want to start it up, just use the ‘docker run’ command to run it anywhere the docker daemon is supported and running. It can be on your laptop, your production server in the cloud, or on a raspberry pi. Regardless of where your image is running, it will behave the same way.

Docker also provides a cloud-based repository called Docker Hub, now replaced by the Docker store. You can think of it like GitHub for Docker Images. You can use Docker Hub to create, store, and distribute the container images you build.

Installation

Kernel comptability

To check kernel compatibility, you can download and run the check-compatibility.sh script.

% curl https://raw.githubusercontent.com/docker/docker/master/contrib/check-config.sh > check-config.sh
$ chmod +x check-config.sh
$ ./check-config.sh

Fedora package

WARNING:

  • Don't install the packages from fedora package search page. It is totally broken. Instead, visit this docker page which will guide you to install docker CE version for Fedora 27. At the time of writing, the supported version is Docker version 17.12.0-ce.
  • Do not directly manipulate any files or directories within /var/lib/docker/. These files and directories are managed by Docker.
  • Do not create a docker group you will belong to and then start the service file as non root. The docker group grants privileges equivalent to the root user.
  • Please refer to this page to install and run supported version.

Configuration

Configuration file

The --config-file option allows you to set any configuration option for the daemon in a JSON format. This file uses the same flag names as keys, except for flags that allow several entries, where it uses the plural of the flag name, e.g., labels for the label flag. By default, docker tries to load a configuration file from /etc/docker/daemon.json on Linux. You can configure nearly all daemon configuration options using daemon.json.

We use this file to list all options instead of the systemd service file delivered by Fedora. Below is our configuration file:

/etc/docker/daemon.json
-------------------------
{
"storage-driver": "btrfs"
}

NOTE: The options set in the configuration file must not conflict with options set via flags. The docker daemon fails to start if an option is duplicated between the file and the flags, regardless their value.

Any syntax error will prevent docker to start

Storage driver

Docker’s btrfs storage driver leverages many Btrfs features for image and container management. Among these features are block-level operations, thin provisioning, copy-on-write snapshots, and ease of administration. You can easily combine multiple physical block devices into a single Btrfs filesystem

HTTP/HTTPS proxy

The Docker daemon uses the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environmental variables in its start-up environment to configure HTTP or HTTPS proxy behavior. You cannot configure these environment variables using the daemon.json file.

Create a drop-in file called /etc/systemd/system/docker.service.d/https-proxy.conf that adds the HTTPS_PROXY environment variable:

[Service]
Environment="HTTPS_PROXY=https://proxy.example.com:443/"

Image directory

The default location of Docker is /var/lib/docker/<driver> all existing images and containers are stored here. The option --data-root (docker version after 17.06-ce) mentions the root of the Docker runtime, default being /var/lib/docker.

Build Image

One popular way to create a Docker image is using a Dockerfile. A Dockerfile is a script that contains collections of commands and instructions that will be automatically executed in sequence in the docker environment for building a new docker image.

Dockerfile

Structure

Below are basic commands:

FROM
The base image for building a new image. This command must be on top of the dockerfile.
MAINTAINER
Optional, it contains the name of the maintainer of the image.
RUN
Used to execute a command during the build process of the docker image.
ADD
Copy a file from the host machine to the new docker image. There is an option to use an URL for the file, docker will then download that file to the destination directory.
ENV
Define an environment variable.
CMD
Used for executing commands when we build a new container from the docker image.
ENTRYPOINT
Define the default command that will be executed when the container is running.
WORKDIR
This is directive for CMD command to be executed.
USER
Set the user or UID for the container created with the image.
VOLUME
Enable access/linked directory between the container and the host machine.

A Dockerfile allows to build an image from the file with the following command:

# docker build -t MyDockerFile 

A Dockerfile contains at least the following sections:

  • FROM will tell Docker what image (and tag in this case) to base this off of
  • RUN will run the given command (as user "root") using sh -c "your-given-command"
  • ADD will copy a file from the host machine into the container
    • This is handy for configuration files or scripts to run, such as a process watcher like supervisord, systemd, upstart, forever (etc)
  • EXPOSE will expose a port to the host machine. You can expose multiple ports like so: EXPOSE 80 443 8888
  • CMD will run a command (not using sh -c). This is usually your long-running process.

Buildah

Buildah is a to build container images compliant with the Open Container Initiative (OCI) image specification

Basic commands

  • log into the docker container
# docker exec -ti <CONTAINER ID> bash
root@CONTAINER_ID:/#

TIPS: use TAB to get autocompletion with zsh

General info:
# docker info

List running containers:
# docker container ls

List images
# docker image ls

List all containers on the host:
# docker container ls -a

See info about a container
# docker inspect Container ID

Stop a running container:
# docker stop <CONTAINER ID/NAME>

Killing still running containers:
# docker kill <CONTAINER ID/NAME>

get info about container environment:
# docker exec -ti <DOCKER ID> env

list container ports
 # docker container port ContainerID
443/tcp -> 0.0.0.0:10443
80/tcp -> 0.0.0.0:10
stop all containers:
# docker stop $(docker ps -a -q)
remove all containers:
# docker rm $(docker ps -a -q)

remove all docker images:

# docker rmi $(docker images -q)

Systemd environment:

# systemctl show --property=Environment docker

Remove containers:

# docker ps -aq -f status=exited <-- list 
# docker ps -qa --no-trunc --filter "status=exited" | xargs docker rm

clean up any resources — images, containers, volumes, and networks — that are dangling (not associated with a container):

# docker system prune -a

Remove one or more specific images:

# docker images -a  --> list
# docker rmi ImageID1 ImageID2  --> remove
# docker images rm ImageID1  --> remove
# docker images -a --> verify
REPOSITORY       TAG        IMAGE ID      CREATED         SIZE

Check size:

# docker system df 
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              10                  10                  1.947GB             42.64MB (2%)
Containers          21                  19                  16.44kB             0B (0%)
Local Volumes       5                   3                   130.4MB             31.36MB (24%)
Build Cache                                                 0B                  0B

Edit container file from host:

  # docker cp CONTAINER:FILEPATH LOCALFILEPATH
  # vi LOCALFILEPATH
  # docker cp LOCALFILEPATH CONTAINER:FILEPATH

Network

But to communicate outside, these containers still uses hostIP and hence the network address translation is required to translate containerIp:Port to hostIp:Port. These NAT translations are done in Linux using IPTables, at container creation with option, or with rancher UI.

CNI - the Container Network Interface

CNI (Container Network Interface) consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of supported plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted.

The Rancher managed IP address will not be present in Docker metadata,which means it will not appear in docker inspect. Certain images may not work if it requires a Docker bridge IP. Any ports published on a host will not be shown in docker ps as Rancher manages separate IPtables for the networking.

Host Mode Networking

This is the mode adopt by Rancher. This mode effectively disables network isolation of a Docker container. Because the container shares the networking namespace of the host, it is directly exposed to the public network; consequently, you need to carry out the coordination via port mapping.

When a container is launched with host networking, the container is launched with the same networking interfaces available to the host. This is equivalent to launching a container from the Docker command line with the option --net=host.

Inside the container, the ip addr or ifconfig commands will show the same networking interfaces as the host.

NOTE: IP addresses allow network resources to be reached through a network interface.

Bridge Mode Networking

In this mode the Docker daemon creates docker0 , a virtual Ethernet bridge that automatically forwards packets between any other network interfaces that are attached to it. By default, the daemon then connects all containers on a host to this internal net‐work through creating a pair of peer interfaces, assigning one of the peers to become the container’s eth0 interface and other peer in the namespace of the host, as well as assigning an IP address/subnet from the private IP range to the bridge.

IP command

from the host

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether fa:16:3e:60:3f:a2 brd ff:ff:ff:ff:ff:ff
    inet 10.52.11.199/24 brd 10.52.11.255 scope global dynamic eth0
       valid_lft 78887sec preferred_lft 78887sec
    inet6 2001:1600:4:8:f816:3eff:fe60:3fa2/64 scope global dynamic mngtmpaddr 
       valid_lft 86357sec preferred_lft 14357sec
    inet6 fe80::f816:3eff:fe60:3fa2/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:50:65:82:04 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:50ff:fe65:8204/64 scope link 
       valid_lft forever preferred_lft forever
27: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether a6:bc:0e:87:a0:4b brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::a4bc:eff:fe87:a04b/64 scope link 
       valid_lft forever preferred_lft forever

from container

# docker exec -ti 7e82d4b31a4a ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: eth0@if303: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether da:31:5e:f1:66:b9 brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.22/32 scope global eth0
       valid_lft forever preferred_lft forever

another one:

3: eth0@if285: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether ee:5f:d2:44:a6:0c brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.4/32 scope global eth0
       valid_lft forever preferred_lft forever

Comments

  • On containers, the interface eth0if303 and eth0if285 match the eth0 on host.The containers have its won IP, 10.42.1.22 and 10.42.1.4/32, attached to the flannel driver on host flannel.1 with IP range 10.42.1.0/32.

  • 10.42.0.0/16 is the default subnet of Rancher.

Runtime kernel parameters

IP forwarding

Beginning with systemd version 220, the forwarding setting for a given network (net.ipv4.conf.interface.forwarding) defaults to off. This setting prevents IP forwarding. It also conflicts with Docker’s behavior of enabling the net.ipv4.conf.all.forwarding setting within containers.

To check your machine policy, verify this command: $ sysctl net.ipv4.conf.all.forwarding returns 1.

To work around this, edit the interface.network file in /etc/sysctl.d/50-override.conf on your Docker host. The /usr/lib/sysctl.d/50-default.conf will then be modified by /etc/sysctl.d/50-override.conf. Then run # sysctl -p.

You can edit too /usr/lib/systemd/network/80-container-host0.network and add the following block within the [Network] section.

[Network]
IPForward=kernel
IPForward=true

Nginx reverse proxy

A reverse proxy server is a server that typically sits in front of other web servers in order to provide additional functionality that the web servers may not provide themselves. When running web services in docker containers, it can be useful to run a reverse proxy in front of the containers to simplify depoyment.

Share files with host

As Docker containers are ephemeral, we want to save files not inside the container but on the host. Docker volumes can be used to share files between a host system and the Docker container.

In this tutorial, they explore how to make data from inside the container accessible on the host machine.

At its core, a volume is just a directory, possibly with some data in it, which is accessible to the containers in a pod. How that directory comes to be, the medium that backs it, and the contents of it are determined by the particular volume type used. To use a volume, a pod specifies what volumes to provide for the pod (the spec.volumes field) and where to mount those into containers (the spec.containers.volumeMounts field). Each container in the Pod must independently specify where to mount each volume. The Kubernetes website list all kind of volumes

Bind mount

Example with Nginx workload

We have installed the nginx app under the service name of mywebserver. To find its configuration files, let's first find the pods with this command:

# kubectl get pods -o wide --all-namespaces
 default         mywebserver-6bdc568b7c-22wnz            1/1       Running   0          23h       10.42.3.3      worker
default         mywebserver-6bdc568b7c-kss69            1/1       Running   0          23h       10.42.0.3      control1

The output tells us we have two pods with name containing mywebserver. This was our choice when installing the app. We see too on which VM pods have been created.

On host control1, running docker inspect <container.number>, returns a Mounts section with, among other informations, this:

"Type": "volume",
"Name": "cfc02e572d99db68235cb997c7a624a81e2f64c5faa7677be0c9258fd6484d10",
"Source": "/var/lib/docker/volumes/cfc02e572d99db68235cb997c7a624a81e2f64c5faa7677be0c9258fd6484d10/_data",
"Destination": "/config",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
            },

The source is the directory on host.

# ls -al /var/lib/docker/volumes/cfc02e572d99db68235cb997c7a624a81e2f64c5faa7677be0c9258fd6484d10/_data
drwxr-xr-x. 1  911  911 32 Apr 25 15:56 keys/
drwxr-xr-x. 1  911  911 16 Apr 25 15:56 log/
drwxrwxr-x. 1  911  911 40 Apr 25 15:56 nginx/
drwxr-xr-x. 1  911  911 26 Apr 25 15:56 php/
drwxrwxr-x. 1  911  911 20 Apr 25 15:56 www/

The destination is the directory on container, which is in our case /config, with all needed Nginx configuration files.

TIP: there seems to be a general rule to have a source on host with this path /var/lib/docker/volumes/<number>/_data which is the mounted /config directory from the container. This directory include configuration files.

storageOS

Longhorn

Systemd & Docker

Thanks to the Open Container Initiative, a.k.a OCI, docker upstream uses runc as the back end for running its containers by default. runc is the default implementation of OCI runtime specification which implements hooks. Hooks are programs that execute after the container is fully setup but before it is executed.

The oci-register-machine hook contacts systemd (systemd-machined) to register the container with machinectl. Machinectl can then list all of the containers and virtual machines running on the host.

Security

Best security practices are listed on the red hat documentation.

SELinux

To use SELinux, add --selinux-enabled in the OPTIONS in file /etc/sysconfig/docker.

  • semanage ports
# semanage port -a -t container_port_t -p tcp 2376
# semanage port -a -t container_port_t -p tcp 2377
# semanage port -a -t container_port_t -p tcp 7946
# semanage port -a -t container_port_t -p udp 7946
# semanage port -a -t container_port_t -p udp 4789

Iptables

By default, docker daemon appends iptables rules for forwarding. For this, it uses a filter chain named DOCKER. Docker works perfectly fine when no firewall is running on the host machine. Without the firewall docker containers can communicate with each other and with the outside world. But with the firewall we need to setup some rules in order to allow traffic to and from the docker interface.

The most important is that we need to allow forwarding between docker0 and eth0.

Docker need some iptables rules to work. It is advised to run Docker with iptables disable and write the rules inside the tables. This method will avoid Docker to mess the tables whent its service is started.

Add /etc/systemd/system/docker.service.d/override.conf with these lines:

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --iptables=false --ip-masq=false

Very basic rules for iptables.

Overlay network

Firewall rules for Docker daemons using overlay networks. We need the following ports open to traffic to and from each Docker host participating on an overlay network:

  • open the swarn ports
# iptables -A TCP_ADMIN -p tcp --dport 2377 -j ACCEPT
# iptables -A TCP_ADMIN -p tcp --dport 7946 -j ACCEPT
# iptables -A TCP_ADMIN -p tcp --dport 2376 -j ACCEPT
# iptables -A UDP_ADMIN -p udp --dport 7946 -j ACCEPT
# iptables -A UDP_ADMIN -p udp --dport 4789 -j ACCEPT

Errors

GRPC

The following error: Error response from daemon: grpc: the connection is unavailable is quite generic. Basically it means: the docker daemon dockerd was unable to make a gRPC connection with the containerd daemon docker-containerd

Connection refused

When testing the traffic on localhost, one run the command curl -v -k https://127.0.0.. One answer sometimes is : curl: (7) Failed to connect to 127.0.0.ort 443: Connection refused. It usually means the firewall allowed the packets to get through (unless the firewall is actively rejecting the connection attempt), but there is no service listening on the destined port number, as returned by netstat -lnp (443 is not listening).

12.0.4. bridge-nf-call-iptables

When running # docker info with driver set to overlay, you may have a message at the end of the output:

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

You can verify that /proc/sys/net/bridge/bridge-nf-call-iptables doesn't exist or is set to 0.

Thiscontrol whether or not packets traversing a bridge in a Docker container are sent to iptables.

# lsmod | grep br_netfilter
# modprobe br_netfilter
# lsmod | grep br_netfilter
br_netfilter           24576  0
bridge                188416  r_netfilter

# vim /etc/modules-load.d/bridge.conf

# load briddge filter module
br_netfilter

# sysctl net.bridge.bridge-nf-call-iptables=   # sysctl net.bridge.bridge-nf-call-ip6tables=    # cat /proc/sys/net/bridge/bridge-nf-call-iptables

Orchestration

Docker swarm

Docker compose

Rancher

Resources

Clone this wiki locally