AlgoDiscoDev Software Development

Docker Containerisation of this GitHub Pages Site

This project involved the migration of a reasonably configured, but monolithic development webserver to a network of loosely coupled, modular, disposable Docker containers.

The webserver hosted the local version of my static GitHub Pages website.

This project report came from my keeping a log of my Docker containerisation learning and implementation experiences.

< pada

"If you wake up and find yourself inside a Docker container... Just make the most of the time you have left."

Objectives of this project

Apparently, one of the use cases that containerisation improves upon is that of developer collaboration.

In that scenario, a collaborating dev with Docker engine, client and server environment installed on their host machine could either pull my publicly hosted images to spin up the containerised version of my web application or build the same using the build configuration files that I provided in a GitHub repository.

The objective of this project was to implement this scenario by migrating a simple web application to a containerised docker environment.

There were two stages:

  1. Research and Learning: To research, install and practice using the Docker system of containerisation.
  2. Publish docker images to a DockerHub repository and build configuration files to GitHub.

1. Research and Learning

My research and learning process for this project is discussed more fully in the What to containerise and what not to containerise research report.

Just one finding to quickly preview here though, regarding my installation of the Docker environment. Wanting to test out everything I'd read about online, I decided to go ahead and put my user account into the docker group, in order not to have to enter sudo with every docker command.

damola@host:~$ sudo usermod -aG docker ${USER}

A bit after that, I read in the docker docs that this approach had some risks, so was not recommended. I suppose that as usual, security and convenience is a zero-sum game. You have to pick one.

Anyway, I therefore decided to walk-back that previous decision by removing my user account from docker group membership. I didn't mind getting into the habit of delivering those extra keystrokes if helped reduce the attack surface of environments in which I would be working. I had to check some man pages to find the rarely used command to do this. Either of these were available:

damola@host:~$ sudo gpasswd -d ${USER} docker damola@host:~$ sudo deluser ${USER} docker

2. Publish docker images to a DockerHub repository and build configuration files to GitHub.

There were two distinct phases to this also:

Phase 0: Build a webserver image preloaded with my Github Pages static site and push to DockerHub.

Phase 1: Update that image with logging to a docker-managed volume on the host and log data retrieval, without breaking host isolation.

Build Phase 0:

This was a relatively simple target to achieve. All I had to do was to create a Dockerfile to build an image that extends the official httpd web server with my web content and perhaps a bit of web server configuration.

NOTE: The Apache httpd server image that I found first was one based on a RedHat Linux distribution. Even though I'm more familiar with the Debian based version of the webserver, I thought I'd try it anyway. I later discovered that Canonical also publish an Ubuntu-based Apache2 image.

# Dockerfile ################################# # Dockerfile to build an http web server image ################################# # Base image is Apaches' httpd server (based on Redhat (official) Linux) FROM httpd # Author: Damola Adebayo LABEL maintainer="Damola Adebayo <adebayo10k@domain.org>" # Copy website source code into image build COPY ./src/ /usr/local/apache2/htdocs/ # Expose port 80 EXPOSE 80

Also, although not absolutely necessary for a single service, I took the opportunity to create an embarrassingly short docker-compose file for this phase. Like so:

# docker-compose.yaml version: '3.8' services: web: build: context: ./ dockerfile: ./src/Dockerfile container_name: a10k_site_running ports: - "8072:80"

Even building this single service, there were a couple of times when I may have gone down wrong roads. Please be entertained by one of them, now:

Wrong Road: A Symlink Fail

I had the 'brilliant' idea that instead of copying web source files from their /var/www location on host web server into the docker project src/ directory, I could save duplication and future divergence by symlinking to the host web server target from the docker project build context. Like this:

damola@host:~$ ln -s /var/www/adebayo10k.github.io/docs/ ~/adebayo10k.github.io/ damola@host:~$ cd ~/adebayo10kgithubio damola@host:~/adebayo10kgithubio$ ls -lh -rw-rw-r-- 1 damola damola 186 Jul 1 14:08 docker-compose.yaml -rw-rw-r-- 1 damola damola 386 Jul 1 13:40 Dockerfile lrwxrwxrwx 1 damola damola 35 Jul 1 10:50 docs -> /var/www/adebayo10k.github.io/docs/

Long story short... it didn't work. Docker said:

=> ERROR [2/2] COPY ./docs/* /usr/local/apache2/htdocs/ 1.0s ------ > [2/2] COPY ./docs/* /usr/local/apache2/htdocs/: ------ failed to solve: lstat /var/lib/docker/tmp/buildkit-mount26108952 ...

It seems so obvious in retrospect, but even as symlink, docker won't reach up outside of the project build context, and into the hosts' web directory to COPY files! I'd have to carefully remove the symlink (without deleting my actual source content), then prepare the build context with a duplicate of my source content manually first. Like so..

damola@host:~$ cd ~/adebayo10k.github.io/ damola@host:~/adebayo10kgithubio$ unlink docs damola@host:~/adebayo10kgithubio$ mkdir src && cd src damola@host:~/adebayo10kgithubio/src$ cp -R /var/www/adebayo10k.github.io/docs/* .

NOTE: I later realised that Docker and Git can share a single project context, so that copying of source files was not necessary after all.

Publishing the Image to the public DockerHub registry

# registrydomain[:port]/repository[:tag]

With the container tested and working, I issued the following commands to retag the image and for the DockerHub push:

damola@host:~$ sudo docker image tag adebayo10kgithubio_web:latest adebayo10k/adebayo10kgithubio_web:latest damola@host:~$ sudo docker images damola@host:~$ sudo docker rmi adebayo10kgithubio_web:latest

... then login to the docker CLI

damola@host:~$ sudo docker push adebayo10k/adebayo10kgithubio_web:latest damola@host:~$ sudo docker rmi adebayo10k/adebayo10kgithubio_web:latest

... and to test the rest of this workflow by pulling it back down and running a local container:

damola@host:~$ sudo docker run -d --rm -p 8072:80 --name a10k_pull_test adebayo10k/adebayo10kgithubio_web:latest damola@host:~$ sudo docker run -d --rm -p 8072:80 -P adebayo10k/adebayo10kgithubio_web:latest

... then request the site from our browser http://localhost:8072. At first got a 400 Client Error, but after a session of troubleshooting, worked as expected.

Build Phase 1:

To update the image with persistent logging to a docker-managed data volume on the host via a dedicated data-only container.

We'd like to access log data from a container, rather than having to alter the root ownership or permissions on the mounted host data volume!

Step by step configure docker to achieve the following:

  1. Update the webserver image Dockerfile to copy in a custom httpd.conf file at image build time. NOTE: sudo docker logs command would no longer be available to track logs, as it seems to be configured in the httpd base image to listen to stdout and stderr streams from the Linux proc filesystem.
  2. Persist web server logs in a docker-managed, named, shared data volume on host.
  3. Data-only, nologin container with a volume that mounts the docker-managed shared volume and serves as a single point for accessing the log data generated by ALL containers in the container network. Apart from the data-generating containers (just the web server for now), this data volume container must remain the only one to maintain a reference to the mounted data volume.
  4. Run an ad-hoc, transient, log retrieving container to query the data-only container and exit immediately.

Here are the relevant files and code fragments:

# http.conf edits: ErrorLog "/var/log/data_access_dir/error_log" CustomLog "/var/log/data_access_dir/access_log" common # docker-compose.yaml version: '3.8' services: web: build: context: ./ dockerfile: ./webserver/Dockerfile container_name: a10k_site_running ports: - "8072:80" environment: - APACHE_LOG_DIR=/var/log/data_access_dir volumes: #- shared_vol:/usr/local/apache2/logs - shared_vol:/var/log/data_access_dir depends_on: - data_hub data_hub: build: context: ./ dockerfile: ./data_hub/Dockerfile container_name: data_only_hub volumes: - shared_vol:/var/log/data_access_dir volumes: shared_vol: # Dockerfile ################################# # Dockerfile to build a dedicated data-only, log data \ # access volume for all log generating services ################################# # Base image is busybox FROM busybox:latest # Author: Damola Adebayo LABEL maintainer="Damola Adebayo <adebayo10k@domain.org>" # Create the mountpoint directory RUN mkdir -p /var/log/data_access_dir # Create a data-only volume as mountpoint VOLUME [ "/var/log/data_access_dir" ] # Execute /bin/true for this data-only container to exit without a shell, \ # without anything but a zero exit code. CMD [ "/bin/true" ] # Dockerfile ################################# # Dockerfile to build an httpd web server image ################################# # Base image is Apaches' httpd server (based on Redhad (official) Linux) FROM httpd # Author: Damola Adebayo LABEL maintainer="Damola Adebayo <adebayo10k@domain.org>" #ENV APACHE_LOG_DIR=/usr/local/apache2/logs ENV APACHE_LOG_DIR=/var/log/data_access_dir # Create the mountpoint directory that we configured \ # for webserver logs. RUN mkdir -p /var/log/data_access_dir # Define the mountpoint VOLUME [ "/var/log/data_access_dir" ] # Copy website source code into image build COPY ./webserver/src/ /usr/local/apache2/htdocs/ # Copy in our customised main httpd configuration file COPY ./webserver/conf/httpd.conf /usr/local/apache2/conf/httpd.conf # Expose port 80 to the container network EXPOSE 80

... And the commands issued repeatedly during the build process:

damola@host:~/adebayo10kgithubio$ sudo docker compose build --no-cache damola@host:~/adebayo10kgithubio$ sudo docker compose up -d --build [+] Running 4/4 ⠿ Network adebayo10kgithubio_default Created 0.4s ⠿ Volume "adebayo10kgithubio_shared_vol" Created 0.2s ⠿ Container data_only_hub Started 5.0s ⠿ Container a10k_site_running Started 5.7s

... And some one-off manual checks:

# Check which containers exist in our current project: damola@host:~/adebayo10kgithubio$ sudo docker compose ps NAME COMMAND SERVICE STATUS PORTS a10k_site_running "httpd-foreground" web running 0.0.0.0:8072->80/tcp data_only_hub "/bin/true" data_hub exited (0) # Query a containers' configuration, filtering for mount information: damola@host:~$ sudo docker inspect --format='{{json .Mounts}}' a10k_site_running [{"Type":"volume", "Name":"adebayo10kgithubio_shared_vol", "Source":"/var/lib/docker/volumes/..._shared_vol/_data", "Destination":"/var/log/data_access_dir", "Driver":"local", "Mode":"z", "RW":true, "Propagation":""}] # Query a containers' configuration, filtering for mount information: damola@host:~$ sudo docker inspect --format='{{json .Mounts}}' data_only_hub [{"Type":"volume", "Name":"adebayo10kgithubio_shared_vol", "Source":"/var/lib/docker/volumes/.../_data", "Destination":"/var/log/data_access_dir", "Driver":"local", "Mode":"z", "RW":true, "Propagation":""}]

All checks done, browsed to http://localhost:8072 and navigated around site for a bit, then manually at CLI, ran the transient container.

NOTE: Log data persists in the root owned /var/lib/docker/volumes on host, but this transient container gives us secure access from container land.

Thought I'd try a couple of different base images for this transient container. The smaller the better I suppose:

damola@host:~$ sudo docker run --rm --volumes-from data_only_hub busybox:latest ls -lh /var/log/data_access_dir damola@host:~$ sudo docker run --rm --volumes-from data_only_hub ubuntu:latest tail /var/log/data_access_dir/access_log damola@host:~$ sudo docker exec -it a10k_site_running /bin/bash

Finally, from within the project build context, tear everything down...

damola@host:~/adebayo10kgithubio$ sudo docker compose stop && \ sudo docker compose down && \ sudo docker rmi adebayo10kgithubio_web adebayo10kgithubio_data_hub && \ sudo docker volume prune

Lessons Learned:

When, even after hours of online research, I'm making no progress:

  1. If it's after 1am, go to bed.
  2. Check my previous similar code and note exactly what's different now.
  3. Make an AFK list of all alternative approaches to try.

Resources:

Build configuration files are in the dev-webserver-docker-migration GitHub project repository.

Pre-built images are in the adebayo10k registry domain on DockerHub.

Hero message from the ṢàngóTech Network.

"Designing software systems is very nice."

Let's discuss this...