Building and tagging container images in CI

docker-ci-tagging-anon

Been thinking a lot recently about how to manage versioning and deployment using Docker for a small scale containerised solution. It’s different from a traditional release pipeline as the build artifacts are the container images with the latest code and configuration, instead of the CI having a zip of the built application.

In a completely ideal containerised microservice solution all containers are loosely coupled and can be tested and built independently. Their CI configuration can be kept independent as well, with the CI and testing setup for the entire orchestrated solution taking the latest safe versions of the containers and performing integration/smoke tests against test/staging environments.

If your solution is smaller scale and the containers linked together, this is my proposed setup.

Build

Images should be built consistently, so dependencies should be resolved and fixed at point of build. This is done for node with npm shrinkwrap which generates a file fixing the npm install to specific dependency versions. This should be done as part of development each time package.json is updated, to ensure all developers as well as images use the exact same versions of packages.

On each commit to develop the image is built and tagged twice, once with “develop” to tag it as the latest version for develop branch code, and then with the version number in the git repo VERSION.md (“1.0.1”). You cannot currently build with multiple tags, but building images with same content/instructions does not duplicate image storage due to Docker image layers.

Tagging

The “develop” tagged image is used as the latest current version of the image to be deployed as part of automated builds to the Development environment, in the develop branch docker-compose.yml all referenced images will use that tag.

The version number tagged image, “1.0.1”, is used as a historic fixed version for traceability, so for specific releases the tagged master docker-compose.yml will reference specific versioned images. This means we have a store of built versioned images which can be deployed on demand to re-create and trace issues.

On each significant release, the latest version image will be pushed to the image repository with the tag “latest” (corresponding to the code in the master branch).

Managing data store changes in containers

docker-container-ci-data-migrations

When creating microservices it’s common to keep their persistence stores loosely coupled so that changes to one service do not affect others. Each service should manage it’s own concerns, be the owner of retrieving/updating it’s data and define how and where it gets it from.

When using a relational database for a store there is an additional problem that each release may require schema/data updates, a known problem of database migration (schema migration/database change management).

There a large number of tools for doing this; flyway, liquibaseDbUp. They allow you to define the schema/data for your service as a series of ordered migration scripts, which can be applied to your database regardless of it’s state as a fresh DB or existing one with production data.

When your container service needs a relational database with a specific schema and you are performing continuous delivery you will need to handle this problem. Traditionally this is handled separately from the service by CI, where a Jenkins/Teamcity task runs the database migration tool before the task to deploy the updated release code for the service. You will have similar problems with containers that require config changes to non-relational stores (redis/mongo etc.).

This is still possible in a containerised deployment, but has disadvantages. Your CI will need knowledge/connection to each containers data store, and run the task for each container with a store. As the number of containers increase this will add more and more complexity into your CI which will need to be aware of all their needs and release changes.

To prevent this from happening the responsibility of updating their persistence store should be on developer for the container itself, as part of the containers definition, code and orchestration details. This allows the developers to define what their persistence store is and how it should be updated each release, leaving CI only responsible for deploying the latest version of the containers.

node_modules/.bin/knex migrate:latest --env development

As an example of this I created a simple People API node application and container, which has a dependency on a mysql database with people data. Using Knex for database migration, the source defines the scripts necessary to setup the database or upgrade it to the latest version. The Dockerfile startup command waits for the database to be available then runs the migration before starting the Node application. The containers necessary for the solution and the dependency on mysql are defined and configured in the docker-compose.yml.

docker-compose up

For a web application example I created People Web node application, that wraps the API and displays the results as HTML. It has a docker-compose.yml that spins up containers for mysql, node-people-api (using the latest image pushed to Docker Hub) and itself. As node-people-api manages it’s own store inside the container node-people-web doesn’t need any knowledge of the migration scripts to setup the mysql database.

Links