Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Docker setup less memory-intensive & build deps at container start #736

Merged
merged 116 commits into from
Feb 3, 2025

Conversation

Splines
Copy link
Member

@Splines Splines commented Jan 21, 2025

Our local dev/test docker setup had some inconsistencies and annoyances that I fix in this PR. It touches on many different config settings that are (in my opinion) sensible to modify in one PR.

Changes to local setup

  • Only build mampf once locally. Previously, mampf was built 4 times (!) locally: for the dev mampf container, for the dev webpacker container, for the test mampf container and for the test webpacker container. And in each build, we build and stored all Ruby and Node.js dependencies, therefore consuming unnecessary bandwidth and sooo many precious GBs of my SSD.
    • Instead, we now just build mampf once when you start the MaMpf dev container and store the respective image of 2GB size locally, which will then be reused by the test mampf container. Therefore, you have to start the dev containers at least once to be able to start the test container.
    • Additionally, we don't have a separate webpacker container anymore. Instead it is directly spun up in the mampf container once at the start. This way, we have one container for MaMpf and its assets locally easing recreation (e.g. mampf and webpacker containers were destroyed or created in our justfiles only together already)
  • Build Ruby gems at container start (& cache). Remember those annoying Bundler errors: Gem not found? This forced you to completely rebuild the container, which downloads the whole ruby image again and on top consumes your time. So why not just install the dependencies every time you start the container? If they are already installed, bundle install will return after less than 1s. And we now store those gems in a new volume: bundle_cache of around 350MB. Should you still have to rebuild the MaMpf container (not necessary anymore when you switch branches with different dependencies on them), you can do so more quickly since dependencies don't have to be refetched and installed again from scratch (due to the bundle_cache volume which acts as a local cache).
  • Build Node.js gems at container start (& cache). The same thing goes for Node.js modules. We now simply install them via yarn whenever the container starts. Therefore, you should always have those dependencies available from the PR you've checked out, which eases jumping around between branches.
  • Same db image. Previously, two different postgres images where downloaded for dev and test (namely 13 and latest). This is now unified and saves some more storage + network time.
  • We also fix that docker tries to connect to the Docker registry to find out that our local package is not existing there, by explicitly setting the pull_policy to never.
  • We let the initialize.sh and entrypoint.sh scripts fail directly (set -e) if any error occurs. Those scripts are also renamed and improved to make more clear what they actually do.
    -Furthermore, we get rid of the workaround for creating a new MaMpf database at the beginning (the workaround was necessary since we renamed the default database user from mampf to localroot). Instead, we use a special docker file that is automatically executed at the start of postgres, see here. Note that you will still encounter the error PG::UndefinedTable: ERROR: relation "media" does not exist in a completely empty database, but this issue is another one that won't be fixed in this PR.
    The previously used recipe create-empty-mampf-db-if-not-exists in the docker.justfile is just left there as reference and might be used manually should something fail. But it is expected, that postgres now reads our custom startup file to just execute a CREATE DATABASE mampf;.

Important

You should completely wipe your Docker setup once. This will also give you some storage back ;) To do so, run the following commands (at your own risk):

docker system prune -a --volumes  # ⚠ Will prune everything you have in Docker

# Additionally clear docker's build cache
docker builder prune -af

# On WSL: you also have to clean via the Docker Desktop app manually to reclaim the deleted space in Windows.

# To start your container again and fill it with some data:
just docker up-reseed -d

If you run into the weird issue error: RPC failed; curl 92 HTTP/2 stream 0 was not closed cleanly, just tell git to use HTTP/1.1 for the mampf repo via git config http.version HTTP/1.1. See also this comment.

Changes to CI/CD

I've reworked the Dockerfiles and docker-compose files to make more transparent what is actually going on and especially what differences between the CI/CD runs and the local setup are. This is achieved by creating new files: docker-compose.cicd.yml and docker-compose.cicd.build.yml. (Note that testing also uses the docker/development/Dockerfile. There is no docker/test/Dockerfile anymore.)

I've reworked our tests.yml GitHub workflow:

  • We now have a new Build MaMpf job. It will use a buildcache to speed up the build. We now also use mode=max to be able to cache multi-stage builds (we have two FROM statements in our docker/development/Dockerfile, that's why it is "multi-stage").
  • After that build jobs, the Unit tests and cypress tests are run as usual in parallel. While this setup with a separate Build job is not necessarily faster, it aims for a better separation of tasks. Additionally, we don't have to precompile assets twice for unit tests and cypress tests. Instead they are only precompiled once, then baked into the mampftest:image and finally downloaded by the test jobs. Note that this asset precompilation is one difference to the local setup, where we don't precompile them and instead use the webpack dev server.
  • In the mampftest container that is pushed to GHCR (privately), you will find two tags buildcache and ìmage. See also Cache storage backends. We use the registry type: "embeds the build cache into a separate image, and pushes to a dedicated location separate from the main output.". I.e. the actual MaMpf image with the precompiled assets is called mampftest:image. This is pulled in the test jobs via docker pull. The buildcache (mampftest:buildcache) is not an actual Docker image; instead its a special cache format that will be used by the build job. Trying to docker pull from it will fail as docker pull expects an actual image.

This PDF showcases some pipeline run times before and after this PR. This is not a thorough research. You will notice that speed benefits are marginal. However the separation is good for a better conceptualization of what is going on and eases modifying caching behavior in the future.

Reviewers

The Docker documentation is actually really good and it describes very clear what config options do. I've included some links in the code. Nevertheless, please ask for any clarification if I made a modification is not 100% clear. I had to read a lot, so maybe I can answer the question.

Pleaes also report any warning you get during execution of any just command. There should be NO warnings after this PR. Please report those warnings with a PR review comment at the respective file, such that we can resolve those comments later on. (For the only warning that you could get, I explicitly print "Ignore the error" beforehand).

  • Test that your local setup is working as expected. Webpacker should also work, so you shouldn't perceive any speed degradations when browsing localhost. Note that compile is set to true for local dev and testing, that is, we have some lazy-loading going on, i.e. webpacker will only compile the assets if you really need them. This is why on the first visit of a new page, it might take a few seconds longer to load.
  • Please also test that the test setup continues to work, i.e. just test cypress. Note that cypress is the biggest image of all with a size of 3GB.
  • Also try out just docker rspec for that to see explicitly what is going on during container start, e.g. bundler will install all gems once such that they are stored in the test bundle_cache volume. When you then rerun, you will notice how the cached gems will be used. After that, also make sure you can run the rspec tests in VSCode via the test runner.
  • (I haven't found a way to reuse a volume between different docker compose files. So we get a development_bundle_cache and a test_bundle_cache, both with the same content of ~340MB size. This is just a space, not a speed problem. Let me know if you have an idea how to solve this. Otherwise, this is still better than having to rebuild everything from scratch every time.)
  • Please also take a look at the latest GitHub Actions runs in this PR and make sure that every step is self-explanatory and makes very clear what is going on.

As a sidenote: working with the webpacker config here was such a pain as it was not at all clear which of the many config files it is actually executing and which env variables it respects. Thus a LOT of workflow runs where needed to test different settings (see the number of commits on this PR 😅). I hope I don't have to touch this setup again before we migrate away from it to Vite ;)

TODO:

Splines and others added 3 commits January 22, 2025 13:31
…731)

* Init first attempt to download latest database dump

* Overhaul just command to download the latest database dump

* Repurpose the download command to just download the dump

* Use existing /db/backups/ folder for storage

* Init dummy reseed from file command

* Check if preseed is url or file

* Move download db command to new `prod.justfile`

* Fix punctuation

* Init basic up-reseed-from-dump command (not tested yet)

* Tear down db, unzip dump, copy over and try to restore

(restoring is not working yet at this point)

* Get local restoring of the database to work

* Tear down database before up-reseed & fix pgadmin config

* Restart containers afterwards

* Make handling of empty database more robust

- Create mampf db if necessary
- Start db and wait for postgres automatically

* Move wait for postgres to db-tear-down method

* Don't wait for postgres at start of db-tear-down

* Clarify that up command assumes a valid database
@Splines Splines changed the base branch from release/1.15.2 to release/1.15.3 February 1, 2025 15:34
Copy link
Collaborator

@fosterfarrell9 fosterfarrell9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. For later reference: Discussions took place on basecamp.

@Splines Splines merged commit 9eec5ba into release/1.15.3 Feb 3, 2025
5 checks passed
@Splines Splines deleted the deps/bundle-local-gems branch February 3, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants