dockerignore speeds up builds¶
Many of Caliban’s commands begin their work by triggering a docker build
command; this command has a side effect of bundling up the entire directory
where you run the command into a “build context”, which is zipped up and sent
off to the Docker build process on your machine.
In a directory containing machine learning code, it’s not unusual that you might also have subdirectories that contain, for example:
large datasets that you’ve cached locally
tensorboard output from local runs
metrics
If you don’t want to include any of these things in the Docker container that
caliban builds for you, you can significantly speed up your builds by creating a
file called .dockerignore
in the directory of your project.
Here’s an example .dockerignore
file, with comments explaining each line:
# ignore the git repository info and the pip installation cache
.git
.cache
# this is huge - ignore the virtualenv we've created inside the folder!
env
# tests don't belong inside the repo.
tests
# no need to package info about the packaged-up code in egg form.
*.egg-info
# These files are here for local development, but have nothing
# to do with the code itself, and don't belong on the docker image.
Makefile
pylintrc
setup.cfg
__pycache__
.coverage
.pytest_cache
As a starting point, you might take your project’s .gitignore
file, copy
everything other to .dockerignore
and then delete any entries that you
actually DO need inside your Docker container. An example might be some data you
don’t control with git
, but that you do want to include in the container using
Caliban’s -d
flag.