Latest DockerHub Images: https://hub.docker.com/orgs/pangeo/repositories
Image | Description | Size | Pulls |
---|---|---|---|
base-image | Foundational Dockerfile for builds | ||
base-notebook | minimally functional image for pangeo hubs | ||
pangeo-notebook | above + core earth science analysis packages | ||
ml-notebook | above + GPU-enabled tensorflow2 |
Click on the image name in the table above for a current list of installed packages and versions
This repository uses GitHub Actions to build images, run tests, and push images to DockerHub.
-
Pull requests from forks trigger rebuilding all images
-
pangeo/base-notebook:master
corresponds to current "staging" image in sync with master branch. Built with every commit to master. Also tagged with short GitHub short SHApangeo/base-notebook:2639bd3
. -
Tags pushed to GitHub manually represent "production" releases with corresponding tags on DockerHub
pangeo/pangeo-notebook:2020.03.11
. Thelatest
tag also corresponds to the most recent GitHub tag.
A common need is to update conda package versions in these images. To do so simply, 1) Fork this repo, 2) edit pangeo-notebook/environment.yml
on your fork, 3) create a PR. Compatible packages versions with conda-lock
and a lock file is automatically committed added as a commit in your PR.
You'll need at least Conda installed, and Docker if you want to build and test locally.
# create a fork of this repo and clone it locally
git clone https://github.com/mygithub/pangeo-docker-images
cd pangeo-docker-images
# Install conda-lock
conda env create -f environment-condalock.yml
git checkout -b change-pangeo-notebook
Edit pangeo-notebook/environment.yml
to change packages! Note that make pangeo-notebook
is a convenient shortcut to build and test. See the Makefile for specific commands that are run. For example, you can just run conda-lock and don't have to run Docker to build and test locally.
make pangeo-notebook
git commit -a -m "added x packages, changed x version"
git push
# go to github to create PR, or use github cli https://cli.github.com
https://github.com/pangeo-data/pangeo-binder-template
docker run -it --rm -p 8888:8888 pangeo/base-notebook:latest jupyter lab --ip 0.0.0.0
- compatible with Pangeo BinderHubs and JupyterHubs
- compatible with Repo2Docker Python configuration files
- reproducible build process and explicit conda package lists
- small size, fast build
- easy to customize
Everything stems from the Dockerfile
in the base-image
folder. The base-image
configures default settings for Conda and Dask with condarc.yml
and dask_config.yml
files. The base-image
is not meant to run on its own, it is the common foundation for -notebook
images that install Python packages including JupyerLab and lab extensions. Lists of Conda packages for each image are specified in an environment.yml
in each -notebook
folder, and compatible Dask and Jupyter packages are guaranteed by specifying the pangeo-notebook
conda metapackage.
You can pre-solve for compatible environments locally with conda-lock to convert the environment.yml
file to a conda-linux-64.lock file which is an explicit list of compatible packages solved by Conda. The major advantage of doing this is that if you rebuild at a later date the resulting Conda environment is identical, which improves reproducibility. For this reason, when building off of the base-image
, any existing conda-linux-64.lock
file takes precedence over the environment.yml
file.
The runtime environment sets two variables by default
$PANGEO_ENV
: name of the conda environment.$PANGEO_SCRATCH
: a URL likegcs://pangeo-scratch/username/
that points to a cloud storage bucket for temporary storage. This is set if the variable$PANGEO_SCRATCH_PREFIX
andJUPYTERHUB_USER
are detected. The prefix should be likes3://pangeo-scratch
- Since 2020.10.16, mamba is installed into the base-image and conda-lock environment and is used by default to solve for a compatible environment (see #146)
- For a simple list of packages for a given image, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/blob/2020.10.08/pangeo-notebook/packages.txt
- To compare changes between two images, you can use a link like this: https://github.com/pangeo-data/pangeo-docker-images/compare/2020.10.03..2020.10.08
The primary use of these Docker images is running on Pangeo Cloud deployments with dask-gateway. Generally, the dask-gateway library version built into the image must match the dask-gateway version deployed in the cloud environment. The follow table keeps track of the first time a new dask-gateway version appears in a tagged image:
dask-gateway | Image tag |
---|---|
0.9 | 2020.11.06 |
0.8 | 2020.07.28 |
0.7 | 2020.04.22 |