Skip to content
Mick Tarsel edited this page Apr 4, 2019 · 8 revisions

Project Summary

The Container Analysis project gathers information about Docker containers from public repositories. Starting with a Helm index chart file, this Python project will crawl, curate, and output helpful information about Applications which are composed of different containers.

Background Information

Since IBM Cloud Private's (ICP) inception, Applications have been created which are made up of 1 or many containers and there has been a strong need to know which hardware (architectures) a container can run on. It is important to know where a container is hosted, which versions of the containers are available, and what architectures the container can run on.

Project Methodology

Running the get-image-info.py in debug mode, the following will happen:

  1. Python script is started with Helm index chart as the only argument
  2. Parse Application names from index.yaml
  3. Locate link to Application's tarball
  4. Download tarball of Application
  5. Extract values.yaml and Chart.yaml
  6. Parse values.yaml for images needed by Application
  7. Parse values.yaml for image repos and image versions (tags).
  8. Crawl dockerhub using images, repos, and tags information
  9. Output Application name, images, tags, and list of supported architectures to CSV

Development Needed

  • Ensure project is Python 3 enabled
  • Enable support for other Docker repositories besides dockerhub.com
  • Gather more information about containers for cross-validation:
    • Utilize 'keywords' associated with Applications
    • Crawl git repos parsing READMEs
  • Host this project
  • Change the output from CSV to a more interactive GUI
  • Create a Wiki or some other type of in-depth documentation
Clone this wiki locally