Lecture Overview
In this session, we'll explore Docker Hub — Docker's official registry for container images — and learn how to find, use, and work with public images. Rather than building every image from scratch, we can leverage the vast ecosystem of pre-built images to accelerate our development. By the end of this session, you'll understand how to find appropriate images, evaluate their quality and security, and incorporate them into your projects effectively.
Introduction to Docker Hub
Docker Hub is the world's largest library and community for container images. It serves as the default registry for Docker, meaning that when you run commands like docker pull python, Docker automatically looks for images on Docker Hub.
What is Docker Hub?
- A cloud-based registry service for Docker images
- A central repository where developers can share and find container images
- Both a public registry for open-source images and a private registry for teams and organizations
- A hub for official images maintained by Docker and software vendors
Analogy: Docker Hub is like a massive public library for container images. Just as a library catalogs books by different authors on different subjects, Docker Hub catalogs images from different providers for different applications. Some books are written by renowned authors (official images), some by community members (community images), and some are in special collections with restricted access (private repositories).
Key Concepts
- Image Repository: A collection of related images, usually representing the same application with different versions
- Official Images: Curated, well-documented images maintained by Docker
- Verified Publisher Images: Created and maintained by commercial entities that partner with Docker
- Community Images: Created and maintained by individual Docker Hub users
- Tags: Identifiers for specific versions of an image
Let's head over to Docker Hub and explore its features.
Accessing Docker Hub
You can access Docker Hub through your web browser at https://hub.docker.com/. When you visit, you'll see a search bar, featured content, and various categories of images.
While you can browse Docker Hub without an account, creating a free account allows you to:
- Push your own images to Docker Hub
- Create private repositories
- Star and follow your favorite images
- Join Docker teams and organizations
Finding Images on Docker Hub
Docker Hub contains millions of images, so finding the right one is an important skill. Let's explore different methods for finding images.
Using the Docker CLI
You can search for images directly from your terminal using the docker search command:
docker search nginx
This returns a list of images related to the search term, along with information such as:
- The name of the repository
- A brief description
- The number of stars (indicating popularity)
- Whether it's an official image
- Whether it's automated (built automatically from a GitHub repository)
Example output:
NAME DESCRIPTION STARS OFFICIAL AUTOMATED
nginx Official build of Nginx. 16831 [OK]
jwilder/nginx-proxy Automated Nginx reverse proxy for docker con... 2122 [OK]
richarvey/nginx-php-fpm Container running Nginx + PHP-FPM capable of... 818 [OK]
jc21/nginx-proxy-manager Docker container for managing Nginx proxy ho... 248
...
While convenient, this method provides limited information. For more detailed information, it's better to use the Docker Hub website.
Using the Docker Hub Website
The Docker Hub website provides a more comprehensive search experience with more details about each image:
- Go to https://hub.docker.com/search
- Enter your search term in the search bar
- Use filters like "Official Images" or "Verified Publishers" to narrow results
- Sort results by relevance, stars, or recency
Understanding Search Results
When you search on Docker Hub, you'll see several pieces of information that help you evaluate an image:
- Official/Verified Badge: Indicates trusted images
- Pull Count: How many times the image has been downloaded
- Star Count: How many users have starred (favorited) the repository
- Last Updated: When the repository was last updated
- Short Description: Brief summary of what the image contains
Example search: If you search for "PostgreSQL" on Docker Hub, you'll see multiple results including:
- postgres (Official Image): The official PostgreSQL image
- bitnami/postgresql (Verified Publisher): Bitnami's PostgreSQL image
- Various community images with specific configurations
Evaluating Image Quality and Security
Not all images are created equal. When selecting an image, you should consider several factors to ensure quality and security.
Image Trust Hierarchy
Docker Hub has a hierarchy of image trustworthiness:
- Official Images: Most trustworthy, maintained by Docker and upstream vendors
- Verified Publisher Images: Created by trusted partners with a verified badge
- Community Images: Created by individual users, varying in quality and security
Analogy: Think of this like medication sources. Official images are like FDA-approved medications from established pharmaceutical companies. Verified publisher images are like supplements from reputable brands with quality certifications. Community images range from carefully formulated products by knowledgeable herbalists to unknown substances mixed in someone's garage — they require more scrutiny before use.
Key Evaluation Criteria
When evaluating an image, consider:
- Maintainer Reputation: Is it from an official source or trusted publisher?
- Documentation Quality: Are the usage and configuration well documented?
- Update Frequency: How recently was the image updated?
- Community Engagement: High pull counts and stars suggest wider usage
- Docker Hub Comments: Look for feedback from other users
- Open Source: Can you see how the image is built? Is there a Dockerfile?
- Security Scanning: Does the image have any known vulnerabilities?
Image Details Page
Clicking on an image in Docker Hub takes you to its details page, which provides:
- A detailed description of the image
- Usage instructions
- Available tags (versions)
- Dockerfile source (for some images)
- Environment variables and other configuration options
Best Practice: Always prefer official images when available. They're maintained by Docker and the software vendors, follow best practices, are regularly updated for security, and provide clear documentation.
Understanding Image Tags
Tags are how Docker identifies specific versions of an image. They're crucial for reproducibility and stability in your projects.
What are Tags?
A tag is a label applied to an image in a repository, indicating a specific version or variant. When you pull an image without specifying a tag, Docker uses the latest tag by default.
Common Tagging Conventions
Many repositories follow these conventions:
- latest: The most up-to-date version (often the most recent stable release)
- Version numbers: Major.Minor.Patch (e.g.,
13.0.1,3.9) - Date-based: Using dates as tags (e.g.,
20210701) - Variant indicators: Often appended to version (e.g.,
3.9-slim,13-alpine)
Common Image Variants
Many official images offer different variants with varying tradeoffs:
- alpine: Based on Alpine Linux, very small footprint but may have compatibility issues
- slim: Smaller than default but larger than alpine, good balance of compatibility and size
- buster/bullseye/etc.: Based on specific Debian releases
- windowsservercore/nanoserver: Windows-based variants
Example tags for Python:
python:3.9- Python 3.9 on Debianpython:3.9-slim- Smaller variant of Python 3.9python:3.9-alpine- Python 3.9 on Alpine Linux (smallest)python:3.9-windowsservercore- Python 3.9 on Windows Server Core
Viewing Available Tags
You can see all available tags for an image on its Docker Hub page. For example, for Python:
- Go to https://hub.docker.com/_/python
- Click on the "Tags" tab
You'll see a list of all available tags along with their size and architecture support.
Best Practice: Always use specific version tags in production environments, never latest. This ensures reproducibility and prevents unexpected changes when images are updated.
Pulling and Using Public Images
Now that we understand how to find and evaluate images, let's look at how to pull and use them effectively.
Pulling Images
To download an image from Docker Hub, use the docker pull command:
docker pull [repository]:[tag]
Examples:
# Pull the latest version of nginx
docker pull nginx
# Pull a specific version of Python
docker pull python:3.9-slim
# Pull PostgreSQL version 13 with Alpine Linux
docker pull postgres:13-alpine
The image will be downloaded to your local Docker environment, where it can be used to run containers.
Viewing Local Images
To see which images you have downloaded locally:
docker images
Output example:
REPOSITORY TAG IMAGE ID CREATED SIZE
nginx latest 605c77e624dd 3 days ago 142MB
python 3.9-slim 8c7051081f50 5 days ago 124MB
postgres 13-alpine 87180a7e49e8 1 week ago 213MB
Running Containers from Images
Once you've pulled an image, you can run a container from it:
docker run [options] [repository]:[tag] [command]
If you haven't explicitly pulled the image, docker run will automatically pull it for you.
Example running containers from public images:
# Run an nginx web server
docker run -d -p 8080:80 --name my-nginx nginx
# Run a PostgreSQL database
docker run -d -p 5432:5432 \
-e POSTGRES_PASSWORD=mysecretpassword \
--name my-postgres \
postgres:13-alpine
# Run a Python container with an interactive shell
docker run -it --rm python:3.9-slim python
Exploring Popular Official Images
Let's explore some of the most popular official images on Docker Hub and how they can be used in your projects.
NGINX
NGINX is a high-performance web server and reverse proxy.
Basic usage:
docker run -d -p 8080:80 --name webserver nginx
Serving custom content:
docker run -d -p 8080:80 -v $(pwd)/html:/usr/share/nginx/html nginx
Custom configuration:
docker run -d -p 8080:80 -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro nginx
Real-world application: You can use NGINX as a reverse proxy in front of your Python web applications. It can handle SSL termination, static file serving, and load balancing, allowing your application to focus on business logic.
PostgreSQL
PostgreSQL is a powerful, open-source relational database.
Basic usage:
docker run -d \
-p 5432:5432 \
-e POSTGRES_PASSWORD=mysecretpassword \
--name my-postgres \
postgres
Data persistence:
docker run -d \
-p 5432:5432 \
-e POSTGRES_PASSWORD=mysecretpassword \
-e POSTGRES_USER=myuser \
-e POSTGRES_DB=mydb \
-v postgres_data:/var/lib/postgresql/data \
--name my-postgres \
postgres
Initialization scripts:
docker run -d \
-p 5432:5432 \
-e POSTGRES_PASSWORD=mysecretpassword \
-v postgres_data:/var/lib/postgresql/data \
-v $(pwd)/init.sql:/docker-entrypoint-initdb.d/init.sql \
--name my-postgres \
postgres
Real-world application: In a typical web application, you might use a PostgreSQL container for your database, connected to your Python application container. The database data can be persisted using a Docker volume.
Python
The official Python image provides Python runtimes.
Interactive Python shell:
docker run -it --rm python:3.9 python
Running a Python script:
docker run -it --rm -v $(pwd):/app -w /app python:3.9 python script.py
Running a Flask web application:
docker run -it --rm \
-p 5000:5000 \
-v $(pwd):/app \
-w /app \
-e FLASK_APP=app.py \
-e FLASK_ENV=development \
python:3.9-slim \
sh -c "pip install -r requirements.txt && flask run --host=0.0.0.0"
Real-world application: The Python image is typically used as a base for creating your own application images. You would start with a Python base image, add your application code and dependencies, and build a custom image.
Redis
Redis is an in-memory data structure store, used as a database, cache, and message broker.
Basic usage:
docker run -d -p 6379:6379 --name my-redis redis
With persistence:
docker run -d \
-p 6379:6379 \
-v redis_data:/data \
--name my-redis \
redis redis-server --appendonly yes
With custom configuration:
docker run -d \
-p 6379:6379 \
-v $(pwd)/redis.conf:/usr/local/etc/redis/redis.conf \
--name my-redis \
redis redis-server /usr/local/etc/redis/redis.conf
Real-world application: Redis is commonly used alongside web applications for caching, session storage, and background job queues. For example, you might use Redis with Celery to handle asynchronous tasks in a Python web application.
Understanding Docker Image Size and Optimization
Image size is an important consideration for performance, network transfer times, and resource usage. Let's explore this aspect of public images.
Why Image Size Matters
- Download time: Smaller images are faster to pull from registries
- Startup time: Smaller images can lead to faster container startup
- Resource usage: Smaller images use less disk space
- Security surface: Smaller images often have fewer unnecessary packages
Size Comparison of Different Variants
Let's compare the sizes of different Python image variants:
docker pull python:3.9
docker pull python:3.9-slim
docker pull python:3.9-alpine
docker images
You might see output like:
REPOSITORY TAG IMAGE ID CREATED SIZE
python 3.9 f88b2f81f83a 2 weeks ago 915MB
python 3.9-slim 8c705081f50d 2 weeks ago 124MB
python 3.9-alpine d4d9c6317a1a 2 weeks ago 45MB
The size differences are substantial! But what are the tradeoffs?
Tradeoffs for Different Variants
| Variant | Pros | Cons |
|---|---|---|
Full (e.g., python:3.9) |
|
|
Slim (e.g., python:3.9-slim) |
|
|
Alpine (e.g., python:3.9-alpine) |
|
|
Best Practice: For Python applications in development, python:3.x-slim is often a good balance of size and compatibility. For production, Alpine-based images are excellent if you've tested thoroughly with your dependencies. The full image is rarely necessary but can be useful for complex build environments.
Working with Image Layers
Understanding how Docker images are constructed from layers helps you optimize image size and reuse.
What are Image Layers?
Docker images are composed of multiple layers, each representing a set of filesystem changes. Layers are:
- Created by instructions in a Dockerfile (each instruction creates a layer)
- Read-only once created
- Cached and reused when possible
- Stacked on top of each other to form the complete image
Analogy: Image layers are like transparencies stacked on top of each other. Each transparency adds something to the final image, but you can see through to the layers below. When Docker builds an image, it's as if it's laying down these transparencies one at a time, with each new layer potentially modifying what's visible in the layers below.
Inspecting Image Layers
To see the layers that make up an image, use the docker history command:
docker history nginx:latest
You'll see output like:
IMAGE CREATED CREATED BY SIZE COMMENT
605c77e624dd 7 days ago /bin/sh -c #(nop) CMD ["nginx" "-g" "daemon… 0B
<missing> 7 days ago /bin/sh -c #(nop) STOPSIGNAL SIGQUIT 0B
<missing> 7 days ago /bin/sh -c #(nop) EXPOSE 80 0B
<missing> 7 days ago /bin/sh -c #(nop) ENTRYPOINT ["/docker-ent… 0B
<missing> 7 days ago /bin/sh -c #(nop) COPY file:09a214a3e07c919a… 4.61kB
<missing> 7 days ago /bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7… 1.04kB
<missing> 7 days ago /bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b0… 1.96kB
<missing> 7 days ago /bin/sh -c #(nop) COPY file:65504f71f5855ca0… 1.2kB
<missing> 7 days ago /bin/sh -c set -x && addgroup --system -… 61.1MB
...
Each row represents a layer in the image, showing when it was created, the command that created it, and its size.
Layer Sharing and Caching
One of the powerful features of Docker's layer system is the ability to share and reuse layers between images. For example:
- If two images are based on Ubuntu, they share the Ubuntu base layers
- When you build an image, Docker reuses cached layers if the instructions haven't changed
- This sharing makes pullls faster and reduces disk usage
Example of layer sharing: If you have both nginx:latest and nginx:1.21 images, they likely share many layers. Docker only stores the unique layers for each tag, saving disk space.
Multi-Architecture Images
Docker Hub supports multi-architecture images, which allows the same image name to work across different CPU architectures.
Understanding Multi-Architecture Images
Multi-architecture images are actually a collection of images for different architectures, tied together with a manifest list. When you pull an image, Docker automatically selects the version that matches your system's architecture.
Common architectures include:
- amd64: Standard 64-bit x86 PCs and servers
- arm64/aarch64: 64-bit ARM (Apple M1/M2, AWS Graviton, etc.)
- arm/v7: 32-bit ARM (older Raspberry Pi, etc.)
- windows-amd64: Windows on 64-bit x86
Checking Architecture Support
To see which architectures an image supports, look at its Docker Hub page under the Tags section. For official images, you'll often see multiple architectures listed for each tag.
Practical application: Multi-architecture support is increasingly important with the growing adoption of ARM-based servers and Apple Silicon Macs. Most official images now support multiple architectures without any special configuration on your part.
Dealing with Architecture Mismatches
Sometimes you might need to run an image that doesn't have a build for your architecture. In these cases:
- Docker Desktop for Mac supports transparent emulation of x86_64 images on Apple Silicon
- You can explicitly request an image for a specific architecture using the
--platformflag:
docker run --platform linux/amd64 -d nginx
This forces Docker to pull and run the amd64 version, even on an ARM machine (with emulation if available).
Advanced Docker Hub Features
Docker Hub offers several advanced features that can enhance your development workflow.
Automated Builds
Docker Hub can automatically build images from source code repositories:
- Link your GitHub or Bitbucket account to Docker Hub
- Set up a repository with a Dockerfile
- Configure build rules (which branches/tags to build)
- Docker Hub will automatically build and publish images when you push changes
Using Docker Hub for Your Own Images
To push your own images to Docker Hub:
- Create an account and log in:
docker login - Tag your image with your username:
docker tag my-app username/my-app:1.0 - Push the image to Docker Hub:
docker push username/my-app:1.0
Organization Accounts
For team projects, Docker Hub offers organization accounts that allow:
- Shared access to repositories
- Team management
- Role-based access control
- Private repositories
Real-world usage: In a professional setting, your organization might have a Docker Hub organization account where all your custom images are stored. CI/CD pipelines can automatically build and push images to this account, and developers can pull these images as needed.
Alternatives to Docker Hub
While Docker Hub is the default and most popular registry, there are several alternatives worth knowing about:
Public Registries
- GitHub Container Registry (ghcr.io): Integrated with GitHub accounts and actions
- Quay.io: Red Hat's container registry with advanced security features
- Google Container Registry (gcr.io): Integrated with Google Cloud
- Amazon Elastic Container Registry (ECR Public): AWS's public registry
Private Registry Options
- Azure Container Registry: Microsoft's private registry service
- Amazon ECR (Private): AWS's private registry service
- Google Container Registry (Private): Google Cloud's private registry
- Harbor: Open-source registry with security scanning
- Self-hosted Docker Registry: Run your own registry server
Using Alternative Registries
To pull from an alternative registry, include the registry hostname in the image name:
# Pull from GitHub Container Registry
docker pull ghcr.io/username/image:tag
# Pull from Google Container Registry
docker pull gcr.io/project-id/image:tag
Best Practice: In enterprise environments, it's common to use a private registry for your custom images. This gives you more control over security, access, and availability.
Practical Examples
Let's put our knowledge of Docker Hub and public images to practical use with some real-world examples.
Example 1: Setting Up a Web Development Environment
Let's create a simple web development environment with NGINX, Python, and PostgreSQL:
# Create a network for the containers to communicate
docker network create webdev
# Start a PostgreSQL database
docker run -d \
--name postgres \
--network webdev \
-e POSTGRES_PASSWORD=devpassword \
-e POSTGRES_USER=devuser \
-e POSTGRES_DB=devdb \
-v pg_data:/var/lib/postgresql/data \
postgres:13-alpine
# Start a Python container for development
docker run -it --rm \
--name python \
--network webdev \
-v "$(pwd)/app:/app" \
-w /app \
-p 5000:5000 \
python:3.9-slim \
bash
# In the Python container, you can now install dependencies and run your app
# pip install -r requirements.txt
# python app.py
# In a separate terminal, start NGINX as a reverse proxy
docker run -d \
--name nginx \
--network webdev \
-p 8080:80 \
-v "$(pwd)/nginx.conf:/etc/nginx/conf.d/default.conf" \
nginx:alpine
With this setup, NGINX can proxy requests to your Python application, which can connect to the PostgreSQL database.
Example 2: Data Analysis Environment
Let's set up a data analysis environment using the Jupyter image:
docker run -it --rm \
-p 8888:8888 \
-v "$(pwd)/notebooks:/home/jovyan/work" \
jupyter/datascience-notebook
This launches a Jupyter notebook with scientific computing libraries. You can access it by opening the URL displayed in the console (typically http://localhost:8888 with a token).
Example 3: WordPress Blog
Let's set up a WordPress blog with MySQL using official images:
# Create a network
docker network create wordpress
# Start MySQL
docker run -d \
--name wordpress-db \
--network wordpress \
-e MYSQL_ROOT_PASSWORD=rootpassword \
-e MYSQL_DATABASE=wordpress \
-e MYSQL_USER=wordpress \
-e MYSQL_PASSWORD=wordpress \
-v mysql_data:/var/lib/mysql \
mysql:5.7
# Start WordPress
docker run -d \
--name wordpress \
--network wordpress \
-p 8080:80 \
-e WORDPRESS_DB_HOST=wordpress-db \
-e WORDPRESS_DB_USER=wordpress \
-e WORDPRESS_DB_PASSWORD=wordpress \
-e WORDPRESS_DB_NAME=wordpress \
-v wordpress_data:/var/www/html \
wordpress
You can then access your WordPress site at http://localhost:8080.
Security Considerations
When using public images, security should be a top consideration:
Image Security Best Practices
- Use official or verified images whenever possible
- Prefer specific tags over
latestto ensure you know what you're running - Keep images updated regularly to get security patches
- Use minimal images when possible (alpine/slim variants) to reduce attack surface
- Scan images for vulnerabilities using tools like Docker Scout, Snyk, or Trivy
- Don't embed secrets in images or containers; use environment variables or secrets management tools
Vulnerability Scanning
Docker Desktop includes Docker Scout, which can scan images for known vulnerabilities:
docker scout quickview nginx:latest
This shows a summary of known vulnerabilities in the image.
docker scout cves nginx:latest
This shows more detailed information about the CVEs (Common Vulnerabilities and Exposures) in the image.
Best Practice: Integrate image scanning into your CI/CD pipeline to automatically check for vulnerabilities before deployment.
Key Takeaways
- Docker Hub is the default registry for Docker images, with millions of available images
- Official images are curated by Docker and are the most trustworthy option
- Tags specify image versions and variants, with conventions like
version,version-slim, andversion-alpine - When evaluating images, consider maintainer reputation, documentation quality, update frequency, and community engagement
- Different image variants offer tradeoffs between size, features, and compatibility
- Multi-architecture images allow the same image to work across different CPU architectures
- Images are composed of layers, which can be shared and reused between images
- Security is a critical consideration when using public images - prefer official images and scan for vulnerabilities
With this knowledge, you're well-equipped to effectively find, evaluate, and use public Docker images in your projects!
Looking Ahead
In our next session, we'll learn how to create our own custom Docker images by writing Dockerfiles. This will allow us to package our applications into containers with precisely the dependencies and configuration we need.
Discussion Questions
- What criteria would you use to decide between the full, slim, and alpine variants of an image for your project?
- How might you approach evaluating a community image that doesn't have an official alternative?
- What are the security implications of using public images in a production environment? How would you mitigate these risks?
- How could the layered architecture of Docker images help optimize the build and deployment process in a CI/CD pipeline?
- In what scenarios might you prefer a specialized image over a more generic base image that you customize yourself?
Additional Resources
- Docker Hub Documentation - Official guide to using Docker Hub
- Docker Pull Reference - Detailed information on the pull command
- Docker Image Management - Best practices for managing images
- Docker Search Reference - How to search for images from the CLI
- Official Images Repository - GitHub repository for official Docker images