Building and Running Your Own Image

Lecture Overview

Now that we understand the basics of Dockerfiles, let's explore the complete process of building and running your own custom Docker images. In this session, we'll cover the entire workflow from creating a Dockerfile to running and debugging containers based on your custom images. We'll build several practical examples and explore strategies for optimizing your Docker images for different use cases.

The Image Building Process

Before diving into specific examples, let's understand how Docker builds images from Dockerfiles.

Understanding Docker Build Context

When you run docker build, Docker sends the entire "build context" to the Docker daemon. The build context includes all files and directories in the path or URL specified at the end of the build command.

docker build -t myimage:tag .

In this example, the . represents the current directory, which becomes the build context. This is important to understand because:

A large build context can slow down the build process
Only files within the build context can be used in COPY and ADD instructions
You can use .dockerignore to exclude files from the build context

Analogy: The build context is like packing a suitcase for a trip. You gather all the items you might need (files in your directory), but you can also decide to leave certain things behind (.dockerignore files). Once packed, you hand your suitcase to the airline (Docker daemon). If your suitcase is too large or contains prohibited items, you'll have problems – just like with Docker builds.

The Build Process in Detail

When Docker builds an image, it follows these steps:

Send the build context to the Docker daemon
Process instructions in the Dockerfile sequentially
For each instruction:
- Check if there's a cached layer that can be reused
- If not, create a temporary container from the previous layer
- Execute the instruction in the temporary container
- Commit the changes as a new layer
- Remove the temporary container
Tag the final image with the specified name and tag

Best practice: Keep your build context as small as possible by using .dockerignore files to exclude unnecessary files and directories. This speeds up the build process and reduces the risk of accidentally including sensitive information.

Essential Build Commands and Options

Basic Build Command

The basic syntax for building a Docker image is:

docker build [OPTIONS] PATH | URL

Common options include:

-t, --tag: Name and optionally tag the image (e.g., name:tag)
-f, --file: Specify the Dockerfile to use (default is PATH/Dockerfile)
--no-cache: Do not use cache when building the image
--pull: Always attempt to pull a newer version of the base image
--build-arg: Set build-time variables (to be used with ARG in the Dockerfile)

Tagging Strategies

Tags help you identify and version your images. Common tagging strategies include:

Semantic versioning: myapp:1.0.0, myapp:1.0.1, etc.
Environment tags: myapp:development, myapp:production
Git commit/branch: myapp:git-abc123, myapp:feature-login
Date-based: myapp:20230415
Multiple tags: You can apply multiple tags to the same image

# Apply multiple tags to the same build
docker build -t myapp:1.0 -t myapp:latest .

Using Build Arguments

Build arguments allow you to pass variables to the build process using ARG instructions in your Dockerfile:

# In your Dockerfile
ARG VERSION=3.9
FROM python:${VERSION}-slim

# Build with a custom version
docker build --build-arg VERSION=3.10 -t myapp .

Note: Build arguments are only available during the build process and are not persisted in the final image. For values that need to be available at runtime, use ENV instructions instead.

Building from Different Sources

You can build images from various sources:

Local directory: docker build -t myapp .
Git repository: docker build -t myapp https://github.com/username/repo.git
Tarball: docker build -t myapp http://server/context.tar.gz
Standard input: docker build -t myapp - (Dockerfile from stdin)

Example 1: Building a Python Web Application

Let's build a simple Flask web application image, step by step.

Step 1: Create Project Structure

First, create a new directory for your project and set up the basic files:

mkdir flask_app
cd flask_app

Create a file named app.py with this content:

from flask import Flask, render_template
import os

app = Flask(__name__)

@app.route('/')
def hello():
    environment = os.environ.get('FLASK_ENV', 'development')
    return render_template('index.html', environment=environment)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Create a templates directory and an index.html file inside it:

mkdir templates

Add this content to templates/index.html:

<!DOCTYPE html>
<html>
<head>
    <title>Docker Flask App</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            margin: 40px;
            line-height: 1.6;
        }
        .container {
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
            border: 1px solid #ddd;
            border-radius: 5px;
        }
        h1 {
            color: #333;
        }
        .environment {
            display: inline-block;
            padding: 5px 10px;
            background-color: #f0f0f0;
            border-radius: 3px;
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>Hello from Docker!</h1>
        <p>This is a Flask application running inside a Docker container.</p>
        <p>Current environment: <span class="environment">{{ environment }}</span></p>
    </div>
</body>
</html>

Create a requirements.txt file with these dependencies:

flask==2.0.1
gunicorn==20.1.0

Step 2: Create a Dockerfile

Now, create a Dockerfile in the project root:

# Use Python 3.9 slim as the base image
FROM python:3.9-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app.py

# Set working directory
WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port
EXPOSE 5000

# Run the application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Step 3: Create a .dockerignore File

Create a .dockerignore file to exclude unnecessary files from the build context:

.git
.gitignore
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
.env
.venv
.idea/
.vscode/
*.swp
Dockerfile
docker-compose.yml
README.md

Step 4: Build the Image

Now, let's build the image:

docker build -t flask-app:1.0 .

This command will:

Send the build context to the Docker daemon
Process each instruction in the Dockerfile
Create a new image named flask-app with the tag 1.0

You should see output showing the progress of each step:

Sending build context to Docker daemon  6.656kB
Step 1/8 : FROM python:3.9-slim
 ---> 8c705081f50d
Step 2/8 : ENV PYTHONDONTWRITEBYTECODE=1     PYTHONUNBUFFERED=1     FLASK_APP=app.py
 ---> Running in 6ce8e46488fd
Removing intermediate container 6ce8e46488fd
 ---> 9d53d0d5b1fd
Step 3/8 : WORKDIR /app
...
Successfully built f8b2f81f83a
Successfully tagged flask-app:1.0

Step 5: Verify the Image

Check that your image was created successfully:

docker images

You should see your new image listed:

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
flask-app    1.0       f8b2f81f83a   2 minutes ago   128MB
python       3.9-slim  8c705081f50d   2 weeks ago     124MB

You can also inspect the image to see its metadata:

docker inspect flask-app:1.0

Step 6: Run a Container from Your Image

Now, let's run a container from the image we just built:

docker run -d -p 5000:5000 --name flask-container flask-app:1.0

This command:

-d: Runs the container in detached mode (background)
-p 5000:5000: Maps port 5000 in the container to port 5000 on the host
--name flask-container: Names the container "flask-container"
flask-app:1.0: Specifies the image to use

Step 7: Test Your Application

Open your web browser and navigate to http://localhost:5000. You should see your Flask application running with the "Hello from Docker!" message.

Step 8: Check Container Logs

To see the logs from your container:

docker logs flask-container

You should see output from Gunicorn showing that your application is running.

Step 9: Stop and Remove the Container

When you're done, stop and remove the container:

docker stop flask-container
docker rm flask-container

Real-world application: This pattern of building a custom image for a web application is extremely common in professional development. Teams typically create Dockerfiles for their applications, build images during CI/CD processes, and deploy those images to staging and production environments. The image contains everything the application needs to run, ensuring consistency across environments.

Example 2: Multi-Stage Build for a React Application

Let's create a more complex example using multi-stage builds to create an optimized image for a React frontend application.

Step 1: Create Project Structure

For this example, let's assume we have a basic React application created with Create React App. We'll focus on the Dockerfile and building process:

mkdir react_app
cd react_app

For brevity, we won't create an entire React app structure here. In a real scenario, you'd have the standard React application files.

Step 2: Create a Multi-Stage Dockerfile

Create a Dockerfile with multi-stage build:

# Stage 1: Build the React application
FROM node:16-alpine as build

# Set working directory
WORKDIR /app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install dependencies
RUN npm ci

# Copy application code
COPY . .

# Build the application
RUN npm run build

# Stage 2: Serve the built application with NGINX
FROM nginx:alpine

# Copy NGINX configuration
COPY nginx.conf /etc/nginx/conf.d/default.conf

# Copy built files from the build stage
COPY --from=build /app/build /usr/share/nginx/html

# Expose port
EXPOSE 80

# Start NGINX
CMD ["nginx", "-g", "daemon off;"]

Step 3: Create NGINX Configuration

Create a file named nginx.conf:

server {
    listen 80;
    server_name localhost;

    root /usr/share/nginx/html;
    index index.html;

    # Serve static files
    location / {
        try_files $uri $uri/ /index.html;
    }

    # Cache static assets
    location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
        expires 30d;
        add_header Cache-Control "public, no-transform";
    }
}

Step 4: Create .dockerignore

Create a .dockerignore file:

node_modules
build
.git
.gitignore
README.md
Dockerfile
.dockerignore

Step 5: Build the Image

Build the multi-stage image:

docker build -t react-app:1.0 .

During the build process, you'll see Docker executing both stages:

First building the React application with Node.js
Then creating a smaller final image with just NGINX and the built files

Step 6: Run a Container from the Image

Now run a container from your optimized image:

docker run -d -p 80:80 --name react-container react-app:1.0

Visit http://localhost in your browser to see the React application running.

Benefits of this multi-stage approach:

Smaller final image: The final image only contains NGINX and the built files, not Node.js or development dependencies
Improved security: Fewer components means less attack surface
Better performance: NGINX is optimized for serving static files
Separation of concerns: Build environment and runtime environment are separate

The final image might be around 25MB, whereas a single-stage build including Node.js would be over 1GB.

Real-world application: This multi-stage build pattern is the industry standard for building frontend applications. In production environments, this approach keeps images small and secure while optimizing performance. Many companies will further enhance this pattern with CDN integration and automated deployments.

Example 3: Building a Microservices Application

Let's look at a more complex example where we build a simple microservices application with a backend API and a database.

Project Structure

For this example, we'll create a simplified structure with an API service and a database service:

mkdir microservices_demo
cd microservices_demo
mkdir api

Step 1: Create the API Service

In the api directory, create the following files:

api/app.py:

from flask import Flask, jsonify
import os
import psycopg2

app = Flask(__name__)

def get_db_connection():
    conn = psycopg2.connect(
        host=os.environ.get('DB_HOST', 'db'),
        database=os.environ.get('DB_NAME', 'postgres'),
        user=os.environ.get('DB_USER', 'postgres'),
        password=os.environ.get('DB_PASSWORD', 'postgres')
    )
    return conn

@app.route('/api/health')
def health():
    return jsonify({"status": "healthy"})

@app.route('/api/items')
def get_items():
    conn = get_db_connection()
    cur = conn.cursor()
    cur.execute('SELECT * FROM items;')
    items = [{"id": row[0], "name": row[1]} for row in cur.fetchall()]
    cur.close()
    conn.close()
    return jsonify(items)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

api/requirements.txt:

flask==2.0.1
psycopg2-binary==2.9.1
gunicorn==20.1.0

api/Dockerfile:

FROM python:3.9-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Step 2: Create Docker Compose File

In the root directory, create a docker-compose.yml file to orchestrate the services:

version: '3'

services:
  api:
    build: ./api
    ports:
      - "5000:5000"
    environment:
      - DB_HOST=db
      - DB_NAME=postgres
      - DB_USER=postgres
      - DB_PASSWORD=postgres
    depends_on:
      - db

  db:
    image: postgres:13-alpine
    environment:
      - POSTGRES_PASSWORD=postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

volumes:
  postgres_data:

Step 3: Create Database Initialization Script

Create an init.sql file in the root directory to initialize the database:

CREATE TABLE IF NOT EXISTS items (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);

INSERT INTO items (name) VALUES ('Item 1');
INSERT INTO items (name) VALUES ('Item 2');
INSERT INTO items (name) VALUES ('Item 3');

Step 4: Build and Run with Docker Compose

Now, build and run the entire application with Docker Compose:

docker-compose up --build

This command will:

Build the API service image using its Dockerfile
Pull the Postgres image from Docker Hub
Create the defined volumes and networks
Start all services in the correct order

Step 5: Test the Application

Open your browser or use curl to test the API endpoints:

curl http://localhost:5000/api/health
curl http://localhost:5000/api/items

You should see the health status and the list of items from the database.

Step 6: Stop the Application

To stop the application and clean up:

docker-compose down

To remove the volumes as well:

docker-compose down -v

Best practice: In a microservices architecture, it's common to have a separate Dockerfile for each service, allowing independent development and deployment. Docker Compose is a great tool for local development, while Kubernetes or other orchestration systems are typically used in production.

Image Optimization Strategies

Let's explore strategies to optimize your Docker images for different requirements.

Minimizing Image Size

Smaller images have several advantages:

Faster downloads and deployments
Reduced storage costs
Smaller attack surface
Improved container startup time

Strategies to minimize image size:

Use slim or alpine base images:

# Standard Python image: ~900MB
FROM python:3.9

# Slim variant: ~150MB
FROM python:3.9-slim

# Alpine variant: ~45MB
FROM python:3.9-alpine

Multi-stage builds to separate build and runtime environments

Clean up in the same layer you create files:

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc && \
    pip install mypackage && \
    apt-get purge -y --auto-remove gcc && \
    rm -rf /var/lib/apt/lists/*

Use .dockerignore to exclude unnecessary files
Minimize the number of layers by combining related operations

Example: Optimizing a Python application image

# Before optimization
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

# After optimization
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]

The optimized version:

Uses the slim variant of Python
Copies only the requirements file first to leverage caching
Uses --no-cache-dir to avoid storing package cache

Alpine vs. Debian-based Images

Alpine-based images are much smaller but come with tradeoffs:

	Alpine-based	Debian-based (slim)
Size	Very small (~5-50MB base)	Small-medium (~100-200MB base)
C libraries	musl libc	glibc
Package manager	apk	apt
Binary compatibility	Sometimes problematic	Generally good
Build complexity	May require build dependencies	Usually simpler

Best practice: For Python applications, python:3.x-slim is often a good default choice. It provides a good balance of size and compatibility. Use Alpine if size is critical and you've tested thoroughly with your dependencies.

Build Speed Optimization

To optimize build speed and leverage caching effectively:

Order instructions from least to most frequently changed
Separate dependency installation from code copying

Use buildkit for improved performance:

DOCKER_BUILDKIT=1 docker build -t myapp .

Consider using dependency caching for large projects:

# In docker-compose.yml
services:
  app:
    build:
      context: .
      cache_from:
        - myapp:latest

Running and Managing Images

Once you've built your images, it's important to understand how to effectively run and manage them.

Run Options in Detail

The docker run command has many options to customize container behavior:

docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Common options include:

-d, --detach: Run in background
-p, --publish: Map ports (host:container)
-v, --volume: Mount volumes
-e, --env: Set environment variables
--name: Assign a name to the container
--rm: Remove container when it exits
--network: Connect to a network
--restart: Restart policy (e.g., always, on-failure)
--memory, --cpus: Resource constraints

Example: Running a container with various options

docker run \
  -d \
  --name app \
  -p 8080:5000 \
  -v data:/app/data \
  -e DEBUG=true \
  -e API_KEY=secret \
  --restart always \
  --memory 512m \
  --cpus 0.5 \
  myapp:latest

This command:

Runs the container in detached mode
Names it "app"
Maps host port 8080 to container port 5000
Mounts a volume named "data" to /app/data
Sets environment variables
Configures automatic restart
Limits memory to 512MB and CPU to half a core

Environment Variables and Configuration

Environment variables are a key mechanism for configuring containerized applications:

# Setting variables in the Dockerfile
ENV API_URL=https://api.example.com
ENV DEBUG=false

# Overriding at runtime
docker run -e DEBUG=true -e API_URL=https://staging-api.example.com myapp

For sensitive configuration, consider:

Docker secrets (in Docker Swarm)

Environment variable files:

# env-file.txt
API_KEY=secret_key
DATABASE_URL=postgres://user:pass@db:5432/dbname

# Using the file
docker run --env-file env-file.txt myapp

Mounting configuration files as volumes

Best practice: Never hardcode sensitive information in your Dockerfile. Use environment variables, secrets management, or mounted configuration files to provide sensitive values at runtime.

Container Lifecycle Management

Understanding container lifecycle commands:

docker start/stop/restart: Control running state
docker pause/unpause: Temporarily freeze container
docker kill: Send SIGKILL to force stop
docker rm: Remove a container
docker logs: View container logs
docker exec: Run commands in a running container
docker inspect: View container details
docker stats: Monitor resource usage

Common management patterns:

# Start a stopped container
docker start container_name

# View logs with follow (-f)
docker logs -f container_name

# Execute a command in a running container
docker exec -it container_name bash

# Remove all stopped containers
docker container prune

# Get detailed information about a container
docker inspect container_name

# View resource usage
docker stats

Data Management with Volumes

For data that needs to persist beyond container lifecycles, use volumes:

# Create a named volume
docker volume create mydata

# Run a container with the volume
docker run -v mydata:/app/data myapp

# Use bind mounts for development
docker run -v $(pwd):/app myapp

# Inspect volume
docker volume inspect mydata

# Remove volume
docker volume rm mydata

# Remove all unused volumes
docker volume prune

Best practice: Use named volumes for production data and bind mounts for development. Regularly back up important volumes, and include volume cleanup in your maintenance procedures.

Debugging and Troubleshooting

Even with well-designed images, issues can arise. Let's explore debugging strategies.

Build-time Debugging

When your build fails, try these approaches:

Examine build output carefully for error messages

Debug intermediate stages:

# Find the last successful layer
docker build -t debug-image . || true
# Start a container from that layer
docker run -it debug-image bash

Add diagnostic commands to your Dockerfile:
```
RUN ls -la /app
RUN pip list
RUN env
```
Try a different base image if facing compatibility issues

Runtime Debugging

To troubleshoot running containers:

Check container status: docker ps -a
View logs: docker logs container_name

Execute commands inside the container:

docker exec -it container_name bash
# Inside the container
ps aux
cat /var/log/app.log
netstat -tulpn

Inspect container configuration: docker inspect container_name
Check resource usage: docker stats container_name

Common Issues and Solutions

Issue	Possible Causes	Solutions
Container exits immediately	No foreground process, command error	Check CMD/ENTRYPOINT Run with -it and a shell to debug Check application logs
Port binding fails	Port already in use, permissions	Check if another container/process is using the port Try a different host port
Volume mount issues	Path permissions, path doesn't exist	Check file permissions Verify paths on host and container Use absolute paths
Network connectivity issues	DNS issues, network isolation	Check with ping/curl inside container Verify DNS configuration Check network settings
Resource constraints	Out of memory, CPU throttling	Monitor with docker stats Increase resource limits Optimize application

Best practice: Build good observability into your containers with proper logging and health checks. This makes troubleshooting much easier when problems arise.

Moving to Production

When moving from development to production, consider these additional aspects:

Image Tagging and Versioning

Implement a consistent tagging strategy:

Semantic versioning: myapp:1.2.3
Git-based: myapp:git-abcdef1 (commit hash)
Build identifiers: myapp:build-123
Environment tags: myapp:production

# Example tagging script in CI
VERSION=$(cat VERSION)
GIT_HASH=$(git rev-parse --short HEAD)
BUILD_ID=${CI_BUILD_NUMBER}

docker build -t myapp:${VERSION} \
  -t myapp:${VERSION}-${GIT_HASH} \
  -t myapp:build-${BUILD_ID} \
  -t myapp:latest .

Security Considerations

Scan images for vulnerabilities:
```
docker scan myapp:1.0
```
Use non-root users in containers:
```
USER app
```
Apply the principle of least privilege
Keep base images updated with security patches
Use read-only filesystems where possible:
```
docker run --read-only myapp
```

CI/CD Integration

Automate image building and testing in your CI/CD pipeline:

Build images on each commit
Run automated tests against the images
Scan for vulnerabilities
Push to a registry
Deploy to staging/production

Example GitHub Actions workflow:

name: Build and Push Docker Image

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Build Image
        run: docker build -t myapp:${{ github.sha }} .
      
      - name: Test Image
        run: |
          docker run --rm myapp:${{ github.sha }} pytest
      
      - name: Login to Registry
        run: echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
      
      - name: Push Image
        run: |
          docker tag myapp:${{ github.sha }} username/myapp:latest
          docker tag myapp:${{ github.sha }} username/myapp:${{ github.sha }}
          docker push username/myapp:latest
          docker push username/myapp:${{ github.sha }}

Container Orchestration

For production, consider using orchestration tools like:

Kubernetes: For complex, scalable deployments
Docker Swarm: Simpler alternative built into Docker
Amazon ECS/EKS: AWS container services
Google Kubernetes Engine: Google Cloud's Kubernetes service
Azure Container Instances/AKS: Microsoft Azure's container services

Note: Orchestration tools handle deployment, scaling, networking, load balancing, and self-healing for containerized applications.

Practical Exercises

Exercise 1: Flask Application with Environment Configuration

Extend the Flask application from Example 1 to support different environments:

Modify the Dockerfile to accept a build argument for the environment
Use a runtime environment variable to control debug mode
Build the image for both development and production
Run containers from both images and observe the differences

Modified Dockerfile:

FROM python:3.9-slim

ARG ENVIRONMENT=development
ENV FLASK_ENV=${ENVIRONMENT} \
    PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app.py

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Building for different environments:

docker build -t flask-app:dev --build-arg ENVIRONMENT=development .
docker build -t flask-app:prod --build-arg ENVIRONMENT=production .

Running the containers:

docker run -d -p 5001:5000 --name flask-dev flask-app:dev
docker run -d -p 5002:5000 --name flask-prod flask-app:prod

Exercise 2: Multi-Stage Build for a Python Application

Create a multi-stage Dockerfile for a Python application:

First stage: Build and test the application
Second stage: Create a minimal runtime image
Ensure only the necessary files are included in the final image

Multi-stage Dockerfile for Python:

# Stage 1: Build and test
FROM python:3.9 AS builder

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

RUN pytest

# Stage 2: Runtime
FROM python:3.9-slim

WORKDIR /app

COPY --from=builder /app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY --from=builder /app/*.py /app/
COPY --from=builder /app/templates /app/templates
COPY --from=builder /app/static /app/static

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Exercise 3: Optimizing Image Size

Take an existing Dockerfile and optimize it for size:

Start with a basic Dockerfile that installs dependencies
Optimize it using the techniques we've discussed
Compare the size before and after optimization

Before optimization:

FROM python:3.9

WORKDIR /app

COPY . .

RUN apt-get update
RUN apt-get install -y gcc
RUN pip install -r requirements.txt
RUN apt-get install -y curl

CMD ["python", "app.py"]

After optimization:

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc curl && \
    pip install --no-cache-dir -r requirements.txt && \
    apt-get purge -y --auto-remove gcc && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

COPY app.py .

CMD ["python", "app.py"]

The optimized version:

Uses a slim base image
Combines apt-get commands to reduce layers
Uses --no-install-recommends to reduce dependencies
Cleans up build dependencies and apt cache
Copies only the necessary files

Key Takeaways

The Docker build context is all files in the specified path, which can be optimized with .dockerignore
Docker builds images layer by layer, caching when possible for efficiency
Optimizing image size improves download speed, security, and resource usage
Multi-stage builds are powerful for creating efficient, production-ready images
Different base images (full, slim, alpine) offer tradeoffs between size and compatibility
Container runtime configuration can be controlled with environment variables and runtime options
Volumes provide persistent storage that survives container lifecycle
Debugging tools include logs, exec, inspect, and stats commands
Production deployments require additional consideration for security, tagging, and orchestration

With these skills, you're now equipped to build, optimize, and run Docker images for a wide range of applications!

Looking Ahead

In our next session, we'll explore Docker Compose in more depth, learning how to orchestrate multi-container applications for development and testing. We'll see how to define complex application stacks with networking, volumes, and environment configuration.

Discussion Questions

How would you approach containerizing an existing application? What factors would you consider first?
When would you choose to use multi-stage builds, and how might they benefit different types of applications?
What strategies would you use to minimize image size for a Python web application while ensuring good performance?
How might container security considerations differ between development and production environments?
What advantages might a microservices approach with multiple containers have over a monolithic container? When might it not be appropriate?

Additional Resources

Docker Build Reference - Complete documentation for the build command
Multi-Stage Builds - Detailed guide to multi-stage builds
Docker Run Reference - All run command options
Resource Constraints - How to manage container resources
Docker Image Security Best Practices - Guide to securing your Docker images