Building and Running Your Own Image

Week 1, Wednesday - Afternoon Session

Lecture Overview

Now that we understand the basics of Dockerfiles, let's explore the complete process of building and running your own custom Docker images. In this session, we'll cover the entire workflow from creating a Dockerfile to running and debugging containers based on your custom images. We'll build several practical examples and explore strategies for optimizing your Docker images for different use cases.

The Image Building Process

Before diving into specific examples, let's understand how Docker builds images from Dockerfiles.

Understanding Docker Build Context

When you run docker build, Docker sends the entire "build context" to the Docker daemon. The build context includes all files and directories in the path or URL specified at the end of the build command.

docker build -t myimage:tag .

In this example, the . represents the current directory, which becomes the build context. This is important to understand because:

Analogy: The build context is like packing a suitcase for a trip. You gather all the items you might need (files in your directory), but you can also decide to leave certain things behind (.dockerignore files). Once packed, you hand your suitcase to the airline (Docker daemon). If your suitcase is too large or contains prohibited items, you'll have problems – just like with Docker builds.

The Build Process in Detail

When Docker builds an image, it follows these steps:

  1. Send the build context to the Docker daemon
  2. Process instructions in the Dockerfile sequentially
  3. For each instruction:
    • Check if there's a cached layer that can be reused
    • If not, create a temporary container from the previous layer
    • Execute the instruction in the temporary container
    • Commit the changes as a new layer
    • Remove the temporary container
  4. Tag the final image with the specified name and tag

Best practice: Keep your build context as small as possible by using .dockerignore files to exclude unnecessary files and directories. This speeds up the build process and reduces the risk of accidentally including sensitive information.

Essential Build Commands and Options

Basic Build Command

The basic syntax for building a Docker image is:

docker build [OPTIONS] PATH | URL

Common options include:

Tagging Strategies

Tags help you identify and version your images. Common tagging strategies include:

# Apply multiple tags to the same build
docker build -t myapp:1.0 -t myapp:latest .

Using Build Arguments

Build arguments allow you to pass variables to the build process using ARG instructions in your Dockerfile:

# In your Dockerfile
ARG VERSION=3.9
FROM python:${VERSION}-slim

# Build with a custom version
docker build --build-arg VERSION=3.10 -t myapp .

Note: Build arguments are only available during the build process and are not persisted in the final image. For values that need to be available at runtime, use ENV instructions instead.

Building from Different Sources

You can build images from various sources:

Example 1: Building a Python Web Application

Let's build a simple Flask web application image, step by step.

Step 1: Create Project Structure

First, create a new directory for your project and set up the basic files:

mkdir flask_app
cd flask_app

Create a file named app.py with this content:

from flask import Flask, render_template
import os

app = Flask(__name__)

@app.route('/')
def hello():
    environment = os.environ.get('FLASK_ENV', 'development')
    return render_template('index.html', environment=environment)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Create a templates directory and an index.html file inside it:

mkdir templates

Add this content to templates/index.html:

<!DOCTYPE html>
<html>
<head>
    <title>Docker Flask App</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            margin: 40px;
            line-height: 1.6;
        }
        .container {
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
            border: 1px solid #ddd;
            border-radius: 5px;
        }
        h1 {
            color: #333;
        }
        .environment {
            display: inline-block;
            padding: 5px 10px;
            background-color: #f0f0f0;
            border-radius: 3px;
        }
    </style>
</head>
<body>
    <div class="container">
        <h1>Hello from Docker!</h1>
        <p>This is a Flask application running inside a Docker container.</p>
        <p>Current environment: <span class="environment">{{ environment }}</span></p>
    </div>
</body>
</html>

Create a requirements.txt file with these dependencies:

flask==2.0.1
gunicorn==20.1.0

Step 2: Create a Dockerfile

Now, create a Dockerfile in the project root:

# Use Python 3.9 slim as the base image
FROM python:3.9-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app.py

# Set working directory
WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Expose port
EXPOSE 5000

# Run the application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Step 3: Create a .dockerignore File

Create a .dockerignore file to exclude unnecessary files from the build context:

.git
.gitignore
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
.env
.venv
.idea/
.vscode/
*.swp
Dockerfile
docker-compose.yml
README.md

Step 4: Build the Image

Now, let's build the image:

docker build -t flask-app:1.0 .

This command will:

  1. Send the build context to the Docker daemon
  2. Process each instruction in the Dockerfile
  3. Create a new image named flask-app with the tag 1.0

You should see output showing the progress of each step:

Sending build context to Docker daemon  6.656kB
Step 1/8 : FROM python:3.9-slim
 ---> 8c705081f50d
Step 2/8 : ENV PYTHONDONTWRITEBYTECODE=1     PYTHONUNBUFFERED=1     FLASK_APP=app.py
 ---> Running in 6ce8e46488fd
Removing intermediate container 6ce8e46488fd
 ---> 9d53d0d5b1fd
Step 3/8 : WORKDIR /app
...
Successfully built f8b2f81f83a
Successfully tagged flask-app:1.0

Step 5: Verify the Image

Check that your image was created successfully:

docker images

You should see your new image listed:

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
flask-app    1.0       f8b2f81f83a   2 minutes ago   128MB
python       3.9-slim  8c705081f50d   2 weeks ago     124MB

You can also inspect the image to see its metadata:

docker inspect flask-app:1.0

Step 6: Run a Container from Your Image

Now, let's run a container from the image we just built:

docker run -d -p 5000:5000 --name flask-container flask-app:1.0

This command:

Step 7: Test Your Application

Open your web browser and navigate to http://localhost:5000. You should see your Flask application running with the "Hello from Docker!" message.

Step 8: Check Container Logs

To see the logs from your container:

docker logs flask-container

You should see output from Gunicorn showing that your application is running.

Step 9: Stop and Remove the Container

When you're done, stop and remove the container:

docker stop flask-container
docker rm flask-container

Real-world application: This pattern of building a custom image for a web application is extremely common in professional development. Teams typically create Dockerfiles for their applications, build images during CI/CD processes, and deploy those images to staging and production environments. The image contains everything the application needs to run, ensuring consistency across environments.

Example 2: Multi-Stage Build for a React Application

Let's create a more complex example using multi-stage builds to create an optimized image for a React frontend application.

Step 1: Create Project Structure

For this example, let's assume we have a basic React application created with Create React App. We'll focus on the Dockerfile and building process:

mkdir react_app
cd react_app

For brevity, we won't create an entire React app structure here. In a real scenario, you'd have the standard React application files.

Step 2: Create a Multi-Stage Dockerfile

Create a Dockerfile with multi-stage build:

# Stage 1: Build the React application
FROM node:16-alpine as build

# Set working directory
WORKDIR /app

# Copy package.json and package-lock.json
COPY package*.json ./

# Install dependencies
RUN npm ci

# Copy application code
COPY . .

# Build the application
RUN npm run build

# Stage 2: Serve the built application with NGINX
FROM nginx:alpine

# Copy NGINX configuration
COPY nginx.conf /etc/nginx/conf.d/default.conf

# Copy built files from the build stage
COPY --from=build /app/build /usr/share/nginx/html

# Expose port
EXPOSE 80

# Start NGINX
CMD ["nginx", "-g", "daemon off;"]

Step 3: Create NGINX Configuration

Create a file named nginx.conf:

server {
    listen 80;
    server_name localhost;

    root /usr/share/nginx/html;
    index index.html;

    # Serve static files
    location / {
        try_files $uri $uri/ /index.html;
    }

    # Cache static assets
    location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
        expires 30d;
        add_header Cache-Control "public, no-transform";
    }
}

Step 4: Create .dockerignore

Create a .dockerignore file:

node_modules
build
.git
.gitignore
README.md
Dockerfile
.dockerignore

Step 5: Build the Image

Build the multi-stage image:

docker build -t react-app:1.0 .

During the build process, you'll see Docker executing both stages:

  1. First building the React application with Node.js
  2. Then creating a smaller final image with just NGINX and the built files

Step 6: Run a Container from the Image

Now run a container from your optimized image:

docker run -d -p 80:80 --name react-container react-app:1.0

Visit http://localhost in your browser to see the React application running.

Benefits of this multi-stage approach:

  • Smaller final image: The final image only contains NGINX and the built files, not Node.js or development dependencies
  • Improved security: Fewer components means less attack surface
  • Better performance: NGINX is optimized for serving static files
  • Separation of concerns: Build environment and runtime environment are separate

The final image might be around 25MB, whereas a single-stage build including Node.js would be over 1GB.

Real-world application: This multi-stage build pattern is the industry standard for building frontend applications. In production environments, this approach keeps images small and secure while optimizing performance. Many companies will further enhance this pattern with CDN integration and automated deployments.

Example 3: Building a Microservices Application

Let's look at a more complex example where we build a simple microservices application with a backend API and a database.

Project Structure

For this example, we'll create a simplified structure with an API service and a database service:

mkdir microservices_demo
cd microservices_demo
mkdir api

Step 1: Create the API Service

In the api directory, create the following files:

api/app.py:

from flask import Flask, jsonify
import os
import psycopg2

app = Flask(__name__)

def get_db_connection():
    conn = psycopg2.connect(
        host=os.environ.get('DB_HOST', 'db'),
        database=os.environ.get('DB_NAME', 'postgres'),
        user=os.environ.get('DB_USER', 'postgres'),
        password=os.environ.get('DB_PASSWORD', 'postgres')
    )
    return conn

@app.route('/api/health')
def health():
    return jsonify({"status": "healthy"})

@app.route('/api/items')
def get_items():
    conn = get_db_connection()
    cur = conn.cursor()
    cur.execute('SELECT * FROM items;')
    items = [{"id": row[0], "name": row[1]} for row in cur.fetchall()]
    cur.close()
    conn.close()
    return jsonify(items)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

api/requirements.txt:

flask==2.0.1
psycopg2-binary==2.9.1
gunicorn==20.1.0

api/Dockerfile:

FROM python:3.9-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Step 2: Create Docker Compose File

In the root directory, create a docker-compose.yml file to orchestrate the services:

version: '3'

services:
  api:
    build: ./api
    ports:
      - "5000:5000"
    environment:
      - DB_HOST=db
      - DB_NAME=postgres
      - DB_USER=postgres
      - DB_PASSWORD=postgres
    depends_on:
      - db

  db:
    image: postgres:13-alpine
    environment:
      - POSTGRES_PASSWORD=postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql

volumes:
  postgres_data:

Step 3: Create Database Initialization Script

Create an init.sql file in the root directory to initialize the database:

CREATE TABLE IF NOT EXISTS items (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);

INSERT INTO items (name) VALUES ('Item 1');
INSERT INTO items (name) VALUES ('Item 2');
INSERT INTO items (name) VALUES ('Item 3');

Step 4: Build and Run with Docker Compose

Now, build and run the entire application with Docker Compose:

docker-compose up --build

This command will:

  1. Build the API service image using its Dockerfile
  2. Pull the Postgres image from Docker Hub
  3. Create the defined volumes and networks
  4. Start all services in the correct order

Step 5: Test the Application

Open your browser or use curl to test the API endpoints:

curl http://localhost:5000/api/health
curl http://localhost:5000/api/items

You should see the health status and the list of items from the database.

Step 6: Stop the Application

To stop the application and clean up:

docker-compose down

To remove the volumes as well:

docker-compose down -v

Best practice: In a microservices architecture, it's common to have a separate Dockerfile for each service, allowing independent development and deployment. Docker Compose is a great tool for local development, while Kubernetes or other orchestration systems are typically used in production.

Image Optimization Strategies

Let's explore strategies to optimize your Docker images for different requirements.

Minimizing Image Size

Smaller images have several advantages:

Strategies to minimize image size:

  1. Use slim or alpine base images:
    # Standard Python image: ~900MB
    FROM python:3.9
    
    # Slim variant: ~150MB
    FROM python:3.9-slim
    
    # Alpine variant: ~45MB
    FROM python:3.9-alpine
  2. Multi-stage builds to separate build and runtime environments
  3. Clean up in the same layer you create files:
    RUN apt-get update && \
        apt-get install -y --no-install-recommends gcc && \
        pip install mypackage && \
        apt-get purge -y --auto-remove gcc && \
        rm -rf /var/lib/apt/lists/*
  4. Use .dockerignore to exclude unnecessary files
  5. Minimize the number of layers by combining related operations

Example: Optimizing a Python application image

# Before optimization
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

# After optimization
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]

The optimized version:

  • Uses the slim variant of Python
  • Copies only the requirements file first to leverage caching
  • Uses --no-cache-dir to avoid storing package cache

Alpine vs. Debian-based Images

Alpine-based images are much smaller but come with tradeoffs:

Alpine-based Debian-based (slim)
Size Very small (~5-50MB base) Small-medium (~100-200MB base)
C libraries musl libc glibc
Package manager apk apt
Binary compatibility Sometimes problematic Generally good
Build complexity May require build dependencies Usually simpler

Best practice: For Python applications, python:3.x-slim is often a good default choice. It provides a good balance of size and compatibility. Use Alpine if size is critical and you've tested thoroughly with your dependencies.

Build Speed Optimization

To optimize build speed and leverage caching effectively:

  1. Order instructions from least to most frequently changed
  2. Separate dependency installation from code copying
  3. Use buildkit for improved performance:
    DOCKER_BUILDKIT=1 docker build -t myapp .
  4. Consider using dependency caching for large projects:
    # In docker-compose.yml
    services:
      app:
        build:
          context: .
          cache_from:
            - myapp:latest

Running and Managing Images

Once you've built your images, it's important to understand how to effectively run and manage them.

Run Options in Detail

The docker run command has many options to customize container behavior:

docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Common options include:

Example: Running a container with various options

docker run \
  -d \
  --name app \
  -p 8080:5000 \
  -v data:/app/data \
  -e DEBUG=true \
  -e API_KEY=secret \
  --restart always \
  --memory 512m \
  --cpus 0.5 \
  myapp:latest

This command:

  • Runs the container in detached mode
  • Names it "app"
  • Maps host port 8080 to container port 5000
  • Mounts a volume named "data" to /app/data
  • Sets environment variables
  • Configures automatic restart
  • Limits memory to 512MB and CPU to half a core

Environment Variables and Configuration

Environment variables are a key mechanism for configuring containerized applications:

# Setting variables in the Dockerfile
ENV API_URL=https://api.example.com
ENV DEBUG=false

# Overriding at runtime
docker run -e DEBUG=true -e API_URL=https://staging-api.example.com myapp

For sensitive configuration, consider:

Best practice: Never hardcode sensitive information in your Dockerfile. Use environment variables, secrets management, or mounted configuration files to provide sensitive values at runtime.

Container Lifecycle Management

Understanding container lifecycle commands:

Common management patterns:

# Start a stopped container
docker start container_name

# View logs with follow (-f)
docker logs -f container_name

# Execute a command in a running container
docker exec -it container_name bash

# Remove all stopped containers
docker container prune

# Get detailed information about a container
docker inspect container_name

# View resource usage
docker stats

Data Management with Volumes

For data that needs to persist beyond container lifecycles, use volumes:

# Create a named volume
docker volume create mydata

# Run a container with the volume
docker run -v mydata:/app/data myapp

# Use bind mounts for development
docker run -v $(pwd):/app myapp

# Inspect volume
docker volume inspect mydata

# Remove volume
docker volume rm mydata

# Remove all unused volumes
docker volume prune

Best practice: Use named volumes for production data and bind mounts for development. Regularly back up important volumes, and include volume cleanup in your maintenance procedures.

Debugging and Troubleshooting

Even with well-designed images, issues can arise. Let's explore debugging strategies.

Build-time Debugging

When your build fails, try these approaches:

  1. Examine build output carefully for error messages
  2. Debug intermediate stages:
    # Find the last successful layer
    docker build -t debug-image . || true
    # Start a container from that layer
    docker run -it debug-image bash
  3. Add diagnostic commands to your Dockerfile:
    RUN ls -la /app
    RUN pip list
    RUN env
  4. Try a different base image if facing compatibility issues

Runtime Debugging

To troubleshoot running containers:

  1. Check container status: docker ps -a
  2. View logs: docker logs container_name
  3. Execute commands inside the container:
    docker exec -it container_name bash
    # Inside the container
    ps aux
    cat /var/log/app.log
    netstat -tulpn
  4. Inspect container configuration: docker inspect container_name
  5. Check resource usage: docker stats container_name

Common Issues and Solutions

Issue Possible Causes Solutions
Container exits immediately No foreground process, command error
  • Check CMD/ENTRYPOINT
  • Run with -it and a shell to debug
  • Check application logs
Port binding fails Port already in use, permissions
  • Check if another container/process is using the port
  • Try a different host port
Volume mount issues Path permissions, path doesn't exist
  • Check file permissions
  • Verify paths on host and container
  • Use absolute paths
Network connectivity issues DNS issues, network isolation
  • Check with ping/curl inside container
  • Verify DNS configuration
  • Check network settings
Resource constraints Out of memory, CPU throttling
  • Monitor with docker stats
  • Increase resource limits
  • Optimize application

Best practice: Build good observability into your containers with proper logging and health checks. This makes troubleshooting much easier when problems arise.

Moving to Production

When moving from development to production, consider these additional aspects:

Image Tagging and Versioning

Implement a consistent tagging strategy:

# Example tagging script in CI
VERSION=$(cat VERSION)
GIT_HASH=$(git rev-parse --short HEAD)
BUILD_ID=${CI_BUILD_NUMBER}

docker build -t myapp:${VERSION} \
  -t myapp:${VERSION}-${GIT_HASH} \
  -t myapp:build-${BUILD_ID} \
  -t myapp:latest .

Security Considerations

CI/CD Integration

Automate image building and testing in your CI/CD pipeline:

  1. Build images on each commit
  2. Run automated tests against the images
  3. Scan for vulnerabilities
  4. Push to a registry
  5. Deploy to staging/production

Example GitHub Actions workflow:

name: Build and Push Docker Image

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Build Image
        run: docker build -t myapp:${{ github.sha }} .
      
      - name: Test Image
        run: |
          docker run --rm myapp:${{ github.sha }} pytest
      
      - name: Login to Registry
        run: echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
      
      - name: Push Image
        run: |
          docker tag myapp:${{ github.sha }} username/myapp:latest
          docker tag myapp:${{ github.sha }} username/myapp:${{ github.sha }}
          docker push username/myapp:latest
          docker push username/myapp:${{ github.sha }}

Container Orchestration

For production, consider using orchestration tools like:

Note: Orchestration tools handle deployment, scaling, networking, load balancing, and self-healing for containerized applications.

Practical Exercises

Exercise 1: Flask Application with Environment Configuration

Extend the Flask application from Example 1 to support different environments:

  1. Modify the Dockerfile to accept a build argument for the environment
  2. Use a runtime environment variable to control debug mode
  3. Build the image for both development and production
  4. Run containers from both images and observe the differences

Modified Dockerfile:

FROM python:3.9-slim

ARG ENVIRONMENT=development
ENV FLASK_ENV=${ENVIRONMENT} \
    PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app.py

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Building for different environments:

docker build -t flask-app:dev --build-arg ENVIRONMENT=development .
docker build -t flask-app:prod --build-arg ENVIRONMENT=production .

Running the containers:

docker run -d -p 5001:5000 --name flask-dev flask-app:dev
docker run -d -p 5002:5000 --name flask-prod flask-app:prod

Exercise 2: Multi-Stage Build for a Python Application

Create a multi-stage Dockerfile for a Python application:

  1. First stage: Build and test the application
  2. Second stage: Create a minimal runtime image
  3. Ensure only the necessary files are included in the final image

Multi-stage Dockerfile for Python:

# Stage 1: Build and test
FROM python:3.9 AS builder

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

RUN pytest

# Stage 2: Runtime
FROM python:3.9-slim

WORKDIR /app

COPY --from=builder /app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY --from=builder /app/*.py /app/
COPY --from=builder /app/templates /app/templates
COPY --from=builder /app/static /app/static

EXPOSE 5000

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]

Exercise 3: Optimizing Image Size

Take an existing Dockerfile and optimize it for size:

  1. Start with a basic Dockerfile that installs dependencies
  2. Optimize it using the techniques we've discussed
  3. Compare the size before and after optimization

Before optimization:

FROM python:3.9

WORKDIR /app

COPY . .

RUN apt-get update
RUN apt-get install -y gcc
RUN pip install -r requirements.txt
RUN apt-get install -y curl

CMD ["python", "app.py"]

After optimization:

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc curl && \
    pip install --no-cache-dir -r requirements.txt && \
    apt-get purge -y --auto-remove gcc && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

COPY app.py .

CMD ["python", "app.py"]

The optimized version:

  • Uses a slim base image
  • Combines apt-get commands to reduce layers
  • Uses --no-install-recommends to reduce dependencies
  • Cleans up build dependencies and apt cache
  • Copies only the necessary files

Key Takeaways

With these skills, you're now equipped to build, optimize, and run Docker images for a wide range of applications!

Looking Ahead

In our next session, we'll explore Docker Compose in more depth, learning how to orchestrate multi-container applications for development and testing. We'll see how to define complex application stacks with networking, volumes, and environment configuration.

Discussion Questions

  1. How would you approach containerizing an existing application? What factors would you consider first?
  2. When would you choose to use multi-stage builds, and how might they benefit different types of applications?
  3. What strategies would you use to minimize image size for a Python web application while ensuring good performance?
  4. How might container security considerations differ between development and production environments?
  5. What advantages might a microservices approach with multiple containers have over a monolithic container? When might it not be appropriate?

Additional Resources