Python Full Stack Web Developer Course

Week 2: Python Fundamentals (Part 1)

Friday Morning: Docker and Dependencies Management

Introduction to Docker and Dependencies Management

Welcome to our session on Docker and dependencies management! This morning, we'll explore how Docker containers can revolutionize the way you manage dependencies in Python projects. While we've already covered virtual environments, Docker takes isolation and reproducibility to a whole new level, addressing many limitations of traditional Python environments.

Think of Docker as not just a tool, but a paradigm shift in how we approach development environments. If a virtual environment is like having a separate toolbox for each project, Docker is like having a complete, isolated workshop for each project—with its own tools, materials, and even operating system.

Quick Docker Review

Let's briefly recap what we've learned about Docker earlier this week:

Today, we'll focus specifically on how Docker addresses the complex challenge of dependency management in Python applications.

The Dependency Management Challenge

Before diving into Docker solutions, let's understand the key challenges in Python dependency management:

1. The "Works on My Machine" Problem

Despite using virtual environments, differences in operating systems and system libraries can still cause inconsistent behavior.

Example: A library that depends on a C extension might compile differently on macOS vs. Linux, or fail entirely on Windows.

2. System-Level Dependencies

Python packages often rely on system libraries that virtual environments don't manage.

Example: Packages like Pillow (imaging), NumPy (with optimized BLAS), or psycopg2 (PostgreSQL) require specific system libraries installed.

3. Version Conflicts and Dependency Hell

Complex applications with many dependencies can lead to irreconcilable version conflicts.

Example: Package A requires library X version 1.x, while Package B requires library X version 2.x.

4. Environment Parity

Development, testing, and production environments need to be identical to avoid surprises.

Example: Your application works in development but fails in production due to subtle differences in environment configuration.

5. Onboarding Friction

New team members often spend days setting up a development environment with the right versions of everything.

Example: A new developer joins and spends their first week just trying to get the app running locally.

Real-World Analogy: Traditional dependency management is like giving someone a list of ingredients for a complex recipe—they might get slightly different brands, qualities, or preparations. Docker is like delivering a pre-measured, ready-to-mix ingredient kit with exactly what's needed.

How Docker Solves Dependency Challenges

1. Complete Environment Isolation

Docker containers include not just Python packages, but the entire runtime environment:

2. Consistent Environments Everywhere

The same container runs identically on any system with Docker installed, eliminating the "works on my machine" problem.

3. Declarative Configuration

A Dockerfile declares exactly what's in your environment, making it self-documenting and version-controllable.

4. Layered Caching

Docker's layer caching makes rebuilding environments fast after small changes to dependencies.

5. Isolation Without Performance Penalty

Containers have near-native performance while maintaining isolation.

6. Service Composition

Docker Compose lets you define and run multi-container applications (e.g., your app plus its database, cache, etc.).

Real-World Analogy: If traditional development is like each chef preparing ingredients differently, Docker is like a professional kitchen where every station is identical and prepared exactly to specification—the same tools, same ingredients, same procedures, no matter who's cooking.

Dockerfile Deep Dive for Python Applications

The Dockerfile is the blueprint for your application's environment. Let's examine the key components for Python applications:

Base Image Selection

Choosing the right base image is crucial for Python applications:

# Official Python images
FROM python:3.9                # Full Python with all dependencies
FROM python:3.9-slim           # Smaller image, fewer system packages
FROM python:3.9-alpine         # Minimal image, smallest size

# Alternative: Use specific OS with Python installed
FROM ubuntu:20.04              # Then install Python yourself

Decision Factors:

Image Type Pros Cons Best For
python:3.x Complete, works with most packages Large size (~900MB) Complex applications with C extensions
python:3.x-slim Smaller (~150MB), most libraries work May need extra packages for some extensions Web applications, good default choice
python:3.x-alpine Very small (~50MB) Uses musl instead of glibc, compilation issues Microservices, simple scripts, constrained environments

Working Directory

Set a dedicated working directory for your application:

# Create and set working directory
WORKDIR /app

Dependency Installation

Efficiently installing Python dependencies with pip:

# Copy just the requirements file first (for better caching)
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

Advanced Technique: Use multi-stage builds to separate dependency installation from application code:

# Build stage
FROM python:3.9-slim AS builder

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential gcc

WORKDIR /build
COPY requirements.txt .

# Install dependencies
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r requirements.txt

# Final stage
FROM python:3.9-slim

WORKDIR /app

# Copy built wheels and install
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/requirements.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r requirements.txt

# Clean up
RUN rm -rf /wheels

System Dependencies

Installing system packages for common Python libraries:

# Example for Debian-based images (including python:slim)
RUN apt-get update && apt-get install -y --no-install-recommends \
    # for Pillow
    libjpeg-dev zlib1g-dev libpng-dev \
    # for psycopg2
    libpq-dev \
    # for lxml
    libxml2-dev libxslt-dev \
    # Cleanup to reduce image size
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

Alpine Example:

# Example for Alpine images
RUN apk add --no-cache \
    jpeg-dev zlib-dev libpng-dev \
    postgresql-dev \
    libxml2-dev libxslt-dev

Application Code

Copying application code and setting up the container:

# Copy application code
COPY . .

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# Expose port
EXPOSE 8000

# Run the application
CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:8000"]

User Setup

Running as a non-root user for security:

# Create a non-root user
RUN addgroup --system app && adduser --system --group app

# Set ownership
RUN chown -R app:app /app

# Switch to non-root user
USER app

A Complete Example

Bringing it all together for a Flask application:

# syntax=docker/dockerfile:1

FROM python:3.9-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1

# Create and set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Create non-root user
RUN addgroup --system app && adduser --system --group app

# Copy application code
COPY . .

# Set ownership
RUN chown -R app:app /app

# Switch to non-root user
USER app

# Expose port
EXPOSE 8000

# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]

Managing Dependencies with Docker Compose

Docker Compose is extremely powerful for managing service dependencies in your application, like databases, caches, and other services.

Basic Compose File Structure

version: '3'

services:
  web:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - .:/app
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/mydatabase
    depends_on:
      - db
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_DB=mydatabase

volumes:
  postgres_data:

Development-Focused Compose Configuration

For development, you'll want hot reloading and easier debugging:

version: '3'

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.dev  # Development-specific Dockerfile
    command: flask run --host=0.0.0.0 --port=8000 --debug  # Development server with auto-reload
    ports:
      - "8000:8000"
    volumes:
      - .:/app  # Mount current directory for live code changes
    environment:
      - FLASK_ENV=development
      - FLASK_APP=app.py
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/mydatabase
    depends_on:
      - db
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_DB=mydatabase
    ports:
      - "5432:5432"  # Expose port for local database tools
  
  # Additional development services
  adminer:  # Database management UI
    image: adminer:latest
    ports:
      - "8080:8080"
    depends_on:
      - db

volumes:
  postgres_data:

Multiple Environment Configurations

You can create different Compose files for different environments:

Base configuration: docker-compose.yml

version: '3'

services:
  web:
    build: .
    depends_on:
      - db
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Development overrides: docker-compose.override.yml (applied automatically)

version: '3'

services:
  web:
    command: flask run --host=0.0.0.0 --port=8000 --debug
    ports:
      - "8000:8000"
    volumes:
      - .:/app
    environment:
      - FLASK_ENV=development
      - FLASK_APP=app.py
  
  db:
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_DB=mydatabase

Production overrides: docker-compose.prod.yml

version: '3'

services:
  web:
    restart: always
    command: gunicorn wsgi:app --bind 0.0.0.0:8000 --workers 4
    expose:
      - "8000"
    environment:
      - FLASK_ENV=production
    volumes:
      - static_data:/app/static
      - media_data:/app/media
  
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
      - static_data:/static
      - media_data:/media
    depends_on:
      - web
  
  db:
    restart: always
    environment:
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_USER=${DB_USER}
      - POSTGRES_DB=${DB_NAME}

volumes:
  static_data:
  media_data:

Running with a specific configuration:

# Development (default)
docker-compose up

# Production
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up

Dependencies Beyond Services

Docker Compose can also help manage:

# Example with additional components
services:
  # ...other services...
  
  # Worker for background tasks
  worker:
    build: .
    command: celery -A app.tasks worker --loglevel=info
    volumes:
      - .:/app
    depends_on:
      - web
      - redis
      - db
  
  # Redis for task queue and caching
  redis:
    image: redis:alpine
    volumes:
      - redis_data:/data
  
  # Scheduled task processor
  scheduler:
    build: .
    command: celery -A app.tasks beat --loglevel=info
    volumes:
      - .:/app
    depends_on:
      - worker
      - redis

volumes:
  redis_data:

Python Dependency Management Strategies in Docker

Pinning Versions

Always pin dependency versions for reproducible builds:

# requirements.txt with pinned versions
flask==2.0.1
sqlalchemy==1.4.23
psycopg2-binary==2.9.1
gunicorn==20.1.0

Warning: Be careful with using pip freeze directly, as it will include all sub-dependencies, which can cause problems when updating packages.

Requirements Files Organization

Splitting requirements files by environment or purpose:

requirements/
├── base.txt       # Core dependencies
├── development.txt
├── production.txt
└── testing.txt

Example of requirements/base.txt:

# Core dependencies with pinned versions
flask==2.0.1
sqlalchemy==1.4.23
psycopg2-binary==2.9.1

Example of requirements/development.txt:

# Include base requirements
-r base.txt

# Development-specific packages
pytest==6.2.5
black==21.9b0
flake8==3.9.2
ipython==7.27.0

In your Dockerfile, use the appropriate requirements file:

# For development
COPY requirements/development.txt .
RUN pip install -r development.txt

# For production
COPY requirements/production.txt .
RUN pip install -r production.txt

Using pip-tools

pip-tools helps manage dependencies more precisely:

Create a requirements.in file with your direct dependencies:

# requirements.in - Direct dependencies only
flask>=2.0.0
sqlalchemy>=1.4.0
psycopg2-binary

Compile it to a pinned requirements.txt:

pip-compile requirements.in

This generates a requirements.txt with all dependencies (including sub-dependencies) pinned:

# requirements.txt
#
# This file is autogenerated by pip-compile
#
click==8.0.1
    # via flask
flask==2.0.1
    # via -r requirements.in
itsdangerous==2.0.1
    # via flask
jinja2==3.0.1
    # via flask
markupsafe==2.0.1
    # via jinja2
psycopg2-binary==2.9.1
    # via -r requirements.in
sqlalchemy==1.4.23
    # via -r requirements.in
werkzeug==2.0.1
    # via flask

In your Dockerfile:

# Install pip-tools
RUN pip install pip-tools

# Copy requirements files
COPY requirements.in .
COPY requirements.txt .

# Install dependencies
RUN pip-sync

Poetry for Modern Dependency Management

Poetry offers a more modern approach to dependency management:

Create a pyproject.toml file:

[tool.poetry]
name = "myapp"
version = "0.1.0"
description = "My awesome app"
authors = ["Your Name "]

[tool.poetry.dependencies]
python = "^3.9"
flask = "^2.0.1"
sqlalchemy = "^1.4.23"
psycopg2-binary = "^2.9.1"
gunicorn = "^20.1.0"

[tool.poetry.dev-dependencies]
pytest = "^6.2.5"
black = "^21.9b0"
flake8 = "^3.9.2"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Dockerfile for Poetry:

FROM python:3.9-slim

# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl \
    build-essential \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install Poetry
RUN curl -sSL https://install.python-poetry.org | python3 -

# Add Poetry to PATH
ENV PATH="${PATH}:/root/.local/bin"

# Set working directory
WORKDIR /app

# Copy Poetry configuration
COPY pyproject.toml poetry.lock* ./

# Configure Poetry not to use virtual environments
RUN poetry config virtualenvs.create false

# Install dependencies
RUN poetry install --no-interaction --no-ansi

# Copy application code
COPY . .

# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]

Pipenv Approach

Pipenv is another option for managing dependencies:

With Pipenv, you use Pipfile and Pipfile.lock:

# Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
flask = "*"
sqlalchemy = "*"
psycopg2-binary = "*"
gunicorn = "*"

[dev-packages]
pytest = "*"
black = "*"
flake8 = "*"

[requires]
python_version = "3.9"

Dockerfile for Pipenv:

FROM python:3.9-slim

# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install pipenv
RUN pip install pipenv

# Set working directory
WORKDIR /app

# Copy Pipfile and Pipfile.lock
COPY Pipfile Pipfile.lock ./

# Install dependencies system-wide (no virtualenv)
RUN pipenv install --system --deploy

# Copy application code
COPY . .

# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]

Development vs. Production Docker Configurations

Different environments have different requirements for dependency handling:

Development Environment Priorities

Production Environment Priorities

Development Dockerfile (Dockerfile.dev)

FROM python:3.9

# Install development tools and dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    default-libmysqlclient-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /app

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_ENV=development \
    FLASK_DEBUG=1

# Install dev dependencies first
COPY requirements/development.txt .
RUN pip install -r development.txt

# During development, we'll mount the code as a volume
# so we don't need to copy it here

# Default command for development
CMD ["flask", "run", "--host=0.0.0.0", "--port=8000"]

Production Dockerfile (Dockerfile or Dockerfile.prod)

FROM python:3.9-slim AS builder

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /build

# Copy requirements
COPY requirements/production.txt .

# Install dependencies with wheel support
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r production.txt

# Final stage
FROM python:3.9-slim

# Set working directory
WORKDIR /app

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_ENV=production

# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq5 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Copy built wheels and install
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/production.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r production.txt \
    && rm -rf /wheels

# Create non-root user
RUN addgroup --system app && adduser --system --group app

# Copy application code
COPY . .

# Set ownership
RUN chown -R app:app /app

# Switch to non-root user
USER app

# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000", "--workers", "4"]

Using Different Configurations

For development:

docker build -f Dockerfile.dev -t myapp:dev .
docker run -p 8000:8000 -v $(pwd):/app myapp:dev

Or with Docker Compose:

docker-compose -f docker-compose.dev.yml up

For production:

docker build -t myapp:prod .
docker run -p 8000:8000 myapp:prod

Or with Docker Compose:

docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Advanced Dependency Management Techniques

1. Layer Caching Optimization

Docker builds images in layers. Optimize your Dockerfile to leverage caching for dependencies:

# BAD: Changes to code trigger reinstallation of all dependencies
COPY . .
RUN pip install -r requirements.txt

# GOOD: Dependencies only reinstalled when requirements change
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .

2. Build Arguments for Environment Selection

Use build arguments to selectively install dependencies:

# Dockerfile
ARG ENVIRONMENT=production

# Base dependencies
COPY requirements/base.txt .
RUN pip install -r base.txt

# Environment-specific dependencies
COPY requirements/${ENVIRONMENT}.txt ./requirements-env.txt
RUN if [ -s requirements-env.txt ]; then pip install -r requirements-env.txt; fi

# Build with:
# docker build --build-arg ENVIRONMENT=development -t myapp:dev .

3. Scanning for Vulnerabilities

Regularly scan dependencies for security vulnerabilities:

# Install safety
RUN pip install safety

# Run security check
RUN safety check -r requirements.txt

Or as part of CI/CD:

stage: security
script:
  - pip install safety
  - safety check -r requirements.txt

4. Custom Package Indexes

For organizations with private packages:

# Using private PyPI server
RUN pip install --index-url https://pypi.example.com/simple/ \
    --extra-index-url https://pypi.org/simple \
    -r requirements.txt

5. Vendoring Dependencies

For air-gapped environments or strict controls:

# Download dependencies (on a connected machine)
pip download -d ./vendor -r requirements.txt

# In Dockerfile
COPY vendor /vendor
RUN pip install --no-index --find-links=/vendor -r requirements.txt

6. Dependency Health Checks

Verify dependency compatibility in the container:

# Add health check in Dockerfile
RUN pip check

# In CI pipeline
docker run --rm myapp:latest pip check

7. Managing Native Dependencies

Handling packages with complex C extensions:

# For packages requiring specific compilation flags
ENV CFLAGS="-march=native -O3"
RUN pip install numpy scipy

8. Shrinking Final Images

Reduce image size by removing build dependencies:

# Remove pip cache and unnecessary files
RUN pip install --no-cache-dir -r requirements.txt \
    && find /usr/local -type d -name __pycache__ -exec rm -rf {} +

Practical Workflows for Development

Local Development Workflow

A practical workflow for day-to-day development:

  1. Initial Setup
    git clone https://github.com/example/myapp.git
    cd myapp
    docker-compose up -d
  2. Adding a New Dependency
    # Add to requirements.txt
    echo "new-package==1.0.0" >> requirements.txt
    
    # Rebuild the container
    docker-compose build
    docker-compose up -d
  3. Development Cycle
    # Edit code locally (changes are synced via volume mount)
    # View logs
    docker-compose logs -f
    
    # Run tests
    docker-compose exec web pytest
    
    # Restart after major changes
    docker-compose restart web
  4. Database Migrations
    # Create migration
    docker-compose exec web flask db migrate -m "Add new table"
    
    # Apply migration
    docker-compose exec web flask db upgrade
  5. Cleanup
    # Stop containers
    docker-compose down
    
    # Remove volumes (caution: destroys data)
    docker-compose down -v

Working with Multiple Projects

When juggling multiple projects:

  1. Unique Naming
    # Use project-specific container names
    # docker-compose.yml
    services:
      web:
        container_name: myapp_web
  2. Port Allocation
    # Assign different ports to different projects
    # Project A
    services:
      web:
        ports:
          - "8000:8000"
      db:
        ports:
          - "5432:5432"
    
    # Project B
    services:
      web:
        ports:
          - "8001:8000"
      db:
        ports:
          - "5433:5432"
  3. Resource Constraints
    # Limit resources to prevent one project from consuming everything
    services:
      web:
        deploy:
          resources:
            limits:
              cpus: '0.5'
              memory: 500M

Advanced Development Conveniences

Make development more convenient with these techniques:

  1. Development Helper Scripts
    # In your package.json or Makefile
    "scripts": {
      "start": "docker-compose up",
      "test": "docker-compose exec web pytest",
      "shell": "docker-compose exec web bash",
      "db-shell": "docker-compose exec db psql -U postgres",
      "logs": "docker-compose logs -f",
      "migrate": "docker-compose exec web flask db upgrade"
    }
  2. Debugger Integration
    # Allow remote debugging
    # docker-compose.yml
    services:
      web:
        ports:
          - "8000:8000"
          - "5678:5678"  # Port for debugger
        command: python -m debugpy --listen 0.0.0.0:5678 -m flask run --host=0.0.0.0 --port=8000
  3. Hot Reloading
    # Use development server with debug mode
    # For Flask
    ENV FLASK_ENV=development
    
    # For Django
    ENV DEBUG=1

Managing Complex Environments

Microservices Architecture

When your application consists of multiple services:

# docker-compose.yml for microservices
version: '3'

services:
  auth-service:
    build: ./auth-service
    environment:
      - DB_HOST=db
      - REDIS_HOST=redis
    depends_on:
      - db
      - redis
  
  user-service:
    build: ./user-service
    environment:
      - DB_HOST=db
      - AUTH_SERVICE_URL=http://auth-service:8000
    depends_on:
      - db
      - auth-service
  
  product-service:
    build: ./product-service
    environment:
      - DB_HOST=db
    depends_on:
      - db
  
  api-gateway:
    build: ./api-gateway
    ports:
      - "8000:8000"
    environment:
      - AUTH_SERVICE_URL=http://auth-service:8000
      - USER_SERVICE_URL=http://user-service:8000
      - PRODUCT_SERVICE_URL=http://product-service:8000
    depends_on:
      - auth-service
      - user-service
      - product-service
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
  
  redis:
    image: redis:alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Hybrid Virtual Environment + Docker Approach

For some teams, combining virtual environments and Docker works well:

  1. Local Development: Use virtual environments for fast iteration
  2. Integration Testing: Use Docker Compose for complete environment
  3. CI/CD Pipeline: Use Docker for consistent builds
  4. Production: Deploy Docker containers

Script to synchronize between virtual env and Docker:

#!/bin/bash
# sync_deps.sh

# Update requirements.txt from virtual environment
if [[ "$1" == "freeze" ]]; then
  echo "Updating requirements.txt from virtual environment..."
  pip freeze > requirements.txt
  echo "Done."

# Install from requirements.txt to virtual environment
elif [[ "$1" == "install" ]]; then
  echo "Installing requirements.txt to virtual environment..."
  pip install -r requirements.txt
  echo "Done."

# Rebuild Docker containers with new requirements
elif [[ "$1" == "docker" ]]; then
  echo "Rebuilding Docker containers..."
  docker-compose build
  echo "Done."

# Sync in both directions
elif [[ "$1" == "sync" ]]; then
  echo "Syncing between virtual environment and Docker..."
  pip freeze > requirements.txt
  docker-compose build
  echo "Done."

else
  echo "Usage: $0 [freeze|install|docker|sync]"
  echo "  freeze:  Update requirements.txt from virtual environment"
  echo "  install: Install from requirements.txt to virtual environment"
  echo "  docker:  Rebuild Docker containers with current requirements.txt"
  echo "  sync:    Update requirements.txt and rebuild Docker containers"
  exit 1
fi

Handling Large Dependencies

Some Python packages (like ML libraries) are extremely large. Strategies for managing them:

  1. Selective Installation
    # Install only what you need
    RUN pip install tensorflow-cpu  # Instead of full tensorflow
  2. Pre-built Images
    # Use specialized base images
    FROM tensorflow/tensorflow:2.6.0
    # Now you have TensorFlow pre-installed
  3. Layer Caching
    # Split requirements by change frequency
    COPY requirements-stable.txt .
    RUN pip install -r requirements-stable.txt
    
    # More frequently changing deps later
    COPY requirements-changing.txt .
    RUN pip install -r requirements-changing.txt

Real-World Example: Flask Application with Comprehensive Dependency Management

Let's bring everything together with a complete example of a Flask application that demonstrates best practices for dependency management with Docker.

Project Structure

flask_app/
├── .dockerignore           # Files to exclude from Docker build
├── .gitignore              # Files to exclude from Git
├── Dockerfile              # Production Dockerfile
├── Dockerfile.dev          # Development Dockerfile
├── README.md               # Project documentation
├── docker-compose.yml      # Base Docker Compose config
├── docker-compose.override.yml  # Development overrides
├── docker-compose.prod.yml # Production overrides
├── app/                    # Application code
│   ├── __init__.py
│   ├── config.py           # Configuration
│   ├── models.py           # Database models
│   ├── routes.py           # API routes
│   ├── templates/          # Jinja2 templates
│   └── static/             # Static assets
├── migrations/             # Database migrations
├── requirements/
│   ├── base.txt            # Base dependencies
│   ├── development.txt     # Development dependencies
│   └── production.txt      # Production dependencies
├── scripts/
│   ├── entrypoint.sh       # Docker entrypoint script
│   └── start-dev.sh        # Development startup script
└── tests/                  # Application tests

Dependency Files

requirements/base.txt

# Core dependencies
Flask==2.0.1
Flask-SQLAlchemy==2.5.1
Flask-Migrate==3.1.0
SQLAlchemy==1.4.23
psycopg2-binary==2.9.1
gunicorn==20.1.0
python-dotenv==0.19.0
werkzeug==2.0.1
click==8.0.1
itsdangerous==2.0.1
jinja2==3.0.1
markupsafe==2.0.1

requirements/development.txt

# Include base requirements
-r base.txt

# Development packages
pytest==6.2.5
pytest-cov==2.12.1
black==21.8b0
flake8==3.9.2
ipython==7.27.0
debugpy==1.4.1

requirements/production.txt

# Include base requirements
-r base.txt

# Production packages
sentry-sdk==1.3.1
blinker==1.4

Docker Configuration

Dockerfile.dev

FROM python:3.9-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app \
    FLASK_ENV=development

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements/development.txt .
RUN pip install --no-cache-dir -r development.txt

# Make scripts executable
COPY scripts/entrypoint.sh scripts/start-dev.sh ./
RUN chmod +x entrypoint.sh start-dev.sh

# Set entrypoint
ENTRYPOINT ["./entrypoint.sh"]

# Default command
CMD ["./start-dev.sh"]

Dockerfile (for production)

# Build stage
FROM python:3.9-slim AS builder

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Set working directory
WORKDIR /build

# Copy requirements
COPY requirements/production.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r production.txt

# Final stage
FROM python:3.9-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    FLASK_APP=app \
    FLASK_ENV=production

# Create non-root user
RUN addgroup --system app && adduser --system --group app

# Set working directory
WORKDIR /app

# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq5 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

# Copy wheels and install dependencies
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/production.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r production.txt \
    && rm -rf /wheels production.txt

# Copy application code
COPY . .

# Make scripts executable and fix ownership
RUN chmod +x scripts/entrypoint.sh && \
    chown -R app:app /app

# Switch to non-root user
USER app

# Set entrypoint
ENTRYPOINT ["./scripts/entrypoint.sh"]

# Run gunicorn
CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:8000", "--workers", "4"]

Docker Compose Files

docker-compose.yml (base configuration)

version: '3'

services:
  web:
    build: .
    depends_on:
      - db
    restart: unless-stopped
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db:5432/flask_app
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=flask_app
    restart: unless-stopped

volumes:
  postgres_data:

docker-compose.override.yml (development overrides, applied automatically)

version: '3'

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.dev
    ports:
      - "8000:8000"
      - "5678:5678"  # For remote debugging
    volumes:
      - .:/app
    environment:
      - FLASK_ENV=development
      - FLASK_DEBUG=1
  
  db:
    ports:
      - "5432:5432"
  
  # Additional development services
  adminer:
    image: adminer:latest
    ports:
      - "8080:8080"
    depends_on:
      - db
    restart: unless-stopped

docker-compose.prod.yml (production overrides)

version: '3'

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile
    expose:
      - "8000"
    environment:
      - FLASK_ENV=production
      - LOG_LEVEL=INFO
    volumes:
      - static_data:/app/app/static
    deploy:
      replicas: 2
      restart_policy:
        condition: any
  
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
      - static_data:/static
    depends_on:
      - web
  
  db:
    environment:
      - POSTGRES_PASSWORD=${DB_PASSWORD:-postgres}
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./scripts/init-db.sh:/docker-entrypoint-initdb.d/init-db.sh

volumes:
  postgres_data:
  static_data:

Helper Scripts

scripts/entrypoint.sh

#!/bin/bash
set -e

# Wait for database to be ready
if [ "$DATABASE_URL" ]; then
  echo "Waiting for database..."
  
  RETRIES=5
  until psql $DATABASE_URL -c "select 1" > /dev/null 2>&1 || [ $RETRIES -eq 0 ]; do
    echo "Waiting for database to be available... $((RETRIES--)) remaining attempts..."
    sleep 1
  done
fi

# Run database migrations
if [ "$FLASK_ENV" = "production" ]; then
  echo "Running migrations..."
  flask db upgrade
fi

# Execute the command passed to docker run
exec "$@"

scripts/start-dev.sh

#!/bin/bash
set -e

# Run any development setup (e.g., migrations)
echo "Running development setup..."
flask db upgrade

# Start development server with debugger enabled
echo "Starting development server..."
python -m debugpy --listen 0.0.0.0:5678 --wait-for-client -m flask run --host=0.0.0.0 --port=8000

Using the Application

Development workflow:

# Start development environment
docker-compose up -d

# View logs
docker-compose logs -f

# Run tests
docker-compose exec web pytest

# Add a new dependency
# 1. Add to requirements/base.txt or requirements/development.txt
# 2. Rebuild the container
docker-compose build
docker-compose up -d

Production deployment:

# Build and start production environment
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Scale web service
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d --scale web=3

This example demonstrates:

By following these patterns, you can create a robust, scalable workflow for managing dependencies in your Python applications.

Best Practices Summary

Dependency Management

Development Workflow

Production Deployment

Exercise: Converting a Python Project to Use Docker

Let's apply what we've learned with a practical exercise. You'll convert an existing Python project to use Docker with best practices for dependency management.

Starting Point: A Simple Flask Application

Imagine you have this basic Flask application structure:

my_flask_app/
├── app.py
├── requirements.txt
└── templates/
    └── index.html

With these files:

app.py:

from flask import Flask, render_template
import os

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0')

requirements.txt:

flask==2.0.1

templates/index.html:

<!DOCTYPE html>
<html>
<head>
    <title>Flask App</title>
</head>
<body>
    <h1>Hello, Docker!</h1>
    <p>This Flask application is running in a Docker container.</p>
</body>
</html>

Exercise Steps

  1. Restructure the Project

    Create a more comprehensive project structure:

    my_flask_app/
    ├── .dockerignore
    ├── .gitignore
    ├── Dockerfile
    ├── Dockerfile.dev
    ├── docker-compose.yml
    ├── requirements/
    │   ├── base.txt
    │   ├── development.txt
    │   └── production.txt
    ├── app/
    │   ├── __init__.py
    │   ├── app.py
    │   └── templates/
    │       └── index.html
    └── scripts/
        └── entrypoint.sh
  2. Split Requirements

    Create separate requirements files:

    requirements/base.txt:

    flask==2.0.1
    gunicorn==20.1.0

    requirements/development.txt:

    -r base.txt
    pytest==6.2.5
    black==21.8b0

    requirements/production.txt:

    -r base.txt
  3. Create Dockerfiles

    Create a development and production Dockerfile:

    Dockerfile.dev:

    FROM python:3.9-slim
    
    WORKDIR /app
    
    ENV PYTHONDONTWRITEBYTECODE=1 \
        PYTHONUNBUFFERED=1 \
        FLASK_APP=app \
        FLASK_ENV=development
    
    COPY requirements/development.txt .
    RUN pip install --no-cache-dir -r development.txt
    
    # During development, we'll mount the code as a volume
    # so we don't need to copy it here
    
    CMD ["flask", "run", "--host=0.0.0.0", "--port=5000"]

    Dockerfile (for production):

    FROM python:3.9-slim
    
    WORKDIR /app
    
    ENV PYTHONDONTWRITEBYTECODE=1 \
        PYTHONUNBUFFERED=1 \
        FLASK_APP=app \
        FLASK_ENV=production
    
    COPY requirements/production.txt .
    RUN pip install --no-cache-dir -r production.txt
    
    COPY . .
    
    RUN addgroup --system app && adduser --system --group app
    USER app
    
    EXPOSE 5000
    
    CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:5000"]
  4. Set Up Docker Compose

    Create the Docker Compose file:

    docker-compose.yml:

    version: '3'
    
    services:
      web:
        build:
          context: .
          dockerfile: Dockerfile.dev
        ports:
          - "5000:5000"
        volumes:
          - .:/app
        environment:
          - FLASK_ENV=development
          - FLASK_APP=app
  5. Create Entry Point Script

    scripts/entrypoint.sh:

    #!/bin/bash
    set -e
    
    # Execute command
    exec "$@"
  6. Update Application Structure

    app/__init__.py:

    from flask import Flask
    
    app = Flask(__name__)
    
    from app import app as application

    app/app.py (updated from the original):

    from flask import render_template
    from app import app
    
    @app.route('/')
    def index():
        return render_template('index.html')
  7. Create .dockerignore File

    .dockerignore:

    __pycache__
    *.pyc
    *.pyo
    *.pyd
    .Python
    env/
    venv/
    .venv/
    .git/
    .gitignore
    .env
    .vscode
    *.log
  8. Build and Run the Application
    # Start development environment
    docker-compose up
    
    # Access the application at http://localhost:5000

Challenge Extensions

  1. Add a database service (e.g., PostgreSQL) to Docker Compose
  2. Create a production override file (docker-compose.prod.yml)
  3. Add health checks to the containers
  4. Implement a multi-stage build for the production Dockerfile
  5. Add a Docker Compose service for running tests

This exercise will give you hands-on experience with Docker and dependency management in a Python application.

Conclusion

Docker has fundamentally changed how we manage dependencies in Python applications. By containerizing your applications, you can:

As we've seen throughout this session, Docker offers powerful solutions to the complex challenge of dependency management in Python applications. Whether you're working on a simple Flask application or a complex microservices architecture, Docker provides tools and patterns that make dependency management more reliable and reproducible.

In our upcoming sessions, we'll build on this foundation as we explore more advanced Python concepts and start building real-world applications that leverage Docker for consistent environments across development and deployment.

Remember: "It works on my machine" is no longer an excuse in the Docker era!

Appendix: Docker Security Considerations

When using Docker for dependency management, security should be a top consideration:

1. Base Image Selection

Choose trusted base images from official repositories:

2. Dependency Scanning

Integrate security scanning into your workflow:

# Using safety in your Dockerfile
RUN pip install safety && \
    safety check -r requirements.txt && \
    pip uninstall -y safety

Or as part of CI/CD:

# GitLab CI example
dependency-scan:
  image: python:3.9-slim
  script:
    - pip install safety
    - safety check -r requirements.txt

3. Non-Root Users

Avoid running containers as root:

# Create a non-root user
RUN addgroup --system app && \
    adduser --system --group app

# Set ownership
RUN chown -R app:app /app

# Switch to non-root user
USER app

4. Minimal Images

Keep images as small as possible to reduce attack surface:

5. Secret Management

Never hardcode secrets in Dockerfiles or images:

6. Image Signing and Verification

Consider signing your Docker images to ensure integrity:

# Sign an image with Docker Content Trust
export DOCKER_CONTENT_TRUST=1
docker push mycompany/myapp:1.0.0

7. Regular Updates

Keep base images and dependencies updated to patch security vulnerabilities:

8. Container Runtime Security

Secure your containers at runtime:

# Docker Compose example with security constraints
services:
  web:
    image: myapp:1.0
    read_only: true
    tmpfs:
      - /tmp
      - /var/run
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Appendix: CI/CD Integration for Dependency Management

Automating dependency management with CI/CD pipelines ensures consistent handling across your development workflow:

GitHub Actions Example

name: Docker CI

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install pip-tools safety
          
      - name: Compile dependencies
        run: |
          pip-compile requirements.in
          
      - name: Check for security vulnerabilities
        run: |
          safety check -r requirements.txt
      
      - name: Build and test Docker image
        run: |
          docker build -t myapp:test .
          docker run --rm myapp:test pytest
      
      - name: Push to registry
        if: github.event_name != 'pull_request'
        run: |
          echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
          docker tag myapp:test mycompany/myapp:${{ github.sha }}
          docker push mycompany/myapp:${{ github.sha }}

Dependency Update Automation

GitHub example with Dependabot for automatic dependency updates:

# .github/dependabot.yml
version: 2
updates:
  # Python dependencies
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "weekly"
    allow:
      # Allow only direct dependencies
      - dependency-type: "direct"
    commit-message:
      prefix: "pip"
    open-pull-requests-limit: 10
    
  # Docker dependencies
  - package-ecosystem: "docker"
    directory: "/"
    schedule:
      interval: "weekly"
    commit-message:
      prefix: "docker"
    open-pull-requests-limit: 5

Matrix Testing

Test against multiple Python versions and dependency sets:

name: Matrix Testing

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.8, 3.9, 3.10]
        dependency-set: [minimal, latest]
    
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
      
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          if [ "${{ matrix.dependency-set }}" == "minimal" ]; then
            pip install -r requirements/minimal.txt
          else
            pip install -r requirements/latest.txt
          fi
      
      - name: Run tests
        run: |
          pytest

Dependency Locking in CI

Ensure dependencies are locked and up-to-date:

name: Dependency Check

on:
  push:
    paths:
      - 'requirements/**'
      - 'pyproject.toml'
      - 'Pipfile'

jobs:
  check-deps:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      
      - name: Check if dependencies are up-to-date
        run: |
          # For pip-tools
          pip install pip-tools
          pip-compile --check requirements.in
          
          # OR for Poetry
          # pip install poetry
          # poetry lock --check
          
          # OR for Pipenv
          # pip install pipenv
          # pipenv lock --requirements > requirements.txt
          # pipenv requirements --dev-only > dev-requirements.txt

Appendix: Monitoring Dependencies in Production

Once your application is deployed, ongoing monitoring of dependencies is crucial:

1. Dependency Tracking

Export a manifest of installed packages in production:

# Add to your container startup
pip freeze > /app/installed_packages.txt

# Or for more detail
pip list --format=json > /app/package_details.json

2. Security Monitoring

Continuously monitor for vulnerabilities:

3. Dependency Visualization

Visualize dependency relationships to understand impact:

# Install pipdeptree
pip install pipdeptree

# Generate visualization
pipdeptree --graph-output png > dependencies.png

# OR generate in DOT format for further processing
pipdeptree --graph-output dot > dependencies.dot

4. Automating Dependency Updates

Create automated processes for safe dependency updates:

  1. Automatically create PRs for dependency updates
  2. Run test suite against the updates
  3. Deploy to staging environment
  4. Perform integration tests
  5. Promote to production if successful

5. Dependency Drift Detection

Detect when actual installed dependencies differ from expected:

# Script to check for dependency drift
#!/bin/bash
# check_drift.sh

# Generate current dependencies
pip freeze > current_deps.txt

# Compare with requirements
diff -u requirements.txt current_deps.txt > drift.patch

if [ -s drift.patch ]; then
    echo "Warning: Dependency drift detected"
    cat drift.patch
    exit 1
else
    echo "All dependencies match requirements"
    exit 0
fi

Additional Resources

Documentation

Tools

Articles and Guides

Books