Introduction to Docker and Dependencies Management
Welcome to our session on Docker and dependencies management! This morning, we'll explore how Docker containers can revolutionize the way you manage dependencies in Python projects. While we've already covered virtual environments, Docker takes isolation and reproducibility to a whole new level, addressing many limitations of traditional Python environments.
Think of Docker as not just a tool, but a paradigm shift in how we approach development environments. If a virtual environment is like having a separate toolbox for each project, Docker is like having a complete, isolated workshop for each project—with its own tools, materials, and even operating system.
Quick Docker Review
Let's briefly recap what we've learned about Docker earlier this week:
- Containers: Lightweight, isolated environments that package an application and its dependencies
- Images: Read-only templates used to create containers
- Dockerfile: Instructions for building a Docker image
- Docker Compose: Tool for defining and running multi-container applications
Today, we'll focus specifically on how Docker addresses the complex challenge of dependency management in Python applications.
The Dependency Management Challenge
Before diving into Docker solutions, let's understand the key challenges in Python dependency management:
1. The "Works on My Machine" Problem
Despite using virtual environments, differences in operating systems and system libraries can still cause inconsistent behavior.
Example: A library that depends on a C extension might compile differently on macOS vs. Linux, or fail entirely on Windows.
2. System-Level Dependencies
Python packages often rely on system libraries that virtual environments don't manage.
Example: Packages like Pillow (imaging), NumPy (with optimized BLAS), or psycopg2 (PostgreSQL) require specific system libraries installed.
3. Version Conflicts and Dependency Hell
Complex applications with many dependencies can lead to irreconcilable version conflicts.
Example: Package A requires library X version 1.x, while Package B requires library X version 2.x.
4. Environment Parity
Development, testing, and production environments need to be identical to avoid surprises.
Example: Your application works in development but fails in production due to subtle differences in environment configuration.
5. Onboarding Friction
New team members often spend days setting up a development environment with the right versions of everything.
Example: A new developer joins and spends their first week just trying to get the app running locally.
Real-World Analogy: Traditional dependency management is like giving someone a list of ingredients for a complex recipe—they might get slightly different brands, qualities, or preparations. Docker is like delivering a pre-measured, ready-to-mix ingredient kit with exactly what's needed.
How Docker Solves Dependency Challenges
1. Complete Environment Isolation
Docker containers include not just Python packages, but the entire runtime environment:
- Specific Python version
- System libraries and dependencies
- Environment variables
- File system state
2. Consistent Environments Everywhere
The same container runs identically on any system with Docker installed, eliminating the "works on my machine" problem.
3. Declarative Configuration
A Dockerfile declares exactly what's in your environment, making it self-documenting and version-controllable.
4. Layered Caching
Docker's layer caching makes rebuilding environments fast after small changes to dependencies.
5. Isolation Without Performance Penalty
Containers have near-native performance while maintaining isolation.
6. Service Composition
Docker Compose lets you define and run multi-container applications (e.g., your app plus its database, cache, etc.).
Real-World Analogy: If traditional development is like each chef preparing ingredients differently, Docker is like a professional kitchen where every station is identical and prepared exactly to specification—the same tools, same ingredients, same procedures, no matter who's cooking.
Dockerfile Deep Dive for Python Applications
The Dockerfile is the blueprint for your application's environment. Let's examine the key components for Python applications:
Base Image Selection
Choosing the right base image is crucial for Python applications:
# Official Python images
FROM python:3.9 # Full Python with all dependencies
FROM python:3.9-slim # Smaller image, fewer system packages
FROM python:3.9-alpine # Minimal image, smallest size
# Alternative: Use specific OS with Python installed
FROM ubuntu:20.04 # Then install Python yourself
Decision Factors:
| Image Type | Pros | Cons | Best For |
|---|---|---|---|
python:3.x |
Complete, works with most packages | Large size (~900MB) | Complex applications with C extensions |
python:3.x-slim |
Smaller (~150MB), most libraries work | May need extra packages for some extensions | Web applications, good default choice |
python:3.x-alpine |
Very small (~50MB) | Uses musl instead of glibc, compilation issues | Microservices, simple scripts, constrained environments |
Working Directory
Set a dedicated working directory for your application:
# Create and set working directory
WORKDIR /app
Dependency Installation
Efficiently installing Python dependencies with pip:
# Copy just the requirements file first (for better caching)
COPY requirements.txt .
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
Advanced Technique: Use multi-stage builds to separate dependency installation from application code:
# Build stage
FROM python:3.9-slim AS builder
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential gcc
WORKDIR /build
COPY requirements.txt .
# Install dependencies
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r requirements.txt
# Final stage
FROM python:3.9-slim
WORKDIR /app
# Copy built wheels and install
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/requirements.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r requirements.txt
# Clean up
RUN rm -rf /wheels
System Dependencies
Installing system packages for common Python libraries:
# Example for Debian-based images (including python:slim)
RUN apt-get update && apt-get install -y --no-install-recommends \
# for Pillow
libjpeg-dev zlib1g-dev libpng-dev \
# for psycopg2
libpq-dev \
# for lxml
libxml2-dev libxslt-dev \
# Cleanup to reduce image size
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Alpine Example:
# Example for Alpine images
RUN apk add --no-cache \
jpeg-dev zlib-dev libpng-dev \
postgresql-dev \
libxml2-dev libxslt-dev
Application Code
Copying application code and setting up the container:
# Copy application code
COPY . .
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
# Expose port
EXPOSE 8000
# Run the application
CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:8000"]
User Setup
Running as a non-root user for security:
# Create a non-root user
RUN addgroup --system app && adduser --system --group app
# Set ownership
RUN chown -R app:app /app
# Switch to non-root user
USER app
A Complete Example
Bringing it all together for a Flask application:
# syntax=docker/dockerfile:1
FROM python:3.9-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# Create and set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Create non-root user
RUN addgroup --system app && adduser --system --group app
# Copy application code
COPY . .
# Set ownership
RUN chown -R app:app /app
# Switch to non-root user
USER app
# Expose port
EXPOSE 8000
# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]
Managing Dependencies with Docker Compose
Docker Compose is extremely powerful for managing service dependencies in your application, like databases, caches, and other services.
Basic Compose File Structure
version: '3'
services:
web:
build: .
ports:
- "8000:8000"
volumes:
- .:/app
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/mydatabase
depends_on:
- db
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_USER=postgres
- POSTGRES_DB=mydatabase
volumes:
postgres_data:
Development-Focused Compose Configuration
For development, you'll want hot reloading and easier debugging:
version: '3'
services:
web:
build:
context: .
dockerfile: Dockerfile.dev # Development-specific Dockerfile
command: flask run --host=0.0.0.0 --port=8000 --debug # Development server with auto-reload
ports:
- "8000:8000"
volumes:
- .:/app # Mount current directory for live code changes
environment:
- FLASK_ENV=development
- FLASK_APP=app.py
- DATABASE_URL=postgresql://postgres:postgres@db:5432/mydatabase
depends_on:
- db
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_USER=postgres
- POSTGRES_DB=mydatabase
ports:
- "5432:5432" # Expose port for local database tools
# Additional development services
adminer: # Database management UI
image: adminer:latest
ports:
- "8080:8080"
depends_on:
- db
volumes:
postgres_data:
Multiple Environment Configurations
You can create different Compose files for different environments:
Base configuration: docker-compose.yml
version: '3'
services:
web:
build: .
depends_on:
- db
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
Development overrides: docker-compose.override.yml (applied automatically)
version: '3'
services:
web:
command: flask run --host=0.0.0.0 --port=8000 --debug
ports:
- "8000:8000"
volumes:
- .:/app
environment:
- FLASK_ENV=development
- FLASK_APP=app.py
db:
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_USER=postgres
- POSTGRES_DB=mydatabase
Production overrides: docker-compose.prod.yml
version: '3'
services:
web:
restart: always
command: gunicorn wsgi:app --bind 0.0.0.0:8000 --workers 4
expose:
- "8000"
environment:
- FLASK_ENV=production
volumes:
- static_data:/app/static
- media_data:/app/media
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf
- static_data:/static
- media_data:/media
depends_on:
- web
db:
restart: always
environment:
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_USER=${DB_USER}
- POSTGRES_DB=${DB_NAME}
volumes:
static_data:
media_data:
Running with a specific configuration:
# Development (default)
docker-compose up
# Production
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up
Dependencies Beyond Services
Docker Compose can also help manage:
- Initialization Scripts: Run database migrations or seed data
- Scheduled Tasks: Add services for Celery workers or cron jobs
- Development Tools: Include services for documentation, testing, or monitoring
# Example with additional components
services:
# ...other services...
# Worker for background tasks
worker:
build: .
command: celery -A app.tasks worker --loglevel=info
volumes:
- .:/app
depends_on:
- web
- redis
- db
# Redis for task queue and caching
redis:
image: redis:alpine
volumes:
- redis_data:/data
# Scheduled task processor
scheduler:
build: .
command: celery -A app.tasks beat --loglevel=info
volumes:
- .:/app
depends_on:
- worker
- redis
volumes:
redis_data:
Python Dependency Management Strategies in Docker
Pinning Versions
Always pin dependency versions for reproducible builds:
# requirements.txt with pinned versions
flask==2.0.1
sqlalchemy==1.4.23
psycopg2-binary==2.9.1
gunicorn==20.1.0
Warning: Be careful with using pip freeze directly, as it will include all sub-dependencies, which can cause problems when updating packages.
Requirements Files Organization
Splitting requirements files by environment or purpose:
requirements/
├── base.txt # Core dependencies
├── development.txt
├── production.txt
└── testing.txt
Example of requirements/base.txt:
# Core dependencies with pinned versions
flask==2.0.1
sqlalchemy==1.4.23
psycopg2-binary==2.9.1
Example of requirements/development.txt:
# Include base requirements
-r base.txt
# Development-specific packages
pytest==6.2.5
black==21.9b0
flake8==3.9.2
ipython==7.27.0
In your Dockerfile, use the appropriate requirements file:
# For development
COPY requirements/development.txt .
RUN pip install -r development.txt
# For production
COPY requirements/production.txt .
RUN pip install -r production.txt
Using pip-tools
pip-tools helps manage dependencies more precisely:
Create a requirements.in file with your direct dependencies:
# requirements.in - Direct dependencies only
flask>=2.0.0
sqlalchemy>=1.4.0
psycopg2-binary
Compile it to a pinned requirements.txt:
pip-compile requirements.in
This generates a requirements.txt with all dependencies (including sub-dependencies) pinned:
# requirements.txt
#
# This file is autogenerated by pip-compile
#
click==8.0.1
# via flask
flask==2.0.1
# via -r requirements.in
itsdangerous==2.0.1
# via flask
jinja2==3.0.1
# via flask
markupsafe==2.0.1
# via jinja2
psycopg2-binary==2.9.1
# via -r requirements.in
sqlalchemy==1.4.23
# via -r requirements.in
werkzeug==2.0.1
# via flask
In your Dockerfile:
# Install pip-tools
RUN pip install pip-tools
# Copy requirements files
COPY requirements.in .
COPY requirements.txt .
# Install dependencies
RUN pip-sync
Poetry for Modern Dependency Management
Poetry offers a more modern approach to dependency management:
Create a pyproject.toml file:
[tool.poetry]
name = "myapp"
version = "0.1.0"
description = "My awesome app"
authors = ["Your Name "]
[tool.poetry.dependencies]
python = "^3.9"
flask = "^2.0.1"
sqlalchemy = "^1.4.23"
psycopg2-binary = "^2.9.1"
gunicorn = "^20.1.0"
[tool.poetry.dev-dependencies]
pytest = "^6.2.5"
black = "^21.9b0"
flake8 = "^3.9.2"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
Dockerfile for Poetry:
FROM python:3.9-slim
# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
build-essential \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install Poetry
RUN curl -sSL https://install.python-poetry.org | python3 -
# Add Poetry to PATH
ENV PATH="${PATH}:/root/.local/bin"
# Set working directory
WORKDIR /app
# Copy Poetry configuration
COPY pyproject.toml poetry.lock* ./
# Configure Poetry not to use virtual environments
RUN poetry config virtualenvs.create false
# Install dependencies
RUN poetry install --no-interaction --no-ansi
# Copy application code
COPY . .
# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]
Pipenv Approach
Pipenv is another option for managing dependencies:
With Pipenv, you use Pipfile and Pipfile.lock:
# Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
flask = "*"
sqlalchemy = "*"
psycopg2-binary = "*"
gunicorn = "*"
[dev-packages]
pytest = "*"
black = "*"
flake8 = "*"
[requires]
python_version = "3.9"
Dockerfile for Pipenv:
FROM python:3.9-slim
# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install pipenv
RUN pip install pipenv
# Set working directory
WORKDIR /app
# Copy Pipfile and Pipfile.lock
COPY Pipfile Pipfile.lock ./
# Install dependencies system-wide (no virtualenv)
RUN pipenv install --system --deploy
# Copy application code
COPY . .
# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000"]
Development vs. Production Docker Configurations
Different environments have different requirements for dependency handling:
Development Environment Priorities
- Quick iteration: Fast rebuilds when dependencies change
- Debugging tools: Include development packages
- Code synchronization: Live code reloading
- Local persistence: Data should persist between container rebuilds
Production Environment Priorities
- Security: Minimal attack surface, no dev tools
- Performance: Optimized dependencies and configurations
- Reliability: Stable, pinned versions of everything
- Size: Smaller images for faster deployments
Development Dockerfile (Dockerfile.dev)
FROM python:3.9
# Install development tools and dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
default-libmysqlclient-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
FLASK_ENV=development \
FLASK_DEBUG=1
# Install dev dependencies first
COPY requirements/development.txt .
RUN pip install -r development.txt
# During development, we'll mount the code as a volume
# so we don't need to copy it here
# Default command for development
CMD ["flask", "run", "--host=0.0.0.0", "--port=8000"]
Production Dockerfile (Dockerfile or Dockerfile.prod)
FROM python:3.9-slim AS builder
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /build
# Copy requirements
COPY requirements/production.txt .
# Install dependencies with wheel support
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r production.txt
# Final stage
FROM python:3.9-slim
# Set working directory
WORKDIR /app
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
FLASK_ENV=production
# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
libpq5 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Copy built wheels and install
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/production.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r production.txt \
&& rm -rf /wheels
# Create non-root user
RUN addgroup --system app && adduser --system --group app
# Copy application code
COPY . .
# Set ownership
RUN chown -R app:app /app
# Switch to non-root user
USER app
# Run the application
CMD ["gunicorn", "wsgi:app", "--bind", "0.0.0.0:8000", "--workers", "4"]
Using Different Configurations
For development:
docker build -f Dockerfile.dev -t myapp:dev .
docker run -p 8000:8000 -v $(pwd):/app myapp:dev
Or with Docker Compose:
docker-compose -f docker-compose.dev.yml up
For production:
docker build -t myapp:prod .
docker run -p 8000:8000 myapp:prod
Or with Docker Compose:
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d
Advanced Dependency Management Techniques
1. Layer Caching Optimization
Docker builds images in layers. Optimize your Dockerfile to leverage caching for dependencies:
# BAD: Changes to code trigger reinstallation of all dependencies
COPY . .
RUN pip install -r requirements.txt
# GOOD: Dependencies only reinstalled when requirements change
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
2. Build Arguments for Environment Selection
Use build arguments to selectively install dependencies:
# Dockerfile
ARG ENVIRONMENT=production
# Base dependencies
COPY requirements/base.txt .
RUN pip install -r base.txt
# Environment-specific dependencies
COPY requirements/${ENVIRONMENT}.txt ./requirements-env.txt
RUN if [ -s requirements-env.txt ]; then pip install -r requirements-env.txt; fi
# Build with:
# docker build --build-arg ENVIRONMENT=development -t myapp:dev .
3. Scanning for Vulnerabilities
Regularly scan dependencies for security vulnerabilities:
# Install safety
RUN pip install safety
# Run security check
RUN safety check -r requirements.txt
Or as part of CI/CD:
stage: security
script:
- pip install safety
- safety check -r requirements.txt
4. Custom Package Indexes
For organizations with private packages:
# Using private PyPI server
RUN pip install --index-url https://pypi.example.com/simple/ \
--extra-index-url https://pypi.org/simple \
-r requirements.txt
5. Vendoring Dependencies
For air-gapped environments or strict controls:
# Download dependencies (on a connected machine)
pip download -d ./vendor -r requirements.txt
# In Dockerfile
COPY vendor /vendor
RUN pip install --no-index --find-links=/vendor -r requirements.txt
6. Dependency Health Checks
Verify dependency compatibility in the container:
# Add health check in Dockerfile
RUN pip check
# In CI pipeline
docker run --rm myapp:latest pip check
7. Managing Native Dependencies
Handling packages with complex C extensions:
# For packages requiring specific compilation flags
ENV CFLAGS="-march=native -O3"
RUN pip install numpy scipy
8. Shrinking Final Images
Reduce image size by removing build dependencies:
# Remove pip cache and unnecessary files
RUN pip install --no-cache-dir -r requirements.txt \
&& find /usr/local -type d -name __pycache__ -exec rm -rf {} +
Practical Workflows for Development
Local Development Workflow
A practical workflow for day-to-day development:
- Initial Setup
git clone https://github.com/example/myapp.git cd myapp docker-compose up -d - Adding a New Dependency
# Add to requirements.txt echo "new-package==1.0.0" >> requirements.txt # Rebuild the container docker-compose build docker-compose up -d - Development Cycle
# Edit code locally (changes are synced via volume mount) # View logs docker-compose logs -f # Run tests docker-compose exec web pytest # Restart after major changes docker-compose restart web - Database Migrations
# Create migration docker-compose exec web flask db migrate -m "Add new table" # Apply migration docker-compose exec web flask db upgrade - Cleanup
# Stop containers docker-compose down # Remove volumes (caution: destroys data) docker-compose down -v
Working with Multiple Projects
When juggling multiple projects:
- Unique Naming
# Use project-specific container names # docker-compose.yml services: web: container_name: myapp_web - Port Allocation
# Assign different ports to different projects # Project A services: web: ports: - "8000:8000" db: ports: - "5432:5432" # Project B services: web: ports: - "8001:8000" db: ports: - "5433:5432" - Resource Constraints
# Limit resources to prevent one project from consuming everything services: web: deploy: resources: limits: cpus: '0.5' memory: 500M
Advanced Development Conveniences
Make development more convenient with these techniques:
- Development Helper Scripts
# In your package.json or Makefile "scripts": { "start": "docker-compose up", "test": "docker-compose exec web pytest", "shell": "docker-compose exec web bash", "db-shell": "docker-compose exec db psql -U postgres", "logs": "docker-compose logs -f", "migrate": "docker-compose exec web flask db upgrade" } - Debugger Integration
# Allow remote debugging # docker-compose.yml services: web: ports: - "8000:8000" - "5678:5678" # Port for debugger command: python -m debugpy --listen 0.0.0.0:5678 -m flask run --host=0.0.0.0 --port=8000 - Hot Reloading
# Use development server with debug mode # For Flask ENV FLASK_ENV=development # For Django ENV DEBUG=1
Managing Complex Environments
Microservices Architecture
When your application consists of multiple services:
# docker-compose.yml for microservices
version: '3'
services:
auth-service:
build: ./auth-service
environment:
- DB_HOST=db
- REDIS_HOST=redis
depends_on:
- db
- redis
user-service:
build: ./user-service
environment:
- DB_HOST=db
- AUTH_SERVICE_URL=http://auth-service:8000
depends_on:
- db
- auth-service
product-service:
build: ./product-service
environment:
- DB_HOST=db
depends_on:
- db
api-gateway:
build: ./api-gateway
ports:
- "8000:8000"
environment:
- AUTH_SERVICE_URL=http://auth-service:8000
- USER_SERVICE_URL=http://user-service:8000
- PRODUCT_SERVICE_URL=http://product-service:8000
depends_on:
- auth-service
- user-service
- product-service
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
redis:
image: redis:alpine
volumes:
- redis_data:/data
volumes:
postgres_data:
redis_data:
Hybrid Virtual Environment + Docker Approach
For some teams, combining virtual environments and Docker works well:
- Local Development: Use virtual environments for fast iteration
- Integration Testing: Use Docker Compose for complete environment
- CI/CD Pipeline: Use Docker for consistent builds
- Production: Deploy Docker containers
Script to synchronize between virtual env and Docker:
#!/bin/bash
# sync_deps.sh
# Update requirements.txt from virtual environment
if [[ "$1" == "freeze" ]]; then
echo "Updating requirements.txt from virtual environment..."
pip freeze > requirements.txt
echo "Done."
# Install from requirements.txt to virtual environment
elif [[ "$1" == "install" ]]; then
echo "Installing requirements.txt to virtual environment..."
pip install -r requirements.txt
echo "Done."
# Rebuild Docker containers with new requirements
elif [[ "$1" == "docker" ]]; then
echo "Rebuilding Docker containers..."
docker-compose build
echo "Done."
# Sync in both directions
elif [[ "$1" == "sync" ]]; then
echo "Syncing between virtual environment and Docker..."
pip freeze > requirements.txt
docker-compose build
echo "Done."
else
echo "Usage: $0 [freeze|install|docker|sync]"
echo " freeze: Update requirements.txt from virtual environment"
echo " install: Install from requirements.txt to virtual environment"
echo " docker: Rebuild Docker containers with current requirements.txt"
echo " sync: Update requirements.txt and rebuild Docker containers"
exit 1
fi
Handling Large Dependencies
Some Python packages (like ML libraries) are extremely large. Strategies for managing them:
- Selective Installation
# Install only what you need RUN pip install tensorflow-cpu # Instead of full tensorflow - Pre-built Images
# Use specialized base images FROM tensorflow/tensorflow:2.6.0 # Now you have TensorFlow pre-installed - Layer Caching
# Split requirements by change frequency COPY requirements-stable.txt . RUN pip install -r requirements-stable.txt # More frequently changing deps later COPY requirements-changing.txt . RUN pip install -r requirements-changing.txt
Real-World Example: Flask Application with Comprehensive Dependency Management
Let's bring everything together with a complete example of a Flask application that demonstrates best practices for dependency management with Docker.
Project Structure
flask_app/
├── .dockerignore # Files to exclude from Docker build
├── .gitignore # Files to exclude from Git
├── Dockerfile # Production Dockerfile
├── Dockerfile.dev # Development Dockerfile
├── README.md # Project documentation
├── docker-compose.yml # Base Docker Compose config
├── docker-compose.override.yml # Development overrides
├── docker-compose.prod.yml # Production overrides
├── app/ # Application code
│ ├── __init__.py
│ ├── config.py # Configuration
│ ├── models.py # Database models
│ ├── routes.py # API routes
│ ├── templates/ # Jinja2 templates
│ └── static/ # Static assets
├── migrations/ # Database migrations
├── requirements/
│ ├── base.txt # Base dependencies
│ ├── development.txt # Development dependencies
│ └── production.txt # Production dependencies
├── scripts/
│ ├── entrypoint.sh # Docker entrypoint script
│ └── start-dev.sh # Development startup script
└── tests/ # Application tests
Dependency Files
requirements/base.txt
# Core dependencies
Flask==2.0.1
Flask-SQLAlchemy==2.5.1
Flask-Migrate==3.1.0
SQLAlchemy==1.4.23
psycopg2-binary==2.9.1
gunicorn==20.1.0
python-dotenv==0.19.0
werkzeug==2.0.1
click==8.0.1
itsdangerous==2.0.1
jinja2==3.0.1
markupsafe==2.0.1
requirements/development.txt
# Include base requirements
-r base.txt
# Development packages
pytest==6.2.5
pytest-cov==2.12.1
black==21.8b0
flake8==3.9.2
ipython==7.27.0
debugpy==1.4.1
requirements/production.txt
# Include base requirements
-r base.txt
# Production packages
sentry-sdk==1.3.1
blinker==1.4
Docker Configuration
Dockerfile.dev
FROM python:3.9-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
FLASK_APP=app \
FLASK_ENV=development
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements/development.txt .
RUN pip install --no-cache-dir -r development.txt
# Make scripts executable
COPY scripts/entrypoint.sh scripts/start-dev.sh ./
RUN chmod +x entrypoint.sh start-dev.sh
# Set entrypoint
ENTRYPOINT ["./entrypoint.sh"]
# Default command
CMD ["./start-dev.sh"]
Dockerfile (for production)
# Build stage
FROM python:3.9-slim AS builder
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
libpq-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /build
# Copy requirements
COPY requirements/production.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /build/wheels -r production.txt
# Final stage
FROM python:3.9-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
FLASK_APP=app \
FLASK_ENV=production
# Create non-root user
RUN addgroup --system app && adduser --system --group app
# Set working directory
WORKDIR /app
# Install runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
libpq5 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Copy wheels and install dependencies
COPY --from=builder /build/wheels /wheels
COPY --from=builder /build/production.txt .
RUN pip install --no-cache-dir --no-index --find-links=/wheels -r production.txt \
&& rm -rf /wheels production.txt
# Copy application code
COPY . .
# Make scripts executable and fix ownership
RUN chmod +x scripts/entrypoint.sh && \
chown -R app:app /app
# Switch to non-root user
USER app
# Set entrypoint
ENTRYPOINT ["./scripts/entrypoint.sh"]
# Run gunicorn
CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:8000", "--workers", "4"]
Docker Compose Files
docker-compose.yml (base configuration)
version: '3'
services:
web:
build: .
depends_on:
- db
restart: unless-stopped
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/flask_app
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=flask_app
restart: unless-stopped
volumes:
postgres_data:
docker-compose.override.yml (development overrides, applied automatically)
version: '3'
services:
web:
build:
context: .
dockerfile: Dockerfile.dev
ports:
- "8000:8000"
- "5678:5678" # For remote debugging
volumes:
- .:/app
environment:
- FLASK_ENV=development
- FLASK_DEBUG=1
db:
ports:
- "5432:5432"
# Additional development services
adminer:
image: adminer:latest
ports:
- "8080:8080"
depends_on:
- db
restart: unless-stopped
docker-compose.prod.yml (production overrides)
version: '3'
services:
web:
build:
context: .
dockerfile: Dockerfile
expose:
- "8000"
environment:
- FLASK_ENV=production
- LOG_LEVEL=INFO
volumes:
- static_data:/app/app/static
deploy:
replicas: 2
restart_policy:
condition: any
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/conf.d/default.conf
- static_data:/static
depends_on:
- web
db:
environment:
- POSTGRES_PASSWORD=${DB_PASSWORD:-postgres}
volumes:
- postgres_data:/var/lib/postgresql/data
- ./scripts/init-db.sh:/docker-entrypoint-initdb.d/init-db.sh
volumes:
postgres_data:
static_data:
Helper Scripts
scripts/entrypoint.sh
#!/bin/bash
set -e
# Wait for database to be ready
if [ "$DATABASE_URL" ]; then
echo "Waiting for database..."
RETRIES=5
until psql $DATABASE_URL -c "select 1" > /dev/null 2>&1 || [ $RETRIES -eq 0 ]; do
echo "Waiting for database to be available... $((RETRIES--)) remaining attempts..."
sleep 1
done
fi
# Run database migrations
if [ "$FLASK_ENV" = "production" ]; then
echo "Running migrations..."
flask db upgrade
fi
# Execute the command passed to docker run
exec "$@"
scripts/start-dev.sh
#!/bin/bash
set -e
# Run any development setup (e.g., migrations)
echo "Running development setup..."
flask db upgrade
# Start development server with debugger enabled
echo "Starting development server..."
python -m debugpy --listen 0.0.0.0:5678 --wait-for-client -m flask run --host=0.0.0.0 --port=8000
Using the Application
Development workflow:
# Start development environment
docker-compose up -d
# View logs
docker-compose logs -f
# Run tests
docker-compose exec web pytest
# Add a new dependency
# 1. Add to requirements/base.txt or requirements/development.txt
# 2. Rebuild the container
docker-compose build
docker-compose up -d
Production deployment:
# Build and start production environment
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Scale web service
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d --scale web=3
This example demonstrates:
- Separate Dockerfiles for development and production
- Multi-stage builds for production to minimize image size
- Organized requirements files by environment
- Docker Compose configurations for different environments
- Helper scripts for common tasks
- Environment-specific configuration
- Database migration handling
- Development aids like debugger support
By following these patterns, you can create a robust, scalable workflow for managing dependencies in your Python applications.
Best Practices Summary
Dependency Management
- Always pin versions in requirements files
- Organize requirements by environment (base, development, production)
- Use multi-stage builds for production images
- Leverage layer caching by ordering Dockerfile instructions properly
- Consider security by regularly scanning dependencies for vulnerabilities
- Minimize image size by removing build dependencies and caches
- Use non-root users for running applications
Development Workflow
- Use volume mounts for fast development cycles
- Include debugging tools in development environments
- Create helper scripts for common tasks
- Document procedures for adding dependencies
- Use Docker Compose for managing multiple services
- Implement health checks to verify container readiness
Production Deployment
- Use production-optimized images (slim, multi-stage builds)
- Implement proper logging for observability
- Set up health monitoring for containers
- Plan for scaling with stateless application design
- Consider orchestration for complex deployments (Kubernetes, ECS)
- Implement robust error handling for application resilience
Exercise: Converting a Python Project to Use Docker
Let's apply what we've learned with a practical exercise. You'll convert an existing Python project to use Docker with best practices for dependency management.
Starting Point: A Simple Flask Application
Imagine you have this basic Flask application structure:
my_flask_app/
├── app.py
├── requirements.txt
└── templates/
└── index.html
With these files:
app.py:
from flask import Flask, render_template
import os
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0')
requirements.txt:
flask==2.0.1
templates/index.html:
<!DOCTYPE html>
<html>
<head>
<title>Flask App</title>
</head>
<body>
<h1>Hello, Docker!</h1>
<p>This Flask application is running in a Docker container.</p>
</body>
</html>
Exercise Steps
- Restructure the Project
Create a more comprehensive project structure:
my_flask_app/ ├── .dockerignore ├── .gitignore ├── Dockerfile ├── Dockerfile.dev ├── docker-compose.yml ├── requirements/ │ ├── base.txt │ ├── development.txt │ └── production.txt ├── app/ │ ├── __init__.py │ ├── app.py │ └── templates/ │ └── index.html └── scripts/ └── entrypoint.sh - Split Requirements
Create separate requirements files:
requirements/base.txt:flask==2.0.1 gunicorn==20.1.0requirements/development.txt:-r base.txt pytest==6.2.5 black==21.8b0requirements/production.txt:-r base.txt - Create Dockerfiles
Create a development and production Dockerfile:
Dockerfile.dev:FROM python:3.9-slim WORKDIR /app ENV PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 \ FLASK_APP=app \ FLASK_ENV=development COPY requirements/development.txt . RUN pip install --no-cache-dir -r development.txt # During development, we'll mount the code as a volume # so we don't need to copy it here CMD ["flask", "run", "--host=0.0.0.0", "--port=5000"]Dockerfile(for production):FROM python:3.9-slim WORKDIR /app ENV PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 \ FLASK_APP=app \ FLASK_ENV=production COPY requirements/production.txt . RUN pip install --no-cache-dir -r production.txt COPY . . RUN addgroup --system app && adduser --system --group app USER app EXPOSE 5000 CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:5000"] - Set Up Docker Compose
Create the Docker Compose file:
docker-compose.yml:version: '3' services: web: build: context: . dockerfile: Dockerfile.dev ports: - "5000:5000" volumes: - .:/app environment: - FLASK_ENV=development - FLASK_APP=app - Create Entry Point Script
scripts/entrypoint.sh:#!/bin/bash set -e # Execute command exec "$@" - Update Application Structure
app/__init__.py:from flask import Flask app = Flask(__name__) from app import app as applicationapp/app.py(updated from the original):from flask import render_template from app import app @app.route('/') def index(): return render_template('index.html') - Create .dockerignore File
.dockerignore:__pycache__ *.pyc *.pyo *.pyd .Python env/ venv/ .venv/ .git/ .gitignore .env .vscode *.log - Build and Run the Application
# Start development environment docker-compose up # Access the application at http://localhost:5000
Challenge Extensions
- Add a database service (e.g., PostgreSQL) to Docker Compose
- Create a production override file (docker-compose.prod.yml)
- Add health checks to the containers
- Implement a multi-stage build for the production Dockerfile
- Add a Docker Compose service for running tests
This exercise will give you hands-on experience with Docker and dependency management in a Python application.
Conclusion
Docker has fundamentally changed how we manage dependencies in Python applications. By containerizing your applications, you can:
- Ensure consistency across development, testing, and production environments
- Simplify onboarding for new team members
- Manage complex dependencies more effectively
- Isolate applications to avoid conflicts
- Scale deployments more easily
- Version entire environments, not just code
- Improve security through isolation and minimal images
As we've seen throughout this session, Docker offers powerful solutions to the complex challenge of dependency management in Python applications. Whether you're working on a simple Flask application or a complex microservices architecture, Docker provides tools and patterns that make dependency management more reliable and reproducible.
In our upcoming sessions, we'll build on this foundation as we explore more advanced Python concepts and start building real-world applications that leverage Docker for consistent environments across development and deployment.
Remember: "It works on my machine" is no longer an excuse in the Docker era!
Appendix: Docker Security Considerations
When using Docker for dependency management, security should be a top consideration:
1. Base Image Selection
Choose trusted base images from official repositories:
- Use official Python images (
python:3.x) rather than arbitrary base images - Consider slim variants (
python:3.x-slim) to reduce attack surface - Pin to specific versions using SHA digests for immutability:
FROM python:3.9-slim@sha256:1c4b7c9bade4c1c8418e38a8e606642aeefb87c2060ec4ba6d7cc8cb0c3fff57
2. Dependency Scanning
Integrate security scanning into your workflow:
# Using safety in your Dockerfile
RUN pip install safety && \
safety check -r requirements.txt && \
pip uninstall -y safety
Or as part of CI/CD:
# GitLab CI example
dependency-scan:
image: python:3.9-slim
script:
- pip install safety
- safety check -r requirements.txt
3. Non-Root Users
Avoid running containers as root:
# Create a non-root user
RUN addgroup --system app && \
adduser --system --group app
# Set ownership
RUN chown -R app:app /app
# Switch to non-root user
USER app
4. Minimal Images
Keep images as small as possible to reduce attack surface:
- Use multi-stage builds to exclude build tools from final image
- Remove package manager caches and temporary files
- Avoid installing unnecessary packages
5. Secret Management
Never hardcode secrets in Dockerfiles or images:
- Use environment variables for configuration
- Consider Docker secrets or dedicated secrets management tools
- Don't use build arguments for secrets (they're visible in image history)
6. Image Signing and Verification
Consider signing your Docker images to ensure integrity:
# Sign an image with Docker Content Trust
export DOCKER_CONTENT_TRUST=1
docker push mycompany/myapp:1.0.0
7. Regular Updates
Keep base images and dependencies updated to patch security vulnerabilities:
- Implement automated workflows to rebuild images regularly
- Use tools like Dependabot to notify of dependency updates
- Balance stability with security needs
8. Container Runtime Security
Secure your containers at runtime:
- Limit container capabilities and resources
- Use read-only file systems where possible
- Implement network policies to restrict container communication
# Docker Compose example with security constraints
services:
web:
image: myapp:1.0
read_only: true
tmpfs:
- /tmp
- /var/run
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
Appendix: CI/CD Integration for Dependency Management
Automating dependency management with CI/CD pipelines ensures consistent handling across your development workflow:
GitHub Actions Example
name: Docker CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pip-tools safety
- name: Compile dependencies
run: |
pip-compile requirements.in
- name: Check for security vulnerabilities
run: |
safety check -r requirements.txt
- name: Build and test Docker image
run: |
docker build -t myapp:test .
docker run --rm myapp:test pytest
- name: Push to registry
if: github.event_name != 'pull_request'
run: |
echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
docker tag myapp:test mycompany/myapp:${{ github.sha }}
docker push mycompany/myapp:${{ github.sha }}
Dependency Update Automation
GitHub example with Dependabot for automatic dependency updates:
# .github/dependabot.yml
version: 2
updates:
# Python dependencies
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
allow:
# Allow only direct dependencies
- dependency-type: "direct"
commit-message:
prefix: "pip"
open-pull-requests-limit: 10
# Docker dependencies
- package-ecosystem: "docker"
directory: "/"
schedule:
interval: "weekly"
commit-message:
prefix: "docker"
open-pull-requests-limit: 5
Matrix Testing
Test against multiple Python versions and dependency sets:
name: Matrix Testing
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, 3.10]
dependency-set: [minimal, latest]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ "${{ matrix.dependency-set }}" == "minimal" ]; then
pip install -r requirements/minimal.txt
else
pip install -r requirements/latest.txt
fi
- name: Run tests
run: |
pytest
Dependency Locking in CI
Ensure dependencies are locked and up-to-date:
name: Dependency Check
on:
push:
paths:
- 'requirements/**'
- 'pyproject.toml'
- 'Pipfile'
jobs:
check-deps:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Check if dependencies are up-to-date
run: |
# For pip-tools
pip install pip-tools
pip-compile --check requirements.in
# OR for Poetry
# pip install poetry
# poetry lock --check
# OR for Pipenv
# pip install pipenv
# pipenv lock --requirements > requirements.txt
# pipenv requirements --dev-only > dev-requirements.txt
Appendix: Monitoring Dependencies in Production
Once your application is deployed, ongoing monitoring of dependencies is crucial:
1. Dependency Tracking
Export a manifest of installed packages in production:
# Add to your container startup
pip freeze > /app/installed_packages.txt
# Or for more detail
pip list --format=json > /app/package_details.json
2. Security Monitoring
Continuously monitor for vulnerabilities:
- Container scanning: Tools like Trivy, Clair, or Snyk
- Runtime monitoring: Tools like Falco
- Software composition analysis: Track open source components
3. Dependency Visualization
Visualize dependency relationships to understand impact:
# Install pipdeptree
pip install pipdeptree
# Generate visualization
pipdeptree --graph-output png > dependencies.png
# OR generate in DOT format for further processing
pipdeptree --graph-output dot > dependencies.dot
4. Automating Dependency Updates
Create automated processes for safe dependency updates:
- Automatically create PRs for dependency updates
- Run test suite against the updates
- Deploy to staging environment
- Perform integration tests
- Promote to production if successful
5. Dependency Drift Detection
Detect when actual installed dependencies differ from expected:
# Script to check for dependency drift
#!/bin/bash
# check_drift.sh
# Generate current dependencies
pip freeze > current_deps.txt
# Compare with requirements
diff -u requirements.txt current_deps.txt > drift.patch
if [ -s drift.patch ]; then
echo "Warning: Dependency drift detected"
cat drift.patch
exit 1
else
echo "All dependencies match requirements"
exit 0
fi
Additional Resources
Documentation
- Docker Official: Dockerfile Best Practices
- Docker Compose Documentation
- Python Packaging User Guide
- pip User Guide
- Poetry Documentation
Tools
- pip-tools - For dependency management
- safety - For vulnerability scanning
- Trivy - For container scanning
- pipdeptree - For dependency visualization
- hadolint - For Dockerfile linting
Articles and Guides
- Python in Production with Docker
- Docker Best Practices for Python Development
- 10 Best Practices to Containerize Python Applications
- Real Python: Packaging Python Applications with Docker
Books
- "Docker for Python Developers" by Joshua Bernstein
- "Python for DevOps" by Noah Gift et al. (O'Reilly)
- "Docker Deep Dive" by Nigel Poulton
- "Python Packaging" by Tarek Ziadé