Session Overview
Welcome to our deep dive into Python and Docker! Today, we'll explore how containerization revolutionizes Python development and deployment. We'll learn why Docker has become essential in modern Python workflows, how to set up Python environments in containers, and best practices for Python-based Docker applications.
Understanding Docker and Containerization
Before diving into Python-specific aspects, let's ensure we have a solid understanding of what Docker provides:
What is Docker?
Docker is a platform that packages applications and their dependencies into standardized units called containers. These containers are isolated, lightweight, and contain everything needed to run an application, including code, runtime, system tools, and libraries.
Container vs. Virtual Machine
Containers are often compared to virtual machines, but they operate differently:
- Containers share the host system's kernel but run in isolated user spaces
- Virtual Machines virtualize the entire operating system, including the kernel
This makes containers significantly lighter and faster to start than VMs, while still providing strong isolation.
Analogy: Apartments vs. Houses
Think of the difference between containers and VMs like apartments versus houses:
- Containers (Apartments) share core infrastructure (foundation, plumbing, electrical systems) but have their own private living spaces
- VMs (Houses) have completely independent infrastructure, making them larger and more resource-intensive
Containers are more efficient when you need many isolated environments that can share core resources.
Why Use Docker for Python Development?
Docker solves several persistent challenges in Python development:
The "Works on My Machine" Problem
One of the most common issues in software development is code that runs perfectly on one machine but fails on another. This occurs because:
- Different Python versions
- Missing or conflicting dependencies
- System-level library differences
- Operating system variations
Docker containers package the entire runtime environment, ensuring consistent behavior across development, testing, and production.
Python-Specific Benefits
- Version Management: Run applications with specific Python versions (2.7, 3.6, 3.10, etc.) without conflicts
- Dependency Isolation: Avoid conflicts between projects requiring different versions of the same library
- System Dependencies: Package system-level libraries that Python modules might depend on (like C compilers, image processing libraries, etc.)
- Reproducible Environments: Guarantee that everyone on the team works with identical setups
- Clean Testing: Test in pristine environments without accumulated artifacts
Real-World Example: Data Science Workflows
Data scientists often face the "dependency nightmare" when collaborating on models. One team member might use NumPy 1.18 with Python 3.7, while another uses NumPy 1.20 with Python 3.9, leading to subtle bugs. With Docker, they can specify:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "model_training.py"]
This ensures that everyone uses exactly the same environment, eliminating inconsistencies.
Analogy: Docker as Recipe Boxes
Think of Docker containers like recipe boxes that include not just the recipe (your code) but also all the ingredients (dependencies), cooking tools (runtime), and even the cooking environment (system libraries):
- Without Docker: "Make this cake" but everyone has different flour, ovens at different temperatures, and varying measuring cups
- With Docker: "Here's a complete box with the exact ingredients, tools, and instructions" - ensuring the same cake every time
Getting Started with Python in Docker
Official Python Docker Images
The Python team maintains official Docker images available on Docker Hub. These images come in several variants:
- python:3.x - Full images with most dependencies
- python:3.x-slim - Smaller images with minimal packages
- python:3.x-alpine - Ultra-minimal images based on Alpine Linux
- python:3.x-windowsservercore - Windows-based images
Running Python in a Docker Container
Let's start with the simplest possible example - running a Python interpreter in a container:
# Pull the Python image (if not already present)
docker pull python:3.10
# Run an interactive Python shell
docker run -it python:3.10
# You should now see the Python REPL
>>> print("Hello from containerized Python!")
Hello from containerized Python!
>>> exit()
Running a Python Script in a Container
Create a file named hello_docker.py with the following content:
import platform
import sys
print("Hello from Python in Docker!")
print(f"Python version: {sys.version}")
print(f"Platform: {platform.platform()}")
Now run this script in a container:
# Assuming hello_docker.py is in your current directory
docker run -v "$(pwd):/app" -w /app python:3.10 python hello_docker.py
This command:
-v "$(pwd):/app"mounts your current directory to /app in the container-w /appsets the working directory inside the container to /apppython:3.10specifies which image to usepython hello_docker.pyis the command to run inside the container
Creating a Python Dockerfile
While running commands directly is useful for simple cases, most projects need a custom Docker image. This is done by creating a Dockerfile.
Basic Python Dockerfile
Create a file named Dockerfile (no extension) in your project directory:
# Use an official Python runtime as a parent image
FROM python:3.10-slim
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app/
# Install any needed packages specified in requirements.txt
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r requirements.txt
# Make port 5000 available to the world outside this container
EXPOSE 5000
# Define environment variable
ENV NAME World
# Run app.py when the container launches
CMD ["python", "app.py"]
Create a simple app.py:
import os
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
name = os.environ.get('NAME', 'World')
return f'Hello, {name}!'
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
And a requirements.txt file:
flask==2.0.1
Building and Running the Docker Image
# Build the image
docker build -t my-python-app .
# Run the container
docker run -p 5000:5000 my-python-app
Your Flask application should now be running at http://localhost:5000
Best Practices for Python Dockerfiles
- Use Specific Versions: Always specify exact versions in requirements.txt (e.g., flask==2.0.1 not just flask)
- Layer Caching: Copy and install requirements before copying the rest of the code to leverage Docker's caching mechanism
- Non-Root User: For production, run as a non-root user for security
- Multi-Stage Builds: Use for compiling extensions or reducing image size
- Environment Variables: Use ENV for configuration that can change
Here's an improved version of our Dockerfile with these practices:
FROM python:3.10-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV PIP_NO_CACHE_DIR 1
# Create a non-root user
RUN useradd -m appuser
# Set the working directory
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy the application
COPY . .
# Change ownership to non-root user
RUN chown -R appuser:appuser /app
# Switch to non-root user
USER appuser
# Expose port
EXPOSE 5000
# Run the application
CMD ["python", "app.py"]
Python with Docker Compose
For applications with multiple services (e.g., web server, database, cache), Docker Compose simplifies management.
Creating a Docker Compose File
Create a file named docker_compose.yml:
version: '3'
services:
web:
build: .
ports:
- "5000:5000"
volumes:
- .:/app
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/postgres
depends_on:
- db
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_USER=postgres
- POSTGRES_DB=postgres
ports:
- "5432:5432"
volumes:
postgres_data:
Using Docker Compose
# Start all services
docker-compose up
# Start in detached mode
docker-compose up -d
# Stop all services
docker-compose down
# Rebuild images and start
docker-compose up --build
Real-World Example: Python Web Application Stack
A production-ready Python web application might include:
version: '3'
services:
nginx:
image: nginx:latest
ports:
- "80:80"
volumes:
- ./nginx/conf.d:/etc/nginx/conf.d
- static_volume:/app/static
- media_volume:/app/media
depends_on:
- web
web:
build: .
command: gunicorn myapp.wsgi:application --bind 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/media
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/postgres
- REDIS_URL=redis://redis:6379/0
depends_on:
- db
- redis
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
- POSTGRES_USER=postgres
- POSTGRES_DB=postgres
redis:
image: redis:6
volumes:
postgres_data:
static_volume:
media_volume:
This setup includes:
- Nginx as a reverse proxy and static file server
- Gunicorn as a WSGI application server
- PostgreSQL database
- Redis for caching or as a message broker
Analogy: Docker Compose as an Orchestra Conductor
If individual Docker containers are like musicians, Docker Compose is like an orchestra conductor:
- Each container (musician) knows how to play its own part
- Docker Compose (conductor) coordinates when each starts, stops, and how they work together
- The conductor ensures everyone is playing in the right order and at the right time
- The score (docker-compose.yml) defines exactly how everything should work together
Just as a conductor makes it easier to manage a complex orchestra, Docker Compose makes it easier to manage complex multi-container applications.
Development Workflow with Python and Docker
Hot Reloading for Development
One challenge when developing with Docker is seeing code changes reflected immediately. Here's how to set up hot reloading:
version: '3'
services:
web:
build: .
command: python -m flask run --host=0.0.0.0 --port=5000
volumes:
- .:/app
ports:
- "5000:5000"
environment:
- FLASK_ENV=development
- FLASK_APP=app.py
With this setup:
- The local directory is mounted into the container
- Flask's development server automatically reloads when files change
- Changes made on your host machine are immediately reflected
Debugging Python in Docker
For debugging, you can:
- Use simple print statements
- Mount debugger configurations
- Use remote debugging
For pdb/debugpy setup:
# In your code
import debugpy
# Enable debugger attachment
debugpy.listen(("0.0.0.0", 5678))
print("Waiting for debugger to attach...")
debugpy.wait_for_client()
And in your docker-compose.yml:
services:
web:
# ...
ports:
- "5000:5000"
- "5678:5678" # Debugger port
Testing in Docker
Creating a separate service for testing ensures isolation:
services:
web:
# ... your web service config
test:
build: .
command: pytest
volumes:
- .:/app
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/test_db
depends_on:
- db
Run tests with:
docker-compose run test
Advanced Python Docker Patterns
Multi-Stage Builds for Python Applications
Multi-stage builds can significantly reduce image size, especially for applications with build dependencies:
# Build stage
FROM python:3.10 AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt
# Final stage
FROM python:3.10-slim
WORKDIR /app
# Copy built wheels from builder stage
COPY --from=builder /app/wheels /wheels
COPY --from=builder /app/requirements.txt .
# Install packages from wheels
RUN pip install --no-cache /wheels/*
COPY . .
CMD ["python", "app.py"]
Python Applications with C Extensions
Many Python packages (NumPy, Pandas, etc.) require C compilation. Ensure your Docker image includes necessary build tools:
FROM python:3.10
# Install build dependencies
RUN apt-get update && apt-get install -y \
build-essential \
python3-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
Using Alpine-based Images
Alpine-based images are much smaller but require special handling:
FROM python:3.10-alpine
# Install build dependencies
RUN apk add --no-cache \
gcc \
musl-dev \
python3-dev
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]
Be aware that Alpine uses musl libc instead of glibc, which can cause subtle compatibility issues with some Python packages.
Production Optimizations
- Use .dockerignore: Create a .dockerignore file to exclude unnecessary files (like .git, __pycache__, etc.)
- Pin Dependencies: Use exact versions for all dependencies
- Set Python Environment Variables:
PYTHONDONTWRITEBYTECODE=1- Prevents Python from writing .pyc filesPYTHONUNBUFFERED=1- Prevents Python from buffering stdout and stderr
- Use Non-Root User: Run containers as a non-root user for security
- Health Checks: Add Docker health checks to monitor application status
FROM python:3.10-slim
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1
# Create user
RUN useradd -m appuser
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
RUN chown -R appuser:appuser /app
USER appuser
# Add health check
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
EXPOSE 5000
CMD ["python", "app.py"]
Real-World Python Docker Applications
Data Science and Machine Learning
Data science workflows benefit greatly from containerization:
FROM python:3.10
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Install Jupyter
RUN pip install jupyterlab
# Copy project files
COPY . .
# Expose Jupyter port
EXPOSE 8888
# Start Jupyter
CMD ["jupyter", "lab", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root", "--NotebookApp.token=''"]
With Docker Compose, you can create a complete data science environment:
version: '3'
services:
jupyter:
build: .
ports:
- "8888:8888"
volumes:
- .:/app
- ./data:/app/data
postgres:
image: postgres:13
environment:
- POSTGRES_PASSWORD=postgres
volumes:
- postgres_data:/var/lib/postgresql/data
mlflow:
image: ghcr.io/mlflow/mlflow:v2.1.1
ports:
- "5000:5000"
volumes:
- ./mlruns:/mlruns
command: mlflow server --host 0.0.0.0
volumes:
postgres_data:
Django Web Applications
Django projects often include multiple services:
version: '3'
services:
web:
build: .
command: gunicorn myproject.wsgi:application --bind 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/media
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgres://postgres:postgres@db:5432/postgres
- CELERY_BROKER_URL=redis://redis:6379/0
depends_on:
- db
- redis
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
redis:
image: redis:6
celery:
build: .
command: celery -A myproject worker -l info
volumes:
- .:/app
environment:
- DATABASE_URL=postgres://postgres:postgres@db:5432/postgres
- CELERY_BROKER_URL=redis://redis:6379/0
depends_on:
- web
- db
- redis
volumes:
postgres_data:
static_volume:
media_volume:
Microservices with Python
In a microservices architecture, each service can be its own Python container:
version: '3'
services:
auth_service:
build: ./auth_service
ports:
- "5000:5000"
environment:
- DATABASE_URL=postgres://postgres:postgres@db:5432/auth_db
product_service:
build: ./product_service
ports:
- "5001:5000"
environment:
- DATABASE_URL=postgres://postgres:postgres@db:5432/product_db
- AUTH_SERVICE_URL=http://auth_service:5000
order_service:
build: ./order_service
ports:
- "5002:5000"
environment:
- DATABASE_URL=postgres://postgres:postgres@db:5432/order_db
- PRODUCT_SERVICE_URL=http://product_service:5000
- AUTH_SERVICE_URL=http://auth_service:5000
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=postgres
volumes:
postgres_data:
Analogy: Microservices as Specialized Shops
Think of a microservices architecture like a shopping center with specialized stores instead of a single department store:
- Each shop (microservice) specializes in one thing and does it well
- Shops can be updated or replaced individually without affecting others
- Shops communicate with each other when necessary (e.g., the tailor might send customers to the shoe store)
- The shopping center can add new stores or remove underperforming ones easily
Docker makes it easy to manage all these specialized services, ensuring they can work together while remaining independently maintainable.
Verifying Python in Docker
Core Verification Steps
When setting up Python in Docker, verify the following:
- Python Version: Ensure the container uses the expected Python version
- Package Installation: Verify dependencies are correctly installed
- Environment Variables: Check that environment variables are properly set
- Filesystem Access: Confirm volumes are mounted correctly
- Network Connectivity: Test that services can communicate
Verification Script
Create a file named verify_environment.py:
#!/usr/bin/env python3
import sys
import os
import platform
import subprocess
import importlib.util
def verify_python_version():
print(f"Python version: {sys.version}")
print(f"Python executable: {sys.executable}")
print(f"Platform: {platform.platform()}")
def verify_packages(required_packages):
print("\nPackage verification:")
for package in required_packages:
try:
spec = importlib.util.find_spec(package)
if spec is None:
print(f"❌ {package} is NOT installed")
else:
module = importlib.import_module(package)
version = getattr(module, '__version__', 'unknown')
print(f"✅ {package} is installed (version: {version})")
except ImportError:
print(f"❌ {package} is NOT installed")
def verify_environment_variables(required_vars):
print("\nEnvironment variables:")
for var in required_vars:
value = os.environ.get(var)
if value:
print(f"✅ {var} is set to: {value}")
else:
print(f"❌ {var} is NOT set")
def verify_filesystem_access(paths):
print("\nFilesystem access:")
for path in paths:
if os.path.exists(path):
print(f"✅ {path} exists and is accessible")
if os.path.isdir(path):
try:
test_file = os.path.join(path, 'test_write.txt')
with open(test_file, 'w') as f:
f.write('test')
os.remove(test_file)
print(f"✅ {path} is writable")
except Exception as e:
print(f"❌ {path} is NOT writable: {e}")
else:
print(f"❌ {path} does NOT exist or is NOT accessible")
def verify_network_connectivity(endpoints):
print("\nNetwork connectivity:")
for endpoint in endpoints:
try:
result = subprocess.run(['curl', '-s', '-o', '/dev/null', '-w', '%{http_code}', endpoint],
capture_output=True, text=True, timeout=5)
status = result.stdout.strip()
if status.startswith('2') or status.startswith('3'):
print(f"✅ {endpoint} is reachable (status: {status})")
else:
print(f"❌ {endpoint} returned status: {status}")
except subprocess.SubprocessError as e:
print(f"❌ {endpoint} is NOT reachable: {e}")
if __name__ == "__main__":
verify_python_version()
# Customize these lists for your application
verify_packages(['flask', 'requests', 'sqlalchemy', 'numpy'])
verify_environment_variables(['DATABASE_URL', 'FLASK_ENV'])
verify_filesystem_access(['/app', '/app/data', '/tmp'])
verify_network_connectivity(['http://localhost:5000', 'http://db:5432', 'https://pypi.org'])
Run this script in your container:
docker run -it --rm myapp python verify_environment.py
Using Docker Compose for Verification
Add a verification service to your docker-compose.yml:
services:
# ... your existing services
verify:
build: .
command: python verify_environment.py
depends_on:
- web
- db
environment:
- DATABASE_URL=postgresql://postgres:postgres@db:5432/postgres
- FLASK_ENV=development
Run it with:
docker-compose run verify
Troubleshooting Python in Docker
Common Issues and Solutions
| Issue | Possible Causes | Solutions |
|---|---|---|
| Module Not Found Error |
|
|
| Permission Denied |
|
|
| Connection Refused |
|
|
| Memory Errors |
|
|
Debugging Commands
# View container logs
docker logs container_name
# Enter a running container
docker exec -it container_name bash
# Inspect a container
docker inspect container_name
# Check resource usage
docker stats
# View networks
docker network ls
Interactive Debugging Session
To debug a failing container:
# Start the container with a different command
docker run -it --entrypoint=bash myapp
# Or for a failed container, commit its state to a new image and debug
docker commit failed_container debug_image
docker run -it --entrypoint=bash debug_image
Wrapping Up and Next Steps
Today we've covered the fundamentals of using Python in Docker containers, from basic setup to production-ready configurations. Docker containers have revolutionized Python development by providing consistent, isolated environments that solve many traditional deployment challenges.
Key Takeaways
- Docker solves the "works on my machine" problem for Python applications
- Official Python Docker images provide a solid foundation for your projects
- A well-crafted Dockerfile is essential for reproducible environments
- Docker Compose simplifies multi-service Python applications
- Proper verification ensures your containerized Python environment works as expected
Practice Exercises
- Create a Dockerfile for a simple Flask application
- Set up a development environment with code hot-reloading
- Build a multi-container application with Python, PostgreSQL, and Redis
- Implement the verification script to test your Docker environment
- Optimize your Docker image size using multi-stage builds
Additional Resources
- Official Python Docker Images
- Docker Compose Documentation
- Dockerizing Flask with Postgres, Gunicorn, and Nginx
- Python Speed - Docker Tips
In our next session, we'll build on these containerization concepts to explore how to effectively manage Python dependencies in Docker and implement best practices for production deployments.