Lecture Overview
Today we'll explore multi-container applications, a fundamental concept in modern application development. Understanding how to architect, connect, and orchestrate multiple containers is essential for building scalable, maintainable, and deployable applications in today's cloud-native world.
Understanding Multi-container Applications
Most real-world applications aren't monoliths running in a single container. Instead, they're composed of multiple specialized services, each running in its own container. This architectural approach brings numerous benefits but also introduces complexity in management and coordination.
Why Use Multiple Containers?
- Separation of concerns: Each container handles a specific function
- Independent scaling: Scale services based on their individual needs
- Isolation: Failures in one container don't directly affect others
- Technology diversity: Use the best tool/language for each component
- Simplified development: Teams can work on components independently
- Targeted updates: Update components individually without affecting the entire system
Analogy: Restaurant Operations
Think of a multi-container application like a restaurant:
- Front-of-house (Frontend container): Takes orders, presents dishes, interacts with customers
- Kitchen (Backend container): Processes orders, prepares food, manages inventory
- Refrigerator (Database container): Stores ingredients and supplies
- Dishwashing station (Cache container): Provides clean plates quickly for reuse
- Delivery service (Message queue container): Handles takeout orders and delivery
Each "department" operates independently but communicates seamlessly to deliver the complete dining experience. If the restaurant gets busier, you might add more kitchen staff (scale the backend) without needing more hosts (frontend).
Common Components in Multi-container Applications
Modern applications typically include several types of containers working together:
Frontend Services
- Web servers: Nginx, Apache
- Static content: HTML, CSS, JavaScript
- Frontend frameworks: React, Vue, Angular
Backend Services
- Application servers: Flask, Django, Express
- API gateways: Managing API requests
- Authentication services: Handling user identity
Data Services
- Databases: PostgreSQL, MySQL, MongoDB
- Caching layers: Redis, Memcached
- Search engines: Elasticsearch
Supporting Services
- Message queues: RabbitMQ, Kafka
- Worker processes: Background job handlers
- Monitoring tools: Prometheus, Grafana
Real-world Example: E-commerce Application
Consider a typical e-commerce application with these containers:
- Nginx container: Web server, handles HTTPS, static content, load balancing
- Frontend container: React application serving the user interface
- API container: Python Flask service providing API endpoints
- Auth container: Service handling user authentication and sessions
- Database container: PostgreSQL storing product and user data
- Redis container: Caching frequently accessed data
- Worker container: Processing background tasks like email notifications
- Search container: Elasticsearch for product search functionality
Each container has a specific role, and together they create a complete application.
Challenges in Multi-container Applications
While powerful, multi-container architectures introduce several challenges:
Communication Between Containers
Containers need to discover and communicate with each other. This raises questions like:
- How does container A find container B?
- How do they securely exchange data?
- What happens if a container is restarted with a new IP address?
Data Persistence and Sharing
Data often needs to persist beyond the container lifecycle or be shared between containers:
- Database data must survive container restarts
- Uploaded files may need to be accessible to multiple services
- Configuration may need to be shared across containers
Startup Order and Dependencies
Some containers depend on others being ready:
- Application server needs database to be ready
- Frontend might need API services available
- Workers may need message queues operational
Configuration Management
Different containers need different environment variables and configurations:
- Database credentials
- API endpoints and authentication tokens
- Feature flags and environment-specific settings
Analogy: Symphony Orchestra
Managing multiple containers is like conducting an orchestra:
- Each instrument (container) plays its specialized part
- Musicians (services) must stay in sync with each other
- Some instruments start playing before others (dependency order)
- Sheet music (configuration) must be distributed correctly
- The conductor (container orchestration) coordinates everything
Without proper orchestration, you get noise instead of music.
Container Communication Patterns
In multi-container applications, services communicate through various patterns:
HTTP/REST Communication
The most common pattern for synchronous service-to-service communication:
# Python Example: Flask service calling another service
import requests
def get_user_data(user_id):
# Service discovery through container name
response = requests.get(f"http://user-service:8080/users/{user_id}")
return response.json()
Message Queues
For asynchronous communication between services:
# Python Example: Sending message to queue
import pika
def send_email_notification(user_id, message):
connection = pika.BlockingConnection(pika.ConnectionParameters('rabbitmq'))
channel = connection.channel()
# Declare a queue
channel.queue_declare(queue='email_notifications')
# Send message to queue
channel.basic_publish(
exchange='',
routing_key='email_notifications',
body=f'{{"user_id": "{user_id}", "message": "{message}"}}'
)
connection.close()
Shared Volumes
For sharing files between containers:
# Docker Compose example
services:
web:
image: nginx
volumes:
- uploaded_files:/usr/share/nginx/html/uploads
processor:
image: image-processor
volumes:
- uploaded_files:/app/files
volumes:
uploaded_files:
Database as Integration Point
Using a database for communication between services:
# Python Example: Service A writes to DB
def create_order(order_data):
db = get_database_connection()
cursor = db.cursor()
cursor.execute(
"INSERT INTO orders (user_id, product_id, quantity) VALUES (%s, %s, %s)",
(order_data['user_id'], order_data['product_id'], order_data['quantity'])
)
db.commit()
# Python Example: Service B reads from DB
def process_new_orders():
db = get_database_connection()
cursor = db.cursor()
cursor.execute("SELECT * FROM orders WHERE processed = FALSE")
orders = cursor.fetchall()
for order in orders:
# Process the order
process_order(order)
# Mark as processed
cursor.execute("UPDATE orders SET processed = TRUE WHERE id = %s", (order['id'],))
db.commit()
Communication Architecture Example
Consider a social media application with these communication patterns:
- Frontend → API: HTTP/REST requests for data
- API → Database: SQL queries for persistent storage
- User posts → Queue: New posts sent to processing queue
- Worker ← Queue: Processes images, generates notifications
- API → Cache: Stores frequently accessed data
- Upload Service → Shared Volume → Image Processor: For image processing
Service Discovery in Multi-container Applications
For containers to communicate, they need to find each other. This is called service discovery.
Basic Service Discovery with Docker
Docker provides basic service discovery through its built-in DNS server:
- Containers on the same network can reach each other by service name
- Docker automatically updates DNS entries when containers start/stop
- No external service discovery system needed for simple applications
# Example: Connecting to a database in another container
import psycopg2
# Use the service name as hostname
conn = psycopg2.connect(
host="db", # Container name as hostname
database="myapp",
user="postgres",
password="secretpassword"
)
More Advanced Service Discovery
For complex applications, more sophisticated service discovery might be needed:
- Consul: Service discovery and configuration
- etcd: Distributed key-value store
- Kubernetes Service: More advanced orchestration
Analogy: Phone Directory
Service discovery is like a phone directory:
- Each service (person) registers its address and capabilities
- Other services can look up who they need to call
- If someone moves (container restarts), the directory is updated
- You don't need to know someone's exact address, just their name
Managing Container Dependencies
In multi-container applications, services often depend on other services being available first.
Dependency Ordering with Docker Compose
Docker Compose provides the depends_on feature to express startup order:
services:
web:
build: ./web
depends_on:
- db
- redis
db:
image: postgres:13
redis:
image: redis:alpine
However, depends_on only waits for containers to start, not for the services inside to be ready.
Wait Scripts and Health Checks
For more sophisticated dependency management, use wait scripts or health checks:
# Example wait-for script in a Dockerfile
FROM python:3.9
# Install wait-for-it script
COPY wait-for-it.sh /usr/local/bin/wait-for-it
RUN chmod +x /usr/local/bin/wait-for-it
COPY . /app
WORKDIR /app
CMD ["wait-for-it", "db:5432", "--", "python", "app.py"]
Application-Level Retry Logic
Another approach is to build retry logic into your application:
import time
import psycopg2
def get_database_connection(max_retries=30, retry_interval=2):
retries = 0
while retries < max_retries:
try:
conn = psycopg2.connect(
host="db",
database="myapp",
user="postgres",
password="secretpassword"
)
print("Database connection established")
return conn
except psycopg2.OperationalError as e:
retries += 1
print(f"Database connection attempt {retries} failed. Retrying in {retry_interval} seconds...")
time.sleep(retry_interval)
raise Exception("Could not connect to database after maximum retries")
Real-world Example: Handling Dependencies
In a typical web application stack:
- Database container starts first
- Redis cache starts (can be parallel with database)
- Backend API waits for database and cache to be ready
- Worker processes wait for backend API readiness
- Frontend web server starts last
Data Persistence in Multi-container Applications
Containers are ephemeral by design, but data often needs to persist. In multi-container applications, this becomes even more important as data may be shared between services.
Types of Docker Volumes
- Named volumes: Persistent storage managed by Docker
- Bind mounts: Map host directories to container paths
- tmpfs mounts: Temporary file storage in memory
Using Volumes in Multi-container Applications
services:
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
backup:
image: backup-service
volumes:
- postgres_data:/backup/data:ro # Read-only access
volumes:
postgres_data: # Named volume defined at the bottom
Data Sharing Patterns
Common patterns for data sharing between containers:
- Database container with persistent volume: Primary data storage
- Shared volume for file-based data: For uploads, exports, etc.
- Cache container: For ephemeral shared data
Analogy: Office Filing System
Container data management is like an office filing system:
- Named volumes: Official filing cabinets that remain even when staff change
- Bind mounts: Documents brought from home but used at work
- tmpfs: Sticky notes that get thrown away at the end of the day
- Shared volumes: Department files that multiple teams need access to
Practical Example: Building a Multi-container Application
Let's walk through creating a simple multi-container application: a web app with a Python backend, PostgreSQL database, and Redis cache.
Step 1: Define the Application Architecture
- Web: Flask application serving API endpoints
- Database: PostgreSQL for persistent storage
- Cache: Redis for temporary data and session storage
Step 2: Create the Directory Structure
multi_container_app/
├── docker-compose.yml
├── web/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── app.py
│ └── wait-for-it.sh
├── database/
│ └── init.sql
└── .env
Step 3: Create the Flask Application
In web/app.py:
from flask import Flask, jsonify
import psycopg2
import redis
import os
import time
app = Flask(__name__)
# Connect to Redis
def get_redis_connection():
redis_host = os.environ.get('REDIS_HOST', 'redis')
redis_port = int(os.environ.get('REDIS_PORT', 6379))
retry_count = 0
max_retries = 30
while retry_count < max_retries:
try:
r = redis.Redis(host=redis_host, port=redis_port, decode_responses=True)
r.ping() # Test connection
return r
except (redis.exceptions.ConnectionError, redis.exceptions.BusyLoadingError):
retry_count += 1
print(f"Redis connection attempt {retry_count} failed. Retrying...")
time.sleep(1)
raise Exception("Could not connect to Redis")
# Connect to PostgreSQL
def get_db_connection():
db_host = os.environ.get('DB_HOST', 'db')
db_name = os.environ.get('DB_NAME', 'myapp')
db_user = os.environ.get('DB_USER', 'postgres')
db_password = os.environ.get('DB_PASSWORD', 'password')
retry_count = 0
max_retries = 30
while retry_count < max_retries:
try:
conn = psycopg2.connect(
host=db_host,
database=db_name,
user=db_user,
password=db_password
)
conn.autocommit = True
return conn
except psycopg2.OperationalError:
retry_count += 1
print(f"Database connection attempt {retry_count} failed. Retrying...")
time.sleep(1)
raise Exception("Could not connect to database")
@app.route('/')
def hello():
return jsonify({"message": "Hello from Flask!"})
@app.route('/items')
def get_items():
# Try to get from cache first
r = get_redis_connection()
cached_items = r.get('items')
if cached_items:
print("Returning cached data")
return jsonify({"source": "cache", "items": eval(cached_items)})
# If not in cache, get from database
conn = get_db_connection()
cursor = conn.cursor()
cursor.execute('SELECT id, name FROM items')
db_items = [{"id": row[0], "name": row[1]} for row in cursor.fetchall()]
cursor.close()
conn.close()
# Store in cache for next time
r.setex('items', 30, str(db_items)) # Cache for 30 seconds
return jsonify({"source": "database", "items": db_items})
@app.route('/add/')
def add_item(name):
conn = get_db_connection()
cursor = conn.cursor()
cursor.execute('INSERT INTO items (name) VALUES (%s) RETURNING id', (name,))
item_id = cursor.fetchone()[0]
conn.commit()
cursor.close()
conn.close()
# Invalidate cache
r = get_redis_connection()
r.delete('items')
return jsonify({"id": item_id, "name": name})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=True)
Step 4: Create the Dockerfile
In web/Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN chmod +x wait-for-it.sh
EXPOSE 5000
CMD ["./wait-for-it.sh", "db:5432", "--", "python", "app.py"]
Step 5: Create Requirements File
In web/requirements.txt:
flask==2.0.1
psycopg2-binary==2.9.1
redis==3.5.3
Step 6: Database Initialization Script
In database/init.sql:
CREATE TABLE IF NOT EXISTS items (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Add some sample data
INSERT INTO items (name) VALUES ('Item 1'), ('Item 2'), ('Item 3');
Step 7: Create wait-for-it Script
In web/wait-for-it.sh (you can download this script from GitHub):
This is a utility script that waits for a host:port to be available before executing a command.
Step 8: Create Docker Compose File
In docker-compose.yml:
version: '3'
services:
web:
build: ./web
ports:
- "5000:5000"
volumes:
- ./web:/app
environment:
- DB_HOST=db
- DB_NAME=myapp
- DB_USER=postgres
- DB_PASSWORD=password
- REDIS_HOST=redis
- FLASK_ENV=development
depends_on:
- db
- redis
db:
image: postgres:13
volumes:
- postgres_data:/var/lib/postgresql/data
- ./database/init.sql:/docker-entrypoint-initdb.d/init.sql
environment:
- POSTGRES_PASSWORD=password
- POSTGRES_DB=myapp
ports:
- "5432:5432" # Exposed for local development tools
redis:
image: redis:6-alpine
ports:
- "6379:6379" # Exposed for local development tools
volumes:
postgres_data:
Step 9: Environment Variables
In .env:
# Development environment variables
COMPOSE_PROJECT_NAME=multi_container_app
# These can override the values in docker-compose.yml if needed
# DB_PASSWORD=different_password
# REDIS_HOST=custom_redis_host
Step 10: Running the Application
# Start all services
docker-compose up
# Access the application
curl http://localhost:5000/items
# Add a new item
curl http://localhost:5000/add/NewItem
# Verify it's added
curl http://localhost:5000/items
This example demonstrates:
- Communication between containers (Flask → PostgreSQL, Flask → Redis)
- Data persistence with volumes (PostgreSQL data)
- Dependency handling (waiting for database before starting web app)
- Configuration through environment variables
- Service discovery using container names as hostnames
- Caching strategy between services
Scaling Multi-container Applications
One of the key benefits of multi-container applications is the ability to scale components independently.
Horizontal Scaling
Running multiple instances of the same service:
# Scale the web service to 3 instances
docker-compose up -d --scale web=3
Note: When scaling, you need to handle:
- Port conflicts (don't directly map container ports)
- Load balancing between instances
- Session persistence or shared state
Load Balancing Options
- Nginx as reverse proxy: Configure upstream servers
- Traefik: Automatic service discovery and routing
- HAProxy: Advanced load balancing features
Scaling Different Components
Multi-container apps allow targeted scaling:
- Scale web containers during high traffic
- Scale workers when processing backlogs
- Keep single database instance for consistency
- Scale read replicas for database read operations
Analogy: Department Store Staffing
Scaling components is like managing department store staff:
- During sales, you add more checkout staff (frontend)
- For inventory day, you add more stockroom workers (backend)
- You maintain the same number of managers (database)
- You bring in seasonal workers for specific tasks (worker processes)
Each department scales based on its specific demands.
Monitoring Multi-container Applications
With multiple containers, monitoring becomes more complex but even more important.
Key Metrics to Monitor
- Container health: Is each container running?
- Resource usage: CPU, memory, disk, network per container
- Application metrics: Response times, error rates, requests per second
- Database metrics: Query performance, connection count
- Cache metrics: Hit rates, memory usage
Monitoring Tools
- Docker stats: Basic built-in monitoring
- Prometheus: Metrics collection and storage
- Grafana: Metrics visualization
- cAdvisor: Container resource usage
- ELK Stack: Log aggregation and analysis
Basic Docker Monitoring Command
# View resource usage of all containers
docker stats
Logging in Multi-container Applications
Centralized logging is crucial:
# View logs from all containers
docker-compose logs
# Follow logs from specific service
docker-compose logs -f web
For production, consider centralized logging solutions like:
- ELK Stack (Elasticsearch, Logstash, Kibana)
- Fluentd with cloud storage
- Graylog
Debugging Multi-container Applications
Debugging across multiple containers requires a systematic approach.
Common Issues and Debugging Techniques
1. Container Communication Issues
- Symptom: "Connection refused" errors
- Debugging:
- Check if containers are on the same network
- Verify service names are used correctly
- Test with simple ping or curl commands from inside containers
- Example:
# Enter a container
docker-compose exec web bash
# Test connection to database
ping db
# Test connection to specific port
nc -zv db 5432
2. Dependency Startup Issues
- Symptom: Services fail due to dependencies not being ready
- Debugging:
- Implement proper wait scripts
- Add retry logic in applications
- Check logs for timing-related errors
3. Volume Mount Issues
- Symptom: Missing data or permission errors
- Debugging:
- Verify volumes are correctly defined
- Check path mappings
- Inspect filesystem inside containers
- Example:
# List volumes
docker volume ls
# Inspect a specific volume
docker volume inspect multi_container_app_postgres_data
# Check permissions inside container
docker-compose exec db ls -la /var/lib/postgresql/data
4. Environment Variable Issues
- Symptom: Application configuration problems
- Debugging:
- Print environment variables for verification
- Check for typos in variable names
- Verify .env file is being loaded
- Example:
# Check environment variables in a container
docker-compose exec web env
# Check specific variable
docker-compose exec web bash -c 'echo $DB_HOST'
Debugging Workflow for Multi-container Applications
- Start with logs:
docker-compose logs - Check container status:
docker-compose ps - Inspect network:
docker network inspect multi_container_app_default - Enter problematic container:
docker-compose exec [service] bash - Test connectivity: Using ping, curl, nc from inside containers
- Verify environment: Check environment variables, filesystem
- Restart services:
docker-compose restart [service] - Rebuild if needed:
docker-compose up -d --build
Assignment: Create a Multi-container Application
Now it's time to apply what you've learned by creating your own multi-container application:
Requirements:
- Create a Docker Compose setup with at least three containers:
- A Python service (Flask or Django)
- A database (PostgreSQL, MySQL, or MongoDB)
- A third service of your choice (Redis, Nginx, etc.)
- Implement communication between services
- Configure proper volume mounts for data persistence
- Handle service dependencies and startup order
- Document your application architecture and usage instructions
Project Structure:
assignment/
├── docker-compose.yml
├── README.md
├── service1/
│ └── [service1 files]
├── service2/
│ └── [service2 files]
├── service3/
│ └── [service3 files]
└── .env
Bonus Challenges:
- Implement a health check endpoint in your Python service
- Add proper error handling and retry logic for dependencies
- Create a frontend container with a simple UI
- Implement a worker queue for background processing
- Add basic monitoring using Prometheus and Grafana
Hints:
- Start simple and incrementally add complexity
- Test each container individually before connecting them
- Use the debugging techniques we discussed
- Document all environment variables and configuration options
- Consider security aspects (e.g., don't hardcode passwords)
Key Takeaways
- Multi-container applications provide separation of concerns, scalability, and technology diversity
- Services communicate through HTTP/REST, message queues, shared volumes, or databases
- Docker provides basic service discovery through container names
- Dependencies can be managed through Docker Compose, wait scripts, or application-level retry logic
- Data persistence requires careful volume configuration
- Debugging multi-container applications requires a systematic approach
- Monitoring becomes even more important with multiple containers