Docker Architecture and Components

Week 1, Wednesday - Morning Session

Lecture Overview

In this session, we'll explore the architecture of Docker and its key components. Understanding how Docker is structured internally will give you a solid foundation for working with containers effectively. You'll learn how the different pieces of Docker fit together to create a coherent system for building, shipping, and running containerized applications.

Docker Architecture at a Glance

Docker uses a client-server architecture with several distinct components working together. Before diving into each component, let's get a high-level overview of how Docker is structured:

┌─────────────────────────────────────────────────────────────────┐
│                           Host Machine                           │
│                                                                  │
│  ┌──────────────┐       ┌─────────────────────────────────────┐ │
│  │              │       │        Docker Host (daemon)         │ │
│  │ Docker Client│<─────>│                                     │ │
│  │              │       │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│  └──────────────┘       │ │Container│ │Container│ │Container│ │ │
│         ▲               │ └─────────┘ └─────────┘ └─────────┘ │ │
│         │               └─────────────────────────────────────┘ │
└─────────┼──────────────────────────────────────────────────────┘
          │
          ▼
┌──────────────────┐
│  Registry        │
│  (e.g. Docker Hub)│
└──────────────────┘

Think of Docker as a transportation system. The Docker Client is like dispatching office that sends instructions. The Docker Daemon (Server) is like the transportation hub that manages all the vehicles (containers). The Registry is like a warehouse storing vehicle designs (images) that can be requested when needed.

Let's now explore each component in detail to understand what it does and how it interacts with the others.

Docker Client

The Docker Client is the primary way users interact with Docker. It's what you're using when you run commands starting with docker in your terminal.

Functions of the Docker Client

The client can communicate with a Docker daemon running on the same machine (the default configuration) or connect to a remote daemon running on another system.

Docker Client in action: When you run a command like docker run nginx, the client:

  1. Parses your command
  2. Connects to the Docker daemon
  3. Sends instructions to pull the nginx image (if needed) and create a container
  4. Returns output from the daemon back to your terminal

The Docker client is somewhat analogous to a remote control for your TV. It doesn't do the actual work of displaying content (that's the TV's job), but it sends instructions that control what happens. Similarly, the Docker client doesn't run containers itself but instructs the daemon to do so.

Common Client Commands

The Docker client provides commands for the entire container lifecycle:

Each of these commands gets translated by the client into API calls to the Docker daemon.

Docker Daemon (dockerd)

The Docker daemon (often referred to as dockerd) is the heart of Docker. It's a background service that manages everything related to containers on a system.

Responsibilities of the Docker Daemon

Think of the Docker daemon as a factory manager. It receives blueprints (images), creates products (containers), manages resources, and oversees the entire production process.

Technical insight: The Docker daemon exposes a REST API that the client and other programs can use to interact with it. This API-driven design makes Docker highly automatable and integrable with other systems.

Daemon Configuration

The daemon can be configured in various ways to control:

In production environments, proper daemon configuration is crucial for security and performance.

Real-world context: In a development team, each developer runs their own Docker daemon on their local machine. In a production environment, you might have multiple servers each running a Docker daemon, potentially managed by an orchestration tool like Kubernetes.

Docker Images

Docker images are read-only templates used to create containers. They contain everything needed to run an application: code, runtime, libraries, environment variables, and configuration files.

Key Characteristics of Images

The layered nature of images is one of Docker's most powerful features. Let's explore it further.

Layered File System

Docker images are built using a layered approach where each layer represents a set of filesystem changes:

┌───────────────────────┐
│   Application Code    │  <-- Top layer
├───────────────────────┤
│   Application Deps    │
├───────────────────────┤
│   Runtime (e.g. Node) │
├───────────────────────┤
│   Base OS (e.g. Alpine)│  <-- Bottom layer
└───────────────────────┘

Each layer only stores the differences from the previous layer. This approach has several advantages:

Analogy: Think of Docker images like a stack of transparent sheets. Each sheet has some content drawn on it, and when stacked together, they form a complete picture. If you want to create a similar image, you can reuse most of the stack and just replace or add sheets as needed, rather than drawing everything from scratch.

Image IDs and Tags

Each Docker image has a unique identifier (a SHA256 hash) and can have multiple human-readable tags:

$ docker images
REPOSITORY    TAG       IMAGE ID       CREATED       SIZE
nginx         latest    ad4c705f24d3   2 weeks ago   133MB
python        3.9       a8bd5b274a97   3 weeks ago   915MB
python        3.10      98f52028b399   3 weeks ago   920MB

In this example:

Tags are crucial for version management. For example, python:3.9 and python:3.10 refer to different versions of Python, while both being part of the "python" repository.

Best practice: Never rely on the "latest" tag in production environments. Always specify exact version tags to ensure consistency and prevent unexpected changes when images are updated.

Containers

If images are the blueprints, containers are the running instances created from those blueprints. A container is a runnable instance of an image.

Container Characteristics

When you create a container from an image, Docker adds a writable layer on top of the immutable image layers. This allows the container to modify files while keeping the original image unchanged.

┌───────────────────────┐
│    Writable Layer     │  <-- Container-specific layer
├───────────────────────┤
│   Application Code    │  
├───────────────────────┤
│   Application Deps    │  <-- Image layers (read-only)
├───────────────────────┤
│   Runtime (e.g. Node) │
├───────────────────────┤
│   Base OS (e.g. Alpine)│
└───────────────────────┘

Analogy: Consider a container like a kitchen. The image provides all the appliances, utensils, and basic ingredients (like flour, sugar, etc.). When you start cooking (run the container), you might create new dishes and temporarily modify the kitchen state, but when you're done (container stops), the kitchen returns to its original state. If you want to save your changes, you need to create a new "blueprint" (image) from your current state.

Container Lifecycle

Containers have a distinct lifecycle with several states:

  1. Created - Container is defined but not started
  2. Running - Container processes are executing
  3. Paused - Container processes are temporarily suspended
  4. Stopped - Container processes have terminated but the container still exists
  5. Removed - Container is deleted along with its writable layer

Container lifecycle commands:

# Create and run a container
$ docker run --name my-nginx -d nginx

# Pause a running container
$ docker pause my-nginx

# Unpause a container
$ docker unpause my-nginx

# Stop a container
$ docker stop my-nginx

# Start a stopped container
$ docker start my-nginx

# Remove a container
$ docker rm my-nginx

Understanding this lifecycle is crucial for managing containers effectively, especially in production environments where automatic restarts and health checks become important.

Docker Registries

Docker registries are repositories for storing and distributing Docker images. They play a crucial role in the "build once, run anywhere" philosophy of Docker.

Registry Types

Docker Hub is the default registry that Docker uses when you run commands like docker pull without specifying a registry.

Analogy: Docker registries are like libraries or bookstores. Docker Hub is like a public library with books (images) anyone can borrow. Private registries are like personal bookshelves where you keep books that are special to you or your organization. When you need a book, you first check if it's on your bookshelf (cache), and if not, you go to the library (registry) to get it.

Working with Registries

Common operations with registries include:

# Pull an image from Docker Hub
$ docker pull nginx:latest

# Tag an image for a specific registry
$ docker tag my-app:1.0 my-registry.example.com/my-app:1.0

# Push an image to a registry
$ docker push my-registry.example.com/my-app:1.0

# Pull from a specific registry
$ docker pull my-registry.example.com/my-app:1.0

In enterprise environments, organizations often maintain their own registries for several reasons:

Best practice: For production applications, always use a private registry with proper access controls. Scan images for vulnerabilities before pushing them to your registry, and implement policies about which external images can be pulled.

Docker Storage

Docker provides several options for managing data in containers. Understanding these is crucial because containers are ephemeral by design - when a container is removed, any data that was written to its writable layer is lost.

Storage Options

Docker Volumes

Volumes are the preferred mechanism for persisting data generated and used by Docker containers. Some key benefits of volumes include:

┌─────────────────────────────────────────┐
│               Host System                │
│                                          │
│  ┌──────────────┐    ┌───────────────┐  │
│  │  Container A │    │  Container B  │  │
│  │              │    │               │  │
│  │              │    │               │  │
│  └──────┬───────┘    └───────┬───────┘  │
│         │                    │          │
│         │                    │          │
│         ▼                    ▼          │
│  ┌──────────────────────────────────┐   │
│  │            Volume                │   │
│  │                                  │   │
│  └──────────────────────────────────┘   │
│                                          │
└─────────────────────────────────────────┘

Working with volumes:

# Create a volume
$ docker volume create my-data

# Run a container with a volume
$ docker run -v my-data:/app/data nginx

# List volumes
$ docker volume ls

# Inspect a volume
$ docker volume inspect my-data

# Remove a volume
$ docker volume rm my-data

# Clean up unused volumes
$ docker volume prune

Analogy: Think of volumes like external hard drives for your containers. The container itself might be temporary, but the external drive (volume) persists and can be connected to different containers over time. This separation of compute (container) from storage (volume) is a fundamental pattern in cloud-native architecture.

Bind Mounts

Bind mounts have been around since the early days of Docker. They allow you to mount a file or directory on the host machine into a container. The main differences from volumes are:

Using bind mounts:

# Mount the current directory into a container
$ docker run -v $(pwd):/app nginx

tmpfs Mounts

tmpfs mounts are stored in the host system's memory only, never written to the host system's filesystem. This is useful for storing sensitive information that you don't want to persist.

Using tmpfs mounts:

# Create a container with a tmpfs mount
$ docker run --tmpfs /app/temp nginx

Best practice: For production applications, always use named volumes for persistent data and clearly document what data needs to persist. In development, bind mounts are often convenient for code changes, but volumes should still be used for databases and other stateful components.

Docker Networking

Docker's networking subsystem is pluggable, using drivers. Several drivers exist by default, and you can install third-party drivers as well. Each driver offers specific features and capabilities.

Network Drivers

Bridge Networks

The bridge driver creates a private network internal to the host. Containers on this network can communicate with each other, and the host can forward traffic to the external world.

┌─────────────────────────────────────────────────────┐
│                  Host System                         │
│                                                      │
│  ┌────────────┐    ┌────────────┐    ┌────────────┐ │
│  │Container A │    │Container B │    │Container C │ │
│  │  172.17.0.2│    │  172.17.0.3│    │  172.17.0.4│ │
│  └─────┬──────┘    └─────┬──────┘    └─────┬──────┘ │
│        │                 │                 │        │
│        └─────────┬───────┴─────────┬───────┘        │
│                  │                 │                │
│           ┌──────┴─────────────────┴──────┐         │
│           │      Bridge Network           │         │
│           │         172.17.0.0/16         │         │
│           └───────────────┬───────────────┘         │
│                           │                         │
│                     ┌─────┴─────┐                   │
│                     │  eth0     │                   │
└─────────────────────┼───────────┼─────────────────┘
                      │           │
                      │  Internet │

Working with networks:

# List networks
$ docker network ls

# Create a bridge network
$ docker network create my-network

# Run a container on a specific network
$ docker run --network=my-network --name=container1 nginx

# Connect a running container to a network
$ docker network connect my-network container2

# Inspect a network
$ docker network inspect my-network

# Disconnect a container from a network
$ docker network disconnect my-network container1

# Remove a network
$ docker network rm my-network

Real-world application: In a typical web application architecture, you might create a custom bridge network for your application. Your frontend container, backend API container, and database container would all connect to this network, allowing them to communicate with each other using their container names as hostnames, while isolating them from other containers on the system.

Container DNS

One important feature of Docker networking is automatic DNS resolution between containers. Containers on the same user-defined network can resolve each other by name.

Example: If you have two containers named web and db on the same network, the web container can connect to the db container simply by using the hostname db in its configuration.

# Create a network
$ docker network create app-network

# Start a database container
$ docker run -d --name db --network app-network postgres

# Start a web container that can connect to the database using hostname "db"
$ docker run -d --name web --network app-network -e DATABASE_URL=postgres://postgres:postgres@db:5432/postgres my-web-app

Best practice: Always create custom networks for your applications rather than using the default bridge network. This provides better isolation, automatic DNS resolution between containers, and more control over your network configuration.

Docker Compose

While not strictly part of the core Docker architecture, Docker Compose is an essential tool that works with Docker to define and run multi-container applications.

What is Docker Compose?

Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application's services, networks, and volumes. Then, with a single command, you create and start all the services from your configuration.

Sample docker-compose.yml file:

version: '3'

services:
  web:
    build: ./web
    ports:
      - "5000:5000"
    volumes:
      - ./web:/code
    depends_on:
      - db
      - redis

  db:
    image: postgres:12
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=password

  redis:
    image: redis:6

volumes:
  postgres_data:

This file defines a three-service application:

  1. A web service built from the Dockerfile in the ./web directory
  2. A db service using the postgres:12 image
  3. A redis service using the redis:6 image

It also configures:

Basic Docker Compose commands:

# Start services
$ docker-compose up

# Start services in the background
$ docker-compose up -d

# Stop services
$ docker-compose down

# Stop services and remove volumes
$ docker-compose down -v

# View logs
$ docker-compose logs

# Run a command in a service
$ docker-compose exec web python manage.py migrate

Analogy: If Docker is like having individual appliances in your kitchen, Docker Compose is like having a single control panel that turns on all the appliances you need for a specific recipe, configured exactly as required. Instead of turning on the stove, then the mixer, then the blender individually, you just press one button labeled "Make Cake" and everything is set up correctly.

We'll explore Docker Compose in much more depth in tomorrow's session, but it's important to understand how it fits into the overall Docker architecture as a higher-level tool that works with the core components we've discussed.

How Components Work Together

Let's trace through a typical workflow to see how all these Docker components interact:

Example Workflow: Running a Container

  1. Client Instruction: You issue docker run nginx in your terminal
  2. Client Processing: The Docker client formats this as an API request to the daemon
  3. Daemon Image Check: The daemon checks if the nginx image exists locally
  4. Registry Interaction: If not found locally, the daemon pulls the image from Docker Hub
  5. Image Download: The registry sends the image layers to the daemon
  6. Container Creation: The daemon creates a new container based on the image
  7. Storage Setup: The daemon sets up any necessary storage (volumes or bind mounts)
  8. Network Configuration: The daemon connects the container to the appropriate network
  9. Container Start: The daemon starts the container processes
  10. Output Return: The daemon streams output back to the client
┌──────────┐         ┌──────────┐         ┌──────────┐
│  Client  │ ──────▶ │  Daemon  │ ◀─────▶ │ Registry │
└──────────┘         └──────────┘         └──────────┘
                          │
                          ▼
                     ┌──────────┐
                     │  Images  │
                     └──────────┘
                          │
                          ▼
                     ┌──────────┐
                     │Containers│
                     └──────────┘
                       ▲      ▲
                       │      │
                 ┌─────┘      └─────┐
                 │                  │
           ┌──────────┐      ┌──────────┐
           │ Volumes  │      │ Networks │
           └──────────┘      └──────────┘

This workflow demonstrates how the client, daemon, registry, images, containers, storage, and networking all work together to create a functioning containerized application.

Docker Architecture in Production Environments

While the architecture we've discussed applies to all Docker installations, production environments often add additional components and considerations:

Container Orchestration

In production, Docker is often managed by an orchestration platform like:

These platforms add capabilities for:

Security Considerations

Production Docker deployments typically include:

Monitoring and Logging

Comprehensive monitoring solutions are essential for Docker in production:

Production architecture example: A typical production setup might include:

  • Multiple host machines running Docker
  • Kubernetes managing containers across those hosts
  • A private Docker registry secured with authentication
  • CI/CD pipelines that build, test, and deploy Docker images
  • Prometheus and Grafana for monitoring
  • ELK Stack or Loki for logging

Docker Architecture Evolution

Docker's architecture has evolved significantly since its initial release:

Major Architectural Changes

Modern Docker architecture:

┌───────────────────────────────────────────────────┐
│                  Docker Engine                      │
│                                                     │
│  ┌─────────────┐      ┌──────────────┐             │
│  │   Docker    │      │   dockerd    │             │
│  │   Client    │◀────▶│   (daemon)   │             │
│  └─────────────┘      └──────┬───────┘             │
│                              │                      │
│                        ┌─────▼──────┐               │
│                        │ containerd │               │
│                        └─────┬──────┘               │
│                              │                      │
│                        ┌─────▼──────┐               │
│                        │ runc/runsc │               │
│                        └────────────┘               │
└───────────────────────────────────────────────────┘

In this modern architecture:

This modular approach allows for more flexibility and enables other tools to leverage Docker's components. For example, Kubernetes can use containerd directly without the full Docker daemon.

Key Takeaways

Understanding these components and how they interact is fundamental to working effectively with Docker and designing containerized applications.

Looking Ahead

In our afternoon session, we'll begin hands-on work with Docker, where you'll see these architectural components in action. We'll:

By the end of the day, you'll have practical experience with the core Docker components we've discussed in this theoretical session.

Discussion Questions

  1. How does Docker's architecture compare to traditional virtualization solutions like VMware or VirtualBox?
  2. Why is the layered approach to images important for efficiency in Docker?
  3. What are the advantages and disadvantages of Docker's approach to container networking?
  4. In what scenarios might Docker volumes be preferred over bind mounts, and vice versa?
  5. How does understanding Docker's architecture help you design better containerized applications?

Additional Resources