Persisting Data in Docker

Why is Data Persistence needed for Containers? #

Ephemeral Nature of Containers: Containers are designed to be short-lived. Data inside a container is lost when it restarts or terminates
Stateful Applications Need Persistence: Applications like databases, file storage systems, and message brokers need to retain their state across restarts or crashes
Data Retention: Critical data—such as user sessions, configurations, and transactions—must survive beyond the lifecycle of individual containers
Backup and Recovery: Persistent storage enables backup strategies and ensures data can be recovered after failure
Example Usecases:
- Running MySQL, PostgreSQL, or MongoDB in a container
- Web app that allows users to upload profile pictures or documents
- Shopping cart in an e-commerce containerized frontend
- Microservice logging activity data

What is a Volume in Docker? #

Volumes:

Ephemeral Nature of Containers: Containers are designed to be short-lived. Data inside a container is lost when it restarts or terminates
Volumes provide persistent storage: Data survives container restarts and removals
Volumes enable Data sharing: Multiple containers can access the same volume, enabling shared data

Types of Volumes in Docker:

1) Named Volumes: Have a specific name assigned to them when created

Command: Created using command docker volume create <volume-name> or automatically when a container specifies a volume that Docker hasn't created yet

docker volume create volume_name #OPTIONAL!
docker run -d -v volume_name:/path/in/container my_image:tag

Example:

docker volume create pg_data #OPTIONAL

# Run container
# Docker-managed named volumes (like pg_data) 
# are stored on the host filesystem
# but in a location managed internally by Docker
docker run -d \
--name my_postgres \
-e POSTGRES_USER=myuser \
-e POSTGRES_PASSWORD=mypassword \
-e POSTGRES_DB=mydatabase \
-v pg_data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:latest

# To check if the container is running
docker container ls

# To see the created volume
docker volume ls

# To inspect the volume
docker volume inspect pg_data
# `{"Name": "pg_data",
#     "Mountpoint": "/var/lib/docker/volumes/pg_data/_data",...}`

2) Anonymous Volumes: Created and managed by Docker and don't have a user-assigned name

Usage: Temporary or throwaway data during container runtime.
- Used when volume re-usability is NOT important.
Command: Automatically created when a container specifies a volume destination without explicitly creating a named volume

docker run -d -v /path/in/container my_image:tag

Example:


# Run container
docker run -d \
--name my_nginx \
-v /var/log/nginx \
-p 8080:80 \
nginx:latest

# To check if the container is running
docker ps

# To see the created anonymous volume
docker volume ls

# To inspect the volume
docker volume inspect VOLUME_NAME # AUTO GENERATED NAME

docker volume ls

DRIVER    VOLUME NAME
local     9f4b1e3e5a9c4f7d88d46bb5efde14a3
local     project_data_volume
local     812dcff9c2a83c3b8f3dd70d7c0a9b0d
local     mysql_db_data

Explain Commands used to manage Volumes #

Create:

Create a Named Volume

docker volume create project_data_volume

Lists:

Lists all Docker volumes

docker volume ls

DRIVER    VOLUME NAME
local     9f4b1e3e5a9c4f7d88d46bb5efde14a3
local     project_data_volume
local     812dcff9c2a83c3b8f3dd70d7c0a9b0d
local     mysql_db_data

Inspection:

Returns detailed information about a Docker volume in JSON format

docker volume inspect <VOLUME_NAME>

[
    {
        "CreatedAt": "2025-12-01T14:23:45Z",
        "Driver": "local",
        "Labels": {
            "project": "sample-app",
            "env": "dev"
        },
        //Actual path where volume data resides on the host.
        "Mountpoint": "/var/lib/docker/volumes/project_data_volume/_data",
        //The volume's name (UUID for anonymous).
        "Name": "project_data_volume",
        "Options": {},
        "Scope": "local"
    }
]

Remove a Specific Volume:

Deletes a specified Docker volume

docker volume rm <VOLUME_NAME>

Remove All Unused Volumes:

Removes all volumes that are not currently used by any container

docker volume prune

Explain Bind Mounts with an example #

Bind mounts: Allow us to mount a directory or file on the host machine into a container
Path linking: Links to a specific path on the host filesystem
Flexibility: Offers direct access to host files, ideal for development and local testing
Use case: Real-time data sharing between host and container
Command:

docker run -d -v /host/path:/container/path my_image:tag

Example:

Web Development: Developing a website locally using Docker containers and website files are stored on your host machine at /tmp/website & any changes made on your host to be instantly visible in the container running your web server

Command:

mkdir /tmp/website

# Run the Apache Container
docker run -d --name my_apache \
-v /tmp/website:/usr/local/apache2/htdocs \
-p 8080:80 \
httpd:latest

# Check if Apache is serving the page
curl http://localhost:8080

# Edit the index.html file on the host
echo "<h1>Apache Server Updated - Live Changes</h1>" > \
/tmp/website/index.html

# Verify Changes with curl
curl http://localhost:8080

# Verify Inside the Container
docker exec -it my_apache /bin/sh

# Try to read the index.html file
cat /usr/local/apache2/htdocs/index.html

Docker Volumes vs Bind Mounts #

Feature / Use Case	Docker Volumes	Bind Mounts
Definition	Managed by Docker: Stored in Docker's internal storage (usually under `/var/lib/docker/volumes/`).	Directly map a file or directory from the host filesystem into the container.
Management	Managed by Docker	Not managed by Docker – Does NOT have `docker volume ls` or `inspect` visibility.
Portability & Isolation	Portable and isolated from host filesystem. Easy to backup, inspect, and share across containers.	Tightly coupled with host. Changes on host are reflected in container and vice versa.
Use Cases	Databases (e.g., MySQL, Postgres) storing data.	Mounting source code during development; live file editing workflows.

Is it possible to share data between Multiple Containers? #

Yes!!: Possible to share data between multiple containers in Docker.

1) Data Sharing Using Volumes:

Recommended method: Docker volumes are the preferred way to share data between containers
Named volumes: Create and mount a named volume into both containers

Shared data store: Both containers can read from and write to the same volume

# Create a named volume
docker volume create shared_data

# Container 1 mounts the volume
docker run -d --name container1 -v shared_data:/data my_image1

# Container 2 mounts the same volume
docker run -d --name container2 -v shared_data:/data my_image2

Example:

# Create a named volume
docker volume create shared_data

# Run the first container (Producer)
docker run -d --name container1 \
-v shared_data:/data alpine sh -c \
"echo 'Hello from container1' > /data/hello.txt && tail -f /dev/null"

# Run the second container (Consumer)
docker run --name container2 \
-v shared_data:/data alpine cat /data/hello.txt && sleep 5s

2) Data Sharing Using Bind Mount:

Bind mounts: Can be used to share data between containers by mounting the same host directory into both containers

# Container 1 mounts a host directory
docker run -d --name container1 -v /host/path:/data my_image1

# Container 2 mounts the same host directory
docker run -d --name container2 -v /host/path:/data my_image2

Example:

#  Create a Host Directory for Shared Data
mkdir -p /tmp/shared_logs && cd /tmp/shared_logs

# Start the First Container (Producer)
docker run -d --name container1 \
-v /tmp/shared_logs:/data alpine sh -c \
"while true; do echo $(date) >> /data/log.txt; sleep 1; done"

# Start the Second Container (Consumer)
docker run --rm --name container2 \
-v /tmp/shared_logs:/data alpine sh -c "tail -f /data/log.txt"

3) Data Sharing Using Volumes from Containers:

Volume sharing: One container can access volumes from another container
- If the container has multiple volume mounts, --volumes-from inherits all of them

Data sharing: Useful for sharing data between running containers

# Start container 1 with a named volume
docker run -d --name container1 \
-v shared_data:/data my_image1

# Start container 2 and use the volume from container 1
docker run -d --name container2 \
--volumes-from container1 my_image2

Example

# Start Container 1 (Uploader/Producer)
docker run -d --name container1 \
-v shared_data:/data alpine sh -c \
"echo 'Hello from container1' > /data/data.txt && tail -f /dev/null"

# Start Container 2 (Processor/Consumer)
docker run --rm --name container2 \
--volumes-from container1 alpine cat /data/data.txt

How can you recover data after accidentally deleting a Container? #

Volume Storage: Data stored in a Docker volume remains intact even after container removal, allowing reuse with new containers.
docker run -v my_volume:/app/data my-image
Bind Mounts from Host: Data saved in a host directory (e.g., -v $(pwd)/data:/app/data) persists independently of the container.
docker run -v /your/local/path/data:/app/data my-image
Container Filesystem: Data stored inside the container's filesystem is lost once the container is deleted, and cannot be recovered unless backed up.
Recommendations:
- Always use volumes or bind mounts for important data to prevent data loss.
- Regularly back up data stored within containers if not using volumes or bind mounts.

What are Volume Drivers? #

Default Storage: Docker uses the local filesystem to store data by default
Volume Drivers: Plugins that enable Docker to create and manage volumes with different backends
External Storage Management: Support for storing, managing, and accessing data outside the local system
- Data Sharing Across Hosts: Use of systems like NFS to share data between multiple Docker hosts
- Cloud Storage Integration: Compatibility with cloud services such as Amazon EBS or Azure File Share
Additional Capabilities: Enable additional capabilities
- Enhanced Redundancy: Improve data reliability and availability through specialized storage solutions
- Performance Optimization: Different drivers can optimize performance based on backend storage capabilities
- Security Features: Some volume drivers support encryption and access controls for sensitive data
- Cross-Platform Compatibility: Drivers can facilitate data access across various operating environments
Custom Volume Creation: Use command like docker volume create --driver DRIVER_NAME --opt key=value volume_name to specify drivers and options

On this page

Why is Data Persistence needed for Containers? #

What is a Volume in Docker? #

Explain Commands used to manage Volumes #

Explain Bind Mounts with an example #

Docker Volumes vs Bind Mounts #

Is it possible to share data between Multiple Containers? #

How can you recover data after accidentally deleting a Container? #

What are Volume Drivers? #