Why is Data Persistence needed for Containers? #
- Ephemeral Nature of Containers: Containers are designed to be short-lived. Data inside a container is lost when it restarts or terminates
- Stateful Applications Need Persistence: Applications like databases, file storage systems, and message brokers need to retain their state across restarts or crashes
- Data Retention: Critical data—such as user sessions, configurations, and transactions—must survive beyond the lifecycle of individual containers
- Backup and Recovery: Persistent storage enables backup strategies and ensures data can be recovered after failure
- Example Usecases:
- Running MySQL, PostgreSQL, or MongoDB in a container
- Web app that allows users to upload profile pictures or documents
- Shopping cart in an e-commerce containerized frontend
- Microservice logging activity data
What is a Volume in Docker? #
Volumes:
- Ephemeral Nature of Containers: Containers are designed to be short-lived. Data inside a container is lost when it restarts or terminates
- Volumes provide persistent storage: Data survives container restarts and removals
- Volumes enable Data sharing: Multiple containers can access the same volume, enabling shared data
Types of Volumes in Docker:
-
1) Named Volumes: Have a specific name assigned to them when created
- Command: Created using command
docker volume create <volume-name>
or automatically when a container specifies a volume that Docker hasn't created yet
docker volume create volume_name #OPTIONAL! docker run -d -v volume_name:/path/in/container my_image:tag
Example:
docker volume create pg_data #OPTIONAL # Run container # Docker-managed named volumes (like pg_data) # are stored on the host filesystem # but in a location managed internally by Docker docker run -d \ --name my_postgres \ -e POSTGRES_USER=myuser \ -e POSTGRES_PASSWORD=mypassword \ -e POSTGRES_DB=mydatabase \ -v pg_data:/var/lib/postgresql/data \ -p 5432:5432 \ postgres:latest # To check if the container is running docker container ls # To see the created volume docker volume ls # To inspect the volume docker volume inspect pg_data # `{"Name": "pg_data", # "Mountpoint": "/var/lib/docker/volumes/pg_data/_data",...}`
- Command: Created using command
-
2) Anonymous Volumes: Created and managed by Docker and don't have a user-assigned name
- Usage: Temporary or throwaway data during container runtime.
- Used when volume re-usability is NOT important.
- Command: Automatically created when a container specifies a volume destination without explicitly creating a named volume
docker run -d -v /path/in/container my_image:tag
Example:
# Run container docker run -d \ --name my_nginx \ -v /var/log/nginx \ -p 8080:80 \ nginx:latest # To check if the container is running docker ps # To see the created anonymous volume docker volume ls # To inspect the volume docker volume inspect VOLUME_NAME # AUTO GENERATED NAME
- Usage: Temporary or throwaway data during container runtime.
docker volume ls
DRIVER VOLUME NAME
local 9f4b1e3e5a9c4f7d88d46bb5efde14a3
local project_data_volume
local 812dcff9c2a83c3b8f3dd70d7c0a9b0d
local mysql_db_data
Explain Commands used to manage Volumes #
Create:
- Create a Named Volume
docker volume create project_data_volume
Lists:
- Lists all Docker volumes
docker volume ls
DRIVER VOLUME NAME
local 9f4b1e3e5a9c4f7d88d46bb5efde14a3
local project_data_volume
local 812dcff9c2a83c3b8f3dd70d7c0a9b0d
local mysql_db_data
Inspection:
- Returns detailed information about a Docker volume in JSON format
docker volume inspect <VOLUME_NAME>
[
{
"CreatedAt": "2025-12-01T14:23:45Z",
"Driver": "local",
"Labels": {
"project": "sample-app",
"env": "dev"
},
//Actual path where volume data resides on the host.
"Mountpoint": "/var/lib/docker/volumes/project_data_volume/_data",
//The volume's name (UUID for anonymous).
"Name": "project_data_volume",
"Options": {},
"Scope": "local"
}
]
Remove a Specific Volume:
- Deletes a specified Docker volume
docker volume rm <VOLUME_NAME>
Remove All Unused Volumes:
- Removes all volumes that are not currently used by any container
docker volume prune
Explain Bind Mounts with an example #
- Bind mounts: Allow us to mount a directory or file on the host machine into a container
- Path linking: Links to a specific path on the host filesystem
- Flexibility: Offers direct access to host files, ideal for development and local testing
- Use case: Real-time data sharing between host and container
- Command:
docker run -d -v /host/path:/container/path my_image:tag
Example:
- Web Development: Developing a website locally using Docker containers and website files are stored on your host machine at
/tmp/website
& any changes made on your host to be instantly visible in the container running your web server - Command:
mkdir /tmp/website # Run the Apache Container docker run -d --name my_apache \ -v /tmp/website:/usr/local/apache2/htdocs \ -p 8080:80 \ httpd:latest # Check if Apache is serving the page curl http://localhost:8080 # Edit the index.html file on the host echo "<h1>Apache Server Updated - Live Changes</h1>" > \ /tmp/website/index.html # Verify Changes with curl curl http://localhost:8080 # Verify Inside the Container docker exec -it my_apache /bin/sh # Try to read the index.html file cat /usr/local/apache2/htdocs/index.html
Docker Volumes vs Bind Mounts #
Feature / Use Case | Docker Volumes | Bind Mounts |
---|---|---|
Definition | Managed by Docker: Stored in Docker's internal storage (usually under /var/lib/docker/volumes/ ). |
Directly map a file or directory from the host filesystem into the container. |
Management | Managed by Docker | Not managed by Docker – Does NOT have docker volume ls or inspect visibility. |
Portability & Isolation | Portable and isolated from host filesystem. Easy to backup, inspect, and share across containers. | Tightly coupled with host. Changes on host are reflected in container and vice versa. |
Use Cases | Databases (e.g., MySQL, Postgres) storing data. | Mounting source code during development; live file editing workflows. |
Is it possible to share data between Multiple Containers? #
- Yes!!: Possible to share data between multiple containers in Docker.
1) Data Sharing Using Volumes:
- Recommended method: Docker volumes are the preferred way to share data between containers
- Named volumes: Create and mount a named volume into both containers
- Shared data store: Both containers can read from and write to the same volume
# Create a named volume docker volume create shared_data
# Container 1 mounts the volume docker run -d --name container1 -v shared_data:/data my_image1
# Container 2 mounts the same volume docker run -d --name container2 -v shared_data:/data my_image2
- Example:
# Create a named volume docker volume create shared_data # Run the first container (Producer) docker run -d --name container1 \ -v shared_data:/data alpine sh -c \ "echo 'Hello from container1' > /data/hello.txt && tail -f /dev/null" # Run the second container (Consumer) docker run --name container2 \ -v shared_data:/data alpine cat /data/hello.txt && sleep 5s
2) Data Sharing Using Bind Mount:
- Bind mounts: Can be used to share data between containers by mounting the same host directory into both containers
# Container 1 mounts a host directory docker run -d --name container1 -v /host/path:/data my_image1
# Container 2 mounts the same host directory docker run -d --name container2 -v /host/path:/data my_image2
- Example:
# Create a Host Directory for Shared Data mkdir -p /tmp/shared_logs && cd /tmp/shared_logs # Start the First Container (Producer) docker run -d --name container1 \ -v /tmp/shared_logs:/data alpine sh -c \ "while true; do echo $(date) >> /data/log.txt; sleep 1; done" # Start the Second Container (Consumer) docker run --rm --name container2 \ -v /tmp/shared_logs:/data alpine sh -c "tail -f /data/log.txt"
3) Data Sharing Using Volumes from Containers:
-
Volume sharing: One container can access volumes from another container
- If the container has multiple volume mounts,
--volumes-from
inherits all of them
- If the container has multiple volume mounts,
-
Data sharing: Useful for sharing data between running containers
# Start container 1 with a named volume docker run -d --name container1 \ -v shared_data:/data my_image1
# Start container 2 and use the volume from container 1 docker run -d --name container2 \ --volumes-from container1 my_image2
-
Example
# Start Container 1 (Uploader/Producer) docker run -d --name container1 \ -v shared_data:/data alpine sh -c \ "echo 'Hello from container1' > /data/data.txt && tail -f /dev/null" # Start Container 2 (Processor/Consumer) docker run --rm --name container2 \ --volumes-from container1 alpine cat /data/data.txt
How can you recover data after accidentally deleting a Container? #
- Volume Storage: Data stored in a Docker volume remains intact even after container removal, allowing reuse with new containers.
docker run -v my_volume:/app/data my-image
- Bind Mounts from Host: Data saved in a host directory (e.g.,
-v $(pwd)/data:/app/data
) persists independently of the container.
docker run -v /your/local/path/data:/app/data my-image
- Container Filesystem: Data stored inside the container's filesystem is lost once the container is deleted, and cannot be recovered unless backed up.
- Recommendations:
- Always use volumes or bind mounts for important data to prevent data loss.
- Regularly back up data stored within containers if not using volumes or bind mounts.
What are Volume Drivers? #
- Default Storage: Docker uses the local filesystem to store data by default
- Volume Drivers: Plugins that enable Docker to create and manage volumes with different backends
- External Storage Management: Support for storing, managing, and accessing data outside the local system
- Data Sharing Across Hosts: Use of systems like NFS to share data between multiple Docker hosts
- Cloud Storage Integration: Compatibility with cloud services such as Amazon EBS or Azure File Share
- Additional Capabilities: Enable additional capabilities
- Enhanced Redundancy: Improve data reliability and availability through specialized storage solutions
- Performance Optimization: Different drivers can optimize performance based on backend storage capabilities
- Security Features: Some volume drivers support encryption and access controls for sensitive data
- Cross-Platform Compatibility: Drivers can facilitate data access across various operating environments
- Custom Volume Creation: Use command like
docker volume create --driver DRIVER_NAME --opt key=value volume_name
to specify drivers and options