Why is data persistence necessary? #


  • Data Retention: Many applications generate or consume data that needs to persist beyond the lifecycle of a single container instance
  • Stateful Applications: Containers are typically ephemeral, meaning they can be started, stopped, and destroyed at any time
    • For stateful applications like databases or file storage, persistence ensures that data remains intact between container restarts
  • Backup and Recovery: Persisting data facilitates easier backup procedures and ensures that critical data is recoverable in case of container failure or system downtime

Ways persist data in Docker containers? #


  • There are two ways to persist data of container
    • Volumes
    • Bind Mounts

What is a volume in Docker, and what are the types of volumes? #


  • Dedicated storage: Managed by Docker, that lives independently of any container’s lifecycle
  • Persistent storage: Persist data independently of container usage, ensuring data survival through restarts and removals
  • Data sharing: Multiple containers can access the same volume, enabling shared data and reducing duplication across services
  • Faster than bind mounts: they interact directly with the host file system

Types of Volumes in Docker:

  • 1) Named Volumes: Have a specific name assigned to them when created

    • Usage: Manageable with specific names for easy identification
    • Command: Created using command volume create <volume-name> or automatically when a container specifies a volume that Docker hasn't created yet
    docker run -d -v mydata:/path/in/container my_image:tag

    Example:

    
    # Run container
    docker run -d \
    --name my_postgres \
    -e POSTGRES_USER=myuser \
    -e POSTGRES_PASSWORD=mypassword \
    -e POSTGRES_DB=mydatabase \
    -v pg_data:/var/lib/postgresql/data \
    -p 5432:5432 \
    postgres:latest
    
    # To check if the container is running
    docker ps
    
    # To see the created volume
    docker volume ls
    
    # To inspect the volume
    docker volume inspect pg_data
  • 2) Anonymous Volumes: Created and managed by Docker and don't have a user-assigned name

    • Usage: Ideal for temporary or disposable data
    • Command: Automatically created when a container specifies a volume destination without explicitly creating a named volume
    docker run -d -v /path/in/container my_image:tag

    Example:

    
    # Run container
    docker run -d \
    --name my_nginx \
    -v /var/log/nginx \
    -p 8080:80 \
    nginx:latest
    
    # To check if the container is running
    docker ps
    
    # To see the created anonymous volume
    docker volume ls
    
    # To inspect the volume
    docker volume inspect VOLUME_NAME # AUTO GENERATED NAME

List out some volume Management Commands #


Lists:

  • Lists all Docker volumes
docker volume ls

Inspection:

  • Returns detailed information about a Docker volume in JSON format
docker volume inspect <VOLUME_NAME>

Remove a Specific Volume:

  • Deletes a specified Docker volume
docker volume rm <VOLUME_NAME>

Remove All Unused Volumes:

  • Removes all volumes that are not currently used by any container
docker volume prune

Explain bind mounts in Docker with example and provide a use case #


  • Bind mounts: Allow us to mount a directory or file on the host machine into a container
  • Path linking: Links to a specific path on the host filesystem
  • Flexibility: Offers direct access to host files, ideal for development and local testing
  • Use case: Real-time data sharing between host and container
  • Command:
docker run -d -v /host/path:/container/path my_image:tag

Example:

  • Web Development: Developing a website locally using Docker containers and website files are stored on your host machine at /tmp/website & any changes made on your host to be instantly visible in the container running your web server
  • Command:
    mkdir /tmp/website
    
    # Run the Apache Container
    docker run -d --name my_apache \
    -v /tmp/website:/usr/local/apache2/htdocs \
    -p 8080:80 \
    httpd:latest
    
    # Check if Apache is serving the page
    curl http://localhost:8080
    
    # Edit the index.html file on the host
    echo "<h1>Apache Server Updated - Live Changes</h1>" > \
    /tmp/website/index.html
    
    # Verify Changes with curl
    curl http://localhost:8080
    
    # Verify Inside the Container
    docker exec -it my_apache /bin/sh
    
    # Try to read the index.html file
    cat /usr/local/apache2/htdocs/index.html
    

Is it possible to share data between two containers? #


  • Yes!!: Possible to share data between two containers in Docker.
  • Multiple Approaches:
    • 1) Data Sharing Using Volumes: Allows containers to share data by mounting a common volume
    • 2) Create a named volume: A persistent storage option that multiple containers can access
    • 3) Data Sharing Using Volumes from Containers: One container shares its volume with others using the --volumes-from option

1) Data Sharing Using Volumes:

  • Recommended method: Docker volumes are the preferred way to share data between containers
  • Named volumes: Create and mount a named volume into both containers
  • Shared data store: Both containers can read from and write to the same volume
    # Create a named volume
    docker volume create shared_data
    # Container 1 mounts the volume
    docker run -d --name container1 -v shared_data:/data my_image1
    # Container 2 mounts the same volume
    docker run -d --name container2 -v shared_data:/data my_image2
  • Example:
    # Create a named volume
    docker volume create shared_data
    
    # Run the first container (Producer)
    docker run -d --name container1 \
    -v shared_data:/data alpine sh -c \
    "echo 'Hello from container1' > /data/hello.txt && tail -f /dev/null"
    
    # Run the second container (Consumer)
    docker run --name container2 \
    -v shared_data:/data alpine cat /data/hello.txt && sleep 5s

2) Create a named volume:

  • Bind mounts: Can be used to share data between containers by mounting the same host directory into both containers
    # Container 1 mounts a host directory
    docker run -d --name container1 -v /host/path:/data my_image1
    # Container 2 mounts the same host directory
    docker run -d --name container2 -v /host/path:/data my_image2
  • Example:
    #  Create a Host Directory for Shared Data
    mkdir -p /tmp/shared_logs && cd /tmp/shared_logs
    
    # Start the First Container (Producer)
    docker run -d --name container1 \
    -v /tmp/shared_logs:/data alpine sh -c \
    "while true; do echo $(date) >> /data/log.txt; sleep 1; done"
    
    # Start the Second Container (Consumer)
    docker run --rm --name container2 \
    -v /tmp/shared_logs:/data alpine sh -c "tail -f /data/log.txt"
    
    

3) Data Sharing Using Volumes from Containers:

  • Volume sharing: One container can access volumes from another container

  • Data sharing: Useful for sharing data between running containers

    # Start container 1 with a named volume
    docker run -d --name container1 \
    -v shared_data:/data my_image1
    # Start container 2 and use the volume from container 1
    docker run -d --name container2 \
    --volumes-from container1 my_image2
  • Example

    # Start Container 1 (Uploader/Producer)
    docker run -d --name container1 \
    -v shared_data:/data alpine sh -c \
    "echo 'Hello from container1' > /data/data.txt && tail -f /dev/null"
    
    # Start Container 2 (Processor/Consumer)
    docker run --rm --name container2 \
    --volumes-from container1 alpine cat /data/data.txt
    

What are the security considerations when using volumes? #


  • Data Isolation: Ensure volumes do not expose sensitive host data to containers
  • Permissions: Set appropriate file system permissions and avoid running containers as root
  • Read-Only Mounts: Mount volumes as read-only when write access is not needed
    # Mounting a volume as read-only
    docker run -v /host/data:/container/data:ro myimage