What is a Dockerfile? and How do you build a Docker image #
Dockerfile: A script containing a series of instructions on how to build a Docker image
Example:
Dockerfile
# // Dockerfile
# Use an official Python runtime as a parent image
FROM python:3.8-slim
# Set the working directory
WORKDIR /app
# Copy the current directory contents into the container
COPY . /app
# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Make port 80 available to the world outside this container
EXPOSE 80
# Define the command to run the application
CMD ["python", "app.py"]
- Python App
app.py
:
# // app.py
from flask import Flask
app = Flask(__name__)
@app.route('/')
def home():
return "Hello, Docker World!"
if __name__ == '__main__':
app.run(host='0.0.0.0', port=80)
- Python App dependency:
requirements.txt
flask
- Build the image using:
docker build -t my-python-app .
- Create container from the image:
docker run -d -p 80:80 my-python-app
# Test
curl http://localhost
💡 Explore the following projects with detailed guides:
Example:
How do you tag a Docker image? Why is tagging important? #
- Tagging Command: Use
docker build -t imagename:tag .
to assign a tag - Importance:
- Version Control: Helps identify image versions and roll back if necessary
- Organization: Distinguishes between development, testing, and production images
- Automation: Facilitates automated deployments by referencing specific tags
Example:
docker build -t myapp:1.0.0 .
docker tag myapp:1.0.0 myapp:latest
List out basic instructions used in a Dockerfile #
FROM
: Specifies the base imageWORKDIR
: Sets the working directory inside the containerCOPY
: Copies files/directories from the host to the containerEXPOSE
: Exposes a port for external accessCMD
: Specifies the command to run within the containerENTRYPOINT
: Sets the command and parameters that execute as the container starts
Difference between ADD and COPY in a Dockerfile #
COPY
: Simply copies files and directories from the host to the containerADD
: Can copy files and directories and also supports URL downloads and automatic extraction of compressed files
Example:
# Copy a local file into the /app directory in the Docker image
COPY localfile.txt /app/
# Add a file from a URL and place it into the /app directory in the image
ADD http://example.com/file.tar.gz /app/
What is the difference between ENTRYPOINT and CMD in a Dockerfile? explain with example #
CMD
: Default arguments, override with docker run.- Default execution: Provides default arguments for container run, can be overridden
- Single effect: Only the last
CMD
in the Dockerfile is effective
ENTRYPOINT
: Primary command, always runs- Main executable: Sets the container's primary command, not easily overridden
- Always run: Ensures the specified command executes consistently
CMD
Example:
# Use an official Python runtime as a parent image
FROM python:3.8-slim
# Set the working directory
WORKDIR /app
# Copy the current directory contents into the container
COPY . /app
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Define the default command to run when the container starts
CMD ["python", "app.py"]
- The
CMD
instruction specifies that the container should runpython app.py
by default - You can override this command by specifying a different command when you run the container
docker run myimage python another_script.py
ENTRYPOINT
Example:
# Use an official Python runtime as a parent image
FROM python:3.8-slim
# Set the working directory
WORKDIR /app
# Copy the current directory contents into the container
COPY . /app
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Set the entrypoint to run the Python interpreter with app.py
ENTRYPOINT ["python", "app.py"]
ENTRYPOINT
instruction sets the entry point for the container aspython app.py
- When the container starts, it will run
python app.py
by default, and you cannot change this behavior by passing arguments to docker run - To override the
ENTRYPOINT
, you need to use the--entrypoint
flag with docker run:
docker run --entrypoint /bin/bash myimage
ENTRYPOINT
+ CMD
:
# Use an official Python runtime as a parent image
FROM python:3.8-slim
# Set the working directory
WORKDIR /app
# Copy the current directory contents into the container
COPY . /app
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Set the entrypoint to run the Python interpreter
ENTRYPOINT ["python"]
# Define the default argument to run
CMD ["app.py"]
- This Dockerfile sets
ENTRYPOINT
topython
andCMD
toapp.py
- The ENTRYPOINT will always be python, but you can change what script is run with it
- When the container starts, it will run
python app.py
- You can override the CMD argument like this:
docker run myimage another_script.py
Explain the layered approach to building a Docker image? and how we can write optimized Dockerfile? #
- Layered builds: Docker images are created in layers, with each Dockerfile instruction forming a new layer
- Cached layers: Layers are cached for faster and more efficient builds
- Independent layers: Changes in one layer don’t impact others
- Rebuild process: Changing one layer triggers a rebuild of subsequent layers
- Optimization practice: Design Dockerfiles to maximize layer reuse
Non-Optimized Dockerfile:
# Use an official Node.js runtime as a parent image
FROM node:14
# Set the working directory
WORKDIR /usr/src/app
# Copy the entire application code to the working directory
COPY . .
# Install dependencies
RUN npm install
# Expose the application port
EXPOSE 3000
# Run the application
CMD ["node", "app.js"]
- In this Dockerfile, the
COPY . .
command copies the entire application code before running npm install - This means any change in the application code (even a small one) will cause Docker to invalidate the cache for the
COPY . .
instruction and all subsequent instructions, leading to frequent rebuilds of the npm install step
Optimized Dockerfile
# Use an official Node.js runtime as a parent image
FROM node:14
# Set the working directory
WORKDIR /usr/src/app
# Copy only package.json and package-lock.json first
COPY package*.json ./
# Install dependencies
RUN npm install
# Copy the rest of the application code
COPY . .
# Expose the application port
EXPOSE 3000
CMD ["node", "app.js"]
- In this optimized Dockerfile,
package*.json
files are copied first - The
RUN npm install
step runs only if dependencies change - The
COPY . .
command copies the rest of the application code after installing dependencies. - This ensures changes to the application code do not affect the dependencies layer, allowing Docker to cache the npm install step unless
package.json
orpackage-lock.json
change
What is immutable infrastructure in Docker? #
- Immutable infrastructure: it's a paradigm in which servers (or containers) are never modified after they are deployed
- Rebuild strategy: if a change is needed, a new server (or container) is built and deployed, and the old one is decommissioned
Is it possible to make changes to an existing Docker image? #
- Read-only design: Docker images cannot be changed directly, as they are read-only
- Commit changes: However, Modifications can be made to a running container and then committed to create a new image
- New image creation: Instead of updating the existing image, a new one is created, following immutable infrastructure principles
Example: Commit changes from container & Create new image
- Step 1: Run a container from the
nginx:latest
image
docker run --name test1 -d -p 80:80 nginx
# Test
curl http://localhost:80
- Step 2: Modify the running docker container
docker exec test1 sh -c \
'echo "<h1 style="font-size:400px">Testing</h1>" > /usr/share/nginx/html/index.html'
# Test
curl http://localhost:80
- Step 3: Create new docker image from running docker container
docker commit test1 newnginx:modified
- Step 4: Run a container from the
newnginx:modified
image (newly created)
docker run --name test2 -d -p 81:80 newnginx:modified
# Test
curl http://localhost:81
What is a multi-stage build, and how can it be used to reduce the size of a container image? #
- Multi-stage Dockerfile: Enables creating Dockerfiles in multiple stages
- Multiple FROM statements: Each FROM starts a new build stage
- Artifact copying: Transfer artifacts between stages to keep the final image small
- Size optimization: Reduces final image size by excluding unnecessary tools and dependencies
Golang Example 1: Larger Image Size
// main.go
package main
import "fmt"
func main() {
fmt.Println("Hello, Multi-Stage Builds!")
}
Dockerfile
FROM golang:1.16
WORKDIR /app
COPY main.go .
RUN go build main.go
CMD ["./main"]
- Problem: final image includes unnecessary build tools, making it larger than needed
docker build -t large-image-size:latest .
# Test
docker images | grep -i "large-image-size"
Golang Example 2: Smaller Image Size
// main.go
package main
import "fmt"
func main() {
fmt.Println("Hello, Multi-Stage Builds!")
}
Dockerfile
# Stage 1: Build the Go application
FROM golang:1.16 AS builder
WORKDIR /app
COPY main.go .
RUN go build main.go
# Stage 2: Create a minimal image with the built binary
FROM alpine:latest
WORKDIR /root/
COPY --from=builder /app/main .
CMD ["./main"]
- Solution: Separate build stage, copy only compiled binary to minimal base
docker build -t small-image-size:latest .
# Test
docker images | grep -i "small-image-size"
- Advantage: Smaller image, No unnecessary dependencies, Boost efficiency
What is the purpose of the .dockerignore file? #
- Exclude files:
.dockerignore
prevents specific files and directories from being copied into the Docker image during the build and make image lightweight - Pattern matching: Define file and directory patterns, similar to .gitignore, that Docker should skip during the build process
Example .dockerignore
File:
# Ignore node_modules directory
node_modules
# Ignore log files
*.log
# Ignore development or editor-specific files
.DS_Store
.vscode/
.idea/