Docker Layer Linking: Share Layers Between Images Without Rebuilding

Quick Reference:
# Build with cache from another image
docker build --cache-from=myapp:latest -t myapp:new .
# Link layers from existing image (BuildKit)
docker buildx build --cache-from=type=registry,ref=myrepo/myapp:cache .
# Export cache for CI
docker buildx build --cache-to=type=registry,ref=myrepo/myapp:cache .
# Check layer sharing
docker history myapp:v1
docker history myapp:v2
Every time you rebuild a Docker image, you're potentially duplicating gigabytes of data that already exists elsewhere. Docker's layer architecture was designed to prevent this, but most teams don't fully leverage it.
How Docker Layers Work
Docker images are composed of read-only layers. Each instruction in a Dockerfile creates a layer:
FROM ubuntu:22.04 # Layer 1: base image
RUN apt-get update # Layer 2: package cache
RUN apt-get install -y curl # Layer 3: curl package
COPY app.py /app/ # Layer 4: your code
When you pull or push images, Docker only transfers layers that don't already exist locally or remotely. This is the foundation of layer sharing.
The Problem: Wasted Cache
Consider this common scenario:
# Dockerfile.api
FROM python:3.11
RUN pip install flask requests boto3
COPY api/ /app/
# Dockerfile.worker
FROM python:3.11
RUN pip install celery requests boto3
COPY worker/ /app/
Both images install requests and boto3, but as different layers. Each build downloads and installs these packages independently. Multiply this across microservices, and you waste significant time and storage.
Layer Linking Strategies
1. Shared Base Images
Create a common base image with shared dependencies:
# Dockerfile.base
FROM python:3.11
RUN pip install requests boto3 prometheus-client
# Dockerfile.api
FROM myrepo/base:latest
RUN pip install flask
COPY api/ /app/
# Dockerfile.worker
FROM myrepo/base:latest
RUN pip install celery
COPY worker/ /app/
Now both images share the requests and boto3 layer. Any node that has one image already has most of the other.
2. BuildKit Cache Mounts
Use BuildKit's cache mounts to share package caches across builds:
# syntax=docker/dockerfile:1.4
FROM python:3.11
# Share pip cache between builds
RUN --mount=type=cache,target=/root/.cache/pip \
pip install flask requests boto3
COPY app/ /app/
Enable BuildKit:
DOCKER_BUILDKIT=1 docker build -t myapp .
The pip cache persists between builds, so repeated installs hit cache instead of downloading.
3. Registry Cache
Share build cache through your container registry:
# Build and push cache layers
docker buildx build \
--cache-to=type=registry,ref=myrepo/myapp:buildcache,mode=max \
--push \
-t myrepo/myapp:v1 .
# Pull cache for next build (on any machine)
docker buildx build \
--cache-from=type=registry,ref=myrepo/myapp:buildcache \
-t myrepo/myapp:v2 .
The mode=max exports all layers, not just the final image layers. This maximizes cache hits for intermediate build stages.
4. Multi-Stage Layer Reuse
Explicitly copy layers between stages:
# Stage 1: Build dependencies
FROM python:3.11 AS deps
COPY requirements.txt .
RUN pip install --user -r requirements.txt
# Stage 2: Build application
FROM python:3.11 AS builder
COPY --from=deps /root/.local /root/.local
COPY . .
RUN python setup.py build
# Stage 3: Runtime (minimal)
FROM python:3.11-slim
COPY --from=deps /root/.local /root/.local
COPY --from=builder /app/dist /app
The deps stage layer is reused in both builder and the final image.
CI/CD Integration
GitHub Actions
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build with cache
uses: docker/build-push-action@v5
with:
push: true
tags: myrepo/myapp:${{ github.sha }}
cache-from: type=registry,ref=myrepo/myapp:buildcache
cache-to: type=registry,ref=myrepo/myapp:buildcache,mode=max
GitLab CI
build:
script:
- docker buildx build
--cache-from=type=registry,ref=$CI_REGISTRY_IMAGE:cache
--cache-to=type=registry,ref=$CI_REGISTRY_IMAGE:cache,mode=max
--push
-t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
Jenkins
pipeline {
agent any
stages {
stage('Build') {
steps {
sh '''
docker buildx build \
--cache-from=type=registry,ref=myrepo/myapp:cache \
--cache-to=type=registry,ref=myrepo/myapp:cache,mode=max \
-t myrepo/myapp:${BUILD_NUMBER} .
'''
}
}
}
}
Measuring Layer Sharing
Check which layers are shared:
# View layers for an image
docker history myapp:v1 --no-trunc
# Compare two images
diff <(docker history myapp:v1 --no-trunc -q) \
<(docker history myapp:v2 --no-trunc -q)
# Check layer sizes
docker system df -v
Inspect layer digests:
# Get manifest with layer info
docker manifest inspect myrepo/myapp:v1
# Or with skopeo
skopeo inspect docker://myrepo/myapp:v1 | jq '.Layers'
Best Practices
1. Order Dockerfile Instructions by Change Frequency
# Rarely changes - at top
FROM python:3.11
RUN apt-get update && apt-get install -y libpq-dev
# Changes sometimes - middle
COPY requirements.txt .
RUN pip install -r requirements.txt
# Changes often - at bottom
COPY . .
This maximizes layer reuse. If only your code changes, Docker rebuilds only the last layer.
2. Pin Base Image Digests
# Bad: tag can change, breaking cache
FROM python:3.11
# Good: digest is immutable
FROM python:3.11@sha256:abc123...
Pinned digests ensure consistent base layers across builds.
3. Combine RUN Commands Thoughtfully
# Creates separate layers (more granular caching)
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y wget
# Creates single layer (smaller image, less cache granularity)
RUN apt-get update && \
apt-get install -y curl wget && \
rm -rf /var/lib/apt/lists/*
Single layers are smaller but rebuild entirely on any change. Choose based on your change patterns.
4. Use .dockerignore
Prevent unnecessary cache invalidation:
# .dockerignore
.git
*.md
tests/
__pycache__/
.env
Files in .dockerignore don't affect the build context hash, so unrelated changes don't invalidate layers.
Kubernetes Integration
Share layers across nodes with pre-pulling:
# DaemonSet to pre-pull common base images
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: image-prepuller
spec:
selector:
matchLabels:
app: prepuller
template:
spec:
initContainers:
- name: prepull
image: myrepo/base:latest
command: ['echo', 'Image pulled']
containers:
- name: pause
image: gcr.io/google_containers/pause:3.2
Or use a mutating webhook to inject common init containers.
Storage Savings
Real-world impact of layer sharing:
| Scenario | Without Sharing | With Sharing | Savings |
|---|---|---|---|
| 10 Python microservices | 15 GB | 4 GB | 73% |
| Base + 5 variants | 8 GB | 2.5 GB | 69% |
| CI cache (100 builds) | 200 GB | 25 GB | 88% |
The savings compound across your registry, CI runners, and Kubernetes nodes.
Troubleshooting
Cache not being used
Check BuildKit is enabled:
docker buildx version
# Should show buildx version
Verify cache exists:
docker buildx du
Registry cache not working
Ensure registry supports the cache manifest format:
# Test with inline cache (simpler, less efficient)
docker buildx build --cache-to=type=inline --push .
Layers not sharing between images
Compare layer digests:
skopeo inspect docker://myrepo/app1 | jq '.Layers'
skopeo inspect docker://myrepo/app2 | jq '.Layers'
Different base image tags or build contexts cause divergent layers even with identical content.
Conclusion
Layer linking is one of Docker's most powerful but underutilized features. By structuring your Dockerfiles strategically and using BuildKit's cache features, you can:
- Cut build times by 50-80%
- Reduce registry storage by 60-90%
- Speed up container startup on Kubernetes nodes
The key is thinking about layers as shared resources rather than per-image artifacts.
Managing container infrastructure at scale? Akmatori AI agents automatically optimize your Docker builds and identify layer sharing opportunities.
