How to Deploy FastAPI with Docker and K3s

From local dev to live: How to deploy FastAPI with Docker and K3s on Kubernetes

How to Deploy FastAPI with Docker and K3s
Photo by Growtika / Unsplash

You’ve got your FastAPI app running locally. You can uvicorn main:app --reload all day long. It feels good.

Then someone (maybe your boss, maybe future-you) asks the question that changes everything:

“Cool, when can we ship it?”

That’s when the stomach drop happens. Shipping an API means you need…

  • A container image you can trust.
  • Somewhere to run it that isn’t just your laptop.
  • Networking so the outside world can reach it.
  • A way to scale up without waking up at 3 AM to restart a process.

You could go all-in with managed Kubernetes on AWS, GCP, or Azure — but that’s overkill for many side projects and even for some small teams. It’s expensive, it’s complex, and it comes with a ton of moving parts you might not need yet.

That’s where Docker + K3s comes in.

K3s is a lightweight, CNCF-certified Kubernetes distribution that’s ridiculously easy to set up. It’s perfect for:

  • Local dev clusters you can spin up and tear down.
  • Low-footprint staging or even production workloads.
  • Learning Kubernetes without paying cloud bills.

Pair it with a solid Docker image and a few Kubernetes manifests, and you’ve got a deployment pipeline that works just as well on your laptop as it does on a beefy server in the cloud.

By the end of this guide, you’ll:

  • Containerize your FastAPI app with a production-ready Dockerfile.
  • Deploy it to a K3s cluster with Traefik handling Ingress.
  • Add probes, scaling, and some basic security hardening.
  • Know the exact commands to go from code → running in cluster in minutes.

No fluff. Real commands. Real config. Copy-paste-deploy.


What We’ll Build

Before we touch a terminal, let’s get clear on what we’re shipping.

We’re not just throwing a FastAPI app into a Docker container and hoping it works in Kubernetes. We’re going to design it like something we’d actually trust to run — even if it’s “just” for a side project.

Here’s the plan:

1. A minimal FastAPI app

  • Two endpoints:
    • /healthz → used by Kubernetes to check if the app is alive and ready to serve traffic.
    • /version → returns the app version from an environment variable (helps with debugging deployments).

2. A production-minded Docker image

  • Multi-stage build → smaller, cleaner images.
  • Runs as a non-root user → security baseline.
  • Healthcheck baked in → container knows if it’s healthy even outside Kubernetes.

3. A K3s cluster running locally

  • Single-node cluster with Traefik Ingress (ships with K3s by default).
  • Namespaced deployment so it’s easy to tear down without touching other workloads.

4. Kubernetes manifests that are:

  • Minimal: no 500-line YAML monstrosities here.
  • Practical: liveness/readiness probes, resource limits, rolling updates.
  • Extendable: you can scale from dev → staging → prod without rewriting everything.

5. Optional Horizontal Pod Autoscaler (HPA)

  • Starts with one pod, scales up to 5 if CPU usage passes 70%.

6. CI/CD with GitHub Actions

  • On every push to main, we:
    1. Run tests.
    2. Build & push the Docker image with a unique tag.
    3. Deploy updated manifests to the cluster.

📦 End Result:

You’ll have a FastAPI app that:

  • Runs in a container you control.
  • Lives in a Kubernetes cluster you can deploy to in seconds.
  • Scales automatically if needed.
  • Has a repeatable, automated deploy pipeline.

And because we’re using K3s, you can run this entire setup locally, then copy-paste the exact same manifests into a cloud-hosted K3s VM when you’re ready to go live.


Prerequisites & Setup

Before we start, let’s get your local machine ready so you can follow along without hitting a wall in the middle.

This isn’t a “just install stuff and hope” checklist — I’ll give you the exact tools we’ll use and why we’re using them.


1. Python 3.11+

We’re targeting Python 3.11 because:

  • It’s officially supported by FastAPI.
  • Better performance than older versions.
  • Still widely supported by cloud hosts and CI systems.

If you manage multiple Python versions, install pyenv. It’ll let you install 3.11 alongside any other versions without messing with your system Python.

# Install Python 3.11 via pyenv
pyenv install 3.11.9
pyenv global 3.11.9

2. Poetry

Poetry is our package/dependency manager. Why?

  • Locked, repeatable installs (good for CI/CD).
  • No leaking dependencies into system Python.
  • Easy to export requirements.txt for Docker builds.
curl -sSL https://install.python-poetry.org | python3 -

3. Docker

We’ll use Docker to build and run our FastAPI image locally before deploying to Kubernetes.

Make sure Docker Desktop (Mac/Windows) or the Docker CLI (Linux) is installed and running.

Verify:

docker --version

4. kubectl

Kubernetes CLI for applying manifests, checking pods, and debugging.

# Mac
brew install kubectl

# Linux
sudo snap install kubectl --classic

5. K3s or k3d

K3s is our lightweight Kubernetes cluster.

For local development, I recommend k3d — it runs K3s inside Docker, so it’s easy to spin up and tear down.

# Mac
brew install k3d

# Linux
curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash

6. (Optional) mkcert

If you want HTTPS locally with Traefik, install mkcert to generate self-signed certs that browsers actually trust.

We’ll skip it for the first run, but it’s nice to have.

brew install mkcert
mkcert -install

7. GitHub Account

We’ll use GitHub Actions for CI/CD and GHCR (GitHub Container Registry) for storing Docker images.

Make sure you can log in to GitHub and push to a repo.


Checkpoint:

You should now have:

  • Python 3.11 + Poetry
  • Docker
  • kubectl
  • k3d (or K3s installed manually)
  • (Optional) mkcert
  • A GitHub account with a personal access token ready if needed

Next, we’ll scaffold the FastAPI app and get the basic health/version endpoints running locally — that’s Section 3.


Scaffold the App

Before we get into containers and clusters, we need a minimal FastAPI application that’s designed to work well in Kubernetes from day one.

This isn’t just about “Hello, World!” — we’re going to include the endpoints that Kubernetes uses to monitor application health, so our app plays nicely with rolling updates and autoscaling later.


Step 1 — Create the Project Structure

We’ll use Poetry to create a new project, install FastAPI and our server, and set up tests.

poetry new fastapi-k3s-demo --name app
cd fastapi-k3s-demo
poetry add fastapi uvicorn gunicorn
poetry add -D pytest

Here’s what this does:

  • poetry new fastapi-k3s-demo → Creates a new Python package called app with a clean directory structure.
  • fastapi → Our API framework.
  • uvicorn → The ASGI server for local dev.
  • gunicorn → A production-ready process manager for running Uvicorn workers in Kubernetes.
  • pytest → Because even a tiny app should have at least one test to keep CI/CD happy.

Step 2 — Add the Application Code

Open app/src/main.py and replace the contents with:

from fastapi import FastAPI
import os

app = FastAPI()

@app.get("/healthz")
def healthz():
    """
    Kubernetes uses this endpoint to check if our app is alive.
    If this returns a 200 OK, the pod is considered healthy.
    """
    return {"ok": True}

@app.get("/version")
def version():
    """
    Returns the current version of the app.
    Useful for debugging deployments and rollouts.
    """
    return {"version": os.getenv("APP_VERSION", "dev")}

Why these endpoints matter:

  • /healthz → This is our liveness/readiness probe target in Kubernetes. It helps K8s decide when to send traffic to a pod and when to restart it.
  • /version → Lets you check which version of the app is running in a given pod — super helpful when testing rolling updates or diagnosing “wrong version” bugs.

Step 3 — Add a Quick Test

We’re not going for 100% test coverage here — just a smoke test so our CI/CD pipeline can fail fast if something’s broken.

tests/test_smoke.py:

def test_truth():
    assert 1 == 1

Yes, it’s silly — but the point is to have CI/CD wired up early. We can add real tests later.


Step 4 — Run Locally

Before we containerize anything, make sure the app works on your machine.

poetry run uvicorn app.src.main:app --reload --port 8000

Open your browser or curl the endpoints:

curl http://localhost:8000/healthz
# {"ok": true}

curl http://localhost:8000/version
# {"version": "dev"}

If this works, we have a working FastAPI app ready to be dockerized.

Next up, in Section 4, we’ll build a production-ready Docker image that’s lean, secure, and Kubernetes-friendly.


Build the Docker Image

Goal: a small, reproducible, non‑root image that starts fast, works with probes, and doesn’t leak your build tools into runtime.

Why multi‑stage?

  • Keep Poetry/build deps in the builder layer → copy only what the app needs into runtime.
  • Smaller attack surface, faster pulls, cheaper deploys.

Why non‑root?

  • If your app gets popped, running as UID 10001 instead of root limits blast radius.
  • Many clusters enforce non‑root by policy anyway.

Dockerfile

Drop this at the repo root as Dockerfile.

# ---- builder ----
FROM python:3.11-slim AS builder
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# Build essentials for any wheels (uvloop, etc.)
RUN apt-get update && apt-get install -y --no-install-recommends \
      curl build-essential \
    && rm -rf /var/lib/apt/lists/*

# Use Poetry to resolve & export pinned requirements
RUN pip install --no-cache-dir poetry==1.8.3
WORKDIR /build
COPY pyproject.toml poetry.lock* ./
# If no lock exists, generate it; then export fully pinned requirements
RUN poetry lock --no-interaction || true \
 && poetry export -f requirements.txt --output requirements.txt --without-hashes

# Bring in the app last to maximize layer cache
COPY app ./app

# ---- runtime ----
FROM python:3.11-slim AS runtime
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PORT=8000 \
    APP_VERSION=dev

# Create a non-root user
RUN useradd -m -u 10001 appuser
WORKDIR /app

# Minimal OS deps (curl for HEALTHCHECK)
RUN apt-get update && apt-get install -y --no-install-recommends \
      curl \
    && rm -rf /var/lib/apt/lists/*

# Install only runtime deps
COPY --from=builder /build/requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy the app code
COPY app /app/app

# Lock down code dir (helps with readOnlyRootFilesystem later)
RUN chmod -R a-w /app/app

USER 10001

# Container-level healthcheck (nice even outside k8s)
HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -fsS http://localhost:${PORT}/healthz || exit 1

EXPOSE 8000
# Gunicorn manages Uvicorn workers; production-friendly defaults
CMD ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "-w", "2", "-b", "0.0.0.0:8000", "app.src.main:app"]

.dockerignore

Keep your image clean and builds fast:

__pycache__
*.pyc
*.pyo
*.pyd
*.swp
.env
.venv
.git
.gitignore
.dist
build
pytest_cache

Build & smoke test locally

# from repo root
docker build -t fastapi-demo:local .
docker run --rm -p 8000:8000 fastapi-demo:local

# new terminal
curl http://localhost:8000/healthz
curl http://localhost:8000/version

If those return JSON, your image is good.


Why these choices (brief)

  • Gunicorn + UvicornWorker: battle‑tested process manager + async worker. You can tune -w and add --graceful-timeout later.
  • HEALTHCHECK: helps on plain Docker hosts and gives early signal in Kubernetes if you wire probes to the same path.
  • Pinned requirements: poetry export produces deterministic builds; fewer “works on my machine” surprises.
  • Non‑root user: aligns with Pod Security Standards and common org policies.

Optional hardening (add when you’re ready)

In the Pod spec (Kubernetes), you’ll later add:

securityContext:
  runAsNonRoot: true
  runAsUser: 10001
  readOnlyRootFilesystem: true

If you set readOnlyRootFilesystem: true, ensure your app writes to a writable path (e.g., /tmp) or a mounted volume.


Troubleshooting

  • Build fails on wheel compilationAdd the missing system libs to the builder stage only. Keep runtime clean.
  • Container starts but port is closedEnsure your command binds to 0.0.0.0:$PORT and you EXPOSE 8000.
  • HEALTHCHECK keeps failingConfirm the app path is /healthz and your PORT matches the process binding.

Checkpoint: You now have a small, reproducible, non‑root Docker image that passes a local health check.


Run Locally with Docker (Dry‑Run Pattern)

Goal: prove the container is healthy, serves traffic, and exposes the right port with the right env before we ship it to a cluster.

1) Build the image

# from repo root
docker build -t fastapi-demo:local .
  • If this fails on dependency wheels, fix it now (add system libs to the builder stage only).

2) Run the container

docker run --rm -p 8000:8000 fastapi-demo:local
  • --rm cleans up the container on exit.
  • -p 8000:8000 maps host:container port. If 8000 is busy, use -p 8080:8000 and curl localhost:8080.

Optional env override to simulate a versioned deploy:

docker run --rm -e APP_VERSION=0.1.0 -p 8000:8000 fastapi-demo:local

3) Smoke‑test endpoints

New terminal:

curl -s localhost:8000/healthz | jq
curl -s localhost:8000/version | jq

Expected:

{"ok": true}
{"version": "dev"}   # or "0.1.0" if you set APP_VERSION

4) Verify the healthcheck

The Dockerfile includes a HEALTHCHECK hitting /healthz. After ~30s, check:

docker ps --format 'table {{.Names}}\t{{.Status}}'

Look for healthy in the Status column. If it stays unhealthy:

  • Confirm the app binds to 0.0.0.0:8000 (not 127.0.0.1).
  • Make sure /healthz returns 200 OK even when the app is “cold starting”.

5) Logs & quick debugging

# in the container terminal (Ctrl+C to stop)
# or attach logs if running detached:
docker run -d --name fastapi-demo -p 8000:8000 fastapi-demo:local
docker logs -f fastapi-demo
docker rm -f fastapi-demo

6) Optional: Makefile helpers

If you like one‑liners (I do):

run:
\tdocker run --rm -p 8000:8000 fastapi-demo:local

smoke:
\tcurl -s localhost:8000/healthz | jq && curl -s localhost:8000/version | jq

Common “it’s probably this” issues

  • Port clash: “bind: address already in use” → pick a different host port: -p 8080:8000.
  • No response: App bound to 127.0.0.1 → ensure command uses 0.0.0.0:8000.
  • Unhealthy: /healthz throws or depends on external services → keep it boring and fast.
  • Weird Python paths: Confirm the module path in the CMD (app.src.main:app).

Checkpoint: The image runs locally, responds on /healthz and /version, and reports healthy. We’re ready for the cluster.


Spin Up K3s (Lightweight Kubernetes)

We’ll use K3s for a tiny, real Kubernetes that runs great on a dev machine or a single VPS. For local dev, I recommend k3d (K3s inside Docker). It’s quick, repeatable, and leaves no mystery files on your host.

# Create a local cluster with Traefik and a load balancer
k3d cluster create dev \
  --agents 1 --servers 1 \
  -p "80:80@loadbalancer" \
  -p "443:443@loadbalancer"

# Verify connectivity
kubectl get nodes
kubectl get pods -A | grep traefik   # Traefik ships with K3s

Option B — Native K3s on a VM (handy for a cloud box)

# On the VM (as root)
curl -sfL https://get.k3s.io | sh -

# Grab kubeconfig to your laptop (with ssh and permissions set)
sudo cat /etc/rancher/k3s/k3s.yaml  # copy to ~/.kube/config on your laptop and fix server IP

Create and use a namespace

Keep our stuff isolated so cleanup is one command and we don’t step on other workloads.

kubectl create namespace fastapi-demo
kubectl config set-context --current --namespace fastapi-demo
Tip: If you forget the namespace later, 90% of “not found” errors are just “you’re in the wrong namespace.”

Check Traefik (Ingress controller)

K3s ships Traefik by default. Confirm it’s up:

kubectl get pods -n kube-system | grep traefik

You’ll access the app via the load balancer on your host. For local hostname routing, we’ll use fastapi.localtest.me (wildcards resolve to 127.0.0.1 automatically), so you don’t need to edit /etc/hosts.

Alternative: *.nip.io also resolves to any IP you put in the hostname, e.g. fastapi.127.0.0.1.nip.io.

Quick cluster sanity checks

# Can the API talk?
kubectl version --short

# What’s running?
kubectl get pods -A

# Who am I and where?
kubectl config get-contexts

Cleanup (don’t leave zombie clusters)

  • k3d:
k3d cluster delete dev
  • Native K3s (on VM):
sudo /usr/local/bin/k3s-uninstall.sh

Checkpoint: You have a working K3s cluster, Traefik is running, and your kubectl context is pointed at fastapi-demo.


Push Image to a Registry (GHCR by default)

Kubernetes won’t use your local Docker cache. Images must live in a registry the cluster can reach. I default to GHCR (GitHub Container Registry) because it’s free, private by default, and plays nicely with GitHub Actions.

1) Pick an image name + tag strategy

  • Repo path: ghcr.io/<USER_OR_ORG>/fastapi-demo
  • Tag by git SHA for immutability, plus a moving latest if you want.
# from repo root
SHA=$(git rev-parse --short HEAD || echo dev)
IMG=ghcr.io/<USER_OR_ORG>/fastapi-demo

docker build -t $IMG:$SHA .
docker tag $IMG:$SHA $IMG:latest

2) Login to GHCR

  • Using the built-in GITHUB_TOKEN inside Actions is easy.
  • Locally, you can use a Personal Access Token (PAT) with write:packages.
# local login (replace USER and paste a PAT with write:packages scope)
echo <PERSONAL_ACCESS_TOKEN> | docker login ghcr.io -u <USER> --password-stdin

3) Push

docker push $IMG:$SHA
docker push $IMG:latest

4) Tell Kubernetes how to pull (if private)

Two approaches:

  • Make the package public in GHCR (simple).
  • Or create a pull secret and reference it in the Deployment.

Create the secret (namespaced):

kubectl create secret docker-registry ghcr-pull \
  --docker-server=ghcr.io \
  --docker-username=<USER> \
  --docker-password=<PERSONAL_ACCESS_TOKEN> \
  --docker-email=<you@example.com>

Reference the secret in your deployment.yaml:

spec:
  template:
    spec:
      imagePullSecrets:
        - name: ghcr-pull
Tip: If you deploy with GitHub Actions from the same org, keep the image private and let the cluster auth via this pull secret. If you’re just experimenting, making the package public avoids secret wrangling.

Option B — Docker Hub (works fine too)

Change the image name:

IMG=<DOCKERHUB_USER>/fastapi-demo
docker build -t $IMG:$SHA .
docker push $IMG:$SHA

If the repo is private, create a similar docker-registry secret using --docker-server=index.docker.io.


Option C — Local registry for k3d (super fast inner loop)

k3d can spin up a local registry you can push to without hitting the internet.

# create once
k3d registry create local-reg --port 0.0.0.0:5000

# recreate your cluster and attach the registry
k3d cluster create dev --agents 1 --servers 1 \
  -p "80:80@loadbalancer" \
  -p "443:443@loadbalancer" \
  --registry-use k3d-local-reg:5000

Tag + push:

IMG=k3d-local-reg:5000/fastapi-demo
docker build -t $IMG:$SHA .
docker push $IMG:$SHA

Use imagePullPolicy: IfNotPresent and you’re golden.


Tagging: what I actually do

  • Immutable deploys: :gitsha (e.g., :a1b2c3d)
  • Human-friendly: :latest (or :main) for quick local testing
  • Release tags: :v0.1.0 when you cut a release

Your Deployment should reference the immutable tag (or digest) in CI:

kubectl set image deployment/fastapi api=$IMG:$SHA

Troubleshooting (it’s usually auth or name)

  • ImagePullBackOff
    • Wrong registry credentials or missing imagePullSecrets.
    • Image is private and you’re not logged in.
    • Tag mismatch: you pushed :dev, deployment uses :sha.
  • manifest unknown
    • Pushed to ghcr.io/<USER>/fastapi-demo but deployment points at <ORG> or vice versa.
  • k3d can’t pull local
    • Ensure cluster is configured with --registry-use k3d-local-reg:5000 and you tagged k3d-local-reg:5000/....

Checkpoint: Your image is in a registry and you know how the cluster will authenticate to pull it.


Kubernetes Manifests (Deployment, Service, Ingress, HPA)

We’ll keep YAML tight but not brittle. Each file does one job. You can drop these into deploy/k8s/.

0) Namespace (once per cluster/env)

Keeps our stuff isolated and easy to nuke.

# deploy/k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: fastapi-demo

1) ConfigMap (version string, feature flags later)

We’ll surface an app version from env. Later you can hang config here (not secrets).

# deploy/k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: fastapi-demo
data:
  APP_VERSION: "dev"

2) Deployment (the meat)

Explained in‑line. Tune replicas in CI or via HPA later.

# deploy/k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fastapi
  namespace: fastapi-demo
  labels:
    app: fastapi
spec:
  replicas: 1
  revisionHistoryLimit: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: fastapi
  template:
    metadata:
      labels:
        app: fastapi
    spec:
      # If your image is private, uncomment:
      # imagePullSecrets:
      #   - name: ghcr-pull
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        fsGroup: 2000
      containers:
        - name: api
          image: ghcr.io/<USER_OR_ORG>/fastapi-demo:dev  # CI will set this to :$GIT_SHA
          imagePullPolicy: IfNotPresent
          ports:
            - name: http
              containerPort: 8000
          env:
            # from ConfigMap
            - name: APP_VERSION
              valueFrom:
                configMapKeyRef:
                  name: app-config
                  key: APP_VERSION
            # Downward API (super useful when debugging pods)
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          readinessProbe:
            httpGet:
              path: /healthz
              port: http
            initialDelaySeconds: 3
            periodSeconds: 5
            timeoutSeconds: 2
            failureThreshold: 3
          livenessProbe:
            httpGet:
              path: /healthz
              port: http
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 2
            failureThreshold: 3
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "256Mi"
          # If you later set readOnlyRootFilesystem, mount /tmp as writable:
          # volumeMounts:
          #   - name: tmp
          #     mountPath: /tmp
      # volumes:
      #   - name: tmp
      #     emptyDir: {}

Why these knobs?

  • RollingUpdate (1 surge / 0 unavailable): zero‑downtime by default.
  • Probes on /healthz: same path as Docker healthcheck → one source of truth.
  • Resource requests/limits: protects your node and lets the scheduler do its job.
  • Downward API: quick “which pod am I?” sanity checks in logs.

3) Service (stable virtual IP)

Makes the Deployment reachable inside the cluster.

# deploy/k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: fastapi
  namespace: fastapi-demo
spec:
  selector:
    app: fastapi
  ports:
    - name: http
      port: 8000       # service port
      targetPort: 8000 # container port
  type: ClusterIP

4) Ingress (Traefik in K3s by default)

Routes external traffic → Service. We’ll use a dev‑friendly hostname that resolves to localhost.

# deploy/k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: fastapi
  namespace: fastapi-demo
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: fastapi.localtest.me   # resolves to 127.0.0.1 automatically
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: fastapi
                port:
                  number: 8000

HTTPS later (optional):

  • Add Traefik’s websecure entrypoint + a TLS secret from mkcert.
  • Keep day‑one HTTP for fewer moving parts; flip to TLS once the basics work.

5) HPA (optional but nice)

Scales pods based on CPU. Needs metrics-server running (K3s often has this by default; if not, install it).

# deploy/k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: fastapi
  namespace: fastapi-demo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: fastapi
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Optional, when you’re ready to get fancier

  • PodDisruptionBudget (keep 1 pod up during node drain):
# deploy/k8s/pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fastapi
  namespace: fastapi-demo
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: fastapi
  • NetworkPolicy (default‑deny egress, allow DNS + HTTP out if needed).
  • Kustomize: great for managing per‑env image tags and patches instead of sed in CI.

Apply in the right order (or just apply the folder)

kubectl apply -f deploy/k8s/namespace.yaml
kubectl apply -f deploy/k8s/configmap.yaml
kubectl apply -f deploy/k8s/deployment.yaml
kubectl apply -f deploy/k8s/service.yaml
kubectl apply -f deploy/k8s/ingress.yaml
# optional
kubectl apply -f deploy/k8s/hpa.yaml
kubectl apply -f deploy/k8s/pdb.yaml

Or the easy button:

kubectl apply -f deploy/k8s/

Quick verification (we’ll go deeper in the next section)

kubectl rollout status deploy/fastapi
kubectl get pods,svc,ing -n fastapi-demo
curl -H "Host: fastapi.localtest.me" http://localhost/healthz

Checkpoint: You’ve got clean manifests for Deploy/Service/Ingress (and optional HPA/PDB), with sensible defaults and room to grow.


Apply & Verify

We’ve got the manifests. Now we’ll put them into the cluster and make sure everything actually works.


1) Apply the manifests

If you’re in the repo root:

kubectl apply -f deploy/k8s/

This will create (or update):

  • Namespace
  • ConfigMap
  • Deployment
  • Service
  • Ingress
  • Optional: HPA, PDB
Tip: If you want to see exactly what’s being applied, use kubectl diff -f deploy/k8s/ first.

2) Watch the rollout

kubectl rollout status deploy/fastapi

You should see something like:

deployment "fastapi" successfully rolled out

If it hangs or errors:

  • CrashLoopBackOff? → Check logs.
  • Stuck waiting? → Probes might be failing; /healthz not returning 200.

3) Check resources

kubectl get pods,svc,ing

Example:

NAME                         READY   STATUS    RESTARTS   AGE
pod/fastapi-5c9d9b9c88-mb6xp 1/1     Running   0          12s

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/fastapi ClusterIP 10.43.7.163 <none>        8000/TCP   12s

NAME          CLASS   HOSTS                 ADDRESS        PORTS   AGE
ingress/fastapi <none> fastapi.localtest.me 127.0.0.1       80      12s

4) Test via Ingress

For local K3s/k3d, the load balancer maps to localhost.

Use localtest.me as the hostname — it resolves to 127.0.0.1 automatically:

curl -H "Host: fastapi.localtest.me" http://localhost/healthz
# {"ok": true}

curl -H "Host: fastapi.localtest.me" http://localhost/version
# {"version": "dev"}

If that works, congrats — your FastAPI app is running in Kubernetes.


5) Common first-run issues

ImagePullBackOff

  • Wrong image name or tag in the Deployment.
  • Registry is private → missing imagePullSecrets.
  • Image never got pushed.

Pod stuck in CrashLoopBackOff

  • /healthz path is wrong or not reachable.
  • App failing to start — check:
kubectl logs deploy/fastapi
  • Wrong module path in CMD (app.src.main:app must match your code structure).

404 Not Found from Ingress

  • Host header mismatch — Traefik matches on Host.
  • Use -H "Host: fastapi.localtest.me" in curl.
  • Ingress pointing to wrong Service name or port.

6) Quick cleanup (when testing)

If you want to reset:

kubectl delete namespace fastapi-demo

Gone in seconds, no leftovers.


Checkpoint: Your FastAPI app is live in K3s, reachable through Traefik Ingress at fastapi.localtest.me.


Observability & Debugging

You’ve got pods running. Now we need to be able to see what they’re doing and diagnose issues fast.

Kubernetes gives you the basics, and we’ll layer in a few best practices.


1) Logs (per pod or filtered by label)

  • Single pod:
kubectl logs <pod-name>
  • Follow logs live:
kubectl logs -f <pod-name>
  • All pods with a label:
kubectl logs -l app=fastapi -f

This is why we labeled our Deployment with app: fastapi — one label, all logs.


2) Describe a resource

Gives you events, probe results, container restarts, and scheduling decisions.

kubectl describe pod <pod-name>

Look for:

  • Events at the bottom → failed probes, image pull errors, scheduling constraints.
  • State → Running, CrashLoopBackOff, Pending (and why).

3) Port-forward for direct access

If Ingress isn’t working or you want to bypass Traefik:

kubectl port-forward svc/fastapi 8000:8000

Then:

curl http://localhost:8000/healthz

4) Check cluster-wide health

kubectl get nodes
kubectl top nodes       # needs metrics-server
kubectl top pods -n fastapi-demo

If kubectl top fails, metrics-server might not be installed — K3s often ships it, but if not, you can add it.


5) Debug with an ephemeral pod

Spin up a temporary pod in the same namespace to poke at network/DNS:

kubectl run tmp-shell --rm -i --tty --image=busybox -- sh
# Inside the pod:
wget -qO- http://fastapi:8000/healthz

If that fails, it’s a Service/NetworkPolicy problem.


6) Tail events for real-time feedback

kubectl get events -n fastapi-demo --sort-by=.metadata.creationTimestamp

Or watch them live:

kubectl get events -n fastapi-demo --watch

7) Check probe status in logs

If readiness/liveness probes are failing, Kubernetes will restart pods or delay routing traffic.

Look for repeated probe hits in logs:

INFO:     127.0.0.1:39284 - "GET /healthz HTTP/1.1" 200 OK

If they’re returning 500s or timing out, you need to debug the /healthz handler.


8) (Optional) Add basic Prometheus annotations

If you ever install Prometheus in K3s, you can have it scrape your app automatically:

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8000"

You’d then expose metrics from FastAPI with something like prometheus-fastapi-instrumentator.


Checkpoint: You now have a toolbox to:

  • Tail logs across pods
  • Inspect pod health and events
  • Bypass Ingress with port-forward
  • Check resource usage
  • Drop into an ephemeral pod for network tests

CI/CD with GitHub Actions (push → build → deploy)

We’ll use a two‑job workflow:

  1. Build: run tests, build the image, push to GHCR with immutable tags.
  2. Deploy: update the Deployment in K3s to the exact image we just built, wait for rollout, and enable quick rollback.

Why this setup?

  • Immutable images (tagged by git SHA) avoid “what did latest point to?” nightmares.
  • Separation of build & deploy makes logs cleaner and rollbacks easier.
  • Rollout gating catches bad deploys early.

11.1 Create secrets

In your GitHub repo → Settings → Secrets and variables → Actions:

  • KUBECONFIG_DATA_BASE64: your kubeconfig in base64:
base64 -w0 ~/.kube/config  # mac: base64 ~/.kube/config | tr -d '\n'
  • (If using a private GHCR package from local workflows) a PAT with write:packages isn’t required; Actions uses the built‑in GITHUB_TOKEN. For local pushes, you’ll need your own PAT as covered earlier.

Optional but nice:

  • Environment named staging/prod with required approvals.

11.2 Workflow file

.github/workflows/build-and-deploy.yml

name: build-and-deploy

on:
  push:
    branches: ["main"]

# Avoid overlapping deploys on rapid pushes
concurrency:
  group: "deploy-main"
  cancel-in-progress: true

permissions:
  contents: read
  packages: write

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      # Fast sanity check (add more tests as you grow)
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - name: Run tests
        run: |
          python -m pip install --upgrade pip
          pip install pytest
          pytest -q || true  # keep build flowing early; fail hard once tests matter

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      # Create consistent tags and get digest output
      - name: Extract Docker metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/${{ github.repository_owner }}/fastapi-demo
          tags: |
            type=sha,format=short
            type=ref,event=branch
            type=raw,value=latest

      - name: Build and push
        id: build-push
        uses: docker/build-push-action@v6
        with:
          context: .
          file: ./Dockerfile
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          provenance: false

      # Expose the exact image reference (by digest) to the deploy job
      - name: Expose image digest
        id: out
        run: |
          echo "IMAGE_REF=ghcr.io/${{ github.repository_owner }}/fastapi-demo@${{ steps.build-push.outputs.digest }}" >> $GITHUB_OUTPUT

    outputs:
      image_ref: ${{ steps.out.outputs.IMAGE_REF }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    # Optionally protect with environments (e.g., staging/prod)
    # environment: staging
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup kubectl
        uses: azure/setup-kubectl@v4
        with:
          version: "v1.30.0"

      - name: Write kubeconfig
        run: |
          mkdir -p $HOME/.kube
          echo "${KUBECONFIG_DATA}" | base64 -d > $HOME/.kube/config
        env:
          KUBECONFIG_DATA: ${{ secrets.KUBECONFIG_DATA_BASE64 }}

      - name: Set namespace
        run: kubectl config set-context --current --namespace=fastapi-demo

      # Option A: apply manifests first (create/update objects)
      - name: Apply base manifests
        run: kubectl apply -f deploy/k8s/

      # Option B: patch Deployment to exact digest for reproducibility
      - name: Pin Deployment to built image (by digest)
        run: |
          kubectl set image deployment/fastapi api=${{ needs.build.outputs.image_ref }} --namespace fastapi-demo

      - name: Wait for rollout
        run: kubectl rollout status deployment/fastapi --timeout=180s

      # Store a pointer for quick rollback (last stable replica set)
      - name: Show rollout history
        run: kubectl rollout history deployment/fastapi

Highlights

  • We pass the image digest (…@sha256:…) to the deploy job → perfectly immutable deploys.
  • concurrency prevents two deploys from racing.
  • kubectl set image changes only the container image, leaving other spec parts untouched.

11.3 Rollbacks (two lines, zero drama)

From your terminal (or add a dedicated workflow job/button):

# See history with revision numbers
kubectl rollout history deployment/fastapi

# Roll back to previous
kubectl rollout undo deployment/fastapi

# Or to a specific revision
kubectl rollout undo deployment/fastapi --to-revision=3

11.4 Optional: Kustomize for per‑env overlays

If you want staging and prod with different replicas, resources, or domains:

deploy/
  k8s/
    base/ (deployment, service, ingress, hpa)
    overlays/
      staging/ (kustomization.yaml + patches)
      prod/    (kustomization.yaml + patches)

Then in Actions, replace the “apply manifests” step with:

kubectl apply -k deploy/k8s/overlays/staging

11.5 Common CI/CD gotchas

  • Cluster can’t pull the image → private GHCR but no imagePullSecrets set; or wrong org path.
  • Apply order issues → namespace missing; apply namespace.yaml first or apply the whole folder.
  • Stuck rollout → failing readiness probe; check kubectl describe pod for probe events.
  • Drift from local → build locally with the same Dockerfile and compare docker run behavior.

Checkpoint: Pushing to main now builds, pushes, and deploys a deterministic image to your K3s cluster and waits for a healthy rollout — with an easy escape hatch (rollback).


Production Hardening (Security, Reliability, Cost)

We’ve got a working deploy. Now let’s make it safe(‑ish), predictable, and boring in the good way.

12.1 Run as non‑root + read‑only FS

Lock down the container and filesystem.

Add to your Deployment:

spec:
  template:
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 10001
        fsGroup: 2000
      containers:
        - name: api
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop: ["ALL"]
          volumeMounts:
            - name: tmp
              mountPath: /tmp
      volumes:
        - name: tmp
          emptyDir: {}
If your app writes files, send them to /tmp or a mounted volume.

12.2 Resource requests/limits (protect the node)

Give the scheduler a clue and prevent noisy‑neighbor issues.

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "256Mi"
Start conservative → observe → adjust. Starvation and OOMKills are avoidable.

12.3 Health probes with grace

Keep probes fast and boring. Add termination grace to avoid cutting off in‑flight requests:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 30
      containers:
        - name: api
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh","-c","sleep 5"]  # let LB drain
          readinessProbe:
            httpGet: { path: /healthz, port: http }
            initialDelaySeconds: 3
            periodSeconds: 5
          livenessProbe:
            httpGet: { path: /healthz, port: http }
            initialDelaySeconds: 10
            periodSeconds: 10
For Gunicorn, consider --graceful-timeout and --timeout flags if you handle long requests.

12.4 Secrets > ConfigMaps

Never put creds in ConfigMaps. Use Secret (and consider Sealed Secrets later).

kubectl create secret generic app-secrets \
  --from-literal=DATABASE_URL=postgres://user:pass@db:5432/app

Mount as env:

env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: app-secrets
        key: DATABASE_URL
Rotate secrets by creating a new one and bumping a dummy env var to force rollout.

12.5 NetworkPolicy (default‑deny egress)

Stop surprise calls to the internet. Allow only what you need:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: fastapi-deny-all
  namespace: fastapi-demo
spec:
  podSelector:
    matchLabels: { app: fastapi }
  policyTypes: ["Egress","Ingress"]
  ingress:
    - from: []   # none (behind Ingress)
  egress:
    - to:
        - namespaceSelector: {}   # allow DNS
      ports:
        - port: 53
          protocol: UDP
        - port: 53
          protocol: TCP
    # add explicit egress to DB, APIs, etc.

12.6 TLS on Traefik (local + real certs)

Start HTTP for day one; add TLS once stable.

Local (mkcert)

mkcert fastapi.localtest.me
kubectl create secret tls fastapi-tls \
  --cert=fastapi.localtest.me.pem \
  --key=fastapi.localtest.me-key.pem \
  -n fastapi-demo

Ingress:

metadata:
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
spec:
  tls:
    - hosts: ["fastapi.localtest.me"]
      secretName: fastapi-tls

Real certs: enable Traefik ACME/Let’s Encrypt and switch entrypoint to websecure.


12.7 Image integrity (digests, scans, SBOM)

  • Pin by digest in CI:
kubectl set image deployment/fastapi api=$IMAGE@sha256:... 
  • Scan images in CI with Trivy or Grype; fail on critical vulns.
  • SBOM: generate with syft for auditability.

12.8 Pod disruption & spreading

Keep one pod up during node drains and spread replicas.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fastapi
  namespace: fastapi-demo
spec:
  minAvailable: 1
  selector:
    matchLabels: { app: fastapi }
---
spec:
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels: { app: fastapi }

12.9 Autoscaling that won’t lie to you

HPA on CPU is fine to start. If the app is I/O bound, consider custom metrics (RPS, queue depth).

Make sure metrics-server actually works; verify with kubectl top pods.


12.10 Runtime security (nice‑to‑have)

  • Seccomp/AppArmor:
securityContext:
  seccompProfile:
    type: RuntimeDefault
  • PSA (Pod Security Admission): set namespace labels to restricted when you’re ready.

12.11 Backups & data separation

  • Keep state out of the API pod. Use a managed DB or a separate StatefulSet.
  • Back up DB regularly; test restores.
  • For S3/GCS, use scoped credentials (least privilege).

12.12 Cost & footprint

  • K3s on a single small VM is cheap; watch CPU throttling if limits are tight.
  • Right‑size requests; over‑requesting is the stealth budget killer.
  • Use smaller base images (python:3.11-slim, or distroless when feasible).

12.13 Release hygiene

  • Version the app (APP_VERSION) and log it on startup.
  • Tag releases (v0.x) and map them to course/examples.
  • Keep CHANGELOGs human‑readable.

12.14 “Day‑2” checklist (pin this)

  • Non‑root, read‑only FS, caps dropped
  • Probes tuned + graceful shutdown
  • Requests/limits set and reviewed
  • Secrets in Secret, not ConfigMap
  • NetworkPolicy default‑deny (egress)
  • TLS on Ingress (mkcert or ACME)
  • Image scans in CI + digest pinning
  • PDB + topology spread (if >1 replica)
  • Backups for any stateful deps
  • Dashboards/alerts for errors & 5xx

Checkpoint: This service won’t topple over from a mild breeze. You’ve moved from “it runs” to “it runs safely.”


Wrap-Up + Next Steps

You’ve just taken a FastAPI app from “runs on my laptop” to “production-ready on K3s” with:

  • Docker for consistent packaging
  • K3s for lightweight, real-deal Kubernetes
  • Traefik as an ingress controller
  • Secrets, probes, requests/limits, and TLS for security and reliability
  • CI/CD so shipping is a push, not a manual ritual

If you’ve followed along, you now have:

  • A working https://yourapp.example.com endpoint
  • Automated builds + deploys
  • Rollback safety
  • A foundation that works on a $5 VPS or a cluster of VMs

13.1 Where to go from here

This is Day-1 Kubernetes — good enough for a side project or small internal tool.

But you can (and should) go further:

  • Add monitoring & logging → Prometheus, Grafana, Loki, or a managed stack
  • Manage secrets with Sealed Secrets or Vault
  • Use a managed DB instead of running one in-cluster
  • Improve deployments → Blue-Green or Canary with Argo Rollouts
  • Automate scaling beyond CPU metrics
  • Disaster recovery → scheduled backups, tested restores

13.2 Common pain points you’ll hit next

  • Cluster drift: Things running in prod aren’t quite what’s in Git. → Solve with GitOps (ArgoCD, Flux).
  • Secrets chaos: Too many .env files → Move to a secret manager early.
  • K3s upgrades: Plan for maintenance windows; K3s is simple but not hands-off.
  • Cert renewals: Automate Let’s Encrypt with Traefik’s ACME.

13.3 Want to skip the yak-shaving next time?

I keep a private repo called The Backend Engineer’s Second Brain — it’s where I store my production-ready boilerplates, deployment manifests, and troubleshooting notes.

It’s designed so you can plug in a new backend project and get it running in minutes, without reinventing YAML.

If you want it, let me know in the comments below and I'll let you know when it's ready!


13.4 Stay in the loop

If you found this guide useful, you’ll love my free Python & FastAPI tips newsletter — where I share battle-tested backend tricks, deployment patterns, and hard-earned lessons.

📬 Join here for free


13.5 Final thought

Getting FastAPI into production is not about memorizing every Kubernetes flag — it’s about building a workflow you trust and can repeat without fear.

This setup gets you 80% of the way there without a cloud bill that looks like a phone number.

From here, it’s just refining, observing, and scaling at your pace.

If you hit a snag, want clarity on a step, or have a deployment war story of your own — drop a comment with your questions. I read and reply to all of them, and your question might help the next backend engineer who stumbles on this guide.