FastAPI – Deployment

Introduction

Deploying a FastAPI application to production requires more than just running uvicorn main:app. A production deployment involves configuring ASGI servers for performance, containerizing your application with Docker, setting up reverse proxies, implementing CI/CD pipelines, managing database migrations, and ensuring security and monitoring are in place.

This comprehensive guide covers everything you need to deploy FastAPI applications reliably, from single-server setups to scalable cloud architectures. Whether you’re deploying to AWS, Heroku, DigitalOcean, or your own infrastructure, you’ll find practical, production-tested configurations here.

What You’ll Learn

  • Configure production ASGI servers (Uvicorn, Gunicorn, Hypercorn)
  • Containerize FastAPI with Docker and docker-compose
  • Set up Nginx as a reverse proxy with SSL
  • Deploy to AWS (EC2, ECS/Fargate, Lambda), Heroku, and DigitalOcean
  • Implement CI/CD pipelines with GitHub Actions
  • Manage database migrations with Alembic in production
  • Add monitoring, structured logging, and health checks
  • Optimize performance with async patterns, caching, and profiling
  • Scale horizontally with load balancing and rate limiting
  • Harden security with HTTPS, headers, CORS, and secrets management

Prerequisites

  • Working knowledge of FastAPI (routes, dependencies, Pydantic models)
  • Basic familiarity with Docker and Linux command line
  • A FastAPI application ready for deployment

1. Production-Ready FastAPI Configuration

Before deploying, your FastAPI application needs proper configuration management, structured logging, and environment-specific settings. The pydantic-settings library provides type-safe configuration that reads from environment variables and .env files.

1.1 Configuration with pydantic-settings

Install the required package:

pip install pydantic-settings python-dotenv

Create a centralized settings module that all parts of your application can import:

# app/config.py
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from functools import lru_cache
from typing import Optional


class Settings(BaseSettings):
    """Application settings loaded from environment variables."""

    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
    )

    # Application
    app_name: str = "FastAPI App"
    app_version: str = "1.0.0"
    debug: bool = False
    environment: str = "production"  # development, staging, production

    # Server
    host: str = "0.0.0.0"
    port: int = 8000
    workers: int = 4
    reload: bool = False

    # Database
    database_url: str = "postgresql+asyncpg://user:pass@localhost:5432/mydb"
    db_pool_size: int = 20
    db_max_overflow: int = 10
    db_pool_timeout: int = 30

    # Redis
    redis_url: str = "redis://localhost:6379/0"

    # Security
    secret_key: str = Field(default="change-me-in-production")
    allowed_hosts: list[str] = ["*"]
    cors_origins: list[str] = ["http://localhost:3000"]

    # JWT
    jwt_secret: str = Field(default="jwt-secret-change-me")
    jwt_algorithm: str = "HS256"
    jwt_expiration_minutes: int = 30

    # Logging
    log_level: str = "INFO"
    log_format: str = "json"  # json or text

    # External Services
    smtp_host: Optional[str] = None
    smtp_port: int = 587
    sentry_dsn: Optional[str] = None


@lru_cache()
def get_settings() -> Settings:
    """Cached settings instance."""
    return Settings()

Create a .env file for local development:

# .env
APP_NAME=MyFastAPIApp
DEBUG=true
ENVIRONMENT=development
DATABASE_URL=postgresql+asyncpg://postgres:password@localhost:5432/mydb
REDIS_URL=redis://localhost:6379/0
SECRET_KEY=dev-secret-key-not-for-production
JWT_SECRET=dev-jwt-secret
LOG_LEVEL=DEBUG
LOG_FORMAT=text
CORS_ORIGINS=["http://localhost:3000","http://localhost:8080"]

Use settings throughout your application:

# app/main.py
from fastapi import FastAPI, Depends
from app.config import Settings, get_settings

app = FastAPI()


@app.get("/info")
async def app_info(settings: Settings = Depends(get_settings)):
    return {
        "app_name": settings.app_name,
        "version": settings.app_version,
        "environment": settings.environment,
        "debug": settings.debug,
    }

1.2 Structured Logging

Production applications need structured logging (JSON format) for log aggregation tools like ELK Stack, Datadog, or CloudWatch. Use structlog for structured, contextualized logging:

pip install structlog
# app/logging_config.py
import logging
import sys
import structlog
from app.config import get_settings


def setup_logging():
    """Configure structured logging for the application."""
    settings = get_settings()

    # Choose processors based on environment
    if settings.log_format == "json":
        renderer = structlog.processors.JSONRenderer()
    else:
        renderer = structlog.dev.ConsoleRenderer(colors=True)

    structlog.configure(
        processors=[
            structlog.contextvars.merge_contextvars,
            structlog.processors.add_log_level,
            structlog.processors.StackInfoRenderer(),
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.format_exc_info,
            renderer,
        ],
        wrapper_class=structlog.make_filtering_bound_logger(
            getattr(logging, settings.log_level.upper(), logging.INFO)
        ),
        context_class=dict,
        logger_factory=structlog.PrintLoggerFactory(file=sys.stdout),
        cache_logger_on_first_use=True,
    )


def get_logger(name: str = __name__):
    """Get a structured logger instance."""
    return structlog.get_logger(name)

Add request logging middleware to track every request:

# app/middleware.py
import time
import uuid
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
from app.logging_config import get_logger

logger = get_logger(__name__)


class RequestLoggingMiddleware(BaseHTTPMiddleware):
    """Log every request with timing and correlation ID."""

    async def dispatch(self, request: Request, call_next):
        request_id = str(uuid.uuid4())[:8]
        start_time = time.perf_counter()

        # Add request ID to structlog context
        structlog.contextvars.clear_contextvars()
        structlog.contextvars.bind_contextvars(request_id=request_id)

        logger.info(
            "request_started",
            method=request.method,
            path=request.url.path,
            client_ip=request.client.host if request.client else "unknown",
        )

        response = await call_next(request)
        duration = time.perf_counter() - start_time

        logger.info(
            "request_completed",
            method=request.method,
            path=request.url.path,
            status_code=response.status_code,
            duration_ms=round(duration * 1000, 2),
        )

        response.headers["X-Request-ID"] = request_id
        return response

1.3 Application Factory Pattern

Use a factory function to create your FastAPI application with all middleware and configuration applied:

# app/main.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.config import get_settings
from app.logging_config import setup_logging
from app.middleware import RequestLoggingMiddleware


@asynccontextmanager
async def lifespan(app: FastAPI):
    """Manage application startup and shutdown."""
    # Startup
    setup_logging()
    from app.logging_config import get_logger
    logger = get_logger("lifespan")
    logger.info("application_starting", environment=get_settings().environment)

    # Initialize database, Redis, etc.
    # await init_db()
    # await init_redis()

    yield  # Application runs here

    # Shutdown
    logger.info("application_shutting_down")
    # await close_db()
    # await close_redis()


def create_app() -> FastAPI:
    """Application factory."""
    settings = get_settings()

    app = FastAPI(
        title=settings.app_name,
        version=settings.app_version,
        debug=settings.debug,
        lifespan=lifespan,
        docs_url="/docs" if settings.debug else None,
        redoc_url="/redoc" if settings.debug else None,
    )

    # CORS
    app.add_middleware(
        CORSMiddleware,
        allow_origins=settings.cors_origins,
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )

    # Request logging
    app.add_middleware(RequestLoggingMiddleware)

    # Include routers
    from app.routers import api_router
    app.include_router(api_router, prefix="/api/v1")

    return app


app = create_app()

2. ASGI Servers for Production

FastAPI runs on ASGI (Asynchronous Server Gateway Interface) servers. While Uvicorn is great for development, production deployments need proper process management, graceful shutdowns, and multiple worker processes.

2.1 Uvicorn in Production

Uvicorn can run with multiple workers for production use:

# Basic production run
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

# With all production options
uvicorn app.main:app \
    --host 0.0.0.0 \
    --port 8000 \
    --workers 4 \
    --loop uvloop \
    --http httptools \
    --log-level warning \
    --access-log \
    --proxy-headers \
    --forwarded-allow-ips="*"

The number of workers should typically be set to (2 * CPU_CORES) + 1. You can also configure Uvicorn programmatically:

# run.py
import uvicorn
from app.config import get_settings

if __name__ == "__main__":
    settings = get_settings()
    uvicorn.run(
        "app.main:app",
        host=settings.host,
        port=settings.port,
        workers=settings.workers,
        reload=settings.reload,
        log_level=settings.log_level.lower(),
        proxy_headers=True,
        forwarded_allow_ips="*",
    )

2.2 Gunicorn with Uvicorn Workers (Recommended)

Gunicorn provides battle-tested process management. Combined with Uvicorn workers, it gives you the best of both worlds — Gunicorn’s process management with Uvicorn’s ASGI performance:

# Install both
pip install gunicorn uvicorn[standard]

# Run with Uvicorn workers
gunicorn app.main:app \
    --worker-class uvicorn.workers.UvicornWorker \
    --workers 4 \
    --bind 0.0.0.0:8000 \
    --timeout 120 \
    --graceful-timeout 30 \
    --keep-alive 5 \
    --access-logfile - \
    --error-logfile -

Create a Gunicorn configuration file for more control:

# gunicorn.conf.py
import multiprocessing
import os

# Server socket
bind = f"0.0.0.0:{os.getenv('PORT', '8000')}"
backlog = 2048

# Worker processes
workers = int(os.getenv("WEB_CONCURRENCY", multiprocessing.cpu_count() * 2 + 1))
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000
timeout = 120
graceful_timeout = 30
keepalive = 5

# Restart workers after this many requests (prevents memory leaks)
max_requests = 1000
max_requests_jitter = 50

# Logging
accesslog = "-"
errorlog = "-"
loglevel = os.getenv("LOG_LEVEL", "info").lower()

# Process naming
proc_name = "fastapi-app"

# Server hooks
def on_starting(server):
    """Called just before the master process is initialized."""
    pass


def post_worker_init(worker):
    """Called just after a worker has been initialized."""
    worker.log.info(f"Worker {worker.pid} initialized")


def worker_exit(server, worker):
    """Called when a worker exits."""
    worker.log.info(f"Worker {worker.pid} exiting")
# Run with config file
gunicorn app.main:app -c gunicorn.conf.py

2.3 Hypercorn (HTTP/2 Support)

Hypercorn supports HTTP/2 and HTTP/3, which can be useful for applications that benefit from multiplexed connections:

pip install hypercorn

# Basic run
hypercorn app.main:app --bind 0.0.0.0:8000 --workers 4

# With HTTP/2
hypercorn app.main:app \
    --bind 0.0.0.0:8000 \
    --workers 4 \
    --certfile cert.pem \
    --keyfile key.pem

ASGI Server Comparison

Feature Uvicorn Gunicorn + Uvicorn Hypercorn
Process Management Basic Advanced (preforking) Basic
Graceful Restart Limited Full (SIGHUP) Limited
HTTP/2 No No Yes
Worker Recovery Manual Automatic Manual
Memory Leak Protection No max_requests No
Production Ready With care Yes (recommended) With care

3. Docker Containerization

Docker provides consistent, reproducible environments across development, staging, and production. A well-crafted Dockerfile ensures your FastAPI application runs the same way everywhere.

3.1 Basic Dockerfile

# Dockerfile
FROM python:3.12-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser

WORKDIR /app

# Install system dependencies
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        curl \
        build-essential \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Change ownership to non-root user
RUN chown -R appuser:appuser /app
USER appuser

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

# Run the application
CMD ["gunicorn", "app.main:app", \
    "--worker-class", "uvicorn.workers.UvicornWorker", \
    "--workers", "4", \
    "--bind", "0.0.0.0:8000", \
    "--timeout", "120", \
    "--access-logfile", "-"]

3.2 Multi-Stage Build (Optimized)

Multi-stage builds produce smaller images by separating build dependencies from the runtime environment:

# Dockerfile.multistage
# ---- Build Stage ----
FROM python:3.12-slim AS builder

ENV PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=1

WORKDIR /build

# Install build dependencies
RUN apt-get update \
    && apt-get install -y --no-install-recommends build-essential \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --prefix=/install --no-cache-dir -r requirements.txt

# ---- Runtime Stage ----
FROM python:3.12-slim AS runtime

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser

# Install runtime-only system dependencies
RUN apt-get update \
    && apt-get install -y --no-install-recommends curl \
    && rm -rf /var/lib/apt/lists/*

# Copy Python packages from builder
COPY --from=builder /install /usr/local

WORKDIR /app
COPY --chown=appuser:appuser . .

USER appuser
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

CMD ["gunicorn", "app.main:app", \
    "--worker-class", "uvicorn.workers.UvicornWorker", \
    "--workers", "4", \
    "--bind", "0.0.0.0:8000"]

3.3 .dockerignore

Exclude unnecessary files from the build context:

# .dockerignore
__pycache__
*.pyc
*.pyo
.git
.gitignore
.env
.env.*
.venv
venv
*.md
docs/
tests/
.pytest_cache
.coverage
htmlcov/
.mypy_cache
.ruff_cache
docker-compose*.yml
Dockerfile*
.dockerignore

3.4 Docker Compose for Development

# docker-compose.yml
version: "3.9"

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql+asyncpg://postgres:password@db:5432/fastapi_db
      - REDIS_URL=redis://redis:6379/0
      - ENVIRONMENT=development
      - DEBUG=true
      - LOG_LEVEL=DEBUG
    volumes:
      - .:/app  # Hot reload in development
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
      POSTGRES_DB: fastapi_db
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:
  redis_data:
# Build and start all services
docker compose up --build -d

# View logs
docker compose logs -f app

# Run database migrations
docker compose exec app alembic upgrade head

# Stop all services
docker compose down

# Stop and remove volumes (clean slate)
docker compose down -v

4. Nginx Reverse Proxy

Nginx sits in front of your ASGI server to handle SSL termination, static file serving, load balancing, request buffering, and rate limiting. It is the standard production setup for Python web applications.

4.1 Basic Nginx Configuration

# nginx/nginx.conf
upstream fastapi_backend {
    server app:8000;
}

server {
    listen 80;
    server_name yourdomain.com www.yourdomain.com;

    # Redirect HTTP to HTTPS
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name yourdomain.com www.yourdomain.com;

    # SSL certificates (Let's Encrypt)
    ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;

    # SSL settings
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers off;
    ssl_session_timeout 1d;
    ssl_session_cache shared:SSL:10m;
    ssl_session_tickets off;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;

    # Request size limit
    client_max_body_size 10M;

    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css application/json application/javascript text/xml;

    # Static files
    location /static/ {
        alias /app/static/;
        expires 30d;
        add_header Cache-Control "public, immutable";
    }

    # API proxy
    location / {
        proxy_pass http://fastapi_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;

        # Buffering
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }

    # Health check endpoint (no logging)
    location /health {
        proxy_pass http://fastapi_backend/health;
        access_log off;
    }
}

4.2 WebSocket Proxying

FastAPI supports WebSockets, which require special Nginx configuration:

# Add to the server block
location /ws/ {
    proxy_pass http://fastapi_backend;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # WebSocket timeout (keep alive)
    proxy_read_timeout 86400s;
    proxy_send_timeout 86400s;
}

4.3 Load Balancing Multiple Workers

If you run multiple FastAPI instances, Nginx can load balance between them:

upstream fastapi_backend {
    least_conn;  # Send to the server with fewest connections

    server app1:8000 weight=3;  # Higher weight = more traffic
    server app2:8000 weight=2;
    server app3:8000 weight=1;

    # Health checks (Nginx Plus only, use external for OSS)
    # health_check interval=10s fails=3 passes=2;
}

4.4 Rate Limiting with Nginx

# Add to http block (before server blocks)
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;

server {
    # ...

    # General API rate limiting
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://fastapi_backend;
        # ... proxy headers
    }

    # Strict rate limiting for auth endpoints
    location /api/auth/ {
        limit_req zone=login burst=5 nodelay;
        proxy_pass http://fastapi_backend;
        # ... proxy headers
    }
}

5. AWS Deployment

AWS offers multiple ways to deploy FastAPI, from virtual servers (EC2) to managed containers (ECS/Fargate) to serverless (Lambda). Each approach has different trade-offs in cost, complexity, and scalability.

5.1 EC2 Deployment

EC2 gives you full control over the server environment. This is a good starting point for teams familiar with server administration.

Server Setup Script

#!/bin/bash
# ec2-setup.sh - Run on a fresh Ubuntu 22.04 EC2 instance

# Update system
sudo apt-get update && sudo apt-get upgrade -y

# Install Python 3.12
sudo add-apt-repository ppa:deadsnakes/ppa -y
sudo apt-get install -y python3.12 python3.12-venv python3.12-dev

# Install Nginx
sudo apt-get install -y nginx certbot python3-certbot-nginx

# Install supervisor for process management
sudo apt-get install -y supervisor

# Create application directory
sudo mkdir -p /opt/fastapi-app
sudo chown $USER:$USER /opt/fastapi-app

# Clone your application
cd /opt/fastapi-app
git clone https://github.com/youruser/yourapp.git .

# Create virtual environment
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Copy environment file
cp .env.production .env

Supervisor Configuration

# /etc/supervisor/conf.d/fastapi.conf
[program:fastapi]
command=/opt/fastapi-app/venv/bin/gunicorn app.main:app
    --worker-class uvicorn.workers.UvicornWorker
    --workers 4
    --bind unix:/tmp/fastapi.sock
    --timeout 120
    --access-logfile /var/log/fastapi/access.log
    --error-logfile /var/log/fastapi/error.log
directory=/opt/fastapi-app
user=www-data
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/log/fastapi/supervisor.log
environment=
    ENVIRONMENT="production",
    DATABASE_URL="postgresql+asyncpg://user:pass@rds-endpoint:5432/mydb"
# Start the application
sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl start fastapi

# Check status
sudo supervisorctl status fastapi

5.2 ECS with Fargate (Serverless Containers)

ECS Fargate runs your Docker containers without managing servers. You define a task (container specs) and a service (how many to run).

ECS Task Definition

# ecs-task-definition.json
{
    "family": "fastapi-app",
    "networkMode": "awsvpc",
    "requiresCompatibilities": ["FARGATE"],
    "cpu": "512",
    "memory": "1024",
    "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
    "containerDefinitions": [
        {
            "name": "fastapi",
            "image": "ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest",
            "portMappings": [
                {
                    "containerPort": 8000,
                    "protocol": "tcp"
                }
            ],
            "environment": [
                {"name": "ENVIRONMENT", "value": "production"},
                {"name": "WORKERS", "value": "2"}
            ],
            "secrets": [
                {
                    "name": "DATABASE_URL",
                    "valueFrom": "arn:aws:ssm:us-east-1:ACCOUNT:parameter/fastapi/database_url"
                },
                {
                    "name": "SECRET_KEY",
                    "valueFrom": "arn:aws:ssm:us-east-1:ACCOUNT:parameter/fastapi/secret_key"
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/fastapi-app",
                    "awslogs-region": "us-east-1",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "healthCheck": {
                "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
                "interval": 30,
                "timeout": 5,
                "retries": 3,
                "startPeriod": 10
            }
        }
    ]
}

Deploy with AWS CLI

# Build and push Docker image to ECR
aws ecr get-login-password --region us-east-1 | \
    docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.us-east-1.amazonaws.com

docker build -t fastapi-app .
docker tag fastapi-app:latest ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest
docker push ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest

# Register task definition
aws ecs register-task-definition --cli-input-json file://ecs-task-definition.json

# Create or update service
aws ecs update-service \
    --cluster fastapi-cluster \
    --service fastapi-service \
    --task-definition fastapi-app \
    --desired-count 2 \
    --force-new-deployment

5.3 AWS Lambda with Mangum

Mangum is an adapter that lets you run FastAPI on AWS Lambda behind API Gateway. This is ideal for low-traffic APIs or APIs with bursty traffic patterns.

pip install mangum
# lambda_handler.py
from mangum import Mangum
from app.main import app

# Create the Lambda handler
handler = Mangum(app, lifespan="off")

SAM Template for Lambda Deployment

# template.yaml (AWS SAM)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Globals:
  Function:
    Timeout: 30
    MemorySize: 512
    Runtime: python3.12

Resources:
  FastAPIFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: lambda_handler.handler
      CodeUri: .
      Events:
        ApiEvent:
          Type: HttpApi
          Properties:
            Path: /{proxy+}
            Method: ANY
        RootEvent:
          Type: HttpApi
          Properties:
            Path: /
            Method: ANY
      Environment:
        Variables:
          ENVIRONMENT: production
          DATABASE_URL: !Ref DatabaseUrl
      Policies:
        - AmazonSSMReadOnlyAccess

  Parameters:
    DatabaseUrl:
      Type: AWS::SSM::Parameter::Value<String>
      Default: /fastapi/database_url

Outputs:
  ApiUrl:
    Description: API Gateway endpoint URL
    Value: !Sub "https://${ServerlessHttpApi}.execute-api.${AWS::Region}.amazonaws.com"
# Deploy with SAM
sam build
sam deploy --guided

AWS Deployment Comparison

Feature EC2 ECS Fargate Lambda
Server Management You manage AWS manages Fully serverless
Scaling Manual / ASG Auto-scaling Automatic
Cost Model Per hour Per vCPU/memory/sec Per request
Cold Start None Minimal Yes (seconds)
WebSockets Yes Yes Via API Gateway
Best For Full control Containers at scale Low/bursty traffic

6. Heroku Deployment

Heroku is one of the simplest platforms for deploying FastAPI. It handles infrastructure, SSL, and scaling with minimal configuration.

6.1 Heroku Configuration Files

Create the required files in your project root:

# Procfile
web: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT --timeout 120
# runtime.txt
python-3.12.3
# requirements.txt
fastapi==0.115.0
uvicorn[standard]==0.30.0
gunicorn==22.0.0
pydantic-settings==2.5.0
sqlalchemy[asyncio]==2.0.35
asyncpg==0.29.0
alembic==1.13.0
python-dotenv==1.0.1
httpx==0.27.0

6.2 Deployment Steps

# Login to Heroku
heroku login

# Create a new app
heroku create my-fastapi-app

# Add PostgreSQL addon
heroku addons:create heroku-postgresql:essential-0

# Add Redis addon
heroku addons:create heroku-redis:mini

# Set environment variables
heroku config:set \
    ENVIRONMENT=production \
    SECRET_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))") \
    JWT_SECRET=$(python -c "import secrets; print(secrets.token_urlsafe(32))") \
    LOG_LEVEL=INFO \
    LOG_FORMAT=json

# Deploy
git push heroku main

# Run migrations
heroku run alembic upgrade head

# View logs
heroku logs --tail

# Scale dynos
heroku ps:scale web=2

6.3 Heroku Release Phase (Auto-Migrations)

Add a release command to automatically run migrations on each deploy:

# Procfile (updated)
web: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT
release: alembic upgrade head

7. DigitalOcean Deployment

DigitalOcean offers two main options: App Platform (managed PaaS, similar to Heroku) and Droplets (virtual servers, similar to EC2).

7.1 App Platform

Create an app specification file:

# .do/app.yaml
name: fastapi-app
region: nyc

services:
  - name: api
    github:
      repo: youruser/fastapi-app
      branch: main
      deploy_on_push: true
    build_command: pip install -r requirements.txt
    run_command: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT
    envs:
      - key: ENVIRONMENT
        value: production
      - key: SECRET_KEY
        type: SECRET
        value: your-secret-key
      - key: DATABASE_URL
        scope: RUN_TIME
        value: ${db.DATABASE_URL}
    instance_count: 2
    instance_size_slug: professional-xs
    http_port: 8000
    health_check:
      http_path: /health

databases:
  - engine: PG
    name: db
    num_nodes: 1
    size: db-s-dev-database
    version: "16"
# Deploy using doctl CLI
doctl apps create --spec .do/app.yaml

# List apps
doctl apps list

# View logs
doctl apps logs APP_ID --type run

7.2 Droplet Deployment

For a Droplet (virtual server), the setup is similar to EC2. Create a setup script:

#!/bin/bash
# droplet-setup.sh - For Ubuntu 22.04 Droplet

# Update system
apt-get update && apt-get upgrade -y

# Install dependencies
apt-get install -y python3.12 python3.12-venv python3-pip nginx certbot python3-certbot-nginx

# Setup application
mkdir -p /opt/fastapi-app
cd /opt/fastapi-app
git clone https://github.com/youruser/yourapp.git .
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Create systemd service
cat > /etc/systemd/system/fastapi.service << 'UNIT'
[Unit]
Description=FastAPI Application
After=network.target

[Service]
User=www-data
Group=www-data
WorkingDirectory=/opt/fastapi-app
Environment="PATH=/opt/fastapi-app/venv/bin"
EnvironmentFile=/opt/fastapi-app/.env
ExecStart=/opt/fastapi-app/venv/bin/gunicorn app.main:app \
    --worker-class uvicorn.workers.UvicornWorker \
    --workers 4 \
    --bind unix:/tmp/fastapi.sock \
    --timeout 120
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
UNIT

# Enable and start
systemctl daemon-reload
systemctl enable fastapi
systemctl start fastapi

# Setup Nginx
cat > /etc/nginx/sites-available/fastapi << 'NGINX'
server {
    listen 80;
    server_name yourdomain.com;

    location / {
        proxy_pass http://unix:/tmp/fastapi.sock;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
NGINX

ln -s /etc/nginx/sites-available/fastapi /etc/nginx/sites-enabled/
nginx -t && systemctl restart nginx

# Setup SSL with Let's Encrypt
certbot --nginx -d yourdomain.com --non-interactive --agree-tos -m you@email.com

8. CI/CD Pipeline with GitHub Actions

Automate testing, building, and deployment with GitHub Actions. A proper CI/CD pipeline ensures every change is tested before it reaches production.

8.1 Complete CI/CD Workflow

# .github/workflows/ci-cd.yml
name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  PYTHON_VERSION: "3.12"
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  # ---- Lint & Type Check ----
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ env.PYTHON_VERSION }}

      - name: Install dependencies
        run: |
          pip install ruff mypy
          pip install -r requirements.txt

      - name: Run Ruff linter
        run: ruff check .

      - name: Run Ruff formatter check
        run: ruff format --check .

      - name: Run MyPy type checker
        run: mypy app/ --ignore-missing-imports

  # ---- Unit & Integration Tests ----
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: password
          POSTGRES_DB: test_db
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
      redis:
        image: redis:7-alpine
        ports:
          - 6379:6379
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ env.PYTHON_VERSION }}
          cache: pip

      - name: Install dependencies
        run: |
          pip install -r requirements.txt
          pip install -r requirements-dev.txt

      - name: Run tests with coverage
        env:
          DATABASE_URL: postgresql+asyncpg://postgres:password@localhost:5432/test_db
          REDIS_URL: redis://localhost:6379/0
          ENVIRONMENT: testing
          SECRET_KEY: test-secret-key
        run: |
          pytest tests/ -v --cov=app --cov-report=xml --cov-report=term

      - name: Upload coverage report
        uses: codecov/codecov-action@v4
        with:
          file: coverage.xml
          fail_ci_if_error: false

  # ---- Build Docker Image ----
  build:
    needs: [lint, test]
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'

    permissions:
      contents: read
      packages: write

    steps:
      - uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  # ---- Deploy to Production ----
  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: production

    steps:
      - name: Deploy to server
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.SERVER_HOST }}
          username: ${{ secrets.SERVER_USER }}
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            cd /opt/fastapi-app
            docker compose pull
            docker compose up -d --remove-orphans
            docker compose exec -T app alembic upgrade head
            docker system prune -f

8.2 Environment-Specific Deployments

Add separate deployment jobs for staging and production:

  # ---- Deploy to Staging ----
  deploy-staging:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/develop'
    environment: staging
    steps:
      - name: Deploy to staging
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.STAGING_HOST }}
          username: ${{ secrets.SERVER_USER }}
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            cd /opt/fastapi-staging
            docker compose -f docker-compose.staging.yml pull
            docker compose -f docker-compose.staging.yml up -d

  # ---- Deploy to Production (manual approval) ----
  deploy-production:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment:
      name: production
      url: https://api.yourdomain.com
    steps:
      - name: Deploy to production
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.PROD_HOST }}
          username: ${{ secrets.SERVER_USER }}
          key: ${{ secrets.SSH_PRIVATE_KEY }}
          script: |
            cd /opt/fastapi-prod
            docker compose pull
            docker compose up -d --no-deps app
            docker compose exec -T app alembic upgrade head
            # Verify health
            sleep 5
            curl -f http://localhost:8000/health || exit 1

9. Database Migrations in Production

Alembic is the standard migration tool for SQLAlchemy. Managing migrations in production requires careful coordination with your deployment process to avoid downtime and data loss.

9.1 Alembic Setup

# Install Alembic
pip install alembic

# Initialize Alembic
alembic init alembic

Configure Alembic to use your application’s database URL:

# alembic/env.py
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
import os
import sys

# Add project root to path
sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))

from app.database import Base  # Your SQLAlchemy Base
from app.models import *  # Import all models

config = context.config

# Override sqlalchemy.url from environment
database_url = os.getenv("DATABASE_URL", "")
# Handle Heroku-style postgres:// URLs
if database_url.startswith("postgres://"):
    database_url = database_url.replace("postgres://", "postgresql://", 1)
config.set_main_option("sqlalchemy.url", database_url)

if config.config_file_name is not None:
    fileConfig(config.config_file_name)

target_metadata = Base.metadata


def run_migrations_offline():
    """Run migrations in 'offline' mode (generates SQL script)."""
    url = config.get_main_option("sqlalchemy.url")
    context.configure(
        url=url,
        target_metadata=target_metadata,
        literal_binds=True,
        dialect_opts={"paramstyle": "named"},
    )
    with context.begin_transaction():
        context.run_migrations()


def run_migrations_online():
    """Run migrations in 'online' mode (directly against database)."""
    connectable = engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )
    with connectable.connect() as connection:
        context.configure(
            connection=connection,
            target_metadata=target_metadata,
        )
        with context.begin_transaction():
            context.run_migrations()


if context.is_offline_mode():
    run_migrations_offline()
else:
    run_migrations_online()

9.2 Creating and Running Migrations

# Generate a migration from model changes
alembic revision --autogenerate -m "add_users_table"

# Review the generated migration file before applying!
# Then apply
alembic upgrade head

# Rollback one step
alembic downgrade -1

# View migration history
alembic history --verbose

# Show current revision
alembic current

9.3 Migrations in Docker

Create an entrypoint script that runs migrations before starting the application:

#!/bin/bash
# docker-entrypoint.sh
set -e

echo "Running database migrations..."
alembic upgrade head

echo "Starting application..."
exec "$@"
# Dockerfile (updated)
# ... (previous build steps)
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x /docker-entrypoint.sh

ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["gunicorn", "app.main:app", "--worker-class", "uvicorn.workers.UvicornWorker", "--workers", "4", "--bind", "0.0.0.0:8000"]

9.4 Zero-Downtime Migration Strategy

For zero-downtime deployments, follow the expand-contract pattern:

  1. Expand: Add new columns/tables as nullable or with defaults (backward compatible)
  2. Migrate: Deploy new code that writes to both old and new schema
  3. Backfill: Populate new columns with data from old columns
  4. Contract: Remove old columns once all code uses the new schema
# Example: Renaming a column (email -> email_address)
# Migration 1: Add new column (expand)
def upgrade():
    op.add_column("users", sa.Column("email_address", sa.String(255), nullable=True))
    # Backfill
    op.execute("UPDATE users SET email_address = email WHERE email_address IS NULL")

def downgrade():
    op.drop_column("users", "email_address")


# Migration 2: Make new column required and drop old (contract)
# Deploy AFTER all code uses email_address
def upgrade():
    op.alter_column("users", "email_address", nullable=False)
    op.drop_column("users", "email")

def downgrade():
    op.add_column("users", sa.Column("email", sa.String(255), nullable=True))
    op.execute("UPDATE users SET email = email_address")

10. Monitoring and Logging

Production applications need comprehensive monitoring to detect issues before users do. This includes health checks, metrics collection, structured logging, and alerting.

10.1 Health Check Endpoints

# app/routers/health.py
from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import text
import redis.asyncio as redis
from datetime import datetime
from app.database import get_db
from app.config import get_settings

router = APIRouter(tags=["health"])


@router.get("/health")
async def health_check():
    """Basic health check for load balancers."""
    return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}


@router.get("/health/ready")
async def readiness_check(db: AsyncSession = Depends(get_db)):
    """Readiness check - verifies all dependencies are available."""
    checks = {}

    # Database check
    try:
        result = await db.execute(text("SELECT 1"))
        checks["database"] = {"status": "healthy"}
    except Exception as e:
        checks["database"] = {"status": "unhealthy", "error": str(e)}

    # Redis check
    try:
        settings = get_settings()
        r = redis.from_url(settings.redis_url)
        await r.ping()
        checks["redis"] = {"status": "healthy"}
        await r.close()
    except Exception as e:
        checks["redis"] = {"status": "unhealthy", "error": str(e)}

    overall = "healthy" if all(
        c["status"] == "healthy" for c in checks.values()
    ) else "unhealthy"

    return {
        "status": overall,
        "checks": checks,
        "timestamp": datetime.utcnow().isoformat(),
    }

10.2 Prometheus Metrics

Expose application metrics for Prometheus to scrape:

pip install prometheus-fastapi-instrumentator
# app/metrics.py
from prometheus_fastapi_instrumentator import Instrumentator
from prometheus_client import Counter, Histogram, Gauge

# Custom metrics
REQUEST_COUNT = Counter(
    "app_requests_total",
    "Total number of requests",
    ["method", "endpoint", "status"]
)

REQUEST_DURATION = Histogram(
    "app_request_duration_seconds",
    "Request duration in seconds",
    ["method", "endpoint"],
    buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]
)

ACTIVE_CONNECTIONS = Gauge(
    "app_active_connections",
    "Number of active connections"
)

DB_POOL_SIZE = Gauge(
    "app_db_pool_size",
    "Database connection pool size"
)


def setup_metrics(app):
    """Initialize Prometheus instrumentation."""
    Instrumentator(
        should_group_status_codes=False,
        should_ignore_untemplated=True,
        should_respect_env_var=False,
        excluded_handlers=["/health", "/metrics"],
        env_var_name="ENABLE_METRICS",
    ).instrument(app).expose(app, endpoint="/metrics")

Add metrics to your application factory:

# In app/main.py create_app()
from app.metrics import setup_metrics

def create_app() -> FastAPI:
    # ... previous setup ...
    setup_metrics(app)
    return app

10.3 Error Tracking with Sentry

pip install sentry-sdk[fastapi]
# app/sentry.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.sqlalchemy import SqlalchemyIntegration
from app.config import get_settings


def setup_sentry():
    """Initialize Sentry error tracking."""
    settings = get_settings()

    if settings.sentry_dsn:
        sentry_sdk.init(
            dsn=settings.sentry_dsn,
            environment=settings.environment,
            release=settings.app_version,
            integrations=[
                FastApiIntegration(transaction_style="endpoint"),
                SqlalchemyIntegration(),
            ],
            traces_sample_rate=0.1 if settings.environment == "production" else 1.0,
            profiles_sample_rate=0.1,
            send_default_pii=False,  # Don't send user PII
        )

10.4 Grafana Dashboard Setup

With Prometheus metrics exposed, you can create Grafana dashboards to visualize:

  • Request rate (requests per second by endpoint)
  • Response time percentiles (p50, p95, p99)
  • Error rate (4xx and 5xx responses)
  • Active connections
  • Database connection pool utilization
  • System metrics (CPU, memory, disk)
# docker-compose monitoring stack
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=15d'

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "fastapi"
    static_configs:
      - targets: ["app:8000"]
    metrics_path: /metrics

11. Performance Optimization

FastAPI is already one of the fastest Python frameworks, but production applications can benefit from caching, async optimization, connection pooling, and profiling.

11.1 Async Best Practices

# GOOD: Use async for I/O-bound operations
import httpx

async def fetch_external_data(url: str) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.json()


# GOOD: Run CPU-bound tasks in a thread pool
from fastapi.concurrency import run_in_threadpool
import hashlib

async def hash_password(password: str) -> str:
    return await run_in_threadpool(
        hashlib.pbkdf2_hmac, "sha256", password.encode(), b"salt", 100000
    )


# GOOD: Parallel async operations
import asyncio

async def get_dashboard_data(user_id: int):
    """Fetch multiple pieces of data concurrently."""
    orders, notifications, recommendations = await asyncio.gather(
        get_user_orders(user_id),
        get_notifications(user_id),
        get_recommendations(user_id),
    )
    return {
        "orders": orders,
        "notifications": notifications,
        "recommendations": recommendations,
    }


# BAD: Sequential async calls (slower)
async def get_dashboard_data_slow(user_id: int):
    orders = await get_user_orders(user_id)           # Wait...
    notifications = await get_notifications(user_id)   # Wait...
    recommendations = await get_recommendations(user_id)  # Wait...
    return {"orders": orders, "notifications": notifications}

11.2 Redis Caching

pip install redis
# app/cache.py
import json
import hashlib
from functools import wraps
from typing import Optional, Callable
import redis.asyncio as redis
from app.config import get_settings

_redis_client: Optional[redis.Redis] = None


async def get_redis() -> redis.Redis:
    """Get or create Redis client."""
    global _redis_client
    if _redis_client is None:
        settings = get_settings()
        _redis_client = redis.from_url(
            settings.redis_url,
            encoding="utf-8",
            decode_responses=True,
        )
    return _redis_client


async def cache_get(key: str) -> Optional[dict]:
    """Get a value from cache."""
    r = await get_redis()
    data = await r.get(key)
    if data:
        return json.loads(data)
    return None


async def cache_set(key: str, value: dict, ttl: int = 300):
    """Set a value in cache with TTL (default 5 minutes)."""
    r = await get_redis()
    await r.setex(key, ttl, json.dumps(value))


async def cache_delete(key: str):
    """Delete a key from cache."""
    r = await get_redis()
    await r.delete(key)


async def cache_delete_pattern(pattern: str):
    """Delete all keys matching a pattern."""
    r = await get_redis()
    async for key in r.scan_iter(match=pattern):
        await r.delete(key)


def cached(ttl: int = 300, prefix: str = ""):
    """Decorator for caching endpoint responses."""
    def decorator(func: Callable):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            # Build cache key from function name and arguments
            key_data = f"{prefix}:{func.__name__}:{str(args)}:{str(sorted(kwargs.items()))}"
            cache_key = hashlib.md5(key_data.encode()).hexdigest()

            # Check cache
            cached_result = await cache_get(cache_key)
            if cached_result is not None:
                return cached_result

            # Execute function
            result = await func(*args, **kwargs)

            # Store in cache
            if isinstance(result, dict):
                await cache_set(cache_key, result, ttl)
            elif hasattr(result, "model_dump"):
                await cache_set(cache_key, result.model_dump(), ttl)

            return result
        return wrapper
    return decorator

Use the caching decorator on your endpoints:

from app.cache import cached, cache_delete_pattern

@router.get("/products/{product_id}")
@cached(ttl=600, prefix="product")
async def get_product(product_id: int, db: AsyncSession = Depends(get_db)):
    """Get product with 10-minute cache."""
    product = await db.get(Product, product_id)
    if not product:
        raise HTTPException(status_code=404, detail="Product not found")
    return ProductResponse.model_validate(product).model_dump()


@router.put("/products/{product_id}")
async def update_product(product_id: int, data: ProductUpdate, db: AsyncSession = Depends(get_db)):
    """Update product and invalidate cache."""
    product = await db.get(Product, product_id)
    # ... update logic ...
    await cache_delete_pattern("product:*")
    return ProductResponse.model_validate(product)

11.3 Response Compression

# Add GZip middleware for large responses
from fastapi.middleware.gzip import GZipMiddleware

app.add_middleware(GZipMiddleware, minimum_size=1000)  # Compress responses > 1KB

11.4 Profiling

Use profiling to find bottlenecks in your application:

# app/profiling.py - Development only
import cProfile
import pstats
import io
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware


class ProfilingMiddleware(BaseHTTPMiddleware):
    """Profile requests and log slow endpoints. DEV ONLY."""

    async def dispatch(self, request: Request, call_next):
        profiler = cProfile.Profile()
        profiler.enable()

        response = await call_next(request)

        profiler.disable()

        # Log if request took more than 100ms
        stream = io.StringIO()
        stats = pstats.Stats(profiler, stream=stream)
        stats.sort_stats("cumulative")

        total_time = sum(stat[3] for stat in stats.stats.values())
        if total_time > 0.1:  # 100ms threshold
            stats.print_stats(20)
            print(f"SLOW REQUEST: {request.method} {request.url.path}")
            print(stream.getvalue())

        return response

12. Scaling Strategies

As your application grows, you need strategies to handle increased traffic. Scaling involves horizontal scaling (more instances), load balancing, caching layers, and rate limiting.

12.1 Horizontal Scaling with Docker Compose

# Scale to multiple instances
docker compose up -d --scale app=4

# Nginx automatically load balances across all instances
# docker-compose.prod.yml - Production scaling
version: "3.9"

services:
  app:
    build: .
    deploy:
      replicas: 4
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
        reservations:
          cpus: "0.25"
          memory: 128M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
    environment:
      - DATABASE_URL=postgresql+asyncpg://user:pass@db:5432/mydb
      - REDIS_URL=redis://redis:6379/0
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf
      - ./nginx/certs:/etc/nginx/certs
    depends_on:
      - app

12.2 Application-Level Rate Limiting

pip install slowapi
# app/rate_limit.py
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
from slowapi.middleware import SlowAPIMiddleware

limiter = Limiter(
    key_func=get_remote_address,
    default_limits=["100/minute"],
    storage_uri="redis://localhost:6379/1",
    strategy="fixed-window-elastic-expiry",
)


def setup_rate_limiting(app):
    """Configure rate limiting for the application."""
    app.state.limiter = limiter
    app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
    app.add_middleware(SlowAPIMiddleware)

Apply rate limits to specific endpoints:

from app.rate_limit import limiter

@router.post("/auth/login")
@limiter.limit("5/minute")
async def login(request: Request, credentials: LoginRequest):
    """Login with strict rate limiting."""
    # ... authentication logic
    pass


@router.get("/api/search")
@limiter.limit("30/minute")
async def search(request: Request, q: str):
    """Search with moderate rate limiting."""
    # ... search logic
    pass

12.3 Background Task Processing

For long-running tasks, use a task queue to process work asynchronously:

pip install celery[redis]
# app/tasks.py
from celery import Celery
from app.config import get_settings

settings = get_settings()

celery_app = Celery(
    "fastapi_tasks",
    broker=settings.redis_url,
    backend=settings.redis_url,
)

celery_app.conf.update(
    task_serializer="json",
    result_serializer="json",
    accept_content=["json"],
    timezone="UTC",
    task_track_started=True,
    task_time_limit=300,  # 5 minute hard limit
    task_soft_time_limit=240,  # 4 minute soft limit
    worker_max_tasks_per_child=100,  # Restart workers after 100 tasks
)


@celery_app.task(bind=True, max_retries=3)
def send_email_task(self, to_email: str, subject: str, body: str):
    """Send email asynchronously."""
    try:
        # ... send email logic
        pass
    except Exception as exc:
        self.retry(exc=exc, countdown=60)  # Retry after 60 seconds


@celery_app.task
def generate_report_task(user_id: int, report_type: str):
    """Generate report in background."""
    # ... heavy computation
    pass
# Use in FastAPI endpoints
from app.tasks import send_email_task, generate_report_task

@router.post("/reports/generate")
async def generate_report(user_id: int, report_type: str):
    task = generate_report_task.delay(user_id, report_type)
    return {"task_id": task.id, "status": "processing"}


@router.get("/tasks/{task_id}")
async def get_task_status(task_id: str):
    from celery.result import AsyncResult
    result = AsyncResult(task_id)
    return {
        "task_id": task_id,
        "status": result.status,
        "result": result.result if result.ready() else None,
    }

Scaling Strategy Summary

Strategy When to Use Complexity
Vertical scaling (bigger server) Quick fix, small apps Low
Horizontal scaling (more instances) High traffic, stateless apps Medium
Caching (Redis) Repeated reads, expensive queries Medium
Background tasks (Celery) Long operations, email, reports Medium
Database read replicas Read-heavy workloads High
CDN for static assets Global users, static content Low
Microservices Large teams, complex domains Very High

13. Security Hardening

Security is not optional in production. FastAPI provides several built-in security features, but you need to configure additional layers for a properly hardened deployment.

13.1 HTTPS Configuration

Always enforce HTTPS in production. Use the HTTPS redirect middleware:

from fastapi.middleware.httpsredirect import HTTPSRedirectMiddleware

if settings.environment == "production":
    app.add_middleware(HTTPSRedirectMiddleware)

13.2 Security Headers Middleware

# app/security.py
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware


class SecurityHeadersMiddleware(BaseHTTPMiddleware):
    """Add security headers to all responses."""

    async def dispatch(self, request: Request, call_next):
        response = await call_next(request)

        response.headers["X-Content-Type-Options"] = "nosniff"
        response.headers["X-Frame-Options"] = "DENY"
        response.headers["X-XSS-Protection"] = "1; mode=block"
        response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
        response.headers["Permissions-Policy"] = (
            "camera=(), microphone=(), geolocation=(), payment=()"
        )

        if request.url.scheme == "https":
            response.headers["Strict-Transport-Security"] = (
                "max-age=63072000; includeSubDomains; preload"
            )

        return response

13.3 CORS Configuration

from fastapi.middleware.cors import CORSMiddleware

# NEVER use allow_origins=["*"] in production
app.add_middleware(
    CORSMiddleware,
    allow_origins=[
        "https://yourdomain.com",
        "https://www.yourdomain.com",
        "https://admin.yourdomain.com",
    ],
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
    allow_headers=["Authorization", "Content-Type", "X-Request-ID"],
    expose_headers=["X-Request-ID"],
    max_age=3600,  # Cache preflight for 1 hour
)

13.4 Secrets Management

Never hardcode secrets. Use environment variables and secrets management services:

# app/secrets.py
import boto3
import json
from functools import lru_cache


@lru_cache()
def get_aws_secret(secret_name: str, region: str = "us-east-1") -> dict:
    """Retrieve secrets from AWS Secrets Manager."""
    client = boto3.client("secretsmanager", region_name=region)
    response = client.get_secret_value(SecretId=secret_name)
    return json.loads(response["SecretString"])


# Usage in settings
class Settings(BaseSettings):
    @classmethod
    def _load_aws_secrets(cls):
        """Load secrets from AWS Secrets Manager at startup."""
        try:
            secrets = get_aws_secret("fastapi/production")
            return secrets
        except Exception:
            return {}

    def __init__(self, **kwargs):
        aws_secrets = self._load_aws_secrets()
        # AWS secrets override env vars
        for key, value in aws_secrets.items():
            if key.lower() not in kwargs:
                kwargs[key.lower()] = value
        super().__init__(**kwargs)

13.5 Input Validation and Sanitization

from pydantic import BaseModel, Field, field_validator
import bleach
import re


class UserInput(BaseModel):
    """User input with validation and sanitization."""
    username: str = Field(min_length=3, max_length=50, pattern=r"^[a-zA-Z0-9_-]+$")
    email: str = Field(max_length=255)
    bio: str = Field(max_length=1000, default="")

    @field_validator("bio")
    @classmethod
    def sanitize_bio(cls, v: str) -> str:
        """Remove HTML tags from bio."""
        return bleach.clean(v, tags=[], strip=True)

    @field_validator("email")
    @classmethod
    def validate_email(cls, v: str) -> str:
        """Validate email format."""
        email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
        if not re.match(email_regex, v):
            raise ValueError("Invalid email format")
        return v.lower()

Security Checklist

Category Item Status
Transport HTTPS enforced everywhere Required
Transport HSTS header enabled Required
Auth Passwords hashed with bcrypt/argon2 Required
Auth JWT tokens with short expiry Required
Auth Rate limiting on login endpoints Required
Headers Security headers on all responses Required
CORS Specific origins (no wildcards) Required
Input Pydantic validation on all inputs Required
Secrets No secrets in code or git Required
Secrets Use secrets manager (AWS SM, Vault) Recommended
Dependencies Regular dependency updates Required
Docs Disable /docs and /redoc in production Recommended

14. Complete Production Stack

Here is a complete production-ready docker-compose setup with FastAPI, PostgreSQL, Redis, Nginx, Celery, and monitoring — everything you need to deploy a real-world application.

14.1 Project Structure

fastapi-production/
├── app/
│   ├── __init__.py
│   ├── main.py              # Application factory
│   ├── config.py             # Pydantic settings
│   ├── database.py           # Database setup
│   ├── models/               # SQLAlchemy models
│   ├── schemas/              # Pydantic schemas
│   ├── routers/              # API routes
│   ├── services/             # Business logic
│   ├── middleware.py          # Custom middleware
│   ├── cache.py              # Redis caching
│   ├── tasks.py              # Celery tasks
│   └── logging_config.py     # Structured logging
├── alembic/                   # Database migrations
│   ├── versions/
│   └── env.py
├── nginx/
│   ├── nginx.conf
│   └── certs/
├── tests/
│   ├── conftest.py
│   ├── test_routes/
│   └── test_services/
├── .github/
│   └── workflows/
│       └── ci-cd.yml
├── Dockerfile
├── docker-compose.yml         # Development
├── docker-compose.prod.yml    # Production
├── docker-entrypoint.sh
├── gunicorn.conf.py
├── requirements.txt
├── requirements-dev.txt
├── alembic.ini
├── .env.example
├── .dockerignore
└── .gitignore

14.2 Production Docker Compose

# docker-compose.prod.yml
version: "3.9"

services:
  # ---- FastAPI Application ----
  app:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      - ENVIRONMENT=production
      - DATABASE_URL=postgresql+asyncpg://fastapi:${DB_PASSWORD}@db:5432/fastapi_prod
      - REDIS_URL=redis://redis:6379/0
      - SECRET_KEY=${SECRET_KEY}
      - JWT_SECRET=${JWT_SECRET}
      - LOG_LEVEL=INFO
      - LOG_FORMAT=json
      - WORKERS=4
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: always
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    networks:
      - backend
      - frontend

  # ---- Celery Worker ----
  celery-worker:
    build: .
    command: celery -A app.tasks worker --loglevel=info --concurrency=4
    environment:
      - DATABASE_URL=postgresql+asyncpg://fastapi:${DB_PASSWORD}@db:5432/fastapi_prod
      - REDIS_URL=redis://redis:6379/0
    depends_on:
      - db
      - redis
    restart: always
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: "0.5"
          memory: 256M
    networks:
      - backend

  # ---- Celery Beat (Scheduler) ----
  celery-beat:
    build: .
    command: celery -A app.tasks beat --loglevel=info
    environment:
      - REDIS_URL=redis://redis:6379/0
    depends_on:
      - redis
    restart: always
    networks:
      - backend

  # ---- PostgreSQL ----
  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_USER: fastapi
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: fastapi_prod
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U fastapi -d fastapi_prod"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: always
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 1G
    networks:
      - backend

  # ---- Redis ----
  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: always
    networks:
      - backend

  # ---- Nginx Reverse Proxy ----
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/certs:/etc/nginx/certs:ro
      - static_files:/app/static:ro
    depends_on:
      - app
    restart: always
    networks:
      - frontend

  # ---- Prometheus (Monitoring) ----
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus_data:/prometheus
    ports:
      - "9090:9090"
    restart: always
    networks:
      - backend

  # ---- Grafana (Dashboards) ----
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    restart: always
    networks:
      - backend

volumes:
  postgres_data:
  redis_data:
  static_files:
  prometheus_data:
  grafana_data:

networks:
  frontend:
    driver: bridge
  backend:
    driver: bridge

14.3 Production Nginx Configuration

# nginx/nginx.conf (production)
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging format
    log_format json_combined escape=json
        '{"time":"$time_iso8601",'
        '"remote_addr":"$remote_addr",'
        '"request":"$request",'
        '"status":$status,'
        '"body_bytes_sent":$body_bytes_sent,'
        '"request_time":$request_time,'
        '"upstream_response_time":"$upstream_response_time"}';

    access_log /var/log/nginx/access.log json_combined;

    # Performance
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;

    # Gzip
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_min_length 1024;
    gzip_types text/plain text/css application/json application/javascript text/xml;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=1r/s;

    # Upstream (load balancing across app replicas)
    upstream app {
        least_conn;
        server app:8000;
    }

    # HTTP -> HTTPS redirect
    server {
        listen 80;
        server_name _;
        return 301 https://$host$request_uri;
    }

    # HTTPS server
    server {
        listen 443 ssl http2;
        server_name yourdomain.com;

        ssl_certificate /etc/nginx/certs/fullchain.pem;
        ssl_certificate_key /etc/nginx/certs/privkey.pem;
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
        ssl_prefer_server_ciphers off;

        # Security headers
        add_header X-Frame-Options "DENY" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header Strict-Transport-Security "max-age=63072000" always;

        client_max_body_size 10M;

        # API endpoints
        location /api/ {
            limit_req zone=api burst=20 nodelay;
            proxy_pass http://app;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # Auth endpoints (strict rate limiting)
        location /api/auth/ {
            limit_req zone=auth burst=5 nodelay;
            proxy_pass http://app;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # WebSocket
        location /ws/ {
            proxy_pass http://app;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_read_timeout 86400s;
        }

        # Health check (no logging, no rate limit)
        location /health {
            access_log off;
            proxy_pass http://app;
        }

        # Static files
        location /static/ {
            alias /app/static/;
            expires 30d;
            add_header Cache-Control "public, immutable";
        }
    }
}

14.4 Deployment Commands

# Create .env file for production secrets
cat > .env << 'EOF'
DB_PASSWORD=your-strong-password-here
SECRET_KEY=your-secret-key-here
JWT_SECRET=your-jwt-secret-here
GRAFANA_PASSWORD=admin-password-here
EOF

# Start the full production stack
docker compose -f docker-compose.prod.yml up -d --build

# Check all services are healthy
docker compose -f docker-compose.prod.yml ps

# View application logs
docker compose -f docker-compose.prod.yml logs -f app

# Run database migrations
docker compose -f docker-compose.prod.yml exec app alembic upgrade head

# Scale application horizontally
docker compose -f docker-compose.prod.yml up -d --scale app=4

# Rolling update (zero downtime)
docker compose -f docker-compose.prod.yml build app
docker compose -f docker-compose.prod.yml up -d --no-deps app

# Backup database
docker compose -f docker-compose.prod.yml exec db pg_dump -U fastapi fastapi_prod > backup.sql

15. Key Takeaways

# Topic Key Points
1 Configuration Use pydantic-settings for type-safe configuration from environment variables. Never hardcode secrets.
2 ASGI Servers Use Gunicorn with Uvicorn workers for production. Set workers to (2 * CPU) + 1. Enable max_requests to prevent memory leaks.
3 Docker Use multi-stage builds for smaller images. Run as non-root user. Include health checks. Use .dockerignore to reduce context size.
4 Nginx Always use Nginx as a reverse proxy. Handle SSL termination, static files, rate limiting, and WebSocket proxying at the Nginx layer.
5 AWS EC2 for full control, ECS/Fargate for managed containers, Lambda with Mangum for serverless. Use SSM Parameter Store or Secrets Manager for secrets.
6 Heroku Simplest deployment path. Use Procfile with Gunicorn + Uvicorn workers. Add release phase for auto-migrations.
7 DigitalOcean App Platform for managed PaaS or Droplets with systemd for full control. Both work well for FastAPI.
8 CI/CD GitHub Actions pipeline: lint, test with services (Postgres, Redis), build Docker image, deploy. Use environments for staging/production separation.
9 Migrations Use Alembic for database migrations. Run migrations in Docker entrypoint or release phase. Follow expand-contract pattern for zero-downtime changes.
10 Monitoring Health check endpoints for load balancers. Prometheus metrics with Grafana dashboards. Sentry for error tracking. Structured JSON logging.
11 Performance Use asyncio.gather for parallel I/O. Cache with Redis. Enable GZip compression. Profile slow endpoints to find bottlenecks.
12 Scaling Start with vertical scaling, then horizontal. Use Celery for background tasks. Rate limit with slowapi. Consider read replicas for DB-heavy workloads.
13 Security Enforce HTTPS, add security headers, configure CORS properly, validate all inputs with Pydantic, use secrets management, disable docs in production.
14 Full Stack Production stack: FastAPI + PostgreSQL + Redis + Nginx + Celery + Prometheus + Grafana. Use docker-compose for orchestration with health checks, resource limits, and network isolation.

Production Deployment Checklist

  1. Configuration: Environment variables via pydantic-settings, no secrets in code
  2. ASGI Server: Gunicorn + Uvicorn workers with proper timeouts and worker count
  3. Containerization: Multi-stage Docker build, non-root user, health checks
  4. Reverse Proxy: Nginx with SSL, rate limiting, security headers
  5. Database: Connection pooling, Alembic migrations, backups
  6. Caching: Redis for session data and response caching
  7. Monitoring: Health endpoints, Prometheus metrics, Sentry errors
  8. Logging: Structured JSON logging with request correlation IDs
  9. Security: HTTPS, CORS, input validation, secrets management
  10. CI/CD: Automated testing, building, and deployment pipeline
  11. Scaling: Horizontal scaling ready, rate limiting, background tasks
  12. Backups: Database backup strategy, disaster recovery plan

With these configurations and practices in place, your FastAPI application is ready for production traffic. Start simple — you don’t need every component from day one. Begin with Docker + Nginx + Gunicorn, add monitoring as you grow, and scale horizontally when needed.




Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *