Deploying a FastAPI application to production requires more than just running uvicorn main:app. A production deployment involves configuring ASGI servers for performance, containerizing your application with Docker, setting up reverse proxies, implementing CI/CD pipelines, managing database migrations, and ensuring security and monitoring are in place.
This comprehensive guide covers everything you need to deploy FastAPI applications reliably, from single-server setups to scalable cloud architectures. Whether you’re deploying to AWS, Heroku, DigitalOcean, or your own infrastructure, you’ll find practical, production-tested configurations here.
Before deploying, your FastAPI application needs proper configuration management, structured logging, and environment-specific settings. The pydantic-settings library provides type-safe configuration that reads from environment variables and .env files.
Install the required package:
pip install pydantic-settings python-dotenv
Create a centralized settings module that all parts of your application can import:
# app/config.py
from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field
from functools import lru_cache
from typing import Optional
class Settings(BaseSettings):
"""Application settings loaded from environment variables."""
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
)
# Application
app_name: str = "FastAPI App"
app_version: str = "1.0.0"
debug: bool = False
environment: str = "production" # development, staging, production
# Server
host: str = "0.0.0.0"
port: int = 8000
workers: int = 4
reload: bool = False
# Database
database_url: str = "postgresql+asyncpg://user:pass@localhost:5432/mydb"
db_pool_size: int = 20
db_max_overflow: int = 10
db_pool_timeout: int = 30
# Redis
redis_url: str = "redis://localhost:6379/0"
# Security
secret_key: str = Field(default="change-me-in-production")
allowed_hosts: list[str] = ["*"]
cors_origins: list[str] = ["http://localhost:3000"]
# JWT
jwt_secret: str = Field(default="jwt-secret-change-me")
jwt_algorithm: str = "HS256"
jwt_expiration_minutes: int = 30
# Logging
log_level: str = "INFO"
log_format: str = "json" # json or text
# External Services
smtp_host: Optional[str] = None
smtp_port: int = 587
sentry_dsn: Optional[str] = None
@lru_cache()
def get_settings() -> Settings:
"""Cached settings instance."""
return Settings()
Create a .env file for local development:
# .env APP_NAME=MyFastAPIApp DEBUG=true ENVIRONMENT=development DATABASE_URL=postgresql+asyncpg://postgres:password@localhost:5432/mydb REDIS_URL=redis://localhost:6379/0 SECRET_KEY=dev-secret-key-not-for-production JWT_SECRET=dev-jwt-secret LOG_LEVEL=DEBUG LOG_FORMAT=text CORS_ORIGINS=["http://localhost:3000","http://localhost:8080"]
Use settings throughout your application:
# app/main.py
from fastapi import FastAPI, Depends
from app.config import Settings, get_settings
app = FastAPI()
@app.get("/info")
async def app_info(settings: Settings = Depends(get_settings)):
return {
"app_name": settings.app_name,
"version": settings.app_version,
"environment": settings.environment,
"debug": settings.debug,
}
Production applications need structured logging (JSON format) for log aggregation tools like ELK Stack, Datadog, or CloudWatch. Use structlog for structured, contextualized logging:
pip install structlog
# app/logging_config.py
import logging
import sys
import structlog
from app.config import get_settings
def setup_logging():
"""Configure structured logging for the application."""
settings = get_settings()
# Choose processors based on environment
if settings.log_format == "json":
renderer = structlog.processors.JSONRenderer()
else:
renderer = structlog.dev.ConsoleRenderer(colors=True)
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.StackInfoRenderer(),
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.format_exc_info,
renderer,
],
wrapper_class=structlog.make_filtering_bound_logger(
getattr(logging, settings.log_level.upper(), logging.INFO)
),
context_class=dict,
logger_factory=structlog.PrintLoggerFactory(file=sys.stdout),
cache_logger_on_first_use=True,
)
def get_logger(name: str = __name__):
"""Get a structured logger instance."""
return structlog.get_logger(name)
Add request logging middleware to track every request:
# app/middleware.py
import time
import uuid
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
from app.logging_config import get_logger
logger = get_logger(__name__)
class RequestLoggingMiddleware(BaseHTTPMiddleware):
"""Log every request with timing and correlation ID."""
async def dispatch(self, request: Request, call_next):
request_id = str(uuid.uuid4())[:8]
start_time = time.perf_counter()
# Add request ID to structlog context
structlog.contextvars.clear_contextvars()
structlog.contextvars.bind_contextvars(request_id=request_id)
logger.info(
"request_started",
method=request.method,
path=request.url.path,
client_ip=request.client.host if request.client else "unknown",
)
response = await call_next(request)
duration = time.perf_counter() - start_time
logger.info(
"request_completed",
method=request.method,
path=request.url.path,
status_code=response.status_code,
duration_ms=round(duration * 1000, 2),
)
response.headers["X-Request-ID"] = request_id
return response
Use a factory function to create your FastAPI application with all middleware and configuration applied:
# app/main.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from app.config import get_settings
from app.logging_config import setup_logging
from app.middleware import RequestLoggingMiddleware
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Manage application startup and shutdown."""
# Startup
setup_logging()
from app.logging_config import get_logger
logger = get_logger("lifespan")
logger.info("application_starting", environment=get_settings().environment)
# Initialize database, Redis, etc.
# await init_db()
# await init_redis()
yield # Application runs here
# Shutdown
logger.info("application_shutting_down")
# await close_db()
# await close_redis()
def create_app() -> FastAPI:
"""Application factory."""
settings = get_settings()
app = FastAPI(
title=settings.app_name,
version=settings.app_version,
debug=settings.debug,
lifespan=lifespan,
docs_url="/docs" if settings.debug else None,
redoc_url="/redoc" if settings.debug else None,
)
# CORS
app.add_middleware(
CORSMiddleware,
allow_origins=settings.cors_origins,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Request logging
app.add_middleware(RequestLoggingMiddleware)
# Include routers
from app.routers import api_router
app.include_router(api_router, prefix="/api/v1")
return app
app = create_app()
FastAPI runs on ASGI (Asynchronous Server Gateway Interface) servers. While Uvicorn is great for development, production deployments need proper process management, graceful shutdowns, and multiple worker processes.
Uvicorn can run with multiple workers for production use:
# Basic production run
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4
# With all production options
uvicorn app.main:app \
--host 0.0.0.0 \
--port 8000 \
--workers 4 \
--loop uvloop \
--http httptools \
--log-level warning \
--access-log \
--proxy-headers \
--forwarded-allow-ips="*"
The number of workers should typically be set to (2 * CPU_CORES) + 1. You can also configure Uvicorn programmatically:
# run.py
import uvicorn
from app.config import get_settings
if __name__ == "__main__":
settings = get_settings()
uvicorn.run(
"app.main:app",
host=settings.host,
port=settings.port,
workers=settings.workers,
reload=settings.reload,
log_level=settings.log_level.lower(),
proxy_headers=True,
forwarded_allow_ips="*",
)
Gunicorn provides battle-tested process management. Combined with Uvicorn workers, it gives you the best of both worlds — Gunicorn’s process management with Uvicorn’s ASGI performance:
# Install both
pip install gunicorn uvicorn[standard]
# Run with Uvicorn workers
gunicorn app.main:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000 \
--timeout 120 \
--graceful-timeout 30 \
--keep-alive 5 \
--access-logfile - \
--error-logfile -
Create a Gunicorn configuration file for more control:
# gunicorn.conf.py
import multiprocessing
import os
# Server socket
bind = f"0.0.0.0:{os.getenv('PORT', '8000')}"
backlog = 2048
# Worker processes
workers = int(os.getenv("WEB_CONCURRENCY", multiprocessing.cpu_count() * 2 + 1))
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000
timeout = 120
graceful_timeout = 30
keepalive = 5
# Restart workers after this many requests (prevents memory leaks)
max_requests = 1000
max_requests_jitter = 50
# Logging
accesslog = "-"
errorlog = "-"
loglevel = os.getenv("LOG_LEVEL", "info").lower()
# Process naming
proc_name = "fastapi-app"
# Server hooks
def on_starting(server):
"""Called just before the master process is initialized."""
pass
def post_worker_init(worker):
"""Called just after a worker has been initialized."""
worker.log.info(f"Worker {worker.pid} initialized")
def worker_exit(server, worker):
"""Called when a worker exits."""
worker.log.info(f"Worker {worker.pid} exiting")
# Run with config file gunicorn app.main:app -c gunicorn.conf.py
Hypercorn supports HTTP/2 and HTTP/3, which can be useful for applications that benefit from multiplexed connections:
pip install hypercorn
# Basic run
hypercorn app.main:app --bind 0.0.0.0:8000 --workers 4
# With HTTP/2
hypercorn app.main:app \
--bind 0.0.0.0:8000 \
--workers 4 \
--certfile cert.pem \
--keyfile key.pem
| Feature | Uvicorn | Gunicorn + Uvicorn | Hypercorn |
|---|---|---|---|
| Process Management | Basic | Advanced (preforking) | Basic |
| Graceful Restart | Limited | Full (SIGHUP) | Limited |
| HTTP/2 | No | No | Yes |
| Worker Recovery | Manual | Automatic | Manual |
| Memory Leak Protection | No | max_requests | No |
| Production Ready | With care | Yes (recommended) | With care |
Docker provides consistent, reproducible environments across development, staging, and production. A well-crafted Dockerfile ensures your FastAPI application runs the same way everywhere.
# Dockerfile
FROM python:3.12-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser
WORKDIR /app
# Install system dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
curl \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Change ownership to non-root user
RUN chown -R appuser:appuser /app
USER appuser
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
# Run the application
CMD ["gunicorn", "app.main:app", \
"--worker-class", "uvicorn.workers.UvicornWorker", \
"--workers", "4", \
"--bind", "0.0.0.0:8000", \
"--timeout", "120", \
"--access-logfile", "-"]
Multi-stage builds produce smaller images by separating build dependencies from the runtime environment:
# Dockerfile.multistage
# ---- Build Stage ----
FROM python:3.12-slim AS builder
ENV PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1
WORKDIR /build
# Install build dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --prefix=/install --no-cache-dir -r requirements.txt
# ---- Runtime Stage ----
FROM python:3.12-slim AS runtime
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser
# Install runtime-only system dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends curl \
&& rm -rf /var/lib/apt/lists/*
# Copy Python packages from builder
COPY --from=builder /install /usr/local
WORKDIR /app
COPY --chown=appuser:appuser . .
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
CMD ["gunicorn", "app.main:app", \
"--worker-class", "uvicorn.workers.UvicornWorker", \
"--workers", "4", \
"--bind", "0.0.0.0:8000"]
Exclude unnecessary files from the build context:
# .dockerignore __pycache__ *.pyc *.pyo .git .gitignore .env .env.* .venv venv *.md docs/ tests/ .pytest_cache .coverage htmlcov/ .mypy_cache .ruff_cache docker-compose*.yml Dockerfile* .dockerignore
# docker-compose.yml
version: "3.9"
services:
app:
build:
context: .
dockerfile: Dockerfile
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql+asyncpg://postgres:password@db:5432/fastapi_db
- REDIS_URL=redis://redis:6379/0
- ENVIRONMENT=development
- DEBUG=true
- LOG_LEVEL=DEBUG
volumes:
- .:/app # Hot reload in development
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: fastapi_db
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 5s
retries: 5
volumes:
postgres_data:
redis_data:
# Build and start all services docker compose up --build -d # View logs docker compose logs -f app # Run database migrations docker compose exec app alembic upgrade head # Stop all services docker compose down # Stop and remove volumes (clean slate) docker compose down -v
Nginx sits in front of your ASGI server to handle SSL termination, static file serving, load balancing, request buffering, and rate limiting. It is the standard production setup for Python web applications.
# nginx/nginx.conf
upstream fastapi_backend {
server app:8000;
}
server {
listen 80;
server_name yourdomain.com www.yourdomain.com;
# Redirect HTTP to HTTPS
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name yourdomain.com www.yourdomain.com;
# SSL certificates (Let's Encrypt)
ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
# SSL settings
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers off;
ssl_session_timeout 1d;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
# Security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;
# Request size limit
client_max_body_size 10M;
# Gzip compression
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types text/plain text/css application/json application/javascript text/xml;
# Static files
location /static/ {
alias /app/static/;
expires 30d;
add_header Cache-Control "public, immutable";
}
# API proxy
location / {
proxy_pass http://fastapi_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# Buffering
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 4k;
}
# Health check endpoint (no logging)
location /health {
proxy_pass http://fastapi_backend/health;
access_log off;
}
}
FastAPI supports WebSockets, which require special Nginx configuration:
# Add to the server block
location /ws/ {
proxy_pass http://fastapi_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket timeout (keep alive)
proxy_read_timeout 86400s;
proxy_send_timeout 86400s;
}
If you run multiple FastAPI instances, Nginx can load balance between them:
upstream fastapi_backend {
least_conn; # Send to the server with fewest connections
server app1:8000 weight=3; # Higher weight = more traffic
server app2:8000 weight=2;
server app3:8000 weight=1;
# Health checks (Nginx Plus only, use external for OSS)
# health_check interval=10s fails=3 passes=2;
}
# Add to http block (before server blocks)
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
server {
# ...
# General API rate limiting
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://fastapi_backend;
# ... proxy headers
}
# Strict rate limiting for auth endpoints
location /api/auth/ {
limit_req zone=login burst=5 nodelay;
proxy_pass http://fastapi_backend;
# ... proxy headers
}
}
AWS offers multiple ways to deploy FastAPI, from virtual servers (EC2) to managed containers (ECS/Fargate) to serverless (Lambda). Each approach has different trade-offs in cost, complexity, and scalability.
EC2 gives you full control over the server environment. This is a good starting point for teams familiar with server administration.
#!/bin/bash # ec2-setup.sh - Run on a fresh Ubuntu 22.04 EC2 instance # Update system sudo apt-get update && sudo apt-get upgrade -y # Install Python 3.12 sudo add-apt-repository ppa:deadsnakes/ppa -y sudo apt-get install -y python3.12 python3.12-venv python3.12-dev # Install Nginx sudo apt-get install -y nginx certbot python3-certbot-nginx # Install supervisor for process management sudo apt-get install -y supervisor # Create application directory sudo mkdir -p /opt/fastapi-app sudo chown $USER:$USER /opt/fastapi-app # Clone your application cd /opt/fastapi-app git clone https://github.com/youruser/yourapp.git . # Create virtual environment python3.12 -m venv venv source venv/bin/activate pip install -r requirements.txt # Copy environment file cp .env.production .env
# /etc/supervisor/conf.d/fastapi.conf
[program:fastapi]
command=/opt/fastapi-app/venv/bin/gunicorn app.main:app
--worker-class uvicorn.workers.UvicornWorker
--workers 4
--bind unix:/tmp/fastapi.sock
--timeout 120
--access-logfile /var/log/fastapi/access.log
--error-logfile /var/log/fastapi/error.log
directory=/opt/fastapi-app
user=www-data
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/log/fastapi/supervisor.log
environment=
ENVIRONMENT="production",
DATABASE_URL="postgresql+asyncpg://user:pass@rds-endpoint:5432/mydb"
# Start the application sudo supervisorctl reread sudo supervisorctl update sudo supervisorctl start fastapi # Check status sudo supervisorctl status fastapi
ECS Fargate runs your Docker containers without managing servers. You define a task (container specs) and a service (how many to run).
# ecs-task-definition.json
{
"family": "fastapi-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"name": "fastapi",
"image": "ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest",
"portMappings": [
{
"containerPort": 8000,
"protocol": "tcp"
}
],
"environment": [
{"name": "ENVIRONMENT", "value": "production"},
{"name": "WORKERS", "value": "2"}
],
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:ssm:us-east-1:ACCOUNT:parameter/fastapi/database_url"
},
{
"name": "SECRET_KEY",
"valueFrom": "arn:aws:ssm:us-east-1:ACCOUNT:parameter/fastapi/secret_key"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/fastapi-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 10
}
}
]
}
# Build and push Docker image to ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.us-east-1.amazonaws.com
docker build -t fastapi-app .
docker tag fastapi-app:latest ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest
docker push ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/fastapi-app:latest
# Register task definition
aws ecs register-task-definition --cli-input-json file://ecs-task-definition.json
# Create or update service
aws ecs update-service \
--cluster fastapi-cluster \
--service fastapi-service \
--task-definition fastapi-app \
--desired-count 2 \
--force-new-deployment
Mangum is an adapter that lets you run FastAPI on AWS Lambda behind API Gateway. This is ideal for low-traffic APIs or APIs with bursty traffic patterns.
pip install mangum
# lambda_handler.py from mangum import Mangum from app.main import app # Create the Lambda handler handler = Mangum(app, lifespan="off")
# template.yaml (AWS SAM)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Globals:
Function:
Timeout: 30
MemorySize: 512
Runtime: python3.12
Resources:
FastAPIFunction:
Type: AWS::Serverless::Function
Properties:
Handler: lambda_handler.handler
CodeUri: .
Events:
ApiEvent:
Type: HttpApi
Properties:
Path: /{proxy+}
Method: ANY
RootEvent:
Type: HttpApi
Properties:
Path: /
Method: ANY
Environment:
Variables:
ENVIRONMENT: production
DATABASE_URL: !Ref DatabaseUrl
Policies:
- AmazonSSMReadOnlyAccess
Parameters:
DatabaseUrl:
Type: AWS::SSM::Parameter::Value<String>
Default: /fastapi/database_url
Outputs:
ApiUrl:
Description: API Gateway endpoint URL
Value: !Sub "https://${ServerlessHttpApi}.execute-api.${AWS::Region}.amazonaws.com"
# Deploy with SAM sam build sam deploy --guided
| Feature | EC2 | ECS Fargate | Lambda |
|---|---|---|---|
| Server Management | You manage | AWS manages | Fully serverless |
| Scaling | Manual / ASG | Auto-scaling | Automatic |
| Cost Model | Per hour | Per vCPU/memory/sec | Per request |
| Cold Start | None | Minimal | Yes (seconds) |
| WebSockets | Yes | Yes | Via API Gateway |
| Best For | Full control | Containers at scale | Low/bursty traffic |
Heroku is one of the simplest platforms for deploying FastAPI. It handles infrastructure, SSL, and scaling with minimal configuration.
Create the required files in your project root:
# Procfile web: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT --timeout 120
# runtime.txt python-3.12.3
# requirements.txt fastapi==0.115.0 uvicorn[standard]==0.30.0 gunicorn==22.0.0 pydantic-settings==2.5.0 sqlalchemy[asyncio]==2.0.35 asyncpg==0.29.0 alembic==1.13.0 python-dotenv==1.0.1 httpx==0.27.0
# Login to Heroku
heroku login
# Create a new app
heroku create my-fastapi-app
# Add PostgreSQL addon
heroku addons:create heroku-postgresql:essential-0
# Add Redis addon
heroku addons:create heroku-redis:mini
# Set environment variables
heroku config:set \
ENVIRONMENT=production \
SECRET_KEY=$(python -c "import secrets; print(secrets.token_urlsafe(32))") \
JWT_SECRET=$(python -c "import secrets; print(secrets.token_urlsafe(32))") \
LOG_LEVEL=INFO \
LOG_FORMAT=json
# Deploy
git push heroku main
# Run migrations
heroku run alembic upgrade head
# View logs
heroku logs --tail
# Scale dynos
heroku ps:scale web=2
Add a release command to automatically run migrations on each deploy:
# Procfile (updated) web: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT release: alembic upgrade head
DigitalOcean offers two main options: App Platform (managed PaaS, similar to Heroku) and Droplets (virtual servers, similar to EC2).
Create an app specification file:
# .do/app.yaml
name: fastapi-app
region: nyc
services:
- name: api
github:
repo: youruser/fastapi-app
branch: main
deploy_on_push: true
build_command: pip install -r requirements.txt
run_command: gunicorn app.main:app --worker-class uvicorn.workers.UvicornWorker --workers 2 --bind 0.0.0.0:$PORT
envs:
- key: ENVIRONMENT
value: production
- key: SECRET_KEY
type: SECRET
value: your-secret-key
- key: DATABASE_URL
scope: RUN_TIME
value: ${db.DATABASE_URL}
instance_count: 2
instance_size_slug: professional-xs
http_port: 8000
health_check:
http_path: /health
databases:
- engine: PG
name: db
num_nodes: 1
size: db-s-dev-database
version: "16"
# Deploy using doctl CLI doctl apps create --spec .do/app.yaml # List apps doctl apps list # View logs doctl apps logs APP_ID --type run
For a Droplet (virtual server), the setup is similar to EC2. Create a setup script:
#!/bin/bash
# droplet-setup.sh - For Ubuntu 22.04 Droplet
# Update system
apt-get update && apt-get upgrade -y
# Install dependencies
apt-get install -y python3.12 python3.12-venv python3-pip nginx certbot python3-certbot-nginx
# Setup application
mkdir -p /opt/fastapi-app
cd /opt/fastapi-app
git clone https://github.com/youruser/yourapp.git .
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Create systemd service
cat > /etc/systemd/system/fastapi.service << 'UNIT'
[Unit]
Description=FastAPI Application
After=network.target
[Service]
User=www-data
Group=www-data
WorkingDirectory=/opt/fastapi-app
Environment="PATH=/opt/fastapi-app/venv/bin"
EnvironmentFile=/opt/fastapi-app/.env
ExecStart=/opt/fastapi-app/venv/bin/gunicorn app.main:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind unix:/tmp/fastapi.sock \
--timeout 120
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
UNIT
# Enable and start
systemctl daemon-reload
systemctl enable fastapi
systemctl start fastapi
# Setup Nginx
cat > /etc/nginx/sites-available/fastapi << 'NGINX'
server {
listen 80;
server_name yourdomain.com;
location / {
proxy_pass http://unix:/tmp/fastapi.sock;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
NGINX
ln -s /etc/nginx/sites-available/fastapi /etc/nginx/sites-enabled/
nginx -t && systemctl restart nginx
# Setup SSL with Let's Encrypt
certbot --nginx -d yourdomain.com --non-interactive --agree-tos -m you@email.com
Automate testing, building, and deployment with GitHub Actions. A proper CI/CD pipeline ensures every change is tested before it reaches production.
# .github/workflows/ci-cd.yml
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
PYTHON_VERSION: "3.12"
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
# ---- Lint & Type Check ----
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install dependencies
run: |
pip install ruff mypy
pip install -r requirements.txt
- name: Run Ruff linter
run: ruff check .
- name: Run Ruff formatter check
run: ruff format --check .
- name: Run MyPy type checker
run: mypy app/ --ignore-missing-imports
# ---- Unit & Integration Tests ----
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: test_db
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: redis:7-alpine
ports:
- 6379:6379
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: pip
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: Run tests with coverage
env:
DATABASE_URL: postgresql+asyncpg://postgres:password@localhost:5432/test_db
REDIS_URL: redis://localhost:6379/0
ENVIRONMENT: testing
SECRET_KEY: test-secret-key
run: |
pytest tests/ -v --cov=app --cov-report=xml --cov-report=term
- name: Upload coverage report
uses: codecov/codecov-action@v4
with:
file: coverage.xml
fail_ci_if_error: false
# ---- Build Docker Image ----
build:
needs: [lint, test]
runs-on: ubuntu-latest
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
# ---- Deploy to Production ----
deploy:
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: production
steps:
- name: Deploy to server
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.SERVER_HOST }}
username: ${{ secrets.SERVER_USER }}
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /opt/fastapi-app
docker compose pull
docker compose up -d --remove-orphans
docker compose exec -T app alembic upgrade head
docker system prune -f
Add separate deployment jobs for staging and production:
# ---- Deploy to Staging ----
deploy-staging:
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/develop'
environment: staging
steps:
- name: Deploy to staging
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.STAGING_HOST }}
username: ${{ secrets.SERVER_USER }}
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /opt/fastapi-staging
docker compose -f docker-compose.staging.yml pull
docker compose -f docker-compose.staging.yml up -d
# ---- Deploy to Production (manual approval) ----
deploy-production:
needs: build
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://api.yourdomain.com
steps:
- name: Deploy to production
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.PROD_HOST }}
username: ${{ secrets.SERVER_USER }}
key: ${{ secrets.SSH_PRIVATE_KEY }}
script: |
cd /opt/fastapi-prod
docker compose pull
docker compose up -d --no-deps app
docker compose exec -T app alembic upgrade head
# Verify health
sleep 5
curl -f http://localhost:8000/health || exit 1
Alembic is the standard migration tool for SQLAlchemy. Managing migrations in production requires careful coordination with your deployment process to avoid downtime and data loss.
# Install Alembic pip install alembic # Initialize Alembic alembic init alembic
Configure Alembic to use your application’s database URL:
# alembic/env.py
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context
import os
import sys
# Add project root to path
sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
from app.database import Base # Your SQLAlchemy Base
from app.models import * # Import all models
config = context.config
# Override sqlalchemy.url from environment
database_url = os.getenv("DATABASE_URL", "")
# Handle Heroku-style postgres:// URLs
if database_url.startswith("postgres://"):
database_url = database_url.replace("postgres://", "postgresql://", 1)
config.set_main_option("sqlalchemy.url", database_url)
if config.config_file_name is not None:
fileConfig(config.config_file_name)
target_metadata = Base.metadata
def run_migrations_offline():
"""Run migrations in 'offline' mode (generates SQL script)."""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url,
target_metadata=target_metadata,
literal_binds=True,
dialect_opts={"paramstyle": "named"},
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online():
"""Run migrations in 'online' mode (directly against database)."""
connectable = engine_from_config(
config.get_section(config.config_ini_section, {}),
prefix="sqlalchemy.",
poolclass=pool.NullPool,
)
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata,
)
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
# Generate a migration from model changes alembic revision --autogenerate -m "add_users_table" # Review the generated migration file before applying! # Then apply alembic upgrade head # Rollback one step alembic downgrade -1 # View migration history alembic history --verbose # Show current revision alembic current
Create an entrypoint script that runs migrations before starting the application:
#!/bin/bash # docker-entrypoint.sh set -e echo "Running database migrations..." alembic upgrade head echo "Starting application..." exec "$@"
# Dockerfile (updated) # ... (previous build steps) COPY docker-entrypoint.sh /docker-entrypoint.sh RUN chmod +x /docker-entrypoint.sh ENTRYPOINT ["/docker-entrypoint.sh"] CMD ["gunicorn", "app.main:app", "--worker-class", "uvicorn.workers.UvicornWorker", "--workers", "4", "--bind", "0.0.0.0:8000"]
For zero-downtime deployments, follow the expand-contract pattern:
# Example: Renaming a column (email -> email_address)
# Migration 1: Add new column (expand)
def upgrade():
op.add_column("users", sa.Column("email_address", sa.String(255), nullable=True))
# Backfill
op.execute("UPDATE users SET email_address = email WHERE email_address IS NULL")
def downgrade():
op.drop_column("users", "email_address")
# Migration 2: Make new column required and drop old (contract)
# Deploy AFTER all code uses email_address
def upgrade():
op.alter_column("users", "email_address", nullable=False)
op.drop_column("users", "email")
def downgrade():
op.add_column("users", sa.Column("email", sa.String(255), nullable=True))
op.execute("UPDATE users SET email = email_address")
Production applications need comprehensive monitoring to detect issues before users do. This includes health checks, metrics collection, structured logging, and alerting.
# app/routers/health.py
from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import text
import redis.asyncio as redis
from datetime import datetime
from app.database import get_db
from app.config import get_settings
router = APIRouter(tags=["health"])
@router.get("/health")
async def health_check():
"""Basic health check for load balancers."""
return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()}
@router.get("/health/ready")
async def readiness_check(db: AsyncSession = Depends(get_db)):
"""Readiness check - verifies all dependencies are available."""
checks = {}
# Database check
try:
result = await db.execute(text("SELECT 1"))
checks["database"] = {"status": "healthy"}
except Exception as e:
checks["database"] = {"status": "unhealthy", "error": str(e)}
# Redis check
try:
settings = get_settings()
r = redis.from_url(settings.redis_url)
await r.ping()
checks["redis"] = {"status": "healthy"}
await r.close()
except Exception as e:
checks["redis"] = {"status": "unhealthy", "error": str(e)}
overall = "healthy" if all(
c["status"] == "healthy" for c in checks.values()
) else "unhealthy"
return {
"status": overall,
"checks": checks,
"timestamp": datetime.utcnow().isoformat(),
}
Expose application metrics for Prometheus to scrape:
pip install prometheus-fastapi-instrumentator
# app/metrics.py
from prometheus_fastapi_instrumentator import Instrumentator
from prometheus_client import Counter, Histogram, Gauge
# Custom metrics
REQUEST_COUNT = Counter(
"app_requests_total",
"Total number of requests",
["method", "endpoint", "status"]
)
REQUEST_DURATION = Histogram(
"app_request_duration_seconds",
"Request duration in seconds",
["method", "endpoint"],
buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0, 10.0]
)
ACTIVE_CONNECTIONS = Gauge(
"app_active_connections",
"Number of active connections"
)
DB_POOL_SIZE = Gauge(
"app_db_pool_size",
"Database connection pool size"
)
def setup_metrics(app):
"""Initialize Prometheus instrumentation."""
Instrumentator(
should_group_status_codes=False,
should_ignore_untemplated=True,
should_respect_env_var=False,
excluded_handlers=["/health", "/metrics"],
env_var_name="ENABLE_METRICS",
).instrument(app).expose(app, endpoint="/metrics")
Add metrics to your application factory:
# In app/main.py create_app()
from app.metrics import setup_metrics
def create_app() -> FastAPI:
# ... previous setup ...
setup_metrics(app)
return app
pip install sentry-sdk[fastapi]
# app/sentry.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.sqlalchemy import SqlalchemyIntegration
from app.config import get_settings
def setup_sentry():
"""Initialize Sentry error tracking."""
settings = get_settings()
if settings.sentry_dsn:
sentry_sdk.init(
dsn=settings.sentry_dsn,
environment=settings.environment,
release=settings.app_version,
integrations=[
FastApiIntegration(transaction_style="endpoint"),
SqlalchemyIntegration(),
],
traces_sample_rate=0.1 if settings.environment == "production" else 1.0,
profiles_sample_rate=0.1,
send_default_pii=False, # Don't send user PII
)
With Prometheus metrics exposed, you can create Grafana dashboards to visualize:
# docker-compose monitoring stack
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports:
- "9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=15d'
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: "fastapi"
static_configs:
- targets: ["app:8000"]
metrics_path: /metrics
FastAPI is already one of the fastest Python frameworks, but production applications can benefit from caching, async optimization, connection pooling, and profiling.
# GOOD: Use async for I/O-bound operations
import httpx
async def fetch_external_data(url: str) -> dict:
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.json()
# GOOD: Run CPU-bound tasks in a thread pool
from fastapi.concurrency import run_in_threadpool
import hashlib
async def hash_password(password: str) -> str:
return await run_in_threadpool(
hashlib.pbkdf2_hmac, "sha256", password.encode(), b"salt", 100000
)
# GOOD: Parallel async operations
import asyncio
async def get_dashboard_data(user_id: int):
"""Fetch multiple pieces of data concurrently."""
orders, notifications, recommendations = await asyncio.gather(
get_user_orders(user_id),
get_notifications(user_id),
get_recommendations(user_id),
)
return {
"orders": orders,
"notifications": notifications,
"recommendations": recommendations,
}
# BAD: Sequential async calls (slower)
async def get_dashboard_data_slow(user_id: int):
orders = await get_user_orders(user_id) # Wait...
notifications = await get_notifications(user_id) # Wait...
recommendations = await get_recommendations(user_id) # Wait...
return {"orders": orders, "notifications": notifications}
pip install redis
# app/cache.py
import json
import hashlib
from functools import wraps
from typing import Optional, Callable
import redis.asyncio as redis
from app.config import get_settings
_redis_client: Optional[redis.Redis] = None
async def get_redis() -> redis.Redis:
"""Get or create Redis client."""
global _redis_client
if _redis_client is None:
settings = get_settings()
_redis_client = redis.from_url(
settings.redis_url,
encoding="utf-8",
decode_responses=True,
)
return _redis_client
async def cache_get(key: str) -> Optional[dict]:
"""Get a value from cache."""
r = await get_redis()
data = await r.get(key)
if data:
return json.loads(data)
return None
async def cache_set(key: str, value: dict, ttl: int = 300):
"""Set a value in cache with TTL (default 5 minutes)."""
r = await get_redis()
await r.setex(key, ttl, json.dumps(value))
async def cache_delete(key: str):
"""Delete a key from cache."""
r = await get_redis()
await r.delete(key)
async def cache_delete_pattern(pattern: str):
"""Delete all keys matching a pattern."""
r = await get_redis()
async for key in r.scan_iter(match=pattern):
await r.delete(key)
def cached(ttl: int = 300, prefix: str = ""):
"""Decorator for caching endpoint responses."""
def decorator(func: Callable):
@wraps(func)
async def wrapper(*args, **kwargs):
# Build cache key from function name and arguments
key_data = f"{prefix}:{func.__name__}:{str(args)}:{str(sorted(kwargs.items()))}"
cache_key = hashlib.md5(key_data.encode()).hexdigest()
# Check cache
cached_result = await cache_get(cache_key)
if cached_result is not None:
return cached_result
# Execute function
result = await func(*args, **kwargs)
# Store in cache
if isinstance(result, dict):
await cache_set(cache_key, result, ttl)
elif hasattr(result, "model_dump"):
await cache_set(cache_key, result.model_dump(), ttl)
return result
return wrapper
return decorator
Use the caching decorator on your endpoints:
from app.cache import cached, cache_delete_pattern
@router.get("/products/{product_id}")
@cached(ttl=600, prefix="product")
async def get_product(product_id: int, db: AsyncSession = Depends(get_db)):
"""Get product with 10-minute cache."""
product = await db.get(Product, product_id)
if not product:
raise HTTPException(status_code=404, detail="Product not found")
return ProductResponse.model_validate(product).model_dump()
@router.put("/products/{product_id}")
async def update_product(product_id: int, data: ProductUpdate, db: AsyncSession = Depends(get_db)):
"""Update product and invalidate cache."""
product = await db.get(Product, product_id)
# ... update logic ...
await cache_delete_pattern("product:*")
return ProductResponse.model_validate(product)
# Add GZip middleware for large responses from fastapi.middleware.gzip import GZipMiddleware app.add_middleware(GZipMiddleware, minimum_size=1000) # Compress responses > 1KB
Use profiling to find bottlenecks in your application:
# app/profiling.py - Development only
import cProfile
import pstats
import io
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
class ProfilingMiddleware(BaseHTTPMiddleware):
"""Profile requests and log slow endpoints. DEV ONLY."""
async def dispatch(self, request: Request, call_next):
profiler = cProfile.Profile()
profiler.enable()
response = await call_next(request)
profiler.disable()
# Log if request took more than 100ms
stream = io.StringIO()
stats = pstats.Stats(profiler, stream=stream)
stats.sort_stats("cumulative")
total_time = sum(stat[3] for stat in stats.stats.values())
if total_time > 0.1: # 100ms threshold
stats.print_stats(20)
print(f"SLOW REQUEST: {request.method} {request.url.path}")
print(stream.getvalue())
return response
As your application grows, you need strategies to handle increased traffic. Scaling involves horizontal scaling (more instances), load balancing, caching layers, and rate limiting.
# Scale to multiple instances docker compose up -d --scale app=4 # Nginx automatically load balances across all instances
# docker-compose.prod.yml - Production scaling
version: "3.9"
services:
app:
build: .
deploy:
replicas: 4
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.25"
memory: 128M
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
environment:
- DATABASE_URL=postgresql+asyncpg://user:pass@db:5432/mydb
- REDIS_URL=redis://redis:6379/0
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf
- ./nginx/certs:/etc/nginx/certs
depends_on:
- app
pip install slowapi
# app/rate_limit.py
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
from slowapi.middleware import SlowAPIMiddleware
limiter = Limiter(
key_func=get_remote_address,
default_limits=["100/minute"],
storage_uri="redis://localhost:6379/1",
strategy="fixed-window-elastic-expiry",
)
def setup_rate_limiting(app):
"""Configure rate limiting for the application."""
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
app.add_middleware(SlowAPIMiddleware)
Apply rate limits to specific endpoints:
from app.rate_limit import limiter
@router.post("/auth/login")
@limiter.limit("5/minute")
async def login(request: Request, credentials: LoginRequest):
"""Login with strict rate limiting."""
# ... authentication logic
pass
@router.get("/api/search")
@limiter.limit("30/minute")
async def search(request: Request, q: str):
"""Search with moderate rate limiting."""
# ... search logic
pass
For long-running tasks, use a task queue to process work asynchronously:
pip install celery[redis]
# app/tasks.py
from celery import Celery
from app.config import get_settings
settings = get_settings()
celery_app = Celery(
"fastapi_tasks",
broker=settings.redis_url,
backend=settings.redis_url,
)
celery_app.conf.update(
task_serializer="json",
result_serializer="json",
accept_content=["json"],
timezone="UTC",
task_track_started=True,
task_time_limit=300, # 5 minute hard limit
task_soft_time_limit=240, # 4 minute soft limit
worker_max_tasks_per_child=100, # Restart workers after 100 tasks
)
@celery_app.task(bind=True, max_retries=3)
def send_email_task(self, to_email: str, subject: str, body: str):
"""Send email asynchronously."""
try:
# ... send email logic
pass
except Exception as exc:
self.retry(exc=exc, countdown=60) # Retry after 60 seconds
@celery_app.task
def generate_report_task(user_id: int, report_type: str):
"""Generate report in background."""
# ... heavy computation
pass
# Use in FastAPI endpoints
from app.tasks import send_email_task, generate_report_task
@router.post("/reports/generate")
async def generate_report(user_id: int, report_type: str):
task = generate_report_task.delay(user_id, report_type)
return {"task_id": task.id, "status": "processing"}
@router.get("/tasks/{task_id}")
async def get_task_status(task_id: str):
from celery.result import AsyncResult
result = AsyncResult(task_id)
return {
"task_id": task_id,
"status": result.status,
"result": result.result if result.ready() else None,
}
| Strategy | When to Use | Complexity |
|---|---|---|
| Vertical scaling (bigger server) | Quick fix, small apps | Low |
| Horizontal scaling (more instances) | High traffic, stateless apps | Medium |
| Caching (Redis) | Repeated reads, expensive queries | Medium |
| Background tasks (Celery) | Long operations, email, reports | Medium |
| Database read replicas | Read-heavy workloads | High |
| CDN for static assets | Global users, static content | Low |
| Microservices | Large teams, complex domains | Very High |
Security is not optional in production. FastAPI provides several built-in security features, but you need to configure additional layers for a properly hardened deployment.
Always enforce HTTPS in production. Use the HTTPS redirect middleware:
from fastapi.middleware.httpsredirect import HTTPSRedirectMiddleware
if settings.environment == "production":
app.add_middleware(HTTPSRedirectMiddleware)
# app/security.py
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
class SecurityHeadersMiddleware(BaseHTTPMiddleware):
"""Add security headers to all responses."""
async def dispatch(self, request: Request, call_next):
response = await call_next(request)
response.headers["X-Content-Type-Options"] = "nosniff"
response.headers["X-Frame-Options"] = "DENY"
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
response.headers["Permissions-Policy"] = (
"camera=(), microphone=(), geolocation=(), payment=()"
)
if request.url.scheme == "https":
response.headers["Strict-Transport-Security"] = (
"max-age=63072000; includeSubDomains; preload"
)
return response
from fastapi.middleware.cors import CORSMiddleware
# NEVER use allow_origins=["*"] in production
app.add_middleware(
CORSMiddleware,
allow_origins=[
"https://yourdomain.com",
"https://www.yourdomain.com",
"https://admin.yourdomain.com",
],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
allow_headers=["Authorization", "Content-Type", "X-Request-ID"],
expose_headers=["X-Request-ID"],
max_age=3600, # Cache preflight for 1 hour
)
Never hardcode secrets. Use environment variables and secrets management services:
# app/secrets.py
import boto3
import json
from functools import lru_cache
@lru_cache()
def get_aws_secret(secret_name: str, region: str = "us-east-1") -> dict:
"""Retrieve secrets from AWS Secrets Manager."""
client = boto3.client("secretsmanager", region_name=region)
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response["SecretString"])
# Usage in settings
class Settings(BaseSettings):
@classmethod
def _load_aws_secrets(cls):
"""Load secrets from AWS Secrets Manager at startup."""
try:
secrets = get_aws_secret("fastapi/production")
return secrets
except Exception:
return {}
def __init__(self, **kwargs):
aws_secrets = self._load_aws_secrets()
# AWS secrets override env vars
for key, value in aws_secrets.items():
if key.lower() not in kwargs:
kwargs[key.lower()] = value
super().__init__(**kwargs)
from pydantic import BaseModel, Field, field_validator
import bleach
import re
class UserInput(BaseModel):
"""User input with validation and sanitization."""
username: str = Field(min_length=3, max_length=50, pattern=r"^[a-zA-Z0-9_-]+$")
email: str = Field(max_length=255)
bio: str = Field(max_length=1000, default="")
@field_validator("bio")
@classmethod
def sanitize_bio(cls, v: str) -> str:
"""Remove HTML tags from bio."""
return bleach.clean(v, tags=[], strip=True)
@field_validator("email")
@classmethod
def validate_email(cls, v: str) -> str:
"""Validate email format."""
email_regex = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
if not re.match(email_regex, v):
raise ValueError("Invalid email format")
return v.lower()
| Category | Item | Status |
|---|---|---|
| Transport | HTTPS enforced everywhere | Required |
| Transport | HSTS header enabled | Required |
| Auth | Passwords hashed with bcrypt/argon2 | Required |
| Auth | JWT tokens with short expiry | Required |
| Auth | Rate limiting on login endpoints | Required |
| Headers | Security headers on all responses | Required |
| CORS | Specific origins (no wildcards) | Required |
| Input | Pydantic validation on all inputs | Required |
| Secrets | No secrets in code or git | Required |
| Secrets | Use secrets manager (AWS SM, Vault) | Recommended |
| Dependencies | Regular dependency updates | Required |
| Docs | Disable /docs and /redoc in production | Recommended |
Here is a complete production-ready docker-compose setup with FastAPI, PostgreSQL, Redis, Nginx, Celery, and monitoring — everything you need to deploy a real-world application.
fastapi-production/ ├── app/ │ ├── __init__.py │ ├── main.py # Application factory │ ├── config.py # Pydantic settings │ ├── database.py # Database setup │ ├── models/ # SQLAlchemy models │ ├── schemas/ # Pydantic schemas │ ├── routers/ # API routes │ ├── services/ # Business logic │ ├── middleware.py # Custom middleware │ ├── cache.py # Redis caching │ ├── tasks.py # Celery tasks │ └── logging_config.py # Structured logging ├── alembic/ # Database migrations │ ├── versions/ │ └── env.py ├── nginx/ │ ├── nginx.conf │ └── certs/ ├── tests/ │ ├── conftest.py │ ├── test_routes/ │ └── test_services/ ├── .github/ │ └── workflows/ │ └── ci-cd.yml ├── Dockerfile ├── docker-compose.yml # Development ├── docker-compose.prod.yml # Production ├── docker-entrypoint.sh ├── gunicorn.conf.py ├── requirements.txt ├── requirements-dev.txt ├── alembic.ini ├── .env.example ├── .dockerignore └── .gitignore
# docker-compose.prod.yml
version: "3.9"
services:
# ---- FastAPI Application ----
app:
build:
context: .
dockerfile: Dockerfile
environment:
- ENVIRONMENT=production
- DATABASE_URL=postgresql+asyncpg://fastapi:${DB_PASSWORD}@db:5432/fastapi_prod
- REDIS_URL=redis://redis:6379/0
- SECRET_KEY=${SECRET_KEY}
- JWT_SECRET=${JWT_SECRET}
- LOG_LEVEL=INFO
- LOG_FORMAT=json
- WORKERS=4
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: always
deploy:
replicas: 2
resources:
limits:
cpus: "1.0"
memory: 512M
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
networks:
- backend
- frontend
# ---- Celery Worker ----
celery-worker:
build: .
command: celery -A app.tasks worker --loglevel=info --concurrency=4
environment:
- DATABASE_URL=postgresql+asyncpg://fastapi:${DB_PASSWORD}@db:5432/fastapi_prod
- REDIS_URL=redis://redis:6379/0
depends_on:
- db
- redis
restart: always
deploy:
replicas: 2
resources:
limits:
cpus: "0.5"
memory: 256M
networks:
- backend
# ---- Celery Beat (Scheduler) ----
celery-beat:
build: .
command: celery -A app.tasks beat --loglevel=info
environment:
- REDIS_URL=redis://redis:6379/0
depends_on:
- redis
restart: always
networks:
- backend
# ---- PostgreSQL ----
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: fastapi
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: fastapi_prod
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U fastapi -d fastapi_prod"]
interval: 10s
timeout: 5s
retries: 5
restart: always
deploy:
resources:
limits:
cpus: "2.0"
memory: 1G
networks:
- backend
# ---- Redis ----
redis:
image: redis:7-alpine
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
restart: always
networks:
- backend
# ---- Nginx Reverse Proxy ----
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/certs:/etc/nginx/certs:ro
- static_files:/app/static:ro
depends_on:
- app
restart: always
networks:
- frontend
# ---- Prometheus (Monitoring) ----
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus_data:/prometheus
ports:
- "9090:9090"
restart: always
networks:
- backend
# ---- Grafana (Dashboards) ----
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
restart: always
networks:
- backend
volumes:
postgres_data:
redis_data:
static_files:
prometheus_data:
grafana_data:
networks:
frontend:
driver: bridge
backend:
driver: bridge
# nginx/nginx.conf (production)
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Logging format
log_format json_combined escape=json
'{"time":"$time_iso8601",'
'"remote_addr":"$remote_addr",'
'"request":"$request",'
'"status":$status,'
'"body_bytes_sent":$body_bytes_sent,'
'"request_time":$request_time,'
'"upstream_response_time":"$upstream_response_time"}';
access_log /var/log/nginx/access.log json_combined;
# Performance
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# Gzip
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_min_length 1024;
gzip_types text/plain text/css application/json application/javascript text/xml;
# Rate limiting
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=1r/s;
# Upstream (load balancing across app replicas)
upstream app {
least_conn;
server app:8000;
}
# HTTP -> HTTPS redirect
server {
listen 80;
server_name _;
return 301 https://$host$request_uri;
}
# HTTPS server
server {
listen 443 ssl http2;
server_name yourdomain.com;
ssl_certificate /etc/nginx/certs/fullchain.pem;
ssl_certificate_key /etc/nginx/certs/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers off;
# Security headers
add_header X-Frame-Options "DENY" always;
add_header X-Content-Type-Options "nosniff" always;
add_header Strict-Transport-Security "max-age=63072000" always;
client_max_body_size 10M;
# API endpoints
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Auth endpoints (strict rate limiting)
location /api/auth/ {
limit_req zone=auth burst=5 nodelay;
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# WebSocket
location /ws/ {
proxy_pass http://app;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 86400s;
}
# Health check (no logging, no rate limit)
location /health {
access_log off;
proxy_pass http://app;
}
# Static files
location /static/ {
alias /app/static/;
expires 30d;
add_header Cache-Control "public, immutable";
}
}
}
# Create .env file for production secrets cat > .env << 'EOF' DB_PASSWORD=your-strong-password-here SECRET_KEY=your-secret-key-here JWT_SECRET=your-jwt-secret-here GRAFANA_PASSWORD=admin-password-here EOF # Start the full production stack docker compose -f docker-compose.prod.yml up -d --build # Check all services are healthy docker compose -f docker-compose.prod.yml ps # View application logs docker compose -f docker-compose.prod.yml logs -f app # Run database migrations docker compose -f docker-compose.prod.yml exec app alembic upgrade head # Scale application horizontally docker compose -f docker-compose.prod.yml up -d --scale app=4 # Rolling update (zero downtime) docker compose -f docker-compose.prod.yml build app docker compose -f docker-compose.prod.yml up -d --no-deps app # Backup database docker compose -f docker-compose.prod.yml exec db pg_dump -U fastapi fastapi_prod > backup.sql
| # | Topic | Key Points |
|---|---|---|
| 1 | Configuration | Use pydantic-settings for type-safe configuration from environment variables. Never hardcode secrets. |
| 2 | ASGI Servers | Use Gunicorn with Uvicorn workers for production. Set workers to (2 * CPU) + 1. Enable max_requests to prevent memory leaks. |
| 3 | Docker | Use multi-stage builds for smaller images. Run as non-root user. Include health checks. Use .dockerignore to reduce context size. |
| 4 | Nginx | Always use Nginx as a reverse proxy. Handle SSL termination, static files, rate limiting, and WebSocket proxying at the Nginx layer. |
| 5 | AWS | EC2 for full control, ECS/Fargate for managed containers, Lambda with Mangum for serverless. Use SSM Parameter Store or Secrets Manager for secrets. |
| 6 | Heroku | Simplest deployment path. Use Procfile with Gunicorn + Uvicorn workers. Add release phase for auto-migrations. |
| 7 | DigitalOcean | App Platform for managed PaaS or Droplets with systemd for full control. Both work well for FastAPI. |
| 8 | CI/CD | GitHub Actions pipeline: lint, test with services (Postgres, Redis), build Docker image, deploy. Use environments for staging/production separation. |
| 9 | Migrations | Use Alembic for database migrations. Run migrations in Docker entrypoint or release phase. Follow expand-contract pattern for zero-downtime changes. |
| 10 | Monitoring | Health check endpoints for load balancers. Prometheus metrics with Grafana dashboards. Sentry for error tracking. Structured JSON logging. |
| 11 | Performance | Use asyncio.gather for parallel I/O. Cache with Redis. Enable GZip compression. Profile slow endpoints to find bottlenecks. |
| 12 | Scaling | Start with vertical scaling, then horizontal. Use Celery for background tasks. Rate limit with slowapi. Consider read replicas for DB-heavy workloads. |
| 13 | Security | Enforce HTTPS, add security headers, configure CORS properly, validate all inputs with Pydantic, use secrets management, disable docs in production. |
| 14 | Full Stack | Production stack: FastAPI + PostgreSQL + Redis + Nginx + Celery + Prometheus + Grafana. Use docker-compose for orchestration with health checks, resource limits, and network isolation. |
With these configurations and practices in place, your FastAPI application is ready for production traffic. Start simple — you don’t need every component from day one. Begin with Docker + Nginx + Gunicorn, add monitoring as you grow, and scale horizontally when needed.