Flask – Deployment

Introduction

You have built a Flask application. It handles routes, talks to a database, renders templates, and works perfectly on your laptop. Now comes the part that separates hobby projects from production software: deployment.

Deployment is the process of taking your application from a development environment — where you run flask run and hit localhost:5000 — to a production environment where real users access it over the internet, 24 hours a day, under unpredictable load, with zero tolerance for data loss or security breaches.

The gap between development and production is enormous. In development, you have one user (yourself), debug mode is on, the database is local, and if the server crashes you just restart it. In production, you might have thousands of concurrent users, secrets must be locked down, the database needs connection pooling and backups, the server must survive crashes and restart automatically, and every request must be served over HTTPS.

This tutorial covers every aspect of deploying a Flask application to production. We will work through the deployment stack from the inside out: preparing your application code, configuring a production WSGI server, setting up a reverse proxy, containerizing with Docker, deploying to cloud platforms, and building CI/CD pipelines. By the end, you will have a complete, repeatable deployment workflow that you can use for any Flask project.

The Deployment Checklist

Before we dive into specifics, here is the high-level checklist every Flask deployment must address:

  • Debug mode OFF — Never run with debug mode enabled in production
  • WSGI server — Replace the Flask development server with Gunicorn or uWSGI
  • Reverse proxy — Put Nginx in front of your WSGI server
  • HTTPS — Encrypt all traffic with TLS certificates
  • Environment variables — No hardcoded secrets in source code
  • Database — Connection pooling, migrations, automated backups
  • Logging — Structured logging to files or external services
  • Monitoring — Health checks, error tracking, performance metrics
  • CI/CD — Automated testing, building, and deployment
  • Scaling — Horizontal scaling strategy for when traffic grows

Let us work through each of these systematically.


1. Preparing for Production

Production readiness starts in your application code. Before you think about servers, containers, or cloud platforms, your Flask app itself must be configured correctly.

Debug Mode OFF

This is the single most critical deployment rule. Flask’s debug mode enables the interactive debugger, which allows anyone who can trigger an error to execute arbitrary Python code on your server. It also enables the reloader, which watches your files for changes and restarts the process — unnecessary overhead in production.

# NEVER do this in production
app.run(debug=True)  # Interactive debugger exposed to the internet

# Correct: debug off, or better yet, don't use app.run() at all
app.run(debug=False)

In production, you will not call app.run() at all. A WSGI server like Gunicorn imports your application object directly. But if your code has debug=True anywhere, make sure it is controlled by an environment variable.

Configuration Classes

Professional Flask applications use configuration classes to separate development, testing, and production settings. This pattern keeps sensitive production values out of your code and makes it easy to switch environments.

# config.py
import os


class Config:
    """Base configuration shared across all environments."""
    SECRET_KEY = os.environ.get("SECRET_KEY", "fallback-dev-key-change-me")
    SQLALCHEMY_TRACK_MODIFICATIONS = False
    MAX_CONTENT_LENGTH = 16 * 1024 * 1024  # 16 MB upload limit


class DevelopmentConfig(Config):
    """Local development settings."""
    DEBUG = True
    SQLALCHEMY_DATABASE_URI = os.environ.get(
        "DATABASE_URL",
        "sqlite:///dev.db"
    )


class TestingConfig(Config):
    """Test suite settings."""
    TESTING = True
    SQLALCHEMY_DATABASE_URI = "sqlite:///:memory:"
    WTF_CSRF_ENABLED = False


class ProductionConfig(Config):
    """Production settings - all secrets from environment variables."""
    DEBUG = False
    TESTING = False
    SQLALCHEMY_DATABASE_URI = os.environ["DATABASE_URL"]  # No fallback; crash if missing
    SECRET_KEY = os.environ["SECRET_KEY"]  # No fallback; crash if missing
    SESSION_COOKIE_SECURE = True
    SESSION_COOKIE_HTTPONLY = True
    SESSION_COOKIE_SAMESITE = "Lax"
    PREFERRED_URL_SCHEME = "https"


config_by_name = {
    "development": DevelopmentConfig,
    "testing": TestingConfig,
    "production": ProductionConfig,
}

Notice that ProductionConfig uses os.environ["DATABASE_URL"] without a fallback. This is intentional. If the environment variable is not set, the application crashes immediately at startup with a clear KeyError. This is far better than silently connecting to a wrong database or running with a default secret key.

Application Factory

Load the correct configuration in your application factory:

# app/__init__.py
import os
from flask import Flask
from config import config_by_name


def create_app(config_name=None):
    if config_name is None:
        config_name = os.environ.get("FLASK_ENV", "development")

    app = Flask(__name__)
    app.config.from_object(config_by_name[config_name])

    # Initialize extensions
    from app.extensions import db, migrate, ma
    db.init_app(app)
    migrate.init_app(app, db)
    ma.init_app(app)

    # Register blueprints
    from app.routes import api_bp
    app.register_blueprint(api_bp, url_prefix="/api")

    return app

Pinning Dependencies

Your requirements.txt must pin every dependency to an exact version. Without pinning, a new install might pull a different version of a library that introduces breaking changes or security vulnerabilities.

# Generate pinned requirements from your current environment
pip freeze > requirements.txt

A pinned requirements.txt looks like this:

# requirements.txt
Flask==3.1.0
Flask-SQLAlchemy==3.1.1
Flask-Migrate==4.0.7
gunicorn==23.0.0
psycopg2-binary==2.9.10
python-dotenv==1.0.1
marshmallow==3.23.1
redis==5.2.1

For more robust dependency management, consider using pip-tools. You write a requirements.in with your direct dependencies, and pip-compile generates a fully pinned requirements.txt with all transitive dependencies and hash verification.

Static File Handling

In development, Flask serves static files from the static/ directory. In production, this is inefficient — Flask is a Python application server, not a file server. Nginx (or a CDN) should serve static files directly, bypassing your Python process entirely. We will configure this in the Nginx section.

For now, make sure your static files are organized:

myapp/
├── app/
│   ├── static/
│   │   ├── css/
│   │   ├── js/
│   │   └── images/
│   ├── templates/
│   └── ...

2. WSGI Servers

Flask’s built-in development server is single-threaded, not optimized for performance, and has no process management. It is designed for one thing: local development convenience. Running it in production is like driving a go-kart on the highway — it technically moves forward, but it is not built for the conditions.

A production WSGI server handles multiple concurrent requests using worker processes or threads, manages worker lifecycle (restarting crashed workers), and is tuned for throughput and reliability.

Gunicorn

Gunicorn (Green Unicorn) is the most popular WSGI server for Python applications. It uses a pre-fork worker model: a master process spawns multiple worker processes, each handling requests independently. If a worker crashes, the master spawns a replacement.

# Install Gunicorn
pip install gunicorn

Basic Usage

# Run with default settings (1 worker)
gunicorn "app:create_app()"

# Specify host and port
gunicorn --bind 0.0.0.0:8000 "app:create_app()"

# Multiple workers
gunicorn --workers 4 --bind 0.0.0.0:8000 "app:create_app()"

The string "app:create_app()" tells Gunicorn to import the app module and call create_app() to get the WSGI application object. If your app object is a module-level variable, use "app:app" or "wsgi:app".

Worker Configuration

The number of workers determines how many concurrent requests your server can handle. The general formula is:

workers = (2 * CPU_CORES) + 1

On a 4-core machine, that is 9 workers. Each worker is a separate OS process with its own memory space, so more workers means more memory usage. Monitor your server’s memory and adjust accordingly.

Gunicorn Configuration File

For production, use a configuration file instead of command-line arguments:

# gunicorn.conf.py
import multiprocessing

# Server socket
bind = "0.0.0.0:8000"

# Worker processes
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "sync"
worker_connections = 1000
timeout = 30
keepalive = 2

# Logging
accesslog = "-"  # stdout
errorlog = "-"   # stderr
loglevel = "info"

# Process naming
proc_name = "myapp"

# Server mechanics
daemon = False
pidfile = None
umask = 0
tmp_upload_dir = None

# Restart workers after this many requests (prevents memory leaks)
max_requests = 1000
max_requests_jitter = 50

# Preload application code before forking workers
preload_app = True
# Run with config file
gunicorn -c gunicorn.conf.py "app:create_app()"

Worker Types

Gunicorn supports different worker types for different workloads:

  • sync (default) — Each worker handles one request at a time. Simple and reliable. Good for CPU-bound applications.
  • gthread — Each worker uses multiple threads. Good when your application does I/O (database queries, API calls) and you want concurrency without the memory cost of more processes.
  • gevent — Uses greenlets for cooperative multitasking. A single worker can handle hundreds of concurrent connections. Excellent for I/O-bound applications.
  • uvicorn.workers.UvicornWorker — ASGI worker for async Flask applications (Flask 2.0+ supports async def views).
# Threaded workers (4 threads per worker)
gunicorn --workers 4 --threads 4 --bind 0.0.0.0:8000 "app:create_app()"

# Gevent workers
pip install gevent
gunicorn --workers 4 --worker-class gevent --bind 0.0.0.0:8000 "app:create_app()"

# Uvicorn workers (for async Flask)
pip install uvicorn
gunicorn --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 "app:create_app()"

uWSGI

uWSGI is an alternative WSGI server with more features and more complexity. It supports the same pre-fork model but adds protocol-level optimizations, built-in caching, and its own process management.

# Install uWSGI
pip install uwsgi

# Run Flask app
uwsgi --http 0.0.0.0:8000 --wsgi-file wsgi.py --callable app --processes 4 --threads 2

uWSGI is powerful but has a steeper learning curve. For most Flask deployments, Gunicorn is the simpler and more common choice. Choose uWSGI if you need its specific features (e.g., built-in caching, spooler for background tasks, or the uwsgi protocol for Nginx communication).


3. Nginx as Reverse Proxy

In production, you do not expose Gunicorn directly to the internet. Instead, you put Nginx in front of it as a reverse proxy. Nginx handles several responsibilities that Gunicorn should not:

  • TLS termination — Nginx handles HTTPS, decrypts the request, and forwards plain HTTP to Gunicorn internally
  • Static file serving — Nginx serves CSS, JS, and images directly from disk, far faster than Python
  • Request buffering — Nginx buffers slow client uploads so Gunicorn workers are not tied up waiting for data
  • Load balancing — Nginx can distribute requests across multiple Gunicorn instances
  • Rate limiting — Protect your application from abuse
  • Connection handling — Nginx efficiently handles thousands of concurrent connections with minimal resources

Nginx Configuration for Flask

# /etc/nginx/sites-available/myapp
server {
    listen 80;
    server_name myapp.example.com;

    # Redirect all HTTP to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name myapp.example.com;

    # SSL certificates (managed by Certbot)
    ssl_certificate /etc/letsencrypt/live/myapp.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/myapp.example.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    # Serve static files directly
    location /static/ {
        alias /var/www/myapp/app/static/;
        expires 30d;
        add_header Cache-Control "public, immutable";
    }

    # Proxy all other requests to Gunicorn
    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_redirect off;

        # Timeouts
        proxy_connect_timeout 60s;
        proxy_read_timeout 60s;
        proxy_send_timeout 60s;
    }

    # Client upload size limit
    client_max_body_size 16M;

    # Logging
    access_log /var/log/nginx/myapp_access.log;
    error_log /var/log/nginx/myapp_error.log;
}
# Enable the site
sudo ln -s /etc/nginx/sites-available/myapp /etc/nginx/sites-enabled/
sudo nginx -t  # Test configuration
sudo systemctl reload nginx

Telling Flask About the Proxy

When Nginx forwards requests to Gunicorn, Flask sees the request as coming from 127.0.0.1 instead of the actual client. The X-Forwarded-For and X-Forwarded-Proto headers carry the original client information. Tell Flask to trust these headers:

from werkzeug.middleware.proxy_fix import ProxyFix

app = create_app()
app.wsgi_app = ProxyFix(app.wsgi_app, x_for=1, x_proto=1, x_host=1, x_prefix=1)

SSL/TLS with Let’s Encrypt

Let’s Encrypt provides free TLS certificates. Certbot automates the entire process:

# Install Certbot
sudo apt install certbot python3-certbot-nginx

# Obtain and install certificate (auto-configures Nginx)
sudo certbot --nginx -d myapp.example.com

# Certbot sets up auto-renewal. Verify it:
sudo certbot renew --dry-run

Certbot modifies your Nginx configuration to add SSL directives and sets up a systemd timer for automatic renewal before the certificate expires (every 90 days).


4. Docker Deployment

Docker packages your application, its dependencies, and its runtime environment into a single, portable image. This eliminates the “works on my machine” problem — if it runs in the Docker container locally, it runs the same way in production.

Dockerfile for Flask

# Dockerfile
FROM python:3.12-slim

# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# Create a non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc libpq-dev && \
    rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Switch to non-root user
USER appuser

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1

# Run with Gunicorn
CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:create_app()"]

Key decisions in this Dockerfile:

  • python:3.12-slim — The slim variant is much smaller than the full image (150 MB vs 1 GB) while still including essential system libraries
  • PYTHONDONTWRITEBYTECODE=1 — Prevents Python from creating .pyc files in the container
  • PYTHONUNBUFFERED=1 — Ensures print statements and log messages appear immediately in Docker logs
  • Non-root user — Security best practice; if the application is compromised, the attacker has limited permissions
  • COPY requirements first — Docker caches each layer. By copying and installing requirements before copying the application code, you only re-install dependencies when requirements.txt changes, not on every code change

Multi-Stage Builds

Multi-stage builds produce smaller production images by separating the build environment from the runtime environment:

# Dockerfile.multistage

# Stage 1: Build
FROM python:3.12-slim AS builder

WORKDIR /app

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc libpq-dev && \
    rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Stage 2: Production
FROM python:3.12-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser

# Install only runtime dependencies (no gcc, no build tools)
RUN apt-get update && \
    apt-get install -y --no-install-recommends libpq5 && \
    rm -rf /var/lib/apt/lists/*

# Copy installed packages from builder
COPY --from=builder /install /usr/local

WORKDIR /app
COPY . .

USER appuser
EXPOSE 8000

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1

CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:create_app()"]

The builder stage installs GCC and builds any C extensions (like psycopg2). The production stage only copies the compiled packages, leaving the build tools behind. This can reduce your image size by 200+ MB.

.dockerignore

Always include a .dockerignore to keep unnecessary files out of the image:

# .dockerignore
__pycache__
*.pyc
*.pyo
.git
.gitignore
.env
.env.*
*.md
.pytest_cache
.mypy_cache
.coverage
htmlcov/
venv/
.venv/
docker-compose*.yml
Dockerfile*
.dockerignore
tests/
docs/
*.log

Docker Compose

Docker Compose orchestrates multiple containers. A typical Flask production stack includes the application, a database, and a cache:

# docker-compose.yml
version: "3.9"

services:
  web:
    build: .
    ports:
      - "8000:8000"
    environment:
      - FLASK_ENV=production
      - DATABASE_URL=postgresql://myapp:secretpassword@db:5432/myapp
      - SECRET_KEY=${SECRET_KEY}
      - REDIS_URL=redis://redis:6379/0
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: unless-stopped
    volumes:
      - app-static:/app/app/static
    networks:
      - backend

  db:
    image: postgres:16-alpine
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=myapp
      - POSTGRES_PASSWORD=secretpassword
    volumes:
      - postgres-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U myapp"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    networks:
      - backend

  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 128mb --maxmemory-policy allkeys-lru
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    networks:
      - backend

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf:ro
      - app-static:/var/www/static:ro
      - ./certbot/conf:/etc/letsencrypt:ro
      - ./certbot/www:/var/www/certbot:ro
    depends_on:
      - web
    restart: unless-stopped
    networks:
      - backend

volumes:
  postgres-data:
  redis-data:
  app-static:

networks:
  backend:
    driver: bridge
# Build and start all services
docker compose up -d --build

# View logs
docker compose logs -f web

# Run database migrations
docker compose exec web flask db upgrade

# Stop all services
docker compose down

# Stop and remove all data (careful!)
docker compose down -v

Health Check Endpoint

Your Flask application needs a health check endpoint that Docker, load balancers, and monitoring tools can hit:

# app/routes/health.py
from flask import Blueprint, jsonify
from app.extensions import db

health_bp = Blueprint("health", __name__)


@health_bp.route("/health")
def health_check():
    """Basic health check - is the app running?"""
    return jsonify({"status": "healthy"}), 200


@health_bp.route("/health/ready")
def readiness_check():
    """Readiness check - can the app handle requests?
    Checks database connectivity and other dependencies.
    """
    checks = {}

    # Check database
    try:
        db.session.execute(db.text("SELECT 1"))
        checks["database"] = "connected"
    except Exception as e:
        checks["database"] = f"error: {str(e)}"
        return jsonify({"status": "unhealthy", "checks": checks}), 503

    return jsonify({"status": "healthy", "checks": checks}), 200

5. Cloud Deployment

With your application containerized, you have multiple options for where to run it. Each cloud platform offers different tradeoffs between control, simplicity, and cost.

AWS (Amazon Web Services)

AWS offers several services for Flask deployment, ranging from fully managed to bare metal:

Elastic Beanstalk

The simplest AWS option. Elastic Beanstalk handles provisioning, load balancing, auto-scaling, and monitoring. You deploy your code, and AWS manages the infrastructure.

# Install EB CLI
pip install awsebcli

# Initialize Elastic Beanstalk in your project
eb init -p python-3.12 myapp --region us-east-1

# Create an environment and deploy
eb create production

# Deploy updates
eb deploy

# Open in browser
eb open

Elastic Beanstalk looks for an application.py file or a Procfile to know how to run your app:

# Procfile (for Elastic Beanstalk)
web: gunicorn -c gunicorn.conf.py "app:create_app()"

ECS (Elastic Container Service)

For Docker-based deployments with more control. You push your Docker image to ECR (Elastic Container Registry) and define how ECS runs it. ECS handles scaling, networking, and load balancing. More configuration than Elastic Beanstalk, but more flexibility.

# Build and push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
docker build -t myapp .
docker tag myapp:latest 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:latest
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:latest

EC2 (Elastic Compute Cloud)

Full control. You provision a virtual server, SSH in, install everything yourself, and manage updates. This is the most work but gives you complete control over the environment. Use this when you have specific requirements that managed services cannot accommodate.

Heroku

Heroku is the fastest path from code to production. It is a Platform-as-a-Service (PaaS) that abstracts away all infrastructure concerns.

# Procfile (required by Heroku)
web: gunicorn "app:create_app()"
# runtime.txt (specify Python version)
python-3.12.8
# Deploy to Heroku
heroku create myapp-production
heroku addons:create heroku-postgresql:essential-0
heroku config:set SECRET_KEY=your-production-secret-key
heroku config:set FLASK_ENV=production

git push heroku main

# Run migrations
heroku run flask db upgrade

# View logs
heroku logs --tail

Heroku automatically detects Python applications, installs dependencies from requirements.txt, and runs the command specified in Procfile. It handles HTTPS, load balancing, and zero-downtime deploys.

DigitalOcean App Platform

DigitalOcean App Platform sits between Heroku’s simplicity and AWS’s flexibility. It supports both Dockerfile-based and buildpack-based deployments, connects directly to your GitHub repository, and auto-deploys on push.

# .do/app.yaml
name: myapp
services:
  - name: web
    github:
      repo: yourusername/myapp
      branch: main
    build_command: pip install -r requirements.txt
    run_command: gunicorn "app:create_app()"
    environment_slug: python
    instance_count: 2
    instance_size_slug: professional-xs
    envs:
      - key: FLASK_ENV
        value: production
      - key: SECRET_KEY
        type: SECRET
        value: your-secret-key
      - key: DATABASE_URL
        scope: RUN_TIME
        value: ${db.DATABASE_URL}
databases:
  - name: db
    engine: PG
    version: "16"

Platform Comparison

Factor Heroku AWS EB AWS ECS DigitalOcean
Setup complexity Low Medium High Low
Control Limited Medium High Medium
Cost (small app) $5-25/mo $15-50/mo $20-60/mo $5-25/mo
Auto-scaling Yes Yes Yes Yes
Docker support Yes Yes Native Yes
Free tier No Yes (12 months) Yes (12 months) No

6. Database in Production

Your development SQLite database will not work in production. Production databases need concurrent access, connection pooling, automated backups, and replication. PostgreSQL is the standard choice for Flask applications.

Connection Pooling

Every database query requires a connection. Opening and closing connections for each request is expensive. Connection pooling maintains a pool of reusable connections.

SQLAlchemy (which Flask-SQLAlchemy wraps) includes a built-in connection pool. Configure it for production:

# config.py - ProductionConfig
class ProductionConfig(Config):
    SQLALCHEMY_DATABASE_URI = os.environ["DATABASE_URL"]

    # Connection pool settings
    SQLALCHEMY_ENGINE_OPTIONS = {
        "pool_size": 20,          # Maximum number of persistent connections
        "max_overflow": 10,       # Extra connections allowed beyond pool_size
        "pool_timeout": 30,       # Seconds to wait for a connection from the pool
        "pool_recycle": 1800,     # Recycle connections after 30 minutes
        "pool_pre_ping": True,    # Test connections before using them
    }

pool_pre_ping=True is especially important. It tests each connection before handing it to your application. If the connection has gone stale (e.g., the database restarted), SQLAlchemy transparently creates a new one instead of giving you a broken connection that causes an error on your user’s request.

Database Migrations

Flask-Migrate (powered by Alembic) tracks database schema changes as versioned migration scripts. This is essential in production because you cannot drop and recreate tables — you have real data.

# Generate a migration after changing models
flask db migrate -m "add user email column"

# Review the generated migration in migrations/versions/
# Then apply it
flask db upgrade

# Rollback if something goes wrong
flask db downgrade

Always review generated migrations before applying them. Alembic does its best to detect changes, but it can miss things (especially column renames, which it detects as a drop + create). Treat migrations as code that deserves code review.

Backups

Automate PostgreSQL backups with a cron job:

#!/bin/bash
# backup.sh
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups/postgres"
DB_NAME="myapp"

mkdir -p "$BACKUP_DIR"

pg_dump -U myapp -h localhost "$DB_NAME" | gzip > "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.sql.gz"

# Keep only last 30 days of backups
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +30 -delete

echo "Backup completed: ${DB_NAME}_${TIMESTAMP}.sql.gz"
# Add to crontab (daily at 2 AM)
0 2 * * * /opt/scripts/backup.sh >> /var/log/backup.log 2>&1

If you are using a managed database (AWS RDS, DigitalOcean Managed Databases), automated backups are built in. Configure the retention period and test your restore procedure regularly.


7. Logging and Monitoring

In production, print() statements are not logging. You need structured, configurable logging that writes to files or external services, includes severity levels, and gives you enough context to debug problems at 3 AM without SSH access to the server.

Python Logging Configuration

# app/logging_config.py
import logging
import logging.handlers
import os


def configure_logging(app):
    """Configure application logging for production."""

    # Remove default Flask handler
    app.logger.handlers.clear()

    # Set log level from environment
    log_level = os.environ.get("LOG_LEVEL", "INFO").upper()
    app.logger.setLevel(getattr(logging, log_level))

    # Console handler (for Docker/container logs)
    console_handler = logging.StreamHandler()
    console_handler.setLevel(logging.DEBUG)

    # Format: timestamp - logger name - level - message
    formatter = logging.Formatter(
        "[%(asctime)s] %(name)s %(levelname)s in %(module)s: %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S"
    )
    console_handler.setFormatter(formatter)
    app.logger.addHandler(console_handler)

    # File handler with rotation (for VM deployments)
    if os.environ.get("LOG_TO_FILE"):
        file_handler = logging.handlers.RotatingFileHandler(
            "logs/app.log",
            maxBytes=10_000_000,  # 10 MB
            backupCount=10
        )
        file_handler.setLevel(logging.INFO)
        file_handler.setFormatter(formatter)
        app.logger.addHandler(file_handler)

    # Suppress noisy loggers
    logging.getLogger("werkzeug").setLevel(logging.WARNING)
    logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)

    app.logger.info("Logging configured at %s level", log_level)

Structured Logging (JSON)

For production systems that feed logs into aggregation services (ELK stack, Datadog, CloudWatch), JSON-formatted logs are easier to parse and query:

# app/logging_config.py (JSON variant)
import json
import logging
from datetime import datetime, timezone


class JSONFormatter(logging.Formatter):
    """Format log records as JSON for log aggregation services."""

    def format(self, record):
        log_entry = {
            "timestamp": datetime.now(timezone.utc).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "module": record.module,
            "function": record.funcName,
            "line": record.lineno,
            "message": record.getMessage(),
        }

        if record.exc_info:
            log_entry["exception"] = self.formatException(record.exc_info)

        # Include extra fields if present
        if hasattr(record, "request_id"):
            log_entry["request_id"] = record.request_id
        if hasattr(record, "user_id"):
            log_entry["user_id"] = record.user_id

        return json.dumps(log_entry)

Request Logging Middleware

# app/middleware.py
import time
import uuid
from flask import g, request, current_app


def register_request_hooks(app):
    """Register before/after request hooks for logging."""

    @app.before_request
    def before_request():
        g.request_id = str(uuid.uuid4())[:8]
        g.start_time = time.time()

    @app.after_request
    def after_request(response):
        duration = time.time() - g.start_time
        current_app.logger.info(
            "request_completed",
            extra={
                "request_id": g.request_id,
                "method": request.method,
                "path": request.path,
                "status": response.status_code,
                "duration_ms": round(duration * 1000, 2),
                "ip": request.remote_addr,
            }
        )
        response.headers["X-Request-ID"] = g.request_id
        return response

Error Tracking with Sentry

Sentry captures exceptions in real time, groups them, tracks their frequency, and provides full stack traces with local variable values. It is the industry standard for production error tracking.

pip install sentry-sdk[flask]
# app/__init__.py
import os
import sentry_sdk
from sentry_sdk.integrations.flask import FlaskIntegration


def create_app(config_name=None):
    # Initialize Sentry before creating the app
    if os.environ.get("SENTRY_DSN"):
        sentry_sdk.init(
            dsn=os.environ["SENTRY_DSN"],
            integrations=[FlaskIntegration()],
            traces_sample_rate=0.1,  # 10% of requests for performance monitoring
            environment=os.environ.get("FLASK_ENV", "production"),
        )

    app = Flask(__name__)
    # ... rest of factory

8. Environment Management

The twelve-factor app methodology (12factor.net) establishes that configuration should be stored in the environment, not in code. This principle is fundamental to modern deployment.

.env Files and python-dotenv

In development, environment variables are managed with .env files. The python-dotenv package loads these into the environment automatically.

pip install python-dotenv
# .env (NEVER commit this file)
FLASK_ENV=development
SECRET_KEY=dev-secret-key-not-for-production
DATABASE_URL=postgresql://localhost:5432/myapp_dev
REDIS_URL=redis://localhost:6379/0
SENTRY_DSN=
LOG_LEVEL=DEBUG
# wsgi.py (entry point)
from dotenv import load_dotenv

load_dotenv()  # Load .env file before anything else

from app import create_app

app = create_app()

Critical rule: Never commit .env files to version control. Add them to .gitignore. Provide a .env.example with placeholder values so developers know which variables are needed.

# .env.example (commit this file)
FLASK_ENV=development
SECRET_KEY=change-me-to-a-random-string
DATABASE_URL=postgresql://localhost:5432/myapp_dev
REDIS_URL=redis://localhost:6379/0
SENTRY_DSN=
LOG_LEVEL=DEBUG

12-Factor App Principles for Flask

The twelve factors most relevant to Flask deployment:

  1. Codebase — One codebase tracked in version control, many deploys (dev, staging, production)
  2. Dependencies — Explicitly declare and isolate dependencies (requirements.txt, virtual environments)
  3. Config — Store config in the environment, not in code
  4. Backing services — Treat databases, caches, and message queues as attached resources (swap via environment variable)
  5. Build, release, run — Strictly separate build (Docker image) from run (container) stages
  6. Processes — Execute the app as stateless processes (no in-memory sessions; use Redis or database)
  7. Port binding — Export services via port binding (Gunicorn binds to a port)
  8. Concurrency — Scale out via the process model (more Gunicorn workers, more containers)
  9. Disposability — Fast startup, graceful shutdown (Gunicorn handles SIGTERM)
  10. Dev/prod parity — Keep development, staging, and production as similar as possible (Docker helps here)
  11. Logs — Treat logs as event streams (write to stdout, let the platform aggregate)
  12. Admin processes — Run admin tasks as one-off processes (flask db upgrade, management commands)

9. CI/CD Pipeline

Continuous Integration and Continuous Deployment automates testing and deployment. Every push to your repository triggers a pipeline that tests your code, builds a Docker image, and deploys it to production. No manual steps, no “I forgot to run the tests” moments.

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Test, Build, Deploy

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16-alpine
        env:
          POSTGRES_DB: myapp_test
          POSTGRES_USER: myapp
          POSTGRES_PASSWORD: testpassword
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install pytest pytest-cov

      - name: Run tests
        env:
          DATABASE_URL: postgresql://myapp:testpassword@localhost:5432/myapp_test
          SECRET_KEY: test-secret-key
          FLASK_ENV: testing
        run: |
          pytest --cov=app --cov-report=xml -v

      - name: Upload coverage
        uses: codecov/codecov-action@v4
        with:
          file: ./coverage.xml

  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push'

    permissions:
      contents: read
      packages: write

    steps:
      - uses: actions/checkout@v4

      - name: Log in to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.event_name == 'push'

    steps:
      - name: Deploy to production server
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_USER }}
          key: ${{ secrets.DEPLOY_SSH_KEY }}
          script: |
            cd /opt/myapp
            docker compose pull web
            docker compose up -d --no-deps web
            docker compose exec -T web flask db upgrade
            docker image prune -f

This pipeline has three stages:

  1. Test — Runs on every push and pull request. Spins up a PostgreSQL service container, installs dependencies, runs pytest with coverage.
  2. Build — Only runs on pushes to main (not PRs). Builds the Docker image and pushes it to GitHub Container Registry.
  3. Deploy — SSHes into the production server, pulls the new image, restarts the web container, runs migrations, and cleans up old images.

10. Practical Example: Full Deployment Stack

Let us put everything together into a complete, production-ready deployment. This is the full stack you would use for a real Flask application.

Project Structure

myapp/
├── app/
│   ├── __init__.py          # Application factory
│   ├── extensions.py        # SQLAlchemy, Migrate, etc.
│   ├── models/
│   ├── routes/
│   │   ├── api.py
│   │   └── health.py
│   ├── static/
│   └── templates/
├── migrations/              # Flask-Migrate / Alembic
├── tests/
├── nginx/
│   └── nginx.conf
├── .env.example
├── .dockerignore
├── .github/
│   └── workflows/
│       └── deploy.yml
├── config.py
├── docker-compose.yml
├── docker-compose.prod.yml
├── Dockerfile
├── gunicorn.conf.py
├── requirements.txt
└── wsgi.py

wsgi.py — The Entry Point

# wsgi.py
import os
from dotenv import load_dotenv

load_dotenv()

from app import create_app

app = create_app(os.environ.get("FLASK_ENV", "production"))

Production Docker Compose

# docker-compose.prod.yml
version: "3.9"

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile
    expose:
      - "8000"
    environment:
      - FLASK_ENV=production
      - DATABASE_URL=postgresql://myapp:${DB_PASSWORD}@db:5432/myapp
      - SECRET_KEY=${SECRET_KEY}
      - REDIS_URL=redis://redis:6379/0
      - SENTRY_DSN=${SENTRY_DSN}
      - LOG_LEVEL=INFO
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    restart: unless-stopped
    networks:
      - internal

  db:
    image: postgres:16-alpine
    environment:
      - POSTGRES_DB=myapp
      - POSTGRES_USER=myapp
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./backups:/backups
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U myapp"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    networks:
      - internal

  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    networks:
      - internal

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf:ro
      - static-files:/var/www/static:ro
      - ./certbot/conf:/etc/letsencrypt:ro
      - ./certbot/www:/var/www/certbot:ro
    depends_on:
      - web
    restart: unless-stopped
    networks:
      - internal

volumes:
  postgres-data:
  redis-data:
  static-files:

networks:
  internal:
    driver: bridge

Production Nginx Configuration

# nginx/nginx.conf
upstream flask_app {
    server web:8000;
}

# Rate limiting zone
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

server {
    listen 80;
    server_name myapp.example.com;

    # Allow Let's Encrypt challenge
    location /.well-known/acme-challenge/ {
        root /var/www/certbot;
    }

    # Redirect everything else to HTTPS
    location / {
        return 301 https://$server_name$request_uri;
    }
}

server {
    listen 443 ssl http2;
    server_name myapp.example.com;

    ssl_certificate /etc/letsencrypt/live/myapp.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/myapp.example.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Content-Security-Policy "default-src 'self'" always;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;

    # Gzip compression
    gzip on;
    gzip_types text/plain text/css application/json application/javascript text/xml;
    gzip_min_length 1000;

    # Static files served directly by Nginx
    location /static/ {
        alias /var/www/static/;
        expires 30d;
        add_header Cache-Control "public, immutable";
        access_log off;
    }

    # API routes with rate limiting
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://flask_app;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # All other routes
    location / {
        proxy_pass http://flask_app;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    client_max_body_size 16M;
}

Gunicorn Production Configuration

# gunicorn.conf.py
import multiprocessing
import os

# Server socket
bind = "0.0.0.0:8000"

# Workers
workers = int(os.environ.get("GUNICORN_WORKERS", multiprocessing.cpu_count() * 2 + 1))
worker_class = os.environ.get("GUNICORN_WORKER_CLASS", "sync")
worker_connections = 1000
timeout = 120
keepalive = 5

# Logging
accesslog = "-"
errorlog = "-"
loglevel = os.environ.get("LOG_LEVEL", "info").lower()
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s'

# Process management
max_requests = 1000
max_requests_jitter = 50
preload_app = True
graceful_timeout = 30

# Hook: log when workers start and stop
def on_starting(server):
    server.log.info("Gunicorn master starting")

def post_fork(server, worker):
    server.log.info("Worker spawned (pid: %s)", worker.pid)

def worker_exit(server, worker):
    server.log.info("Worker exited (pid: %s)", worker.pid)

Production Checklist Script

# scripts/check_production.py
"""Pre-deployment production readiness checker."""
import os
import sys


def check_production_readiness():
    checks = []
    errors = []

    # 1. Check required environment variables
    required_vars = ["SECRET_KEY", "DATABASE_URL", "FLASK_ENV"]
    for var in required_vars:
        if os.environ.get(var):
            checks.append(f"[PASS] {var} is set")
        else:
            errors.append(f"[FAIL] {var} is not set")

    # 2. Check debug mode
    flask_env = os.environ.get("FLASK_ENV", "")
    if flask_env == "production":
        checks.append("[PASS] FLASK_ENV is 'production'")
    else:
        errors.append(f"[FAIL] FLASK_ENV is '{flask_env}', expected 'production'")

    # 3. Check SECRET_KEY is not a default
    secret = os.environ.get("SECRET_KEY", "")
    weak_secrets = ["dev", "secret", "change-me", "default", "password"]
    if any(weak in secret.lower() for weak in weak_secrets):
        errors.append("[FAIL] SECRET_KEY appears to be a default/weak value")
    elif len(secret) < 32:
        errors.append(f"[FAIL] SECRET_KEY is too short ({len(secret)} chars, need 32+)")
    else:
        checks.append("[PASS] SECRET_KEY looks strong")

    # 4. Check database URL is not SQLite
    db_url = os.environ.get("DATABASE_URL", "")
    if "sqlite" in db_url:
        errors.append("[FAIL] DATABASE_URL uses SQLite (not suitable for production)")
    else:
        checks.append("[PASS] DATABASE_URL is not SQLite")

    # Print results
    print("\n=== Production Readiness Check ===\n")
    for check in checks:
        print(f"  {check}")
    for error in errors:
        print(f"  {error}")

    print(f"\n  Passed: {len(checks)}, Failed: {len(errors)}\n")

    if errors:
        print("  RESULT: NOT READY FOR PRODUCTION\n")
        sys.exit(1)
    else:
        print("  RESULT: READY FOR PRODUCTION\n")
        sys.exit(0)


if __name__ == "__main__":
    check_production_readiness()

11. Scaling

Scaling is the art of handling more traffic without degrading performance. There are two approaches, and you will eventually use both.

Vertical Scaling

Vertical scaling means giving your server more resources — more CPU, more RAM, faster disks. It is the simplest approach: upgrade your VM from 2 cores to 8 cores, and Gunicorn spawns more workers. But vertical scaling has a ceiling. A single machine can only get so big, and it is still a single point of failure.

Horizontal Scaling

Horizontal scaling means running multiple instances of your application behind a load balancer. This is the standard approach for production systems.

                    ┌─────────────┐
                    │   Internet   │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │ Load Balancer│
                    │   (Nginx)    │
                    └──┬───┬───┬──┘
                       │   │   │
              ┌────────▼┐ ┌▼────────┐ ┌▼────────┐
              │ Flask #1 │ │ Flask #2 │ │ Flask #3 │
              │ Gunicorn │ │ Gunicorn │ │ Gunicorn │
              └────┬─────┘ └───┬─────┘ └───┬─────┘
                   │           │            │
              ┌────▼───────────▼────────────▼────┐
              │     PostgreSQL + Redis            │
              └───────────────────────────────────┘

Horizontal scaling requires your application to be stateless. That means:

  • No in-memory sessions — Use Redis or database-backed sessions. If a user’s second request hits a different instance, their session must still be there.
  • No local file storage — Uploaded files must go to shared storage (S3, NFS) not the local filesystem.
  • No in-memory caching — Use Redis. Every instance needs access to the same cache.

Redis for Caching

# app/cache.py
import redis
import json
import os
from functools import wraps
from flask import current_app

redis_client = redis.from_url(os.environ.get("REDIS_URL", "redis://localhost:6379/0"))


def cache_response(timeout=300, key_prefix="view"):
    """Decorator to cache Flask view responses in Redis."""
    def decorator(f):
        @wraps(f)
        def wrapper(*args, **kwargs):
            cache_key = f"{key_prefix}:{f.__name__}:{hash(str(args) + str(kwargs))}"

            # Try to get from cache
            cached = redis_client.get(cache_key)
            if cached:
                current_app.logger.debug("Cache hit: %s", cache_key)
                return json.loads(cached)

            # Execute function and cache result
            result = f(*args, **kwargs)
            redis_client.setex(cache_key, timeout, json.dumps(result))
            current_app.logger.debug("Cache miss, stored: %s", cache_key)
            return result
        return wrapper
    return decorator


# Usage in a route
@api_bp.route("/products")
@cache_response(timeout=60)
def get_products():
    products = Product.query.all()
    return [p.to_dict() for p in products]

Server-Side Sessions with Redis

# Store sessions in Redis instead of signed cookies
pip install Flask-Session
# config.py
import redis

class ProductionConfig(Config):
    SESSION_TYPE = "redis"
    SESSION_REDIS = redis.from_url(os.environ["REDIS_URL"])
    SESSION_PERMANENT = False
    SESSION_USE_SIGNER = True

CDN for Static Assets

A Content Delivery Network serves your static files from edge servers around the world, reducing latency for users far from your origin server. Popular options include CloudFront (AWS), Cloudflare, and Fastly.

# config.py
class ProductionConfig(Config):
    CDN_DOMAIN = os.environ.get("CDN_DOMAIN", "")

# In templates, use the CDN domain for static assets
# app/__init__.py
@app.context_processor
def inject_cdn():
    return {"cdn_domain": app.config.get("CDN_DOMAIN", "")}
<!-- In Jinja2 templates -->
{% if cdn_domain %}
    <link rel="stylesheet" href="https://{{ cdn_domain }}/static/css/style.css">
{% else %}
    <link rel="stylesheet" href="{{ url_for('static', filename='css/style.css') }}">
{% endif %}

12. Common Pitfalls

These are the mistakes I see most often in Flask deployments. Every one of them has caused production outages.

1. Debug Mode in Production

Running with debug=True exposes the Werkzeug interactive debugger. Anyone who can trigger an exception can execute arbitrary Python code on your server. This is not a theoretical risk — it is a trivially exploitable remote code execution vulnerability.

# NEVER in production
app.run(debug=True)

# Always check
assert not app.debug, "Debug mode must be off in production"

2. Hardcoded Secrets

# BAD: Secret in source code, visible in Git history forever
app.config["SECRET_KEY"] = "my-super-secret-key-2024"

# GOOD: Secret from environment
app.config["SECRET_KEY"] = os.environ["SECRET_KEY"]

Even if you delete the hardcoded secret in a later commit, it remains in your Git history. Anyone with repository access can find it. If this has already happened, rotate the secret immediately.

3. No Health Checks

Without health checks, your load balancer and container orchestrator have no way to know if your application is actually working. A process can be running but unable to handle requests (e.g., database connection lost). Health checks let the infrastructure detect and replace unhealthy instances automatically.

4. Not Using HTTPS

All traffic must be encrypted. No exceptions. Credentials, session tokens, and user data are all visible in plain HTTP. Let’s Encrypt makes this free. There is no excuse.

5. Using SQLite in Production

SQLite does not support concurrent writes. When two Gunicorn workers try to write simultaneously, one gets a “database is locked” error. Use PostgreSQL or MySQL.

6. No Connection Pooling

Without connection pooling, every request opens a new database connection and closes it when done. Under load, you exhaust the database’s connection limit. SQLAlchemy’s pool is configured by default, but you should tune pool_size and max_overflow for your workload.

7. Logging to Files Without Rotation

If you log to a file without rotation, the file grows until it fills the disk. Use RotatingFileHandler or, better yet, log to stdout and let Docker/systemd handle it.

8. No Graceful Shutdown

When deploying a new version, the old process must finish handling in-flight requests before shutting down. Gunicorn handles this correctly with SIGTERM by default, but make sure your deployment process sends the right signal and waits for the graceful timeout.


13. Best Practices Summary

12-Factor App

Follow the twelve-factor methodology. It was written by engineers at Heroku who deployed millions of applications. The principles are battle-tested and apply to every Flask deployment.

Infrastructure as Code

Every aspect of your infrastructure should be defined in version-controlled files:

  • Dockerfile — Application container
  • docker-compose.yml — Service orchestration
  • nginx.conf — Reverse proxy configuration
  • gunicorn.conf.py — WSGI server configuration
  • .github/workflows/deploy.yml — CI/CD pipeline

If your production server dies, you should be able to recreate the entire environment from these files. No manual server configuration. No tribal knowledge. Everything is documented in code.

Zero-Downtime Deployments

Users should never see an error page during a deployment. Strategies:

  • Rolling updates — Replace instances one at a time, keeping old ones running until new ones pass health checks
  • Blue-green deployment — Run two identical environments. Deploy to the inactive one, test it, then switch traffic
  • Canary deployment — Route a small percentage of traffic to the new version. If metrics look good, gradually increase

Security Hardening

  • Keep all dependencies updated (pip-audit for vulnerability scanning)
  • Set security headers (HSTS, X-Frame-Options, CSP)
  • Use HTTPS everywhere
  • Run containers as non-root users
  • Scan Docker images for vulnerabilities (docker scout, trivy)
  • Limit container resources (CPU, memory) to prevent runaway processes
# Scan for known vulnerabilities in your dependencies
pip install pip-audit
pip-audit

# Scan Docker image
docker scout cves myapp:latest

14. Key Takeaways

  1. Never use the Flask development server in production. Use Gunicorn or uWSGI behind Nginx.
  2. Configuration belongs in the environment. Use environment variables for secrets, database URLs, and anything that changes between environments. Never hardcode credentials.
  3. Docker is the standard deployment unit. Containerize your application for consistent, reproducible deployments across every environment.
  4. Automate everything. CI/CD pipelines eliminate human error. Tests run on every push. Builds are automatic. Deployments are a button click or a git push.
  5. Monitor and log everything. You cannot fix what you cannot see. Structured logging, health checks, and error tracking (Sentry) are not optional.
  6. Design for failure. Servers crash, databases go down, networks partition. Health checks, connection pooling with pre-ping, graceful shutdown, and automated restarts keep your application available.
  7. Scale horizontally. Build stateless applications from the start. Use Redis for sessions and caching, S3 for file storage. When traffic grows, add more instances behind a load balancer.
  8. Security is not optional. HTTPS everywhere, debug mode off, secrets in environment variables, non-root containers, dependency vulnerability scanning. These are baseline requirements, not nice-to-haves.
  9. Start simple, add complexity as needed. A single VPS running Docker Compose is a perfectly valid production setup for most applications. You do not need Kubernetes on day one.
  10. Infrastructure as code. Every configuration file is version-controlled. If your server disappears, you can recreate it from your repository. No manual steps. No documentation drift.

Deployment is not a one-time event. It is an ongoing practice. Your deployment infrastructure evolves with your application. Start with the basics — Gunicorn, Nginx, Docker, CI/CD — and add sophistication as your needs grow. The patterns in this tutorial will serve you from your first production deployment to your thousandth.




Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *