Nginx Cheat Sheet - The Configuration Guide That Actually Works

After configuring Nginx while volunteering at drinkpani, setting up systems at Tathyakar, and deploying projects at Ensemble Matrix, I learned that a reliable configuration is less about copying blocks and more about understanding the intent behind each directive. This article shares why I used specific options, what problems they solved, and the guardrails that kept production calm.

When I say production, I mean the evenings where a slow header stalled an API for a city of users, or a websocket dropped in the middle of the only demo that mattered. The goal here is to explain not only what to write, but when it helps, and just as important, when to leave it out.

We will walk through the core building blocks first, then move into reverse proxying, load balancing, websockets, transport security, caching, rate limits, observability, and performance. Each section starts with the problem and the tradeoffs. Code follows as a record, not as the point.

Basic Setup & Structure

Production configuration starts with intent. The events block decides how workers accept connections. The http block shapes everything riding on top: logging, compression, limits, and includes. I rely on worker_processes auto so the same file scales from a tiny VM to a larger host without hand tuning. I define a rich access log early because most midnight questions are answered by good logs. Reasonable body and header limits protect upstreams from surprise payloads. Compression reduces bandwidth cost without harming latency when applied to the right types. Finally, I keep app specific servers outside the main file and pull them in with includes, which keeps deploys focused and rollbacks simple.

Here is the base structure I reuse across projects:

# /etc/nginx/nginx.conf - Main configuration
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging format
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for" '
                    'rt=$request_time uct="$upstream_connect_time" '
                    'uht="$upstream_header_time" urt="$upstream_response_time"';

    access_log /var/log/nginx/access.log main;

    # Performance optimizations
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    client_max_body_size 50M;

    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/xml text/javascript 
               application/javascript application/xml+rss 
               application/json application/xml;

    # Include server configurations
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

Pro Tip: Worker Processes

Use worker_processes auto to let Nginx automatically set the number of worker processes based on available CPU cores. I learned this after manually setting it wrong and wondering why performance was terrible.

Reverse Proxy Configurations

A reverse proxy is the front door. It terminates connections, sets security headers, and decides which upstream should answer. Forwarding everything is tempting; it works until it does not. The pattern below keeps headers explicit, upgrades clean for websockets, and timeouts realistic for normal web workloads. For a basic service, this is what I ended up using after a few production incidents.

Basic Node.js App (Like Our Web Services)

Story from the field

During a quiet evening deploy for a small service, a missing X-Forwarded-Proto header caused redirect loops that only appeared behind the proxy. Making every header explicit solved it. Later, websocket upgrades failed because the connection header kept closing; moving to proxy_http_version 1.1 and setting the upgrade headers fixed intermittent drops. The timeouts below were tuned to match typical request latency profiles so the proxy stays patient without masking hung upstreams.

# /etc/nginx/sites-available/web-api
server {
    listen 80;
    server_name api.yourapp.com;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header Referrer-Policy "no-referrer-when-downgrade" always;
    add_header Content-Security-Policy "default-src 'self' http: https: data: blob: 'unsafe-inline'" always;

    # API routes
    location /api/ {
        proxy_pass http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
        
        # Timeouts for web service processing
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    # Health check endpoint
    location /health {
        proxy_pass http://127.0.0.1:3000/health;
        access_log off;
    }
}

Tuning workflow

Profile first. Change one variable at a time. Keep a record of the old and new values and the reason. Recheck after deploys and after traffic changes. Most performance wins come from removing unnecessary work rather than pushing limits to the edge.

Reading the tea leaves

Separate formats make it easier to answer focused questions. JSON logs feed aggregators cleanly. Path specific access logs pay for themselves the first time you chase a latency regression. Disable logging for health probes to keep your signal strong.

What to watch

Start permissive and tighten after you see real traffic. Login endpoints deserve the strictest limits and detailed logs. General API limits should be high enough not to punish normal usage but low enough to blunt scraping and bursts. Behind a load balancer, validate and trust X-Forwarded-For only if you control each hop.

Why this matters

Immutable asset caching cuts repeat bandwidth dramatically. Short HTML caching makes releases visible quickly. The try_files fallback preserves client routing without special cases in upstream apps. Keep the root pointing at built artifacts and serve APIs through explicit locations so static and dynamic concerns do not mix.

Operational notes

Rotate certificates with automation and alert on expiry far in advance. Test HSTS on a subdomain before enabling it broadly. Keep resolvers explicit to avoid surprises in containerized environments. Prefer a minimal cipher suite that passes current guidance rather than a long list you do not maintain.

Checklist I keep for websockets

Upgrade headers present. Connection set to upgrade. Buffering off. Read timeout long enough for quiet periods. Upstream supports keepalive. If any of these are missing, intermittent drops and invisible latency spikes appear.

Why this works

Least connections smooths traffic when request durations vary. Weights let stronger instances pull more load. Keepalive reduces connection churn between Nginx and upstreams, which shows up as lower tail latency. Connection pooling and small buffers in the location block balance memory usage with throughput for typical JSON APIs.

Notes

Security headers at the edge raise the baseline for every app behind Nginx. Keep them as defaults and override per service only when needed. Health endpoints route through the same path as real traffic, which validates upstream wiring without adding side channels. Keep the health location unlogged or you will drown in noise.

Why these choices

worker_processes auto lets Nginx match CPU cores without me guessing. The events settings favor scalable accept behavior. In http, a custom log_format captures request time and upstream timings so slow paths stand out. client_max_body_size prevents accidental large uploads from overwhelming backends. Gzip is limited to text and common web types so binaries are not compressed inefficiently. The pair of include lines separate platform concerns from app concerns, which makes it safer to change one without touching the other.

Load Balancing Multiple Instances

Traffic rarely arrives evenly. One user uploads a large file while another browses a light page. The balancing method should reflect that mix. I use least connections when instance performance varies under load because it prevents slower nodes from falling behind. Weights let a stronger node carry more, and keepalive reduces overhead between Nginx and upstreams. At Ensemble Matrix, this approach kept API latency steady during noisy bursts.

# Upstream configuration
upstream web_backend {
    least_conn;  # Use least connections algorithm
    
    server 127.0.0.1:3001 weight=3 max_fails=3 fail_timeout=30s;
    server 127.0.0.1:3002 weight=3 max_fails=3 fail_timeout=30s;
    server 127.0.0.1:3003 weight=2 max_fails=3 fail_timeout=30s;  # Backup server
    
    # Health check
    keepalive 32;
}

server {
    listen 80;
    server_name api.ensemblematrix.com;

    location / {
        proxy_pass http://web_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Enable connection pooling
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }
}

Load Balancing Methods

Round robin sends traffic evenly and works best when instances are identical and requests are short.

Least connections directs new requests to instances with fewer active connections, which smooths out heavy or uneven workloads.

IP hash keeps a given client on the same instance, useful when session state cannot be shared.

Hash uses a custom key such as a user or tenant id to keep related traffic together.

WebSocket Configuration

The biggest surprises with websockets came from details that looked small. Missing upgrade headers break connections in ways that are hard to spot. Buffering must be off or messages feel sticky. Read timeouts should be long enough for quiet but healthy connections. At drinkpani, these choices kept streams stable during long sessions.

# WebSocket proxy configuration
map $http_upgrade $connection_upgrade {
    default upgrade;
    '' close;
}

upstream app_websocket {
    server 127.0.0.1:8080;
    server 127.0.0.1:8081;
    server 127.0.0.1:8082;
}

server {
    listen 80;
    server_name ws.drinkpani.com;

    location /socket.io/ {
        proxy_pass http://app_websocket;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # WebSocket specific timeouts
        proxy_read_timeout 86400;  # 24 hours
        proxy_send_timeout 86400;
        proxy_connect_timeout 60s;
        
        # Disable buffering for real-time communication
        proxy_buffering off;
    }
}

SSL/TLS Configuration

Certificates rotate, ciphers age, and defaults change. I favor modern protocols with a short list of ciphers maintained by trusted sources. HSTS belongs only when you are sure every subdomain will stay on HTTPS. OCSP stapling reduces handshake latency. Test on a staging host before moving the same settings into production.

# SSL configuration
server {
    listen 443 ssl http2;
    server_name api.yourapp.com;

    # SSL certificates
    ssl_certificate /etc/letsencrypt/live/api.yourapp.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.yourapp.com/privkey.pem;

    # SSL configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384;
    ssl_prefer_server_ciphers off;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;
    ssl_session_tickets off;

    # OCSP stapling
    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_trusted_certificate /etc/letsencrypt/live/api.yourapp.com/chain.pem;
    resolver 8.8.8.8 8.8.4.4 valid=300s;
    resolver_timeout 5s;

    # Security headers
    add_header Strict-Transport-Security "max-age=63072000" always;
    add_header X-Frame-Options DENY always;
    add_header X-Content-Type-Options nosniff always;
    add_header X-XSS-Protection "1; mode=block" always;

    # Your application
    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name api.yourapp.com;
    return 301 https://$server_name$request_uri;
}

Static File Serving & Caching

Single page apps thrive when assets stay cacheable for a long time and HTML stays fresh. Fingerprinted assets can be immutable for a year. HTML should expire quickly to allow releases to appear without a hard refresh. The fallback with try_files keeps client routing working when requests land directly on deep links. For the web dashboard at Tathyakar, this balance kept updates quick and bandwidth low.

# Static file serving with caching
server {
    listen 80;
    server_name dashboard.tathyakar.com;
    root /var/www/dashboard/build;
    index index.html;

    # Cache static assets
    location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2|ttf|eot)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
        add_header X-Cache-Status "STATIC";
        
        # Enable compression
        gzip_static on;
        
        # Security headers for static files
        add_header X-Frame-Options DENY;
        add_header X-Content-Type-Options nosniff;
    }

    # Cache HTML files for shorter period
    location ~* \.(html)$ {
        expires 1h;
        add_header Cache-Control "public";
        add_header X-Cache-Status "HTML";
    }

    # API routes
    location /api/ {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Disable caching for API responses
        add_header Cache-Control "no-cache, no-store, must-revalidate";
        add_header Pragma "no-cache";
        add_header Expires "0";
    }

    # React Router fallback
    location / {
        try_files $uri $uri/ /index.html;
    }
}

Rate Limiting & Security

Rate limits are safety rails, not walls. The right numbers come from watching real traffic. Separate zones for login, general API, and everything else let you be strict where it matters and generous where it does not. When you run behind a load balancer, trust the client IP only if you control the full proxy chain and validate forwarded headers.

# Rate limiting configuration
http {
    # Define rate limiting zones
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
    limit_req_zone $binary_remote_addr zone=general:10m rate=100r/s;
    
    # Connection limiting
    limit_conn_zone $binary_remote_addr zone=conn_limit_per_ip:10m;
}

server {
    listen 80;
    server_name api.yourapp.com;

    # General rate limiting
    limit_req zone=general burst=20 nodelay;
    limit_conn conn_limit_per_ip 20;

    # Strict rate limiting for login
    location /api/auth/login {
        limit_req zone=login burst=5 nodelay;
        proxy_pass http://127.0.0.1:3000;
        # ... other proxy settings
    }

    # API rate limiting
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://127.0.0.1:3000;
        # ... other proxy settings
    }

    # Block common attack patterns
    location ~* \.(php|asp|aspx|jsp)$ {
        return 444;  # Close connection without response
    }

    # Block access to sensitive files
    location ~ /\. {
        deny all;
        access_log off;
        log_not_found off;
    }
}

Rate Limiting Gotcha

Be careful with rate limiting behind load balancers. If you’re using $binary_remote_addr, all requests might appear to come from the load balancer IP. Use $http_x_forwarded_for or $http_x_real_ip instead, but validate these headers first!

Monitoring & Logging

Metrics catch trends. Logs explain incidents. Separate log formats for APIs and general traffic make it faster to answer narrow questions. I keep a lightweight status page for quick health checks and turn off logging for health probes to avoid noise.

# Custom log formats for different needs
http {
    # Detailed API logging
    log_format api_log '$remote_addr - $remote_user [$time_local] '
                       '"$request" $status $body_bytes_sent '
                       '"$http_referer" "$http_user_agent" '
                       'rt=$request_time uct="$upstream_connect_time" '
                       'uht="$upstream_header_time" urt="$upstream_response_time" '
                       'cs=$upstream_cache_status';

    # JSON format for log aggregation
    log_format json_log escape=json '{"timestamp":"$time_iso8601",'
                                   '"remote_addr":"$remote_addr",'
                                   '"method":"$request_method",'
                                   '"uri":"$request_uri",'
                                   '"status":$status,'
                                   '"body_bytes_sent":$body_bytes_sent,'
                                   '"request_time":$request_time,'
                                   '"upstream_response_time":"$upstream_response_time",'
                                   '"user_agent":"$http_user_agent"}';
}

server {
    listen 80;
    server_name api.yourapp.com;

    # Separate logs for different endpoints
    access_log /var/log/nginx/api.access.log api_log;
    error_log /var/log/nginx/api.error.log warn;

    location /api/v1/ {
        access_log /var/log/nginx/api-v1.access.log json_log;
        proxy_pass http://127.0.0.1:3000;
        # ... proxy settings
    }

    # Don't log health checks
    location /health {
        access_log off;
        proxy_pass http://127.0.0.1:3000/health;
    }
}

# Enable Nginx status page for monitoring
server {
    listen 8080;
    server_name localhost;
    
    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        deny all;
    }
}

Performance Tuning

Tune after measuring. File descriptor limits remove a common ceiling on concurrency. Reasonable buffer sizes keep memory spikes in check. Keepalive reduces handshake overhead between Nginx and upstreams. These values are a starting point. Measure, adjust, and keep notes so future you knows why a number changed.

# Performance tuning in nginx.conf
worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
    accept_mutex off;
}

http {
    # Connection keepalive
    keepalive_timeout 65;
    keepalive_requests 1000;

    # Buffer sizes
    client_body_buffer_size 128k;
    client_max_body_size 50m;
    client_header_buffer_size 1k;
    large_client_header_buffers 4 4k;
    output_buffers 1 32k;
    postpone_output 1460;

    # Timeouts
    client_header_timeout 3m;
    client_body_timeout 3m;
    send_timeout 3m;

    # TCP optimizations
    tcp_nopush on;
    tcp_nodelay on;
    sendfile on;
    sendfile_max_chunk 512k;

    # Compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_comp_level 6;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/javascript
        application/xml+rss
        application/json;

    # Open file cache
    open_file_cache max=200000 inactive=20s;
    open_file_cache_valid 30s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;
}

Common Nginx Commands

These are the commands I use daily for managing Nginx:

# Test configuration before reloading
sudo nginx -t

# Reload configuration without downtime
sudo nginx -s reload

# Stop Nginx gracefully
sudo nginx -s quit

# Stop Nginx immediately
sudo nginx -s stop

# Check Nginx status
sudo systemctl status nginx

# View error logs in real-time
sudo tail -f /var/log/nginx/error.log

# View access logs with filtering
sudo tail -f /var/log/nginx/access.log | grep "POST"

# Check which process is using port 80
sudo netstat -tlnp | grep :80

# Find Nginx configuration files
sudo nginx -T | head -20

# Check Nginx version and modules
nginx -V

Troubleshooting Common Issues

502 Bad Gateway

Usually means your upstream server is down or unreachable.

# Check if your app is running
curl http://127.0.0.1:3000/health

# Check Nginx error logs
sudo tail -f /var/log/nginx/error.log

413 Request Entity Too Large

File upload too large. Increase client_max_body_size.

# In server block or http block
client_max_body_size 50M;

504 Gateway Timeout

Upstream server taking too long to respond.

# Increase proxy timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;

Production Deployment Checklist

Pre-deployment Checklist

Test the configuration with nginx -t
Obtain and validate SSL certificates
Set security headers you can maintain long term
Add rate limits for sensitive endpoints
Rotate and ship logs to a safe place
Enable monitoring and alerts you will actually read
Load test with at least two upstream instances
Set cache headers for assets and HTML separately
Back up working configuration files
Record what changed and why

Conclusion

Nginx is powerful, and like most powerful tools, it rewards care. The choices here were shaped by production incidents and patient debugging across drinkpani, Tathyakar, and Ensemble Matrix. Start simple, observe real users, change one thing at a time, and keep your notes close.

References

Official Nginx Documentation - Comprehensive official docs
Nginx Wiki - Community-driven wiki
HTML5 Boilerplate Nginx Config - Best practices
Mozilla SSL Configuration Generator - SSL config tool