Skip to content

Scaling

Controller (Django)

The Controller is stateless -- all state lives in Postgres, Redis, and MinIO.

Strategy

  • Run multiple Controller instances behind a load balancer
  • Session affinity not required (sessions stored in Redis/DB)
  • Each instance runs its own gunicorn workers

Configuration

# Production: 4 workers per instance, 2-4 instances
gunicorn config.wsgi:application \
  --bind 0.0.0.0:8000 \
  --workers 4 \
  --threads 2 \
  --timeout 120 \
  --max-requests 1000 \
  --max-requests-jitter 50

Bottlenecks

  • Database connections -- Each worker holds a connection. Use PgBouncer for connection pooling at scale.
  • Brief assembly -- CPU-bound (JSON serialization + hashing). Scales linearly with workers.
  • Celery workers -- Scale independently. Add more workers for brief lifecycle, health checks, failure reports.

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: controller
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: controller
          resources:
            requests:
              cpu: "500m"
              memory: "512Mi"
            limits:
              cpu: "1"
              memory: "1Gi"

Dispatcher (Go)

The Dispatcher manages container lifecycle -- scaling depends on the queue source.

Internal Queue (Redis List)

  • Only one Dispatcher instance can consume from a Redis list (BRPOP is exclusive)
  • Scale by increasing NUM_CONSUMERS (parallel goroutines within one instance)
  • For multi-instance: switch to redis-stream queue source

Redis Stream Queue

  • Multiple Dispatchers can consume from the same stream via consumer groups
  • Each instance joins the dispatchers consumer group
  • Messages are distributed across instances automatically
  • Set QUEUE_SOURCE=redis-stream on all instances

Configuration

NUM_CONSUMERS=10
MAX_CONCURRENT_PULLS=4
QUEUE_SOURCE=redis-stream
QUEUE_STREAM_NAME=kohakku:tasks
QUEUE_CONSUMER_GROUP=dispatchers
NUM_CONSUMERS=5

Bottlenecks

  • Image pulls -- Gated by MAX_CONCURRENT_PULLS. Cold pulls dominate latency.
  • Docker socket -- Local backend shares one Docker daemon. For higher throughput, use K8s/ECS backends.
  • Redis -- Single Redis handles queue + state. Separate Redis instances for queue vs state at high scale.

Temporal Worker

Stateless -- run multiple workers on the same task queue. Temporal server distributes workflow executions across workers.

# Run 3 worker instances
for i in 1 2 3; do
  python temporal_worker.py &
done

Celery Workers

Scale independently from the Controller. Separate queues for different task types if needed.

# High-priority queue for dispatch
celery -A config worker -l info -Q celery,dispatch -c 4

# Background queue for cleanup
celery -A config worker -l info -Q cleanup -c 2

Database

Strategy When
Read replicas Read-heavy loads. Django supports database routers.
Connection pooling High worker count. PgBouncer in transaction mode.
Index audit Slow queries. EXPLAIN ANALYZE on hot paths.

Redis

Setting Recommendation
Persistence AOF enabled by default in docker-compose (appendonly yes)
Maxmemory Set to prevent OOM. LRU eviction for cache, noeviction for queue
Separate instances One for cache/sessions, one for task queue, one for Celery broker