Scaling¶

Controller (Django)¶

The Controller is stateless -- all state lives in Postgres, Redis, and MinIO.

Strategy¶

Run multiple Controller instances behind a load balancer
Session affinity not required (sessions stored in Redis/DB)
Each instance runs its own gunicorn workers

Configuration¶

# Production: 4 workers per instance, 2-4 instances
gunicorn config.wsgi:application \
  --bind 0.0.0.0:8000 \
  --workers 4 \
  --threads 2 \
  --timeout 120 \
  --max-requests 1000 \
  --max-requests-jitter 50

Bottlenecks¶

Database connections -- Each worker holds a connection. Use PgBouncer for connection pooling at scale.
Brief assembly -- CPU-bound (JSON serialization + hashing). Scales linearly with workers.
Celery workers -- Scale independently. Add more workers for brief lifecycle, health checks, failure reports.

Kubernetes¶

apiVersion: apps/v1
kind: Deployment
metadata:
  name: controller
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: controller
          resources:
            requests:
              cpu: "500m"
              memory: "512Mi"
            limits:
              cpu: "1"
              memory: "1Gi"

Dispatcher (Go)¶

The Dispatcher manages container lifecycle -- scaling depends on the queue source.

Internal Queue (Redis List)¶

Only one Dispatcher instance can consume from a Redis list (BRPOP is exclusive)
Scale by increasing NUM_CONSUMERS (parallel goroutines within one instance)
For multi-instance: switch to redis-stream queue source

Redis Stream Queue¶

Multiple Dispatchers can consume from the same stream via consumer groups
Each instance joins the dispatchers consumer group
Messages are distributed across instances automatically
Set QUEUE_SOURCE=redis-stream on all instances

Configuration¶

Single instanceMulti-instance

NUM_CONSUMERS=10
MAX_CONCURRENT_PULLS=4

QUEUE_SOURCE=redis-stream
QUEUE_STREAM_NAME=kohakku:tasks
QUEUE_CONSUMER_GROUP=dispatchers
NUM_CONSUMERS=5

Bottlenecks¶

Image pulls -- Gated by MAX_CONCURRENT_PULLS. Cold pulls dominate latency.
Docker socket -- Local backend shares one Docker daemon. For higher throughput, use K8s/ECS backends.
Redis -- Single Redis handles queue + state. Separate Redis instances for queue vs state at high scale.

Temporal Worker¶

Stateless -- run multiple workers on the same task queue. Temporal server distributes workflow executions across workers.

# Run 3 worker instances
for i in 1 2 3; do
  python temporal_worker.py &
done

Celery Workers¶

Scale independently from the Controller. Separate queues for different task types if needed.

# High-priority queue for dispatch
celery -A config worker -l info -Q celery,dispatch -c 4

# Background queue for cleanup
celery -A config worker -l info -Q cleanup -c 2

Database¶

Strategy	When
Read replicas	Read-heavy loads. Django supports database routers.
Connection pooling	High worker count. PgBouncer in transaction mode.
Index audit	Slow queries. `EXPLAIN ANALYZE` on hot paths.

Redis¶

Setting	Recommendation
Persistence	AOF enabled by default in docker-compose (`appendonly yes`)
Maxmemory	Set to prevent OOM. LRU eviction for cache, `noeviction` for queue
Separate instances	One for cache/sessions, one for task queue, one for Celery broker