Skip to main content

Operations and Deployment Guide

This guide covers everything an operator needs to run EvilTwin: bootstrapping the environment, performing database migrations, running in production with TLS, and handling day-2 operational tasks like backups and scaling.


Understanding the Service Topology

EvilTwin is a multi-service application orchestrated with Docker Compose. Here is how the services relate to each other:

Loading diagram…
Important: Network Isolation

In production, the Deception Network (honeypots) must be on a separate network segment from the Management Network. This prevents a compromised honeypot from communicating directly with the backend database or SDN controller.


Bootstrap Checklist (Day 0)

Run these steps in order the first time you deploy EvilTwin.

Step 1: Prepare Environment Variables

# Copy the example and edit with your values
cp .env.example .env

Required values to set (the deployment will not work safely until these are changed):

VariableWhat to set it to
POSTGRES_PASSWORDA strong random password — openssl rand -hex 20
SECRET_KEYA random 32-byte hex string — openssl rand -hex 32
CANARY_WEBHOOK_SECRETA random 32-byte hex string — openssl rand -hex 32
Generating secure secrets
openssl rand -hex 32   # Use this output for SECRET_KEY and CANARY_WEBHOOK_SECRET
openssl rand -hex 20 # Use this output for POSTGRES_PASSWORD

Step 2: Check for Port Conflicts

EvilTwin uses several ports. Verify they are free before starting:

# Check which ports are in use
ss -tlnp | grep -E '22|23|80|445|5432|6633|8000|8080'

If any port is in use by another service, edit docker-compose.yml to change the host-side binding.

Step 3: Build and Start Services

# Build all container images and start in the background
docker compose up --build -d

# Watch the startup progress
docker compose logs -f --tail=50

You should see each service reach a healthy or running state within 60 seconds.

Step 4: Run Database Migrations

EvilTwin uses Alembic to manage the database schema. Think of Alembic as "git for your database structure" — each migration is a versioned change to the schema.

# Apply all pending migrations to create tables and indexes
docker compose exec backend alembic upgrade head

You should see output like:

INFO  [alembic.runtime.migration] Running upgrade  -> abc123, Initial schema
INFO [alembic.runtime.migration] Running upgrade abc123 -> def456, Add threat fields
What is "head"?

head means "apply all migrations up to the latest version." If you re-run this command later, it will only apply new migrations — it won't touch existing data.

Step 5: Verify Health

# Should return {"status": "healthy", "database": true, ...}
curl -s http://localhost:8000/health | python3 -m json.tool

# Should return HTTP 200 with version info
curl -I http://localhost:8000/

Step 6: Create Your First User Account

# Register an admin user
curl -X POST http://localhost:8000/auth/register \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "StrongPass123!", "role": "admin"}'

# Log in to receive a JWT access token
curl -X POST http://localhost:8000/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "StrongPass123!"}'
# Save the "access_token" from the response — you will need it for API calls

Step 7: Validate Ingestion and Scoring

Simulate a honeypot event to verify the full pipeline works:

TOKEN="your_access_token_here"

# Send a test event
curl -X POST http://localhost:8000/log \
-H "Content-Type: application/json" \
-d '{
"src_ip": "203.0.113.99",
"dst_ip": "10.0.1.10",
"protocol": "ssh",
"sensor_id": "cowrie-test",
"timestamp": "2024-01-15T10:00:00Z",
"payload": {"event": "login_attempt", "username": "root", "password": "admin"}
}'

# Check that the session was created
curl -s http://localhost:8000/sessions \
-H "Authorization: Bearer $TOKEN" | python3 -m json.tool

Production Deployment

Using docker-compose.prod.yml

The docker-compose.prod.yml file adds production-specific configuration:

  • No exposed ports from Postgres (database not accessible externally)
  • Frontend served by nginx
  • TLS termination via nginx
# Start with production configuration
docker compose -f docker-compose.prod.yml up --build -d

TLS Certificate Setup

EvilTwin uses nginx to terminate HTTPS in production. You need certificates before starting.

Option A: Let's Encrypt (public domain)

# Install certbot
dnf install certbot # Fedora/RHEL

# Obtain a certificate
certbot certonly --standalone -d yourdomain.com

# Certificates will be at:
# /etc/letsencrypt/live/yourdomain.com/fullchain.pem
# /etc/letsencrypt/live/yourdomain.com/privkey.pem

Option B: Self-signed (internal/testing)

mkdir -p infra/nginx/certs
openssl req -x509 -newkey rsa:4096 -keyout infra/nginx/certs/privkey.pem \
-out infra/nginx/certs/fullchain.pem -days 365 -nodes \
-subj "/CN=eviltwin.internal"

After generating certificates, set in .env:

NGINX_CERT_PATH=/etc/letsencrypt/live/yourdomain.com/fullchain.pem
NGINX_KEY_PATH=/etc/letsencrypt/live/yourdomain.com/privkey.pem

Nginx Configuration

The nginx reverse proxy handles:

  • HTTPS termination (port 443)
  • HTTP → HTTPS redirect (port 80)
  • WebSocket upgrade pass-through for /ws/alerts
  • Rate limiting on auth endpoints

Check infra/nginx/nginx.conf for the full configuration.


Day-2 Operations

Checking Service Status

# See all container states and health checks
docker compose ps

# Check logs for a specific service
docker compose logs backend --tail=100
docker compose logs cowrie --tail=50
docker compose logs ryu --tail=50

# Follow logs in real time
docker compose logs -f backend

Updating the Application

# Pull latest code
git pull

# Rebuild and restart (zero-downtime not supported — brief interruption expected)
docker compose up --build -d

# Always run migrations after code updates
docker compose exec backend alembic upgrade head

Database Backup

# Create a compressed backup with timestamp
docker compose exec postgres pg_dump -U "${POSTGRES_USER}" "${POSTGRES_DB}" \
| gzip > "backup_$(date +%Y%m%d_%H%M%S).sql.gz"

# List existing backups
ls -lh backup_*.sql.gz

Database Restore

# CAUTION: This replaces the current database content
gunzip -c backup_20240115_120000.sql.gz \
| docker compose exec -T postgres psql -U "${POSTGRES_USER}" "${POSTGRES_DB}"

Scaling the Backend

The FastAPI backend can run multiple workers for higher throughput:

# In docker-compose.yml, change the backend command to:
# command: gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

For horizontal scaling (multiple backend containers), a load balancer is required in front. PostgreSQL connection pooling (PgBouncer) is recommended when running more than 4 workers.


Operational Runbooks

Runbook A: Ingestion Not Updating Sessions

Symptom: POST to /log succeeds (200) but no sessions appear in /sessions.

StepCommand
1. Check backend logs for errorsdocker compose logs backend --tail=200
2. Verify migrations are applieddocker compose exec backend alembic current
3. Query with broad date rangecurl "http://localhost:8000/sessions?page=1&page_size=25"
4. Check database directlydocker compose exec postgres psql -U eviltwin -c "SELECT COUNT(*) FROM sessions;"

Runbook B: SDN Not Redirecting Suspicious Traffic

Symptom: High threat-level IPs are not being blocked/redirected.

StepCommand
1. Check the score for the IPcurl -H "Authorization: Bearer $TOKEN" http://localhost:8000/score/203.0.113.1
2. Verify redirect thresholdCheck THREAT_REDIRECT_THRESHOLD in .env
3. Inspect installed flowscurl http://localhost:8080/stats/flow/1
4. Check controller logsdocker compose logs ryu --tail=100

Runbook C: Frontend Not Receiving Live Alerts

Symptom: Dashboard shows "Disconnected" or alerts appear delayed.

StepCommand
1. Verify backend is runningcurl -s http://localhost:8000/health
2. Check WebSocket authEnsure browser has valid JWT (check DevTools → Application → Local Storage)
3. Verify WebSocket URLCheck VITE_WS_URL in frontend .env
4. Watch backend WS logsdocker compose logs backend --tail=50

Runbook D: Authentication Failures (401 Unauthorized)

Symptom: API calls return 401 Unauthorized.

StepAction
1. Check token expiryJWT access tokens expire after 30 minutes (configurable)
2. Refresh the tokenPOST /auth/refresh with your refresh token
3. Verify SECRET_KEYIf SECRET_KEY changed, all existing tokens are immediately invalidated
4. Re-loginPOST /auth/login to get a new token pair

Runbook E: AI Endpoints Return 503

Symptom: POST /ai/analyze or POST /ai/chat returns 503 Service Unavailable.

StepAction
1. Check AI statuscurl -H "Authorization: Bearer $TOKEN" http://localhost:8000/ai/status
2. Verify LLM configCheck LLM_API_KEY and LLM_BASE_URL in .env
3. Test LLM connectivitycurl -H "Authorization: Bearer $LLM_API_KEY" "$LLM_BASE_URL/models"
4. Check backend logs`docker compose logs backend --tail=50