Skip to main content

Testing and Quality Strategy

Goals

This platform handles security-critical logic — incorrect threat scoring, broken auth, or dropped events could mean missed detections or denied access for analysts. The test suite is designed to catch regressions at every layer.

Test goals:

  • Ingestion correctness — no events dropped or corrupted during the ingest pipeline
  • Auth and RBAC — protected endpoints reject unauthenticated/unauthorised requests
  • Threat scoring — ML model and feature extraction produce correct outputs
  • Alert delivery — High/Critical events broadcast over WebSocket
  • Frontend behaviour — analyst-critical workflows (filters, pagination, auth) work correctly
  • SDN logic — flow rule installation/deletion is correct

Test Pyramid

Loading diagram…

Use unit tests for logic. Use integration tests for database and HTTP interactions. Use runtime validation sparingly — only where container behaviour cannot be mocked.


Backend Tests

Unit Tests (backend/tests/)

Test FileWhat It Covers
test_ai.pyFeature extraction, scorer fallback when model missing
test_scoring.pyML scoring output range, level classification
test_alert_manager.pyQueue push/pop, overflow behaviour, broadcast
test_canary.pyHMAC signature verification, replay attack detection
test_vpn.pyVPN detection service responses and mock
test_train.pySynthetic data generation, model training, serialisation

Integration Tests (backend/tests/test_api_integration.py and test_ingest.py)

These tests spin up an ASGI test client and run actual HTTP requests against a real (schema-isolated) PostgreSQL schema.

Endpoints covered:

EndpointWhat Is Verified
GET /health200 OK, database: true
POST /auth/registerUser created, bcrypt hash stored
POST /auth/loginReturns access_token and refresh_token
POST /auth/refreshReturns new access token
POST /logEvent persisted, session created, score computed
GET /sessionsPagination, filters, correct response schema
GET /sessions/{id}Session detail with commands and credentials
GET /score/{ip}Score returned for known and unknown IPs
GET /dashboard/stats24h aggregates
GET /dashboard/top-attackersSorted attacker list
POST /canary/webhookHMAC-valid and invalid requests
GET /ai/statusLLM availability check

DB isolation strategy:

  • Each test creates and drops its own schema prefix
  • SQLAlchemy get_db dependency is overridden to inject the test session
  • Async test client via httpx.AsyncClient with ASGI transport
  • Tests are skipped automatically when TEST_DATABASE_URL is not set — safe for CI environments without a running Postgres

To run integration tests:

TEST_DATABASE_URL=postgresql+asyncpg://postgres:password@localhost:5432/eviltwin_test \
pytest backend/tests -q

Auth and RBAC Tests

Key scenarios tested:

  • POST /sessions without token → 401
  • POST /sessions with analyst token → 200
  • Admin-only endpoint with analyst token → 403
  • Expired token → 401
  • Valid refresh token → new access token

Frontend Tests

Test framework: Vitest (Jest-compatible, Vite-native) + React Testing Library

Test AreaFileWhat Is Tested
Alert feedtest/alertFeed.test.tsxZustand store updates render new alert rows
Auth guardtest/authGuard.test.tsxUnauthenticated user redirected to /login
Sessions filterstest/sessionsFilters.test.tsxUI controls produce correct API query params
Paginationtest/pagination.test.tsxPrev/next increments page correctly
TopBar reconnecttest/topBar.test.tsxBackoff state shows attempt count and countdown
Token refreshtest/tokenRefresh.test.tsx401 triggers refresh; request is retried

Run frontend tests:

cd frontend
npm test -- --run # CI mode (single pass)
npm test # watch mode (development)
npm run build # TypeScript + Vite build — catches type errors

SDN Tests (sdn/tests/)

Test FileWhat Is Tested
Flow rule constructionOFPFlowMod messages have correct match and action fields
Flow install/deleteFlowManager calls the correct Ryu datapath methods
REST API — GET /flowsReturns current suspicious IP list
REST API — POST /flowsAdds an IP to the flow table
REST API — DELETE /flows/{ip}Removes a specific IP
Backend score parsingController correctly parses threat_level from backend response
Containerised validationCompose config valid, Ryu image buildable (opt-in only)

Run SDN tests:

pytest sdn/tests -q

# Opt-in runtime validation (requires Docker)
RUN_DOCKER_VALIDATION=1 pytest sdn/tests/validate_runtime.py -q

CI/CD Pipeline

The GitHub Actions pipeline (.github/workflows/ci.yml) runs on every push and pull request:

Loading diagram…

Pipeline stages:

StageCommandFailure Causes
Lintruff check backend/Style violations, unused imports, undefined names
Backend unit testspytest backend/tests -qLogic errors in scorer, alert manager, canary, VPN
Backend integration testspytest backend/tests -q with Postgres serviceDB schema issues, API contract changes
Frontend testsnpm test -- --runComponent regression, broken hooks
Frontend buildnpm run buildTypeScript errors, missing imports
SDN testspytest sdn/tests -qFlow rule logic errors

The pipeline blocks merge if any stage fails. This prevents broken code from reaching the main branch.


Quality Gate — Run Locally Before Pushing

# From repo root
ruff check backend/

pytest backend/tests -q

cd frontend && npm test -- --run && npm run build && cd ..

pytest sdn/tests -q

All four commands should exit with code 0. If any fail, fix before pushing.


Risk-Based Testing Priorities

Not all code is equally critical. These areas deserve the most thorough test coverage:

PriorityAreaWhy
P0Ingest pipeline (POST /log)Core data collection — bugs cause data loss
P0Auth and token validationSecurity boundary — bugs mean unauthorised access
P1Threat scoring outputDrives automated SDN response — wrong scores cause false redirects
P1Canary HMAC + replaySecurity feature — bypass = undetected canary triggers
P2WebSocket deliveryAnalyst visibility — broken alerts delay response
P2Session filter queriesAnalyst workflow — broken filters mean missing data
P3Dashboard aggregationsNice-to-have — wrong numbers are cosmetic, not security-critical