veylant/CLAUDE.md
2026-03-06 18:38:04 +01:00

15 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Veylant IA — A B2B SaaS platform acting as an intelligent proxy/gateway for enterprise AI consumption. Core value proposition: prevent Shadow AI, enforce PII anonymization, ensure GDPR/EU AI Act compliance, and control costs across all LLM usage in an organization.

Full product requirements are in docs/AI_Governance_Hub_PRD.md and the 6-month execution plan (13 sprints, 164 tasks) is in docs/AI_Governance_Hub_Plan_Realisation.md. Architecture Decision Records live in docs/adr/.

Architecture

Go module: github.com/veylant/ia-gateway · Go version: 1.24

Modular monolith (not microservices), with two distinct runtimes:

API Gateway (Traefik)
        │
Go Proxy [cmd/proxy] — chi router, zap logger, viper config
  ├── internal/auth/         Local JWT auth (HS256) — LocalJWTVerifier + LoginHandler (POST /v1/auth/login)
  ├── internal/middleware/   Auth (JWT verification), RateLimit, RequestID, SecurityHeaders
  ├── internal/router/       RBAC enforcement + provider dispatch + fallback chain
  ├── internal/routing/      Rules engine (PostgreSQL JSONB, in-memory cache, priority ASC)
  ├── internal/pii/          gRPC client to PII sidecar + /v1/pii/analyze HTTP handler
  ├── internal/auditlog/     ClickHouse append-only logger (async batch writer)
  ├── internal/compliance/   GDPR Art.30 registry + AI Act classification + PDF reports
  ├── internal/admin/        Admin REST API (/v1/admin/*) — routing rules, users, providers
  ├── internal/billing/      Token cost tracking (per provider pricing)
  ├── internal/circuitbreaker/ Failure-count breaker (threshold=5, open_ttl=60s)
  ├── internal/ratelimit/    Token-bucket limiter (per-tenant + per-user, DB overrides)
  ├── internal/flags/        Feature flags (PostgreSQL + in-memory fallback)
  ├── internal/crypto/       AES-256-GCM encryptor for prompt storage
  ├── internal/metrics/      Prometheus middleware + metrics registration
  ├── internal/provider/     Adapter interface + OpenAI/Anthropic/Azure/Mistral/Ollama impls
  ├── internal/proxy/        Core request handler (PII → upstream → audit → response)
  ├── internal/apierror/     OpenAI-format error helpers (WriteError, WriteErrorWithRequestID)
  ├── internal/health/       /healthz, /docs, /playground, /playground/analyze handlers
  └── internal/config/       Viper-based config loader (VEYLANT_* env var overrides)
        │ gRPC (<2ms) to localhost:50051
PII Detection Service [services/pii] — FastAPI + grpc.aio
  ├── HTTP health: :8091/healthz
  ├── Layer 1: Regex (IBAN, email, phone, SSN, credit cards)
  ├── Layer 2: Presidio + spaCy NER (names, addresses, orgs)
  └── Layer 3: LLM validation (V1.1, ambiguous cases)
        │
LLM Provider Adapters (OpenAI, Anthropic, Azure, Mistral, Ollama)

Data layer:

  • PostgreSQL 16 — config, users, policies, processing registry (Row-Level Security for multi-tenancy; app role: veylant_app)
  • ClickHouse — analytics and immutable audit logs
  • Redis 7 — sessions, rate limiting, PII pseudonymization mappings (AES-256-GCM + TTL)
  • Prometheus — metrics scraper on :9090; Grafana — dashboards on :3001 (admin/admin)
  • HashiCorp Vault — secrets and API key rotation (90-day cycle)

Frontend: React 18 + TypeScript + Vite, shadcn/ui, recharts. Routes protected via local JWT (stored in localStorage, auto-logout on expiry); web/src/auth/ manages the auth flow. API clients live in web/src/api/.

Documentation site (http://localhost:3000/docs): public, no auth required. Root: web/src/pages/docs/ — sections: getting-started, installation, api-reference (8 endpoints), guides (6), deployment (3), security (2), changelog. Layout components: DocLayout.tsx (sidebar + content + TOC), DocSidebar.tsx (with search), DocBreadcrumbs.tsx, DocPagination.tsx. Shared components: components/CodeBlock.tsx, Callout.tsx, ApiEndpoint.tsx, ParamTable.tsx, TableOfContents.tsx. Nav structure: web/src/pages/docs/nav.ts. Uses @tailwindcss/typography (added as devDependency) for prose rendering.

Repository Structure

cmd/proxy/           # Go main entry point — wires all modules, starts HTTP server
internal/            # All Go modules (see Architecture above for full list)
gen/                 # Generated Go gRPC stubs (buf generate → never edit manually)
services/pii/        # Python FastAPI + gRPC PII detection service
  gen/pii/v1/        # Generated Python proto stubs (run `make proto` first)
  tests/             # pytest unit tests (test_regex.py, test_pipeline.py, test_pseudo.py)
proto/pii/v1/        # gRPC .proto definitions
migrations/          # golang-migrate SQL files (up/down pairs)
  clickhouse/        # ClickHouse DDL applied at startup via ApplyDDL()
web/                 # React frontend (Vite, src/pages, src/components, src/api)
  src/pages/docs/    # Public documentation site (no auth); nav.ts defines sidebar structure
test/                # Integration tests (test/integration/, //go:build integration) + k6 load tests (test/k6/)
deploy/              # Helm, Kubernetes manifests, Terraform (EKS), Prometheus/Grafana, alertmanager
  clickhouse/        # ClickHouse config overrides for Docker (e.g. listen-ipv4.xml — forces IPv4)
docker-compose.yml   # Full local dev stack (9 services)
config.yaml          # Local dev config (overridden by VEYLANT_* env vars)

Build & Development Commands

Use make as the primary interface. The proxy runs on :8090, PII HTTP on :8091, PII gRPC on :50051.

make dev              # Start full stack (proxy + PostgreSQL + ClickHouse + Redis + Keycloak + PII)
make dev-down         # Stop and remove all containers and volumes
make dev-logs         # Tail logs from all services
make build            # go build → bin/proxy
make test             # go test -race ./...
make test-cover       # Tests with HTML coverage report (coverage.html)
make test-integration # Integration tests with testcontainers (requires Docker)
make lint             # golangci-lint + black --check + ruff check
make fmt              # gofmt + black
make proto            # buf generate — regenerates gen/ and services/pii/gen/
make proto-lint       # buf lint
make migrate-up       # Apply pending DB migrations
make migrate-down     # Roll back last migration
make migrate-status   # Show current migration version
make check            # Full pre-commit: build + vet + lint + test
make health           # curl localhost:8090/healthz
make docs             # Open http://localhost:8090/docs in browser (proxy must be running)
make helm-dry-run     # Render Helm templates without deploying
make helm-deploy      # Deploy to staging (requires IMAGE_TAG + KUBECONFIG env vars)
make load-test        # k6 load test (SCENARIO=smoke|load|stress|soak, default: smoke)
make deploy-blue      # Blue/green: deploy IMAGE_TAG to blue slot (requires kubectl + Istio)
make deploy-green     # Blue/green: deploy IMAGE_TAG to green slot
make deploy-rollback  # Roll back traffic to ACTIVE_SLOT (e.g. make deploy-rollback ACTIVE_SLOT=blue)

Frontend dev server (Vite, runs on :3000):

cd web && npm install && npm run dev    # dev server with HMR
cd web && npm run build                 # tsc + vite build → web/dist/
cd web && npm run lint                  # ESLint (max-warnings: 0)

Vite dev proxy: In dev mode, all /v1/* requests from the frontend are proxied to localhost:8090 (the Go proxy). No CORS issues during development.

Run a single Go test:

go test -run TestName ./internal/module/

Run a single Python test:

pytest services/pii/tests/test_file.py::test_function

Proto prerequisite: Run make proto before starting the PII service if gen/ or services/pii/gen/ is missing — the service will start but reject all gRPC requests otherwise.

Config override: Any config key can be overridden via env var with the VEYLANT_ prefix and ._ replacement. Example: VEYLANT_SERVER_PORT=9090 overrides server.port.

Auth config: auth.jwt_secret (env: VEYLANT_AUTH_JWT_SECRET) and auth.jwt_ttl_hours. Login endpoint: POST /v1/auth/login (public). Dev credentials: admin@veylant.dev / admin123. Tokens are HS256-signed JWTs; users stored in users table with bcrypt password hashes (migration 000010).

Provider configs: LLM provider API keys are stored encrypted (AES-256-GCM) in the provider_configs table (migration 000011). CRUD via GET|POST /v1/admin/providers, PUT|DELETE|POST-test /v1/admin/providers/{id}. Adapters hot-reload on save/update without proxy restart (router.UpdateAdapter() / RemoveAdapter()).

Tools required: buf (brew install buf), golang-migrate (brew install golang-migrate), golangci-lint, Python 3.12, black, ruff.

Tenant onboarding (after make dev):

deploy/onboarding/onboard-tenant.sh    # creates admin, seeds 4 routing templates, configures rate limits
deploy/onboarding/import-users.sh      # bulk import from CSV (email, first_name, last_name, department, role)

Development Mode Graceful Degradation

When server.env=development, the proxy degrades gracefully instead of crashing:

  • PostgreSQL unreachable → routing engine and feature flags disabled; flag store uses in-memory fallback
  • ClickHouse unreachable → audit logging disabled
  • PII service unreachable → PII disabled if pii.fail_open=true (default)

In production (server.env=production), any of the above causes a fatal startup error.

Key Technical Constraints

Latency budget: The entire PII pipeline (regex + NER + pseudonymization) must complete in <50ms. The PII gRPC call has a configurable timeout (pii.timeout_ms, default 100ms).

Streaming (SSE): The proxy must flush SSE chunks without buffering. PII anonymization applies to the request before it's sent upstream — not to the streamed response. This is the most technically complex piece of the MVP.

Multi-tenancy: Logical isolation via PostgreSQL Row-Level Security. The app connects as role veylant_app and sets app.tenant_id per session. Superuser bypasses RLS (dev only).

Immutable audit logs: ClickHouse is append-only — no DELETE operations. Retention via TTL policies only. ClickHouse DDL is applied idempotently at startup from migrations/clickhouse/.

Proxy Docker image: Uses distroless/static — no shell, no wget. CMD-SHELL health checks in docker-compose cannot work for the proxy container; dependents use condition: service_started instead.

Routing rule evaluation: Rules are sorted ascending by priority (lower = evaluated first). All conditions within a rule are AND-joined. An empty Conditions slice is a catch-all. First match wins. Supported condition fields: user.role, user.department, request.sensitivity, request.model, request.use_case, request.token_estimate. Operators: eq, neq, in, nin, gte, lte, contains, matches.

Conventions

Go import ordering (goimports with local-prefixes: github.com/veylant/ia-gateway): three groups — stdlib · external · github.com/veylant/ia-gateway/internal/.... gen/ is excluded from all linters (generated code).

Commits: Conventional Commits (feat:, fix:, chore:) — used for automated changelog generation.

API versioning: /v1/ prefix, OpenAI-compatible format (/v1/chat/completions) so existing OpenAI SDK clients work without modification.

LLM Provider Adapters: Each provider implements provider.Adapter (Send(), Stream(), Validate(), HealthCheck()). Add new providers by implementing this interface in internal/provider/<name>/.

Error handling: Go modules use typed errors with errors.Wrap. The proxy always returns errors in OpenAI JSON format (type, message, code).

Feature flags: PostgreSQL table (feature_flags) + in-memory fallback when DB is unavailable. No external service.

OpenAPI docs: Generated from swaggo annotations — never write API docs by hand.

Testing split: 70% unit (testing + testify / pytest) · 20% integration (testcontainers for PG/ClickHouse/Redis, lives in test/integration/, requires //go:build integration tag) · 10% E2E (Playwright for UI). Tests are written in parallel with each module, not deferred.

CI coverage thresholds: Go internal packages must maintain ≥80% coverage; Python PII service ≥75%. NER tests (test_ner.py) are excluded from CI because fr_core_news_lg (~600MB) is only available in the Docker build.

Custom Semgrep Rules (.semgrep.yml)

These are enforced in CI and represent project-specific guardrails:

  • context.Background() in HTTP handlers → use r.Context() to propagate tenant context and cancellation.
  • SQL string concatenation (db.QueryContext(ctx, query+var) or fmt.Sprintf) → use parameterized queries ($1, $2, ...).
  • Sensitive fields in logs (zap.String("password"|"api_key"|"token"|"secret"|"Authorization"|"email"|"prompt", ...)) → use redaction helpers.
  • Hardcoded API keys (string literals starting with sk-) → load from env or Vault.
  • json.NewDecoder(r.Body).Decode() without http.MaxBytesReader → wrap body first.
  • Python eval()/exec() on variables → never evaluate user-supplied data.

Security Patterns

  • Zero Trust network, mTLS between services, TLS 1.3 externally
  • All sensitive fields encrypted at application level (AES-256-GCM)
  • API keys stored as SHA-256 hashes only; prefix kept for display (e.g. sk-vyl_ab12cd34)
  • RBAC roles: admin, manager, user, auditor — per-model and per-department permissions. admin/manager have unrestricted model access; user is limited to rbac.user_allowed_models; auditor cannot call /v1/chat/completions by default.
  • Audit-of-the-audit: all accesses to audit logs are themselves logged
  • CI pipeline (.github/workflows/ci.yml): Go build/test/lint, Python format/lint/test, Semgrep SAST, Trivy container scan (CRITICAL/HIGH blocking), gitleaks, OWASP ZAP DAST (non-blocking, main only), k6 smoke test + blue/green Helm staging deploy (main only)
  • Release pipeline (.github/workflows/release.yml, on v* tag): multi-arch Docker image (amd64/arm64) → GHCR, Helm chart → GHCR OCI, GitHub Release with notes extracted from CHANGELOG.md

MVP Scope (V1)

In scope: AI proxy, PII anonymization + pseudonymization, intelligent routing engine, audit logs, RBAC, React dashboard, GDPR Article 30 registry, AI Act risk classification, provider configuration wizard, integrated playground (prompt test with PII visualization).

Out of scope (V2+): ML anomaly detection, Shadow AI discovery, physical multi-tenant isolation, native SDKs, SIEM integrations.