13 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Veylant IA — A B2B SaaS platform acting as an intelligent proxy/gateway for enterprise AI consumption. Core value proposition: prevent Shadow AI, enforce PII anonymization, ensure GDPR/EU AI Act compliance, and control costs across all LLM usage in an organization.
Full product requirements are in docs/AI_Governance_Hub_PRD.md and the 6-month execution plan (13 sprints, 164 tasks) is in docs/AI_Governance_Hub_Plan_Realisation.md.
Architecture
Go module: github.com/veylant/ia-gateway · Go version: 1.24
Modular monolith (not microservices), with two distinct runtimes:
API Gateway (Traefik)
│
Go Proxy [cmd/proxy] — chi router, zap logger, viper config
├── internal/middleware/ Auth (OIDC/Keycloak), RateLimit, RequestID, SecurityHeaders
├── internal/router/ RBAC enforcement + provider dispatch + fallback chain
├── internal/routing/ Rules engine (PostgreSQL JSONB, in-memory cache, priority ASC)
├── internal/pii/ gRPC client to PII sidecar + /v1/pii/analyze HTTP handler
├── internal/auditlog/ ClickHouse append-only logger (async batch writer)
├── internal/compliance/ GDPR Art.30 registry + AI Act classification + PDF reports
├── internal/admin/ Admin REST API (/v1/admin/*) — routing rules, users, providers
├── internal/billing/ Token cost tracking (per provider pricing)
├── internal/circuitbreaker/ Failure-count breaker (threshold=5, open_ttl=60s)
├── internal/ratelimit/ Token-bucket limiter (per-tenant + per-user, DB overrides)
├── internal/flags/ Feature flags (PostgreSQL + in-memory fallback)
├── internal/crypto/ AES-256-GCM encryptor for prompt storage
├── internal/metrics/ Prometheus middleware + metrics registration
├── internal/provider/ Adapter interface + OpenAI/Anthropic/Azure/Mistral/Ollama impls
├── internal/proxy/ Core request handler (PII → upstream → audit → response)
├── internal/apierror/ OpenAI-format error helpers (WriteError, WriteErrorWithRequestID)
├── internal/health/ /healthz, /docs, /playground, /playground/analyze handlers
└── internal/config/ Viper-based config loader (VEYLANT_* env var overrides)
│ gRPC (<2ms) to localhost:50051
PII Detection Service [services/pii] — FastAPI + grpc.aio
├── HTTP health: :8091/healthz
├── Layer 1: Regex (IBAN, email, phone, SSN, credit cards)
├── Layer 2: Presidio + spaCy NER (names, addresses, orgs)
└── Layer 3: LLM validation (V1.1, ambiguous cases)
│
LLM Provider Adapters (OpenAI, Anthropic, Azure, Mistral, Ollama)
Data layer:
- PostgreSQL 16 — config, users, policies, processing registry (Row-Level Security for multi-tenancy; app role:
veylant_app) - ClickHouse — analytics and immutable audit logs
- Redis 7 — sessions, rate limiting, PII pseudonymization mappings (AES-256-GCM + TTL)
- Keycloak — IAM, SSO, SAML 2.0/OIDC federation (dev console: http://localhost:8080, admin/admin; test users: admin@veylant.dev/admin123, user@veylant.dev/user123)
- Prometheus — metrics scraper on :9090; Grafana — dashboards on :3001 (admin/admin)
- HashiCorp Vault — secrets and API key rotation (90-day cycle)
Frontend: React 18 + TypeScript + Vite, shadcn/ui, recharts. Routes protected via OIDC (Keycloak); web/src/auth/ manages the auth flow. API clients live in web/src/api/.
Repository Structure
cmd/proxy/ # Go main entry point — wires all modules, starts HTTP server
internal/ # All Go modules (see Architecture above for full list)
gen/ # Generated Go gRPC stubs (buf generate → never edit manually)
services/pii/ # Python FastAPI + gRPC PII detection service
gen/pii/v1/ # Generated Python proto stubs (run `make proto` first)
tests/ # pytest unit tests (test_regex.py, test_pipeline.py, test_pseudo.py)
proto/pii/v1/ # gRPC .proto definitions
migrations/ # golang-migrate SQL files (up/down pairs)
clickhouse/ # ClickHouse DDL applied at startup via ApplyDDL()
web/ # React frontend (Vite, src/pages, src/components, src/api)
test/ # Integration tests (test/integration/, //go:build integration) + k6 load tests (test/k6/)
deploy/ # Helm, Kubernetes manifests, Terraform (EKS), Prometheus/Grafana, alertmanager
clickhouse/ # ClickHouse config overrides for Docker (e.g. listen-ipv4.xml — forces IPv4)
docker-compose.yml # Full local dev stack (9 services)
config.yaml # Local dev config (overridden by VEYLANT_* env vars)
Build & Development Commands
Use make as the primary interface. The proxy runs on :8090, PII HTTP on :8091, PII gRPC on :50051.
make dev # Start full stack (proxy + PostgreSQL + ClickHouse + Redis + Keycloak + PII)
make dev-down # Stop and remove all containers and volumes
make dev-logs # Tail logs from all services
make build # go build → bin/proxy
make test # go test -race ./...
make test-cover # Tests with HTML coverage report (coverage.html)
make test-integration # Integration tests with testcontainers (requires Docker)
make lint # golangci-lint + black --check + ruff check
make fmt # gofmt + black
make proto # buf generate — regenerates gen/ and services/pii/gen/
make proto-lint # buf lint
make migrate-up # Apply pending DB migrations
make migrate-down # Roll back last migration
make migrate-status # Show current migration version
make check # Full pre-commit: build + vet + lint + test
make health # curl localhost:8090/healthz
make docs # Open http://localhost:8090/docs in browser (proxy must be running)
make helm-dry-run # Render Helm templates without deploying
make helm-deploy # Deploy to staging (requires IMAGE_TAG + KUBECONFIG env vars)
make load-test # k6 load test (SCENARIO=smoke|load|stress|soak, default: smoke)
make deploy-blue # Blue/green: deploy IMAGE_TAG to blue slot (requires kubectl + Istio)
make deploy-green # Blue/green: deploy IMAGE_TAG to green slot
make deploy-rollback # Roll back traffic to ACTIVE_SLOT (e.g. make deploy-rollback ACTIVE_SLOT=blue)
Frontend dev server (Vite, runs on :3000):
cd web && npm install && npm run dev
Run a single Go test:
go test -run TestName ./internal/module/
Run a single Python test:
pytest services/pii/tests/test_file.py::test_function
Proto prerequisite: Run make proto before starting the PII service if gen/ or services/pii/gen/ is missing — the service will start but reject all gRPC requests otherwise.
Config override: Any config key can be overridden via env var with the VEYLANT_ prefix and . → _ replacement. Example: VEYLANT_SERVER_PORT=9090 overrides server.port.
Tools required: buf (brew install buf), golang-migrate (brew install golang-migrate), golangci-lint, Python 3.12, black, ruff.
Development Mode Graceful Degradation
When server.env=development, the proxy degrades gracefully instead of crashing:
- Keycloak unreachable → falls back to
MockVerifier(JWT auth bypassed; dev user injected asadminrole) - PostgreSQL unreachable → routing engine and feature flags disabled; flag store uses in-memory fallback
- ClickHouse unreachable → audit logging disabled
- PII service unreachable → PII disabled if
pii.fail_open=true(default)
In production (server.env=production), any of the above causes a fatal startup error.
Key Technical Constraints
Latency budget: The entire PII pipeline (regex + NER + pseudonymization) must complete in <50ms. The PII gRPC call has a configurable timeout (pii.timeout_ms, default 100ms).
Streaming (SSE): The proxy must flush SSE chunks without buffering. PII anonymization applies to the request before it's sent upstream — not to the streamed response. This is the most technically complex piece of the MVP.
Multi-tenancy: Logical isolation via PostgreSQL Row-Level Security. The app connects as role veylant_app and sets app.tenant_id per session. Superuser bypasses RLS (dev only).
Immutable audit logs: ClickHouse is append-only — no DELETE operations. Retention via TTL policies only. ClickHouse DDL is applied idempotently at startup from migrations/clickhouse/.
Proxy Docker image: Uses distroless/static — no shell, no wget. CMD-SHELL health checks in docker-compose cannot work for the proxy container; dependents use condition: service_started instead.
Routing rule evaluation: Rules are sorted ascending by priority (lower = evaluated first). All conditions within a rule are AND-joined. An empty Conditions slice is a catch-all. First match wins. Supported condition fields: user.role, user.department, request.sensitivity, request.model, request.use_case, request.token_estimate. Operators: eq, neq, in, nin, gte, lte, contains, matches.
Conventions
Go import ordering (goimports with local-prefixes: github.com/veylant/ia-gateway): three groups — stdlib · external · github.com/veylant/ia-gateway/internal/.... gen/ is excluded from all linters (generated code).
Commits: Conventional Commits (feat:, fix:, chore:) — used for automated changelog generation.
API versioning: /v1/ prefix, OpenAI-compatible format (/v1/chat/completions) so existing OpenAI SDK clients work without modification.
LLM Provider Adapters: Each provider implements provider.Adapter (Send(), Stream(), Validate(), HealthCheck()). Add new providers by implementing this interface in internal/provider/<name>/.
Error handling: Go modules use typed errors with errors.Wrap. The proxy always returns errors in OpenAI JSON format (type, message, code).
Feature flags: PostgreSQL table (feature_flags) + in-memory fallback when DB is unavailable. No external service.
OpenAPI docs: Generated from swaggo annotations — never write API docs by hand.
Testing split: 70% unit (testing + testify / pytest) · 20% integration (testcontainers for PG/ClickHouse/Redis, lives in test/integration/, requires //go:build integration tag) · 10% E2E (Playwright for UI). Tests are written in parallel with each module, not deferred.
CI coverage thresholds: Go internal packages must maintain ≥80% coverage; Python PII service ≥75%. NER tests (test_ner.py) are excluded from CI because fr_core_news_lg (~600MB) is only available in the Docker build.
Custom Semgrep Rules (.semgrep.yml)
These are enforced in CI and represent project-specific guardrails:
context.Background()in HTTP handlers → user.Context()to propagate tenant context and cancellation.- SQL string concatenation (
db.QueryContext(ctx, query+var)orfmt.Sprintf) → use parameterized queries ($1, $2, ...). - Sensitive fields in logs (
zap.String("password"|"api_key"|"token"|"secret"|"Authorization"|"email"|"prompt", ...)) → use redaction helpers. - Hardcoded API keys (string literals starting with
sk-) → load from env or Vault. json.NewDecoder(r.Body).Decode()withouthttp.MaxBytesReader→ wrap body first.- Python
eval()/exec()on variables → never evaluate user-supplied data.
Security Patterns
- Zero Trust network, mTLS between services, TLS 1.3 externally
- All sensitive fields encrypted at application level (AES-256-GCM)
- API keys stored as SHA-256 hashes only; prefix kept for display (e.g.
sk-vyl_ab12cd34) - RBAC roles:
admin,manager,user,auditor— per-model and per-department permissions.admin/managerhave unrestricted model access;useris limited torbac.user_allowed_models;auditorcannot call/v1/chat/completionsby default. - Audit-of-the-audit: all accesses to audit logs are themselves logged
- CI pipeline: Semgrep (SAST), Trivy (image scanning, CRITICAL/HIGH blocking), gitleaks (secret detection), OWASP ZAP DAST (non-blocking, main branch only)
- Release pipeline (
v*tag push): multi-arch Docker image (amd64/arm64) → GHCR, Helm chart → GHCR OCI, GitHub Release with notes extracted from CHANGELOG.md
MVP Scope (V1)
In scope: AI proxy, PII anonymization + pseudonymization, intelligent routing engine, audit logs, RBAC, React dashboard, GDPR Article 30 registry, AI Act risk classification, provider configuration wizard, integrated playground (prompt test with PII visualization).
Out of scope (V2+): ML anomaly detection, Shadow AI discovery, physical multi-tenant isolation, native SDKs, SIEM integrations.