veylant/docs/integration-guide.md

# Veylant IA Proxy — Developer Integration Guide

Get up and running in under 30 minutes. The proxy is fully compatible with the OpenAI API — change one URL and your existing code works.

## Prerequisites

- Your Veylant IA proxy URL (e.g. `https://api.veylant.ai` or `http://localhost:8090` for local dev)
- A JWT token issued by your organisation's Keycloak instance

## 1. Change the base URL

### Python (openai SDK)

```python
from openai import OpenAI

client = OpenAI(
    api_key="your-jwt-token",          # pass your JWT as the API key
    base_url="https://api.veylant.ai/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarise the Q3 report."}],
)
print(response.choices[0].message.content)
```

### curl

```bash
curl -X POST https://api.veylant.ai/v1/chat/completions \
  -H "Authorization: Bearer $VEYLANT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

### Node.js (openai SDK)

```javascript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.VEYLANT_TOKEN,
  baseURL: 'https://api.veylant.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
```

## 2. Authentication

Every request to `/v1/*` must include a `Bearer` JWT in the `Authorization` header:

```
Authorization: Bearer <your-jwt-token>
```

Tokens are issued by your organisation's Keycloak instance. Contact your admin to obtain one.

The token must contain:
- `tenant_id` — your organisation's identifier
- `user_id` — your user identifier
- `roles` — at least one of `admin`, `manager`, `user`, `auditor`

## 3. Streaming

Streaming works identically to the OpenAI API — set `stream: true`:

```python
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
```

The proxy forwards SSE chunks from the upstream provider without buffering.

## 4. PII Anonymization (automatic)

PII anonymization is automatic and transparent. Before your prompt reaches the upstream provider:

1. Named entities (names, emails, phone numbers, IBAN, etc.) are detected
2. Entities are replaced with pseudonyms (e.g. `Jean Dupont` becomes `[PERSON_1]`)
3. The upstream response is de-pseudonymized before being returned to you

You receive the original names back in the response — the upstream never sees them.

To disable PII for your tenant, ask your admin to run:
```
PUT /v1/admin/flags/pii_enabled  {"enabled": false}
```

## 5. Supported Models

The proxy routes to different providers based on model prefix:

| Model prefix | Provider |
|---|---|
| `gpt-*`, `o1-*`, `o3-*` | OpenAI |
| `claude-*` | Anthropic |
| `mistral-*`, `mixtral-*` | Mistral |
| `llama*`, `phi*`, `qwen*` | Ollama (self-hosted) |

Your admin may have configured custom routing rules that override this behaviour.

## 6. Error Codes

All errors follow the OpenAI error format:

```json
{
  "error": {
    "type": "authentication_error",
    "message": "missing or invalid token",
    "code": null
  }
}
```

| HTTP Status | Error type | Cause |
|---|---|---|
| `400` | `invalid_request_error` | Malformed JSON or missing required fields |
| `401` | `authentication_error` | Missing or expired JWT |
| `403` | `permission_error` | Model not allowed for your role (RBAC) |
| `429` | `rate_limit_error` | Too many requests — wait and retry |
| `502` | `upstream_error` | The upstream LLM provider returned an error |

## 7. Rate Limits

Limits are configured per-tenant. The default is 6 000 requests/minute with a burst of 1 000. Your admin can adjust this via `PUT /v1/admin/rate-limits/{tenant_id}`.

When you hit the limit you receive:
```http
HTTP/1.1 429 Too Many Requests
Retry-After: 1
```

## 8. Health Check

Verify the proxy is reachable without authentication:

```bash
curl https://api.veylant.ai/healthz
# {"status":"ok"}
```

## 9. API Reference

Full interactive documentation is available at:
```
https://api.veylant.ai/docs
```

Or download the raw OpenAPI 3.1 spec:
```bash
curl https://api.veylant.ai/docs/openapi.yaml -o openapi.yaml
```