veylant/docs/integration-guide.md
2026-02-23 13:35:04 +01:00

169 lines
4.3 KiB
Markdown

# Veylant IA Proxy — Developer Integration Guide
Get up and running in under 30 minutes. The proxy is fully compatible with the OpenAI API — change one URL and your existing code works.
## Prerequisites
- Your Veylant IA proxy URL (e.g. `https://api.veylant.ai` or `http://localhost:8090` for local dev)
- A JWT token issued by your organisation's Keycloak instance
## 1. Change the base URL
### Python (openai SDK)
```python
from openai import OpenAI
client = OpenAI(
api_key="your-jwt-token", # pass your JWT as the API key
base_url="https://api.veylant.ai/v1",
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarise the Q3 report."}],
)
print(response.choices[0].message.content)
```
### curl
```bash
curl -X POST https://api.veylant.ai/v1/chat/completions \
-H "Authorization: Bearer $VEYLANT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```
### Node.js (openai SDK)
```javascript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.VEYLANT_TOKEN,
baseURL: 'https://api.veylant.ai/v1',
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
```
## 2. Authentication
Every request to `/v1/*` must include a `Bearer` JWT in the `Authorization` header:
```
Authorization: Bearer <your-jwt-token>
```
Tokens are issued by your organisation's Keycloak instance. Contact your admin to obtain one.
The token must contain:
- `tenant_id` — your organisation's identifier
- `user_id` — your user identifier
- `roles` — at least one of `admin`, `manager`, `user`, `auditor`
## 3. Streaming
Streaming works identically to the OpenAI API — set `stream: true`:
```python
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
```
The proxy forwards SSE chunks from the upstream provider without buffering.
## 4. PII Anonymization (automatic)
PII anonymization is automatic and transparent. Before your prompt reaches the upstream provider:
1. Named entities (names, emails, phone numbers, IBAN, etc.) are detected
2. Entities are replaced with pseudonyms (e.g. `Jean Dupont` becomes `[PERSON_1]`)
3. The upstream response is de-pseudonymized before being returned to you
You receive the original names back in the response — the upstream never sees them.
To disable PII for your tenant, ask your admin to run:
```
PUT /v1/admin/flags/pii_enabled {"enabled": false}
```
## 5. Supported Models
The proxy routes to different providers based on model prefix:
| Model prefix | Provider |
|---|---|
| `gpt-*`, `o1-*`, `o3-*` | OpenAI |
| `claude-*` | Anthropic |
| `mistral-*`, `mixtral-*` | Mistral |
| `llama*`, `phi*`, `qwen*` | Ollama (self-hosted) |
Your admin may have configured custom routing rules that override this behaviour.
## 6. Error Codes
All errors follow the OpenAI error format:
```json
{
"error": {
"type": "authentication_error",
"message": "missing or invalid token",
"code": null
}
}
```
| HTTP Status | Error type | Cause |
|---|---|---|
| `400` | `invalid_request_error` | Malformed JSON or missing required fields |
| `401` | `authentication_error` | Missing or expired JWT |
| `403` | `permission_error` | Model not allowed for your role (RBAC) |
| `429` | `rate_limit_error` | Too many requests — wait and retry |
| `502` | `upstream_error` | The upstream LLM provider returned an error |
## 7. Rate Limits
Limits are configured per-tenant. The default is 6 000 requests/minute with a burst of 1 000. Your admin can adjust this via `PUT /v1/admin/rate-limits/{tenant_id}`.
When you hit the limit you receive:
```http
HTTP/1.1 429 Too Many Requests
Retry-After: 1
```
## 8. Health Check
Verify the proxy is reachable without authentication:
```bash
curl https://api.veylant.ai/healthz
# {"status":"ok"}
```
## 9. API Reference
Full interactive documentation is available at:
```
https://api.veylant.ai/docs
```
Or download the raw OpenAPI 3.1 spec:
```bash
curl https://api.veylant.ai/docs/openapi.yaml -o openapi.yaml
```