2026-02-23 13:35:04 +01:00

4.3 KiB

Raw Blame History

Veylant IA Proxy — Developer Integration Guide

Get up and running in under 30 minutes. The proxy is fully compatible with the OpenAI API — change one URL and your existing code works.

Prerequisites

Your Veylant IA proxy URL (e.g. https://api.veylant.ai or http://localhost:8090 for local dev)
A JWT token issued by your organisation's Keycloak instance

1. Change the base URL

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    api_key="your-jwt-token",          # pass your JWT as the API key
    base_url="https://api.veylant.ai/v1",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarise the Q3 report."}],
)
print(response.choices[0].message.content)

curl

curl -X POST https://api.veylant.ai/v1/chat/completions \
  -H "Authorization: Bearer $VEYLANT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Node.js (openai SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.VEYLANT_TOKEN,
  baseURL: 'https://api.veylant.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);

2. Authentication

Every request to /v1/* must include a Bearer JWT in the Authorization header:

Authorization: Bearer <your-jwt-token>

Tokens are issued by your organisation's Keycloak instance. Contact your admin to obtain one.

The token must contain:

tenant_id — your organisation's identifier
user_id — your user identifier
roles — at least one of admin, manager, user, auditor

3. Streaming

Streaming works identically to the OpenAI API — set stream: true:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

The proxy forwards SSE chunks from the upstream provider without buffering.

4. PII Anonymization (automatic)

PII anonymization is automatic and transparent. Before your prompt reaches the upstream provider:

Named entities (names, emails, phone numbers, IBAN, etc.) are detected
Entities are replaced with pseudonyms (e.g. Jean Dupont becomes [PERSON_1])
The upstream response is de-pseudonymized before being returned to you

You receive the original names back in the response — the upstream never sees them.

To disable PII for your tenant, ask your admin to run:

PUT /v1/admin/flags/pii_enabled  {"enabled": false}

5. Supported Models

The proxy routes to different providers based on model prefix:

Model prefix	Provider
`gpt-`, `o1-`, `o3-*`	OpenAI
`claude-*`	Anthropic
`mistral-`, `mixtral-`	Mistral
`llama`, `phi`, `qwen*`	Ollama (self-hosted)

Your admin may have configured custom routing rules that override this behaviour.

6. Error Codes

All errors follow the OpenAI error format:

{
  "error": {
    "type": "authentication_error",
    "message": "missing or invalid token",
    "code": null
  }
}

HTTP Status	Error type	Cause
`400`	`invalid_request_error`	Malformed JSON or missing required fields
`401`	`authentication_error`	Missing or expired JWT
`403`	`permission_error`	Model not allowed for your role (RBAC)
`429`	`rate_limit_error`	Too many requests — wait and retry
`502`	`upstream_error`	The upstream LLM provider returned an error

7. Rate Limits

Limits are configured per-tenant. The default is 6 000 requests/minute with a burst of 1 000. Your admin can adjust this via PUT /v1/admin/rate-limits/{tenant_id}.

When you hit the limit you receive:

HTTP/1.1 429 Too Many Requests
Retry-After: 1

8. Health Check

Verify the proxy is reachable without authentication:

curl https://api.veylant.ai/healthz
# {"status":"ok"}

9. API Reference

Full interactive documentation is available at:

https://api.veylant.ai/docs

Or download the raw OpenAPI 3.1 spec:

curl https://api.veylant.ai/docs/openapi.yaml -o openapi.yaml

4.3 KiB Raw Blame History