Chat completions
POST https://aiapiv2.pekpik.com/v1/chat/completions
OpenAI-compatible. Works with the official SDKs, LangChain, LlamaIndex, and any OpenAI-compatible client.
Request
Section titled “Request”curl https://aiapiv2.pekpik.com/v1/chat/completions \ -H "Authorization: Bearer $PEKPIK_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-opus-4-7", "messages": [ {"role": "system", "content": "You are concise."}, {"role": "user", "content": "Explain HTTP/2 in one sentence."} ], "temperature": 0.7, "max_tokens": 512 }'Common parameters
Section titled “Common parameters”| Field | Description |
|---|---|
model | Model ID (see Models). |
messages | Array of {role, content}. Roles: system, user, assistant, tool. |
stream | true to receive server-sent token chunks. |
temperature, top_p | Sampling controls. |
max_tokens | Output cap. |
tools, tool_choice | Function / tool calling (where the model supports it). |
response_format | Structured / JSON output (where supported). |
Feature support (vision, tools, structured output, streaming) mirrors the upstream model — nothing is stripped.
Response
Section titled “Response”{ "id": "chatcmpl-...", "object": "chat.completion", "model": "claude-opus-4-7", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 24, "completion_tokens": 18, "total_tokens": 42 }}- For very large
max_tokens, see provider-specific limits; the gateway may clamp to a safe ceiling on some pooled channels. - On a transient upstream failure the gateway automatically retries on another channel (see HA).