Skip to content

Chat completions

POST https://aiapiv2.pekpik.com/v1/chat/completions

OpenAI-compatible. Works with the official SDKs, LangChain, LlamaIndex, and any OpenAI-compatible client.

Terminal window
curl https://aiapiv2.pekpik.com/v1/chat/completions \
-H "Authorization: Bearer $PEKPIK_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-7",
"messages": [
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "Explain HTTP/2 in one sentence."}
],
"temperature": 0.7,
"max_tokens": 512
}'
FieldDescription
modelModel ID (see Models).
messagesArray of {role, content}. Roles: system, user, assistant, tool.
streamtrue to receive server-sent token chunks.
temperature, top_pSampling controls.
max_tokensOutput cap.
tools, tool_choiceFunction / tool calling (where the model supports it).
response_formatStructured / JSON output (where supported).

Feature support (vision, tools, structured output, streaming) mirrors the upstream model — nothing is stripped.

{
"id": "chatcmpl-...",
"object": "chat.completion",
"model": "claude-opus-4-7",
"choices": [
{ "index": 0, "message": { "role": "assistant", "content": "..." }, "finish_reason": "stop" }
],
"usage": { "prompt_tokens": 24, "completion_tokens": 18, "total_tokens": 42 }
}
  • For very large max_tokens, see provider-specific limits; the gateway may clamp to a safe ceiling on some pooled channels.
  • On a transient upstream failure the gateway automatically retries on another channel (see HA).