Chat completions

POST https://aiapiv2.pekpik.com/v1/chat/completions

OpenAI-compatible. Works with the official SDKs, LangChain, LlamaIndex, and any OpenAI-compatible client.

Request

curl https://aiapiv2.pekpik.com/v1/chat/completions \
  -H "Authorization: Bearer $PEKPIK_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user", "content": "Explain HTTP/2 in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'

Common parameters

Field	Description
`model`	Model ID (see Models).
`messages`	Array of `{role, content}`. Roles: `system`, `user`, `assistant`, `tool`.
`stream`	`true` to receive server-sent token chunks.
`temperature`, `top_p`	Sampling controls.
`max_tokens`	Output cap.
`tools`, `tool_choice`	Function / tool calling (where the model supports it).
`response_format`	Structured / JSON output (where supported).

Feature support (vision, tools, structured output, streaming) mirrors the upstream model — nothing is stripped.

Response

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "claude-opus-4-7",
  "choices": [
    { "index": 0, "message": { "role": "assistant", "content": "..." }, "finish_reason": "stop" }
  ],
  "usage": { "prompt_tokens": 24, "completion_tokens": 18, "total_tokens": 42 }
}

Notes

For very large max_tokens, see provider-specific limits; the gateway may clamp to a safe ceiling on some pooled channels.
On a transient upstream failure the gateway automatically retries on another channel (see HA).