Use case

Vector database RAG workflows need model routing

A vector database stores the retrieval layer, but the model layer still needs planning. Teams building RAG need to choose embedding models, decide when to re-index, pick generation models and route harder questions to stronger models. PEKPIK LLM helps keep those model decisions behind one OpenAI-compatible gateway.

Request access Read quickstart View model catalog

Primary query

vector database RAG

Why teams search for this

Separate vector storage decisions from model access decisions.

Choose embedding models carefully before indexing large corpora.

Route generation models by answer complexity and business risk.

Keep fallback models available when retrieved context is ambiguous.

Where PEKPIK fits

Good fit

OKRAG stacks with existing vector storage but changing model needs.
OKTeams deciding whether to re-index with a new embedding model.
OKProducts that need different generation models for different document workflows.

Check first

!PEKPIK does not replace your vector database or retrieval layer.
!Re-indexing can be operationally expensive and should be planned.
!Grounding quality depends on retrieval filters, chunking and prompt design.

OpenAI-compatible example

base_url swap

from openai import OpenAI

client = OpenAI(
    base_url="https://aiapiv2.pekpik.com/v1",
    api_key="sk-...",
)

response = client.chat.completions.create(
    model="claude-opus-4-7",
    messages=[{"role": "user", "content": "Summarize this for a product team."}],
)

Suggested rollout

01

Document your vector dimensions, metadata filters and retrieval scoring.
02

Test any new embedding model on a small corpus first.
03

Evaluate answer quality with retrieved context included.
04

Route complex answers to stronger models and simple grounded answers to lower-cost options.

FAQ

Does PEKPIK store my vectors?

No. PEKPIK is the model gateway. Your vector database or search system stores vectors and handles retrieval.

When should I re-index a RAG corpus?

Re-index when a new embedding model measurably improves retrieval quality enough to justify the migration cost.