Use case
Vector database RAG workflows need model routing
A vector database stores the retrieval layer, but the model layer still needs planning. Teams building RAG need to choose embedding models, decide when to re-index, pick generation models and route harder questions to stronger models. PEKPIK LLM helps keep those model decisions behind one OpenAI-compatible gateway.
Why teams search for this
Where PEKPIK fits
Good fit
- OKRAG stacks with existing vector storage but changing model needs.
- OKTeams deciding whether to re-index with a new embedding model.
- OKProducts that need different generation models for different document workflows.
Check first
- !PEKPIK does not replace your vector database or retrieval layer.
- !Re-indexing can be operationally expensive and should be planned.
- !Grounding quality depends on retrieval filters, chunking and prompt design.
OpenAI-compatible example
base_url swapfrom openai import OpenAI
client = OpenAI(
base_url="https://aiapiv2.pekpik.com/v1",
api_key="sk-...",
)
response = client.chat.completions.create(
model="claude-opus-4-7",
messages=[{"role": "user", "content": "Summarize this for a product team."}],
) Suggested rollout
- 01
Document your vector dimensions, metadata filters and retrieval scoring.
- 02
Test any new embedding model on a small corpus first.
- 03
Evaluate answer quality with retrieved context included.
- 04
Route complex answers to stronger models and simple grounded answers to lower-cost options.
FAQ
Does PEKPIK store my vectors?
No. PEKPIK is the model gateway. Your vector database or search system stores vectors and handles retrieval.
When should I re-index a RAG corpus?
Re-index when a new embedding model measurably improves retrieval quality enough to justify the migration cost.