Rate limits & TPM
TPM / RPM
Section titled “TPM / RPM”Each key is provisioned with a tokens-per-minute (TPM) and requests-per-minute (RPM) allowance based on your forecast. Standard ranges run from 100 to 5,000 RPM; higher is available on request.
Because every model is backed by many upstream keys, your effective throughput is the sum across channels — you can burst higher than any single official provider tier allows.
Multi-channel high availability
Section titled “Multi-channel high availability”- Every model is fronted by a pool of upstream keys.
- Each channel is health-checked continuously; failing keys are quarantined within seconds.
- Requests automatically fail over to a healthy channel, so a single bad key never breaks your traffic.
Handling limits
Section titled “Handling limits”- On
429, back off and retry with jitter. - If you consistently hit limits, contact us to raise your allowance — we size to your workload.
- Avoid sending a single request with an extreme
max_tokens; some pooled channels clamp to a safe ceiling.