Rate Limits and Output Token Limits
This page outlines the current rate limiting policies and output token constraints for API usage across Lazarus' hosted API.
Rate Limits
To ensure fair usage and platform stability, we enforce rate limits at the organization (orgID) level. These limits apply to only our hosted API.
Per organization:
- Bulk API calls (Bulk RikAI 2, Bulk RikAI2-Extract, Bulk RikY2):
- Limit: 80 API calls per 5 minutes
- Total API calls (all Lazarus APIs):
- Limit: 70 API calls per 10 seconds
If you exceed these limits, your API calls will receive a 429 Too Many Requests
response. We recommend implementing exponential backoff with retries in your client logic to gracefully handle these responses.
Output Token Limits
By default, the hosted API has a maximum of 4K output tokens per response. For Lazarus' container deployments, the output token limit is adjustable.
If your request generates more tokens than this limit allows, the response will be truncated.
Please contact us at support@lazarusai.com if you have any questions or requests regarding rate limits or output token limits.