Skip to main content
Version: 2025-07-24

Rate Limits and Output Token Limits

This page outlines the current rate limiting policies and output token constraints for API usage across Lazarus' hosted API.


Rate Limits

To ensure fair usage and platform stability, we enforce rate limits at the organization (orgID) level. These limits apply to only our hosted API.

Per organization:

If you exceed these limits, your API calls will receive a 429 Too Many Requests response. We recommend implementing exponential backoff with retries in your client logic to gracefully handle these responses.


Output Token Limits

By default, the hosted API has a maximum of 4K output tokens per response. For Lazarus' container deployments, the output token limit is adjustable.

If your request generates more tokens than this limit allows, the response will be truncated.


Please contact us at support@lazarusai.com if you have any questions or requests regarding rate limits or output token limits.