Rate limiting and throttling
Understand rate limits, avoid throttling, and optimize high-volume workloads.
What is rate limiting?
Rate limits control how many requests you can make in a time period. They prevent abuse and ensure fair resource allocation.
Runtype rate limits
Runtype enforces limits on Flow executions:
Plan | Execution pool | Burst rate (10s) | Enforcement |
|---|---|---|---|
Build (free) | 50/day | — | Full speed → slow mode |
Startup | 15,000/month | 50 requests | Soft cap + overage billing |
Pro | 60,000/month | 200 requests | Soft cap + overage billing |
Team | 300,000/month | 500 requests | Soft cap + overage billing |
Enterprise | Unlimited | Unlimited | — |
Burst rate is the maximum requests per 10-second window. Exceeding it returns 429 until the window resets.
Paid tiers (Startup, Pro, Team) use a soft cap: you can continue running after the monthly pool is used, and overage is billed. So 429 errors on paid tiers typically come from exceeding the burst rate, not from exhausting your monthly pool.
These limits apply to Flow executions, not to AI model calls (which have their own provider-specific limits).
Slow mode (Build tier)
Build tier users get 50 Flow executions per day at full speed. After that, you enter slow mode:
Each request is delayed 10 seconds
You can make up to 10 requests per hour
Requests are not blocked entirely, so demos and light use can continue
If you exceed the hourly limit in slow mode, the API returns 429 with code SLOW_MODE_HOURLY_LIMIT_EXCEEDED and a Retry-After header. Upgrading to a paid plan removes slow mode and gives you a monthly pool and higher burst rate.
AI model provider limits
When using platform keys:
Shared across Runtype users
May be throttled during peak usage
Best for development, low-to-moderate production volume
When using BYOK:
Your provider's rate limits apply
Dedicated to your account
Best for high-volume production or when you need guaranteed response times
See provider dashboards for specific limits
Rate limit errors
Runtype API: HTTP 429 Too Many Requests with a JSON body including:
error: Description of the limit hitcode: Machine-readable code likeRATE_LIMIT_EXCEEDED,DAILY_EXECUTION_LIMIT_EXCEEDED,SLOW_MODE_HOURLY_LIMIT_EXCEEDED, orMONTHLY_EXECUTION_SOFT_CAP_REACHEDRetry-Afterheader: Seconds until retry is allowed
Clients should use the code field and Retry-After header for retry and backoff logic.
AI providers: RateLimitError or 429
Avoiding rate limits
Use BYOK
Switch to your own API keys for dedicated rate limits:
Go to Settings → Models
Add provider keys (OpenAI, Anthropic, etc.)
Disable platform keys
Implement backoff and retry
Add delay steps and retry logic:
Add conditional check for rate limit errors
If rate limited: Delay step (use the
Retry-Afterheader when present)Retry the request
Reduce request frequency
Batch operations instead of individual requests
Cache API responses in Records
Use scheduled Flows during off-peak hours
Spread load
For large batches:
Lower concurrency settings
Add delay steps between API calls
Split batch into multiple smaller batches
Monitoring rate limit usage
API response headers
Burst rate limit (paid tiers):
RateLimit-Limit: 200
RateLimit-Remaining: 142
RateLimit-Reset: 7RateLimit-Limit: Maximum requests in the 10-second window
RateLimit-Remaining: Requests left in current window
RateLimit-Reset: Seconds until the current 10-second window resets (0–10)
Execution quota:
X-Quota-Limit: 60000
X-Quota-Remaining: 42153
X-Quota-Reset: 345600
X-Quota-Policy: monthlyX-Quota-Limit: Total executions in your pool
X-Quota-Remaining: Executions left in current period
X-Quota-Reset: Seconds until daily/monthly period resets
X-Quota-Policy:
daily,monthly, orunlimitedX-Quota-Warning: Present when you've used ≥80% of your quota
X-Quota-Overage: Present when you've exceeded your soft cap
Build tier slow mode:
X-Slow-Mode: true
X-Slow-Mode-Remaining: 7
X-Slow-Mode-Reset: 2340X-Slow-Mode: Indicates slow mode is active
X-Slow-Mode-Remaining: Requests left in current hour
X-Slow-Mode-Reset: Seconds until hourly limit resets
Dashboard monitoring
Check execution usage in the Runtype dashboard:
Header: Shows execution usage indicator (used/limit for the period)
Settings → Billing: Detailed usage and limits
Build users in slow mode: See a "Slow mode" indicator in the header
This helps you avoid hitting limits without inspecting API responses.
Upgrading limits
Runtype platform
Upgrade your plan from the dashboard at Settings → Billing or via upgrade prompts when near or at a limit:
Build → Startup: Move from 50/day + slow mode to 15,000 executions/month and higher burst rate
Startup → Pro → Team: Higher monthly pools (60k, 300k) and burst limits
Pro/Team → Enterprise: Custom limits and unlimited option
AI providers
Contact providers to increase limits:
OpenAI: Request tier upgrade
Anthropic: Apply for higher tier
Google: Increase quota in console
Test high-volume workflows in development with realistic data volumes. This reveals rate limit issues before production deployment.
Best practices
Cache aggressively: Store and reuse API responses
Batch when possible: One request for 100 items > 100 individual requests
Monitor proactively: Set up alerts before hitting limits
Plan for growth: Upgrade before you need it
Use cheaper models: Faster models often have higher rate limits
Emergency mitigation
If you're rate limited in production:
Pause scheduled jobs temporarily
Switch to BYOK if using platform keys
Add delays to active Flows
Contact support for temporary limit increase (Enterprise)
Next steps
Connecting AI model providers for BYOK setup
Billing and plans to upgrade limits
Running Flows in batch for batch optimization
Common errors and solutions for troubleshooting