Rate limiting and throttling

Understand rate limits, avoid throttling, and optimize high-volume workloads.

What is rate limiting?

Rate limits control how many requests you can make in a time period. They prevent abuse and ensure fair resource allocation.

Runtype rate limits

Runtype enforces limits on Flow executions:

Plan	Execution pool	Burst rate (10s)	Enforcement
Build (free)	50/day	—	Full speed → slow mode
Startup	15,000/month	50 requests	Soft cap + overage billing
Pro	60,000/month	200 requests	Soft cap + overage billing
Team	300,000/month	500 requests	Soft cap + overage billing
Enterprise	Unlimited	Unlimited	—

Burst rate is the maximum requests per 10-second window. Exceeding it returns 429 until the window resets.

Paid tiers (Startup, Pro, Team) use a soft cap: you can continue running after the monthly pool is used, and overage is billed. So 429 errors on paid tiers typically come from exceeding the burst rate, not from exhausting your monthly pool.

These limits apply to Flow executions, not to AI model calls (which have their own provider-specific limits).

Slow mode (Build tier)

Build tier users get 50 Flow executions per day at full speed. After that, you enter slow mode:

Each request is delayed 10 seconds
You can make up to 10 requests per hour
Requests are not blocked entirely, so demos and light use can continue

If you exceed the hourly limit in slow mode, the API returns 429 with code SLOW_MODE_HOURLY_LIMIT_EXCEEDED and a Retry-After header. Upgrading to a paid plan removes slow mode and gives you a monthly pool and higher burst rate.

AI model provider limits

When using platform keys:

Shared across Runtype users
May be throttled during peak usage
Best for development, low-to-moderate production volume

When using BYOK:

Your provider's rate limits apply
Dedicated to your account
Best for high-volume production or when you need guaranteed response times
See provider dashboards for specific limits

Rate limit errors

Runtype API: HTTP 429 Too Many Requests with a JSON body including:

error: Description of the limit hit
code: Machine-readable code like RATE_LIMIT_EXCEEDED, DAILY_EXECUTION_LIMIT_EXCEEDED, SLOW_MODE_HOURLY_LIMIT_EXCEEDED, or MONTHLY_EXECUTION_SOFT_CAP_REACHED
Retry-After header: Seconds until retry is allowed

Clients should use the code field and Retry-After header for retry and backoff logic.

AI providers: RateLimitError or 429

Avoiding rate limits

Use BYOK

Switch to your own API keys for dedicated rate limits:

Go to Settings → Models
Add provider keys (OpenAI, Anthropic, etc.)
Disable platform keys

Implement backoff and retry

Add delay steps and retry logic:

Add conditional check for rate limit errors
If rate limited: Delay step (use the Retry-After header when present)
Retry the request

Reduce request frequency

Batch operations instead of individual requests
Cache API responses in Records
Use scheduled Flows during off-peak hours

Spread load

For large batches:

Lower concurrency settings
Add delay steps between API calls
Split batch into multiple smaller batches

Monitoring rate limit usage

API response headers

Burst rate limit (paid tiers):

RateLimit-Limit: 200
RateLimit-Remaining: 142
RateLimit-Reset: 7

RateLimit-Limit: Maximum requests in the 10-second window
RateLimit-Remaining: Requests left in current window
RateLimit-Reset: Seconds until the current 10-second window resets (0–10)

Execution quota:

X-Quota-Limit: 60000
X-Quota-Remaining: 42153
X-Quota-Reset: 345600
X-Quota-Policy: monthly

X-Quota-Limit: Total executions in your pool
X-Quota-Remaining: Executions left in current period
X-Quota-Reset: Seconds until daily/monthly period resets
X-Quota-Policy: daily, monthly, or unlimited
X-Quota-Warning: Present when you've used ≥80% of your quota
X-Quota-Overage: Present when you've exceeded your soft cap

Build tier slow mode:

X-Slow-Mode: true
X-Slow-Mode-Remaining: 7
X-Slow-Mode-Reset: 2340

X-Slow-Mode: Indicates slow mode is active
X-Slow-Mode-Remaining: Requests left in current hour
X-Slow-Mode-Reset: Seconds until hourly limit resets

Dashboard monitoring

Check execution usage in the Runtype dashboard:

Header: Shows execution usage indicator (used/limit for the period)
Settings → Billing: Detailed usage and limits
Build users in slow mode: See a "Slow mode" indicator in the header

This helps you avoid hitting limits without inspecting API responses.

Upgrading limits

Runtype platform

Upgrade your plan from the dashboard at Settings → Billing or via upgrade prompts when near or at a limit:

Build → Startup: Move from 50/day + slow mode to 15,000 executions/month and higher burst rate
Startup → Pro → Team: Higher monthly pools (60k, 300k) and burst limits
Pro/Team → Enterprise: Custom limits and unlimited option

AI providers

Contact providers to increase limits:

OpenAI: Request tier upgrade
Anthropic: Apply for higher tier
Google: Increase quota in console

Test high-volume workflows in development with realistic data volumes. This reveals rate limit issues before production deployment.

Best practices

Cache aggressively: Store and reuse API responses
Batch when possible: One request for 100 items > 100 individual requests
Monitor proactively: Set up alerts before hitting limits
Plan for growth: Upgrade before you need it
Use cheaper models: Faster models often have higher rate limits

Emergency mitigation

If you're rate limited in production:

Pause scheduled jobs temporarily
Switch to BYOK if using platform keys
Add delays to active Flows
Contact support for temporary limit increase (Enterprise)

Next steps

Connecting AI model providers for BYOK setup
Billing and plans to upgrade limits
Running Flows in batch for batch optimization
Common errors and solutions for troubleshooting

Was this helpful?