Azure API Management Rate Limiting & Quotas
Controlling API Usage and Preventing Abuse
Introduction
Rate limiting and quotas are essential for protecting your APIs from abuse, ensuring fair usage among consumers, and preventing server overload. Azure API Management provides flexible mechanisms to control API usage at different levels.
This comprehensive guide covers:
- Rate limiting — Controlling request frequency
- Quotas — Limiting total usage over time
- Key-based limiting — Per-user, per-subscription controls
- Advanced patterns — Tiered access, hybrid limiting
- Configuration — Portal and policy-based setup
Understanding Rate Limiting
How Rate Limiting Works
┌─────────────────────────────────────────────────────────────────────┐
│ RATE LIMITING FLOW │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Client A ────▶ APIM ────▶ Backend │
│ │ │ │
│ │ │ 1st request │
│ │ │ ✓ Allowed │
│ │ │ Counter: 1 │
│ │ │ │
│ │ │ 2nd-100th requests │
│ │ │ ✓ Allowed │
│ │ │ Counter: 100 │
│ │ │ │
│ │ │ 101st request │
│ │ │ ✗ Rate limit exceeded │
│ │ │ Returns: 429 Too Many Requests │
│ │ │ │
│ Client B ────▶ APIM ────▶ (same counter for shared key) │
│ │
└─────────────────────────────────────────────────────────────────────┘
Rate Limit Headers
When rate limit is triggered, APIM returns headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-RetryAfter: 60
Retry-After: 60
Rate Limit Policy
Basic Rate Limit
<inbound>
<!-- 100 calls per 60 seconds for all requests -->
<rate-limit calls="100" renewal-period="60" />
</inbound>
Rate Limit Response
<inbound>
<rate-limit
calls="100"
renewal-period="60"
remaining-calls-header-name="X-RateRemaining"
retry-after-header-name="X-RateRetryAfter"
limit-header-name="X-RateLimit" />
</inbound>
Per-Subscription Rate Limit
<inbound>
<!-- Rate limit per subscription -->
<rate-limit-by-key
calls="100"
renewal-period="60"
counter-key="@(context.Subscription.Id)"
increment-count="1"
counter-prefix="subscription" />
</inbound>
Rate Limit by Key
Per-User Rate Limit
<inbound>
<!-- Rate limit based on user ID from JWT -->
<set-variable name="userId" value="@(context.Request.Headers.GetValueOrDefault("X-User-Id", ""))" />
<rate-limit-by-key
calls="50"
renewal-period="60"
counter-key="@(context.Variables["userId"])" />
</inbound>
Per-IP Rate Limit
<inbound>
<rate-limit-by-key
calls="10"
renewal-period="60"
counter-key="@(context.Request.IpAddress)" />
</inbound>
Per-API Key Rate Limit
<inbound>
<rate-limit-by-key
calls="1000"
renewal-period="3600"
counter-key="@(context.Subscription.PrimaryKey)" />
</inbound>
Quota Policy
Basic Quota
<outbound>
<!-- 10,000 calls per day -->
<quota calls="10000" renewal-period="86400" />
</outbound>
Bandwidth Quota
<outbound>
<!-- 1GB bandwidth per month -->
<quota bandwidth="1073741824" renewal-period="2678400" />
</outbound>
Combined Calls and Bandwidth
<outbound>
<quota
calls="10000"
bandwidth="524288000"
renewal-period="86400"
bytes-unit="KB" />
</outbound>
Per-Subscription Quota
<outbound>
<quota-by-key
calls="100000"
renewal-period="2592000"
counter-key="@(context.Subscription.Id)" />
</outbound>
Advanced Patterns
Tiered Rate Limiting
<inbound>
<choose>
<when condition="@(context.Subscription.Name.Contains("Premium"))">
<!-- Higher limit for premium -->
<rate-limit-by-key calls="1000" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</when>
<when condition="@(context.Subscription.Name.Contains("Standard"))">
<!-- Medium limit for standard -->
<rate-limit-by-key calls="100" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</when>
<otherwise>
<!-- Basic limit -->
<rate-limit-by-key calls="10" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</otherwise>
</choose>
</inbound>
Hybrid Rate Limit + Quota
<inbound>
<!-- Rate limit - per minute -->
<rate-limit-by-key
calls="100"
renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</inbound>
<outbound>
<!-- Quota - per day -->
<quota-by-key
calls="10000"
renewal-period="86400"
counter-key="@(context.Subscription.Id)" />
</outbound>
Conditional Rate Limiting
<inbound>
<!-- Only rate limit expensive endpoints -->
<choose>
<when condition="@(context.Request.Url.Path.StartsWith("/reports"))">
<rate-limit-by-key calls="10" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</when>
<when condition="@(context.Request.Url.Path.StartsWith("/search"))">
<rate-limit-by-key calls="50" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
</when>
<otherwise>
<!-- No limit for other endpoints -->
</otherwise>
</choose>
</inbound>
Response When Exceeded
Default Behavior
When rate limit is exceeded:
- Returns HTTP 429 (Too Many Requests)
- Includes
Retry-Afterheader - Includes rate limit headers
Custom Error Response
<inbound>
<rate-limit-by-key calls="100" renewal-period="60"
counter-key="@(context.Subscription.Id)"
increment-count="1" />
</inbound>
<on-error>
<choose>
<when condition="@(context.LastError.Message.Contains("Rate limit"))">
<return-response>
<set-status code="429" reason="Rate Limit Exceeded" />
<set-header name="Retry-After" exists-action="override">
<value>60</value>
</set-header>
<set-body>{
"error": "Rate limit exceeded",
"message": "You have exceeded the rate limit. Please try again later.",
"retryAfter": 60
}</set-body>
</return-response>
</when>
</choose>
</on-error>
Configuration Options
Policy Attributes
| Attribute | Description | Default |
|---|---|---|
calls | Maximum calls allowed | Required |
renewal-period | Time period in seconds | Required |
counter-key | Key to track usage | All requests |
remaining-calls-header | Header name for remaining | X-RateLimit-Remaining |
limit-header-name | Header name for limit | X-RateLimit-Limit |
retry-after-header-name | Header for retry time | X-RateLimit-RetryAfter |
Increment Count
<inbound>
<!-- Count batch size as 1 regardless of actual count -->
<rate-limit-by-key calls="100" renewal-period="60"
counter-key="@(context.Subscription.Id)"
increment-count="1" />
</inbound>
Best Practices
| Practice | Description |
|---|---|
| Use rate-limit-by-key | More flexible than basic rate-limit |
| Set appropriate limits | Match backend capacity |
| Include rate limit headers | Help clients understand limits |
| Use quotas for billing | Track total usage |
| Implement tiered limits | Different limits per subscription tier |
| Test thoroughly | Verify limits work correctly |
Recommended Limits
| Tier | Rate Limit | Daily Quota |
|---|---|---|
| Free | 10/min | 100/day |
| Basic | 50/min | 1,000/day |
| Standard | 100/min | 10,000/day |
| Premium | 1,000/min | 100,000/day |
Monitoring
# Get subscription throttling metrics
az monitor metrics list \
--resource-group my-rg \
--resource-type Microsoft.ApiManagement/service \
--metric "ThrottledRequests"
Related Topics
- Caching — Response caching
- JWT Validation — Authentication
- Policy Engine — Advanced policies
Azure Integration Hub - Intermediate Level