Azure API Management Rate Limiting & Quotas

Controlling API Usage and Preventing Abuse


Introduction

Rate limiting and quotas are essential for protecting your APIs from abuse, ensuring fair usage among consumers, and preventing server overload. Azure API Management provides flexible mechanisms to control API usage at different levels.

This comprehensive guide covers:

  • Rate limiting — Controlling request frequency
  • Quotas — Limiting total usage over time
  • Key-based limiting — Per-user, per-subscription controls
  • Advanced patterns — Tiered access, hybrid limiting
  • Configuration — Portal and policy-based setup

Understanding Rate Limiting

How Rate Limiting Works

┌─────────────────────────────────────────────────────────────────────┐
│                    RATE LIMITING FLOW                               │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Client A ────▶ APIM ────▶ Backend                                 │
│        │             │                                              │
│        │             │  1st request                                 │
│        │             │  ✓ Allowed                                   │
│        │             │  Counter: 1                                  │
│        │             │                                              │
│        │             │  2nd-100th requests                          │
│        │             │  ✓ Allowed                                   │
│        │             │  Counter: 100                                │
│        │             │                                              │
│        │             │  101st request                               │
│        │             │  ✗ Rate limit exceeded                       │
│        │             │  Returns: 429 Too Many Requests              │
│        │             │                                              │
│   Client B ────▶ APIM ────▶ (same counter for shared key)           │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Rate Limit Headers

When rate limit is triggered, APIM returns headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-RetryAfter: 60
Retry-After: 60

Rate Limit Policy

Basic Rate Limit

<inbound>
    <!-- 100 calls per 60 seconds for all requests -->
    <rate-limit calls="100" renewal-period="60" />
</inbound>

Rate Limit Response

<inbound>
    <rate-limit 
        calls="100" 
        renewal-period="60"
        remaining-calls-header-name="X-RateRemaining"
        retry-after-header-name="X-RateRetryAfter"
        limit-header-name="X-RateLimit" />
</inbound>

Per-Subscription Rate Limit

<inbound>
    <!-- Rate limit per subscription -->
    <rate-limit-by-key 
        calls="100" 
        renewal-period="60" 
        counter-key="@(context.Subscription.Id)" 
        increment-count="1"
        counter-prefix="subscription" />
</inbound>

Rate Limit by Key

Per-User Rate Limit

<inbound>
    <!-- Rate limit based on user ID from JWT -->
    <set-variable name="userId" value="@(context.Request.Headers.GetValueOrDefault("X-User-Id", ""))" />
    
    <rate-limit-by-key 
        calls="50" 
        renewal-period="60" 
        counter-key="@(context.Variables["userId"])" />
</inbound>

Per-IP Rate Limit

<inbound>
    <rate-limit-by-key 
        calls="10" 
        renewal-period="60" 
        counter-key="@(context.Request.IpAddress)" />
</inbound>

Per-API Key Rate Limit

<inbound>
    <rate-limit-by-key 
        calls="1000" 
        renewal-period="3600" 
        counter-key="@(context.Subscription.PrimaryKey)" />
</inbound>

Quota Policy

Basic Quota

<outbound>
    <!-- 10,000 calls per day -->
    <quota calls="10000" renewal-period="86400" />
</outbound>

Bandwidth Quota

<outbound>
    <!-- 1GB bandwidth per month -->
    <quota bandwidth="1073741824" renewal-period="2678400" />
</outbound>

Combined Calls and Bandwidth

<outbound>
    <quota 
        calls="10000" 
        bandwidth="524288000" 
        renewal-period="86400" 
        bytes-unit="KB" />
</outbound>

Per-Subscription Quota

<outbound>
    <quota-by-key 
        calls="100000" 
        renewal-period="2592000" 
        counter-key="@(context.Subscription.Id)" />
</outbound>

Advanced Patterns

Tiered Rate Limiting

<inbound>
    <choose>
        <when condition="@(context.Subscription.Name.Contains("Premium"))">
            <!-- Higher limit for premium -->
            <rate-limit-by-key calls="1000" renewal-period="60" 
                counter-key="@(context.Subscription.Id)" />
        </when>
        <when condition="@(context.Subscription.Name.Contains("Standard"))">
            <!-- Medium limit for standard -->
            <rate-limit-by-key calls="100" renewal-period="60" 
                counter-key="@(context.Subscription.Id)" />
        </when>
        <otherwise>
            <!-- Basic limit -->
            <rate-limit-by-key calls="10" renewal-period="60" 
                counter-key="@(context.Subscription.Id)" />
        </otherwise>
    </choose>
</inbound>

Hybrid Rate Limit + Quota

<inbound>
    <!-- Rate limit - per minute -->
    <rate-limit-by-key 
        calls="100" 
        renewal-period="60" 
        counter-key="@(context.Subscription.Id)" />
</inbound>

<outbound>
    <!-- Quota - per day -->
    <quota-by-key 
        calls="10000" 
        renewal-period="86400" 
        counter-key="@(context.Subscription.Id)" />
</outbound>

Conditional Rate Limiting

<inbound>
    <!-- Only rate limit expensive endpoints -->
    <choose>
        <when condition="@(context.Request.Url.Path.StartsWith("/reports"))">
            <rate-limit-by-key calls="10" renewal-period="60" 
                counter-key="@(context.Subscription.Id)" />
        </when>
        <when condition="@(context.Request.Url.Path.StartsWith("/search"))">
            <rate-limit-by-key calls="50" renewal-period="60" 
                counter-key="@(context.Subscription.Id)" />
        </when>
        <otherwise>
            <!-- No limit for other endpoints -->
        </otherwise>
    </choose>
</inbound>

Response When Exceeded

Default Behavior

When rate limit is exceeded:

  • Returns HTTP 429 (Too Many Requests)
  • Includes Retry-After header
  • Includes rate limit headers

Custom Error Response

<inbound>
    <rate-limit-by-key calls="100" renewal-period="60" 
        counter-key="@(context.Subscription.Id)" 
        increment-count="1" />
</inbound>

<on-error>
    <choose>
        <when condition="@(context.LastError.Message.Contains("Rate limit"))">
            <return-response>
                <set-status code="429" reason="Rate Limit Exceeded" />
                <set-header name="Retry-After" exists-action="override">
                    <value>60</value>
                </set-header>
                <set-body>{
                    "error": "Rate limit exceeded",
                    "message": "You have exceeded the rate limit. Please try again later.",
                    "retryAfter": 60
                }</set-body>
            </return-response>
        </when>
    </choose>
</on-error>

Configuration Options

Policy Attributes

AttributeDescriptionDefault
callsMaximum calls allowedRequired
renewal-periodTime period in secondsRequired
counter-keyKey to track usageAll requests
remaining-calls-headerHeader name for remainingX-RateLimit-Remaining
limit-header-nameHeader name for limitX-RateLimit-Limit
retry-after-header-nameHeader for retry timeX-RateLimit-RetryAfter

Increment Count

<inbound>
    <!-- Count batch size as 1 regardless of actual count -->
    <rate-limit-by-key calls="100" renewal-period="60" 
        counter-key="@(context.Subscription.Id)" 
        increment-count="1" />
</inbound>

Best Practices

PracticeDescription
Use rate-limit-by-keyMore flexible than basic rate-limit
Set appropriate limitsMatch backend capacity
Include rate limit headersHelp clients understand limits
Use quotas for billingTrack total usage
Implement tiered limitsDifferent limits per subscription tier
Test thoroughlyVerify limits work correctly

Recommended Limits

TierRate LimitDaily Quota
Free10/min100/day
Basic50/min1,000/day
Standard100/min10,000/day
Premium1,000/min100,000/day

Monitoring

# Get subscription throttling metrics
az monitor metrics list \
  --resource-group my-rg \
  --resource-type Microsoft.ApiManagement/service \
  --metric "ThrottledRequests"

Related Topics


Azure Integration Hub - Intermediate Level