API Management — Implementing Multi-Tier Rate Limiting

The Problem

You have an API exposed via Azure APIM and need to implement rate limiting:

Free tier: 100 requests/hour
Basic tier: 1,000 requests/hour
Pro tier: 10,000 requests/hour
Enterprise tier: Unlimited with dedicated backend

Plus, you need to block abusive clients automatically and provide proper error responses.

Solution Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Client Requests                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────┐    ┌─────────────┐    ┌─────────────────────────┐  │
│  │ Free    │    │   APIM      │    │    Rate Limit Policy    │  │
│  │ Clients │───▶│   Gateway   │───▶│    (by subscription)    │  │
│  └─────────┘    └─────────────┘    └─────────────────────────┘  │
│        │                                                    │   │
│        ▼                                                    ▼   │
│  ┌─────────────┐                                    ┌─────────┐ │
│  │ 429 Too     │                                    │ Allow   │ │
│  │ Many        │                                    │ Request │ │
│  │ Requests    │                                    └────┬────┘ │
│  └─────────────┘                                         │      │
│                                                          ▼      │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                   Backend API                            │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Implementation

Step 1: Configure Rate Limit Policy

<!-- api-policy.xml -->
<policies>
    <inbound>
        <!-- Base rate limiting by subscription -->
        <rate-limit-by-key  calls="1000" 
                            renewal-period="3600" 
                            counter-key="@(context.Subscription.Id)"
                            increment-condition="@(context.Response.StatusCode >= 200 && context.Response.StatusCode < 300)" />

        <!-- Track exceeded quota -->
        <choose>
            <when condition="@(context.Variables.GetValueOrDefault<bool>("isRateLimited"))">
                <return-response>
                    <set-status code="429" reason="Too Many Requests" />
                    <set-header name="X-Rate-Limit-Retry-After" exists-action="override">
                        <value>@(context.Variables.GetValueOrDefault<int>("retryAfterSeconds").ToString())</value>
                    </set-header>
                    <set-header name="X-Rate-Limit-Limit" exists-action="override">
                        <value>@(context.Variables.GetValueOrDefault<int>("rateLimitCalls").ToString())</value>
                    </set-header>
                    <set-header name="X-Rate-Limit-Remaining" exists-action="override">
                        <value>@(context.Variables.GetValueOrDefault<int>("rateLimitRemaining").ToString())</value>
                    </set-header>
                    <set-body>{
                        "error": "rate_limit_exceeded",
                        "message": "You have exceeded your rate limit. Please retry after the retry-after period.",
                        "details": "Contact support@company.com for tier upgrade options."
                    }</set-body>
                </return-response>
            </when>
        </choose>

        <!-- Call backend -->
        <base />
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
    </outbound>
</policies>

Step 2: Advanced Tier-Based Rate Limiting

<!-- advanced-rate-limit.xml -->
<policies>
    <inbound>
        <!-- Determine tier from subscription -->
        <set-variable name="tier" value="@{
            var subscription = context.Subscription;
            var tierName = subscription.Properties?.FirstOrDefault(p => p.Key == "tier")?.Value?.ToString() ?? "free";
            return tierName;
        }" />

        <!-- Set rate limits based on tier -->
        <set-variable name="rateLimitConfig" value="@{
            var tier = context.Variables.GetValueOrDefault<string>("tier");
            return tier switch {
                "enterprise" => new { calls = 999999999, period = 3600, burst = 100 },
                "pro" => new { calls = 10000, period = 3600, burst = 50 },
                "basic" => new { calls = 1000, period = 3600, burst = 10 },
                _ => new { calls = 100, period = 3600, burst = 5 }
            };
        }" />

        <!-- Apply rate limiting -->
        <rate-limit-by-key calls="@(context.Variables.GetValueOrDefault<dynamic>("rateLimitConfig").calls)" 
                           renewal-period="@(context.Variables.GetValueOrDefault<dynamic>("rateLimitConfig").period)"
                           counter-key="@(context.Subscription.Id + "-" + context.Request.IpAddress)"
                           increment-condition="@(context.Response.StatusCode >= 200 && context.Response.StatusCode < 300)" />

        <base />
    </inbound>
</policies>

Step 3: Client ID Rate Limiting

<!-- client-rate-limit.xml -->
<policies>
    <inbound>
        <!-- Use API key as counter for stricter limiting -->
        <set-variable name="apiKey" value="@(context.Request.Headers.GetValueOrDefault("X-Api-Key", ""))" />

        <!-- Rate limit by API key for premium clients -->
        <choose>
            <when condition="@(!string.IsNullOrEmpty(context.Variables.GetValueOrDefault<string>("apiKey")))">
                <rate-limit-by-key calls="10000" 
                                  renewal-period="60"
                                  counter-key="@(context.Variables.GetValueOrDefault<string>("apiKey"))"
                                  increment-condition="@(context.Response.StatusCode >= 200 && context.Response.StatusCode < 300)" />
            </when>
        </choose>

        <base />
    </inbound>
</policies>

Step 4: Automatic Blocking of Abusive Clients

<!-- abuse-prevention.xml -->
<policies>
    <inbound>
        <!-- Check if client is in blocklist -->
        <set-variable name="isBlocked" value="@{
            var blockedIps = new[] { "192.168.1.100", "10.0.0.50" };
            var clientIp = context.Request.IpAddress;
            return blockedIps.Contains(clientIp);
        }" />

        <!-- Check for suspicious patterns -->
        <set-variable name="suspiciousPattern" value="@{
            var userAgent = context.Request.Headers.GetValueOrDefault("User-Agent", "");
            var referer = context.Request.Headers.GetValueOrDefault("Referer", "");
            
            // Block missing user agent
            if (string.IsNullOrEmpty(userAgent)) return true;
            
            // Block known scrapers
            var scrapers = new[] { "scrapy", "curl", "wget", "python" };
            return scrapers.Any(s => userAgent.ToLower().Contains(s));
        }" />

        <!-- Block if suspicious -->
        <choose>
            <when condition="@(context.Variables.GetValueOrDefault<bool>("isBlocked") || context.Variables.GetValueOrDefault<bool>("suspiciousPattern"))">
                <return-response>
                    <set-status code="403" reason="Forbidden" />
                    <set-body>{
                        "error": "access_denied",
                        "message": "Your request has been blocked due to suspicious activity."
                    }</set-body>
                </return-response>
            </when>
        </choose>

        <base />
    </inbound>
</policies>

Step 5: Backend Rate Limiting (Throttling Backend Calls)

<!-- backend-throttle.xml -->
<policies>
    <backend>
        <!-- Throttle calls to backend to protect it -->
        <rate-limit-by-key calls="100" 
                           renewal-period="60"
                           counter-key="@(context.Request.Url.Host)"
                           increment-condition="@(context.Response.StatusCode >= 200 && context.Response.StatusCode < 300)" />

        <!-- Circuit breaker for backend failures -->
        <choose>
            <when condition="@(context.Variables.GetValueOrDefault<bool>("circuitBreakerOpen"))">
                <return-response>
                    <set-status code="503" reason="Service Unavailable" />
                    <set-body>{"error": "backend_overloaded", "message": "Service is experiencing high load. Please retry later."}</set-body>
                </return-response>
            </when>
        </choose>

        <base />
    </backend>
</policies>

Programmatic Management

Managing Blocklist via API

using Microsoft.Azure.Management.ApiManagement;
using Microsoft.Rest;

public class RateLimitManager
{
    private readonly ApiManagementClient _client;
    private readonly string _resourceGroup;
    private readonly string _apiMgmtName;

    public RateLimitManager(string subscriptionId, string resourceGroup, string apiMgmtName, string accessToken)
    {
        var credentials = new TokenCredentials(accessToken);
        _client = new ApiManagementClient(credentials) { SubscriptionId = subscriptionId };
        _resourceGroup = resourceGroup;
        _apiMgmtName = apiMgmtName;
    }

    public async Task BlockClientAsync(string ipAddress, string reason)
    {
        // Add to named value (blocked IPs)
        await _client.NamedValues.CreateOrUpdateAsync(
            _resourceGroup,
            _apiMgmtName,
            $"blocked-ip-{ipAddress.Replace(".", "-")}",
            new Microsoft.Azure.Management.ApiManagement.Models.NamedValueContract
            {
                DisplayName = $"Blocked: {ipAddress}",
                Value = ipAddress,
                Secret = false
            });
    }

    public async Task<Dictionary<string, int>> GetTierLimitsAsync(string tier)
    {
        return tier switch
        {
            "enterprise" => new Dictionary<string, int> { ["calls"] = 999999999, ["period"] = 3600 },
            "pro" => new Dictionary<string, int> { ["calls"] = 10000, "period" = 3600 },
            "basic" => new Dictionary<string, int> { ["calls"] = 1000, "period"] = 3600 },
            _ => new Dictionary<string, int> { ["calls"] = 100, "period"] = 3600 }
        };
    }
}

Monitoring Rate Limits

using Microsoft.Azure.Management.ApiManagement;
using Microsoft.Azure.Management.ApiManagement.Models;

public class RateLimitMonitor
{
    private readonly ApiManagementClient _client;

    public async Task<RateLimitReport> GetRateLimitStatsAsync(string apiMgmtName, string resourceGroup)
    {
        var reports = await _client.Reports.ListByApiAsync(
            resourceGroup,
            apiMgmtName,
            startDate: DateTime.UtcNow.AddDays(-7),
            endDate: DateTime.UtcNow,
            interval: "PT1H");

        var report = new RateLimitReport
        {
            TotalRequests = reports.Sum(r => r.CallCount),
            BlockedRequests = reports.Sum(r => r.BlockedCallCount),
            ThrottledRequests = reports.Sum(r => r.ThrottledCallCount),
            TopClients = GetTopClients(reports),
            TopAPIs = GetTopAPIs(reports)
        };

        return report;
    }

    private Dictionary<string, long> GetTopClients(IEnumerable<ReportContract> reports)
    {
        return reports
            .GroupBy(r => r.SubscriptionId)
            .OrderByDescending(g => g.Sum(r => r.CallCount))
            .Take(10)
            .ToDictionary(g => g.Key ?? "anonymous", g => g.Sum(r => r.CallCount));
    }
}

Testing Rate Limits

#!/bin/bash
# test-rate-limit.sh

API_URL="https://your-apim.azure-api.net/your-api/endpoint"
API_KEY="your-subscription-key"

echo "Testing rate limit..."
for i in {1..15}; do
    RESPONSE=$(curl -s -w "\n%{http_code}" -H "Ocp-Apim-Subscription-Key: $API_KEY" "$API_URL")
    HTTP_CODE=$(echo "$RESPONSE" | tail -n 1)
    BODY=$(echo "$RESPONSE" | head -n -1)
    echo "Request $i: HTTP $HTTP_CODE"
    if [ "$HTTP_CODE" == "429" ]; then
        echo "Rate limited! Response: $BODY"
        break
    fi
done

echo "Testing with different client IPs..."
for i in {1..5}; do
    curl -s -H "Ocp-Apim-Subscription-Key: $API_KEY" \
         -H "X-Forwarded-For: 192.168.1.$i" \
         "$API_URL" | jq -r '.error // .message'
done

Best Practices

Separate limits per API: Different APIs have different sensitivity
Use burst allowance: Allow short spikes without throttling
Clear error messages: Tell clients when they can retry
Monitor and alert: Track rate limit violations
Gradual rollout: Test with canary before full deployment

Summary

APIM provides built-in rate limiting via rate-limit-by-key
Combine subscription, client IP, and API key for flexible limiting
Implement circuit breakers to protect backend
Monitor usage and adjust limits based on traffic patterns