Azure API Management — Backend Load Balancing & Circuit Breaker

Backend Pools, Weighted Routing, Health Checks, and Circuit Breaker Patterns


Introduction

When building resilient API architectures, distributing load across multiple backend services is essential for high availability and scalability. Azure API Management (APIM) provides sophisticated backend load balancing capabilities that go beyond simple round-robin routing.

This guide covers:

  • Backend Pools — Grouping multiple backends for load distribution
  • Weighted Routing — Traffic distribution based on weight values
  • Health Checks — Automatic detection of unhealthy backends
  • Circuit Breaker — Preventing cascading failures
  • Custom Policies — Advanced routing and failover logic

Backend Pool Configuration

Why Use Backend Pools?

In enterprise scenarios, you often need to:

  • Distribute traffic across multiple API instances for horizontal scaling
  • Implement blue-green deployments with gradual traffic shifting
  • Route to different backend versions (canary releases)
  • Provide fallback when primary backends fail
  • Implement geographic routing for lower latency

Basic Backend Pool

{
  "backends": [
    {
      "url": "https://api-v1.example.com",
      "weight": 10
    },
    {
      "url": "https://api-v2.example.com",
      "weight": 5
    },
    {
      "url": "https://api-v3.example.com",
      "weight": 5
    }
  ]
}

This configuration distributes traffic with a 10:5:5 ratio (50% to v1, 25% each to v2 and v3).

Backend Pool with Capacity

{
  "backends": [
    {
      "url": "https://primary-api.example.com",
      "weight": 80,
      "capacity": 1000,
      "description": "Primary API cluster - handles most traffic"
    },
    {
      "url": "https://secondary-api.example.com",
      "weight": 20,
      "capacity": 500,
      "description": "Secondary cluster - overflow traffic"
    }
  ]
}

Weighted Routing Policy

Understanding Weighted Routing

Weighted routing allows you to control what percentage of traffic goes to each backend. This is crucial for:

  1. Canary Deployments — Gradually shift traffic (e.g., 5% → 20% → 100%)
  2. A/B Testing — Route percentage of users to different versions
  3. Load Distribution — Distribute based on backend capacity
  4. Disaster Recovery — Route to backup when primary fails

Round-Robin with Weights

<backend>
    <set-backend-service base-url="https://api.example.com" />
    <round-robin weight="10">
        <address>https://backend1.example.com</address>
        <address>https://backend2.example.com</address>
        <address>https://backend3.example.com</address>
    </round-robin>
</backend>

In this policy:

  • Backend1 receives 10/20 = 50% of traffic
  • Backend2 receives 5/20 = 25% of traffic
  • Backend3 receives 5/20 = 25% of traffic

Dynamic Weight Assignment

<inbound>
    <!-- Set weight based on client tier -->
    <set-variable name="backendWeight" value="@{
        var tier = context.Request.Headers.GetValueOrDefault("X-Client-Tier", "standard");
        return tier switch {
            "premium" => 80,
            "enterprise" => 100,
            _ => 20
        };
    }" />
</inbound>

<backend>
    <round-robin weight="@(context.Variables.GetValue<int>("backendWeight"))">
        <address>https://premium-api.example.com</address>
        <address>https://standard-api.example.com</address>
    </round-robin>
</backend>

Geographic Routing Example

<inbound>
    <!-- Determine user's region -->
    <set-variable name="userRegion" value="@{
        var region = context.Request.Headers.GetValueOrDefault("X-Azure-Region", "eastus");
        return region;
    }" />
</inbound>

<backend>
    <choose>
        <!-- Route to closest backend -->
        <when condition="@(context.Variables.GetValue<string>("userRegion") == "westus2")">
            <set-backend-service base-url="https://api-westus2.example.com" />
        </when>
        <when condition="@(context.Variables.GetValue<string>("userRegion") == "eastus")">
            <set-backend-service base-url="https://api-eastus.example.com" />
        </when>
        <otherwise>
            <set-backend-service base-url="https://api-default.example.com" />
        </otherwise>
    </choose>
</backend>

Circuit Breaker Pattern

Why Circuit Breakers Matter

Without circuit breakers, when a backend fails:

  1. Requests continue hitting the failed backend
  2. Timeout delays cause slow responses
  3. Failed requests queue up, consuming resources
  4. Cascading failures affect other services

Circuit breakers prevent this by:

  • Detecting failures — Track failed requests
  • Opening the circuit — Stop sending requests to failed backend
  • Allowing recovery — Periodically test if backend recovered
  • Falling back — Route to alternative backend

Retry Policy with Circuit Breaker

<backend>
    <retry timeout="30">
        <condition>
            <variable name="retryCondition">
                @(context.Response.StatusCode >= 500 || context.Variables.GetValue<bool>("circuitOpen"))
            </variable>
        </condition>
        <break>5s</break>  <!-- Break for 5 seconds after 5 failures -->
        <backend>
            <retry>
                <!-- Fallback logic -->
            </retry>
        </backend>
    </retry>
</backend>

Advanced Circuit Breaker Implementation

<inbound>
    <!-- Check if circuit is open -->
    <set-variable name="circuitOpen" value="@{
        var lastFailure = context.Variables.GetValue<DateTime?>("lastFailureTime");
        var failureCount = context.Variables.GetValue<int>("failureCount");
        
        // Circuit opens after 5 failures within 60 seconds
        if (failureCount >= 5 && lastFailure.HasValue)
        {
            var timeSinceLastFailure = DateTime.UtcNow - lastFailure.Value;
            // Keep circuit open for 30 seconds
            return timeSinceLastFailure.TotalSeconds < 30;
        }
        
        return false;
    }" />

    <!-- If circuit is open, return 503 -->
    <choose>
        <when condition="@(context.Variables.GetValue<bool>("circuitOpen"))">
            <return-response>
                <set-status code="503" reason="Service Unavailable" />
                <set-body>Service temporarily unavailable. Please retry later.</set-body>
                <set-header name="Retry-After" exists-action="override">
                    <value>30</value>
                </set-header>
            </return-response>
        </when>
    </choose>
</inbound>

<backend>
    <!-- Track failures -->
    <choose>
        <when condition="@(context.Response.StatusCode >= 500)">
            <set-variable name="failureCount" value="@(context.Variables.GetValue<int>("failureCount") + 1)" />
            <set-variable name="lastFailureTime" value="@(DateTime.UtcNow)" />
        </when>
        <otherwise>
            <!-- Reset failure count on success -->
            <set-variable name="failureCount" value="0" />
        </when>
    </choose>
</backend>

Circuit Breaker State Machine

┌─────────────────────────────────────────────────────────────────┐
│                    CIRCUIT BREAKER STATES                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│    CLOSED (Normal) ──▶ OPEN (Failure) ──▶ HALF-OPEN (Testing)   │
│         │                   │                    │              │
│         ▼                   ▼                    ▼              │
│   ┌──────────┐       ┌──────────┐        ┌──────────┐           │
│   │Requests  │       │Requests  │        │Requests  │           │
│   │pass      │       │blocked   │        │test      │           │
│   │through   │       │return 503│        │backend   │           │
│   └──────────┘       └──────────┘        └──────────┘           │
│                                                                 │
│   Transition:              Transition:           Transition:    │
│   5 failures               30 seconds             Success=      │
│   in 60s                   timeout                 closed       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Health Check Configuration

Built-in Health Checks

APIM can automatically monitor backend health and remove unhealthy backends from rotation.

{
  "healthCheck": {
    "isEnabled": true,
    "interval": "00:00:30",
    "timeout": "00:00:10",
    "successThreshold": 1,
    "failureThreshold": 3,
    "healthStatus": "healthy",
    "unhealthyStatus": "unhealthy"
  }
}

Configuration explained:

  • interval — How often to check (every 30 seconds)
  • timeout — How long to wait for response (10 seconds)
  • successThreshold — Successful checks before marking healthy (1)
  • failureThreshold — Failed checks before marking unhealthy (3)

Health Check Endpoint Requirements

// Backend health check endpoint example
[ApiController]
[Route("health")]
public class HealthController : ControllerBase
{
    [HttpGet]
    public IActionResult GetHealth()
    {
        var health = new
        {
            Status = "healthy",
            Timestamp = DateTime.UtcNow,
            Dependencies = new
            {
                Database = "healthy",
                Cache = "healthy",
                ExternalApi = "healthy"
            },
            Metrics = new
            {
                CpuUsage = GetCpuUsage(),
                MemoryUsage = GetMemoryUsage(),
                ResponseTime = GetAvgResponseTime()
            }
        };
        
        return Ok(health);
    }
}

Custom Health Check Policy

<inbound>
    <!-- Check backend health before routing -->
    <choose>
        <when condition="@(context.Variables.GetValue<bool>("backendUnhealthy"))">
            <return-response>
                <set-status code="503" />
                <set-body>Service temporarily unavailable</set-body>
            </return-response>
        </when>
    </choose>
</inbound>

<backend>
    <!-- Execute health check -->
    <send-request mode="new" timeout="5" response-variable-name="health-check">
        <set-url>https://backend.example.com/health</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <!-- Evaluate health check response -->
    <set-variable name="backendHealthy" 
        value="@{
            var response = (IResponse)context.Variables[\"health-check\"];
            if (response == null) return false;
            
            if (response.StatusCode != 200) return false;
            
            // Parse health response body
            var body = response.Content.As<string>();
            var health = System.Text.Json.JsonSerializer.Deserialize<dynamic>(body);
            
            return health.status == \"healthy\";
        }" />
    
    <set-variable name="backendUnhealthy" value="@(!(bool)context.Variables[\"backendHealthy\"])" />
</backend>

Health Check with Dependency Validation

<backend>
    <send-request mode="new" timeout="10" response-variable-name="db-health">
        <set-url>https://backend.example.com/health/database</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <send-request mode="new" timeout="10" response-variable-name="cache-health">
        <set-url>https://backend.example.com/health/cache</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <set-variable name="allHealthy" value="@{
        var dbResponse = (IResponse)context.Variables[\"db-health\"];
        var cacheResponse = (IResponse)context.Variables[\"cache-health\"];
        
        return dbResponse?.StatusCode == 200 && cacheResponse?.StatusCode == 200;
    }" />
</backend>

Circuit Breaker State Management

Tracking Circuit State in Variables

<inbound>
    <!-- Check if circuit is open -->
    <set-variable name="circuitOpen" value="@{
        var lastFailure = context.Variables.GetValue<DateTime?>("lastFailure");
        if (!lastFailure.HasValue) return false;
        
        // Circuit open for 30 seconds after failure
        return DateTime.UtcNow - lastFailure.Value < TimeSpan.FromSeconds(30);
    }" />
</inbound>

<backend>
    <choose>
        <when condition="@(context.Response.StatusCode >= 500)">
            <set-variable name="lastFailure" value="@(DateTime.UtcNow)" />
        </when>
    </choose>
</backend>

State Persistence Across Requests

For production environments, consider storing circuit breaker state in Azure Cache for Redis:

// External processing (Azure Function)
public static class CircuitBreakerState
{
    private static IConnectionMultiplexer _redis;
    
    public static async Task<bool> IsCircuitOpenAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        var lastFailure = await db.StringGetAsync($"circuit:{backendId}:lastFailure");
        
        if (string.IsNullOrEmpty(lastFailure)) return false;
        
        var failureTime = DateTime.Parse(lastFailure);
        return DateTime.UtcNow - failureTime < TimeSpan.FromSeconds(30);
    }
    
    public static async Task RecordFailureAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        await db.StringSetAsync($"circuit:{backendId}:lastFailure", DateTime.UtcNow.ToString());
        await db.KeyExpireAsync($"circuit:{backendId}:lastFailure", TimeSpan.FromMinutes(5));
    }
    
    public static async Task ResetCircuitAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        await db.KeyDeleteAsync($"circuit:{backendId}:lastFailure");
    }
}

Dashboard for Monitoring

[ApiController]
[Route("admin/circuit-breaker")]
public class CircuitBreakerController : ControllerBase
{
    [HttpGet("status")]
    public IActionResult GetStatus()
    {
        return Ok(new
        {
            Backends = new[]
            {
                new {
                    Name = "api-v1",
                    CircuitState = "Closed",
                    FailureCount = 0,
                    LastFailure = (DateTime?)null
                },
                new {
                    Name = "api-v2",
                    CircuitState = "Open",
                    FailureCount = 5,
                    LastFailure = DateTime.UtcNow.AddSeconds(-15)
                }
            }
        });
    }
    
    [HttpPost("reset/{backendName}")]
    public async Task<IActionResult> ResetCircuit(string backendName)
    {
        await CircuitBreakerState.ResetCircuitAsync(backendName);
        return Ok(new { Message = $"Circuit reset for {backendName}" });
    }
}

Complete Load Balancing Policy Example

Here's a comprehensive policy that combines all concepts:

<policies>
    <inbound>
        <!-- Step 1: Check circuit breaker state -->
        <set-variable name="circuitOpen" value="@{
            return context.Variables.GetValue<bool>($"circuit_{context.Request.Url.Host}_open");
        }" />
        
        <choose>
            <when condition="@((bool)context.Variables.GetValueOrDefault("circuitOpen", false))">
                <return-response>
                    <set-status code="503" />
                    <set-body>Backend temporarily unavailable - circuit breaker active</set-body>
                </return-response>
            </when>
        </choose>
        
        <!-- Step 2: Determine routing based on weight -->
        <set-variable name="targetWeight" value="@{
            var weightHeader = context.Request.Headers.GetValueOrDefault("X-Backend-Weight", "");
            return string.IsNullOrEmpty(weightHeader) ? 10 : int.Parse(weightHeader);
        }" />
    </inbound>
    
    <backend>
        <!-- Step 3: Round robin with weights -->
        <round-robin weight="@(context.Variables.GetValue<int>("targetWeight"))">
            <address>https://primary.example.com</address>
            <address>https://secondary.example.com</address>
            <address>https://tertiary.example.com</address>
        </round-robin>
        
        <!-- Step 4: Track response for circuit breaker -->
        <choose>
            <when condition="@(context.Response.StatusCode >= 500)">
                <set-variable name="failureCount" value="@{
                    var current = context.Variables.GetValueOrDefault<int>("failureCount", 0);
                    return current + 1;
                }" />
                
                <!-- Open circuit after 5 failures -->
                <choose>
                    <when condition="@(context.Variables.GetValue<int>("failureCount") >= 5)">
                        <set-variable name="@($"circuit_{context.Request.Url.Host}_open")" value="true" />
                    </when>
                </choose>
            </when>
            <otherwise>
                <set-variable name="failureCount" value="0" />
            </otherwise>
        </choose>
    </backend>
    
    <outbound>
        <!-- Add health info to response headers -->
        <set-header name="X-Backend-Health" exists-action="override">
            <value>@(context.Variables.GetValue<bool>("circuitOpen") ? "degraded" : "healthy")</value>
        </set-header>
    </outbound>
</policies>

Best Practices

PracticeDescription
Start with low weightsBegin with 5-10% traffic to new backends
Monitor closelyTrack error rates and latency during rollout
Set appropriate thresholdsConfigure failure thresholds based on your SLA
Implement graceful degradationHave fallback responses when backends fail
Use health checksAutomatically detect and remove unhealthy backends
Test circuit breakersRegularly test failure scenarios
Log everythingCapture metrics for post-incident analysis

Common Pitfalls to Avoid

  1. Setting weights too aggressively — Don't route 100% to new backend immediately
  2. Ignoring health checks — Always use automatic health detection
  3. No timeout configuration — Always set reasonable timeouts
  4. Forgetting to reset — Ensure circuits can close after recovery
  5. No monitoring — Track circuit state changes in your dashboard

Azure Integration Hub - Advanced Level