Azure API Management — Backend Load Balancing & Circuit Breaker

Backend Pools, Weighted Routing, Health Checks, and Circuit Breaker Patterns

Introduction

When building resilient API architectures, distributing load across multiple backend services is essential for high availability and scalability. Azure API Management (APIM) provides sophisticated backend load balancing capabilities that go beyond simple round-robin routing.

This guide covers:

Backend Pools — Grouping multiple backends for load distribution
Weighted Routing — Traffic distribution based on weight values
Health Checks — Automatic detection of unhealthy backends
Circuit Breaker — Preventing cascading failures
Custom Policies — Advanced routing and failover logic

Backend Pool Configuration

Why Use Backend Pools?

In enterprise scenarios, you often need to:

Distribute traffic across multiple API instances for horizontal scaling
Implement blue-green deployments with gradual traffic shifting
Route to different backend versions (canary releases)
Provide fallback when primary backends fail
Implement geographic routing for lower latency

Basic Backend Pool

{
  "backends": [
    {
      "url": "https://api-v1.example.com",
      "weight": 10
    },
    {
      "url": "https://api-v2.example.com",
      "weight": 5
    },
    {
      "url": "https://api-v3.example.com",
      "weight": 5
    }
  ]
}

This configuration distributes traffic with a 10:5:5 ratio (50% to v1, 25% each to v2 and v3).

Backend Pool with Capacity

{
  "backends": [
    {
      "url": "https://primary-api.example.com",
      "weight": 80,
      "capacity": 1000,
      "description": "Primary API cluster - handles most traffic"
    },
    {
      "url": "https://secondary-api.example.com",
      "weight": 20,
      "capacity": 500,
      "description": "Secondary cluster - overflow traffic"
    }
  ]
}

Weighted Routing Policy

Understanding Weighted Routing

Weighted routing allows you to control what percentage of traffic goes to each backend. This is crucial for:

Canary Deployments — Gradually shift traffic (e.g., 5% → 20% → 100%)
A/B Testing — Route percentage of users to different versions
Load Distribution — Distribute based on backend capacity
Disaster Recovery — Route to backup when primary fails

Round-Robin with Weights

<backend>
    <set-backend-service backend-id="my-backend-pool" />
</backend>

In this policy:

Backend1 receives 10/20 = 50% of traffic
Backend2 receives 5/20 = 25% of traffic
Backend3 receives 5/20 = 25% of traffic

Dynamic Weight Assignment

<inbound>
    <!-- Set weight based on client tier -->
    <set-variable name="backendWeight" value="@{
        var tier = context.Request.Headers.GetValueOrDefault("X-Client-Tier", "standard");
        if (tier == "premium") return "https://premium-api.example.com";
        if (tier == "enterprise") return "https://premium-api.example.com";
        return "https://standard-api.example.com";
    }" />
</inbound>

<backend>
    <set-backend-service base-url="@((string)context.Variables["backendWeight"])" />
</backend>

Geographic Routing Example

<inbound>
    <!-- Determine user's region -->
    <set-variable name="userRegion" value="@{
        var region = context.Request.Headers.GetValueOrDefault("X-Azure-Region", "eastus");
        return region;
    }" />
</inbound>

<backend>
    <choose>
        <!-- Route to closest backend -->
        <when condition="@(context.Variables.GetValue<string>("userRegion") == "westus2")">
            <set-backend-service base-url="https://api-westus2.example.com" />
        </when>
        <when condition="@(context.Variables.GetValue<string>("userRegion") == "eastus")">
            <set-backend-service base-url="https://api-eastus.example.com" />
        </when>
        <otherwise>
            <set-backend-service base-url="https://api-default.example.com" />
        </otherwise>
    </choose>
</backend>

Circuit Breaker Pattern

Why Circuit Breakers Matter

Without circuit breakers, when a backend fails:

Requests continue hitting the failed backend
Timeout delays cause slow responses
Failed requests queue up, consuming resources
Cascading failures affect other services

Circuit breakers prevent this by:

Detecting failures — Track failed requests
Opening the circuit — Stop sending requests to failed backend
Allowing recovery — Periodically test if backend recovered
Falling back — Route to alternative backend

Retry Policy with Circuit Breaker

<backend>
    <retry condition="@(context.Response.StatusCode >= 500)" count="3" interval="5" first-fast-retry="true">
        <set-backend-service base-url="https://fallback.example.com" />
        <forward-request />
    </retry>
</backend>

Advanced Circuit Breaker Implementation

<inbound>
    <!-- Initialize failure tracking -->
    <set-variable name="failureCount" value="@(0)" />

    <!-- Check if circuit is open -->
    <set-variable name="circuitOpen" value="@{
        var lastFailure = (DateTime?)context.Variables.GetValueOrDefault("lastFailureTime", null);
        var failureCount = (int)context.Variables.GetValueOrDefault("failureCount", 0);
        
        // Circuit opens after 5 failures within 60 seconds
        if (failureCount >= 5 && lastFailure.HasValue)
        {
            var timeSinceLastFailure = DateTime.UtcNow - lastFailure.Value;
            // Keep circuit open for 30 seconds
            return timeSinceLastFailure.TotalSeconds < 30;
        }
        
        return false;
    }" />

    <!-- If circuit is open, return 503 -->
    <choose>
        <when condition="@((bool)context.Variables["circuitOpen"])">
            <return-response>
                <set-status code="503" reason="Service Unavailable" />
                <set-body>Service temporarily unavailable. Please retry later.</set-body>
                <set-header name="Retry-After" exists-action="override">
                    <value>30</value>
                </set-header>
            </return-response>
        </when>
    </choose>
</inbound>

<backend>
    <forward-request />
</backend>

<outbound>
    <!-- Track failures in outbound where context.Response is available -->
    <choose>
        <when condition="@(context.Response.StatusCode >= 500)">
            <set-variable name="failureCount" value="@((int)context.Variables.GetValueOrDefault("failureCount", 0) + 1)" />
            <set-variable name="lastFailureTime" value="@(DateTime.UtcNow)" />
        </when>
        <otherwise>
            <!-- Reset failure count on success -->
            <set-variable name="failureCount" value="@(0)" />
        </otherwise>
    </choose>
</outbound>

Circuit Breaker State Machine

┌─────────────────────────────────────────────────────────────────┐
│                    CIRCUIT BREAKER STATES                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│    CLOSED (Normal) ──▶ OPEN (Failure) ──▶ HALF-OPEN (Testing)   │
│         │                   │                    │              │
│         ▼                   ▼                    ▼              │
│   ┌──────────┐       ┌──────────┐        ┌──────────┐           │
│   │Requests  │       │Requests  │        │Requests  │           │
│   │pass      │       │blocked   │        │test      │           │
│   │through   │       │return 503│        │backend   │           │
│   └──────────┘       └──────────┘        └──────────┘           │
│                                                                 │
│   Transition:              Transition:           Transition:    │
│   5 failures               30 seconds             Success=      │
│   in 60s                   timeout                 closed       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Health Check Configuration

Built-in Health Checks

APIM can automatically monitor backend health and remove unhealthy backends from rotation.

{
  "healthCheck": {
    "isEnabled": true,
    "interval": "00:00:30",
    "timeout": "00:00:10",
    "successThreshold": 1,
    "failureThreshold": 3,
    "healthStatus": "healthy",
    "unhealthyStatus": "unhealthy"
  }
}

Configuration explained:

interval — How often to check (every 30 seconds)
timeout — How long to wait for response (10 seconds)
successThreshold — Successful checks before marking healthy (1)
failureThreshold — Failed checks before marking unhealthy (3)

Health Check Endpoint Requirements

// Backend health check endpoint example
[ApiController]
[Route("health")]
public class HealthController : ControllerBase
{
    [HttpGet]
    public IActionResult GetHealth()
    {
        var health = new
        {
            Status = "healthy",
            Timestamp = DateTime.UtcNow,
            Dependencies = new
            {
                Database = "healthy",
                Cache = "healthy",
                ExternalApi = "healthy"
            },
            Metrics = new
            {
                CpuUsage = GetCpuUsage(),
                MemoryUsage = GetMemoryUsage(),
                ResponseTime = GetAvgResponseTime()
            }
        };
        
        return Ok(health);
    }
}

Custom Health Check Policy

<inbound>
    <!-- Check backend health before routing -->
    <choose>
        <when condition="@((bool)context.Variables.GetValueOrDefault("backendUnhealthy", false))">
            <return-response>
                <set-status code="503" reason="Service Unavailable" />
                <set-body>Service temporarily unavailable</set-body>
            </return-response>
        </when>
    </choose>
</inbound>

<backend>
    <!-- Execute health check -->
    <send-request mode="new" timeout="5" response-variable-name="health-check">
        <set-url>https://backend.example.com/health</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <!-- Evaluate health check response -->
    <set-variable name="backendHealthy" 
        value="@{
            var response = (IResponse)context.Variables["health-check"];
            if (response == null) return false;
            
            if (response.StatusCode != 200) return false;
            
            // Parse health response body
            var body = response.Body.As<string>();
            var health = JObject.Parse(body);
            
            return (string)health["status"] == "healthy";
        }" />
    
    <set-variable name="backendUnhealthy" value="@(!(bool)context.Variables["backendHealthy"])" />
</backend>

Health Check with Dependency Validation

<backend>
    <send-request mode="new" timeout="10" response-variable-name="db-health">
        <set-url>https://backend.example.com/health/database</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <send-request mode="new" timeout="10" response-variable-name="cache-health">
        <set-url>https://backend.example.com/health/cache</set-url>
        <set-method>GET</set-method>
    </send-request>
    
    <set-variable name="allHealthy" value="@{
        var dbResponse = (IResponse)context.Variables[\"db-health\"];
        var cacheResponse = (IResponse)context.Variables[\"cache-health\"];
        
        return dbResponse?.StatusCode == 200 && cacheResponse?.StatusCode == 200;
    }" />
</backend>

Circuit Breaker State Management

Tracking Circuit State in Variables

<inbound>
    <!-- Check if circuit is open -->
    <set-variable name="circuitOpen" value="@{
        var lastFailure = (DateTime?)context.Variables.GetValueOrDefault("lastFailure", null);
        if (!lastFailure.HasValue) return false;
        
        // Circuit open for 30 seconds after failure
        return DateTime.UtcNow - lastFailure.Value < TimeSpan.FromSeconds(30);
    }" />
</inbound>

<backend>
    <forward-request />
</backend>

<outbound>
    <choose>
        <when condition="@(context.Response.StatusCode >= 500)">
            <set-variable name="lastFailure" value="@(DateTime.UtcNow)" />
        </when>
    </choose>
</outbound>

State Persistence Across Requests

For production environments, consider storing circuit breaker state in Azure Cache for Redis:

// External processing (Azure Function)
public static class CircuitBreakerState
{
    private static IConnectionMultiplexer _redis;
    
    public static async Task<bool> IsCircuitOpenAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        var lastFailure = await db.StringGetAsync($"circuit:{backendId}:lastFailure");
        
        if (string.IsNullOrEmpty(lastFailure)) return false;
        
        var failureTime = DateTime.Parse(lastFailure);
        return DateTime.UtcNow - failureTime < TimeSpan.FromSeconds(30);
    }
    
    public static async Task RecordFailureAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        await db.StringSetAsync($"circuit:{backendId}:lastFailure", DateTime.UtcNow.ToString());
        await db.KeyExpireAsync($"circuit:{backendId}:lastFailure", TimeSpan.FromMinutes(5));
    }
    
    public static async Task ResetCircuitAsync(string backendId)
    {
        var db = _redis.GetDatabase();
        await db.KeyDeleteAsync($"circuit:{backendId}:lastFailure");
    }
}

Dashboard for Monitoring

[ApiController]
[Route("admin/circuit-breaker")]
public class CircuitBreakerController : ControllerBase
{
    [HttpGet("status")]
    public IActionResult GetStatus()
    {
        return Ok(new
        {
            Backends = new[]
            {
                new {
                    Name = "api-v1",
                    CircuitState = "Closed",
                    FailureCount = 0,
                    LastFailure = (DateTime?)null
                },
                new {
                    Name = "api-v2",
                    CircuitState = "Open",
                    FailureCount = 5,
                    LastFailure = DateTime.UtcNow.AddSeconds(-15)
                }
            }
        });
    }
    
    [HttpPost("reset/{backendName}")]
    public async Task<IActionResult> ResetCircuit(string backendName)
    {
        await CircuitBreakerState.ResetCircuitAsync(backendName);
        return Ok(new { Message = $"Circuit reset for {backendName}" });
    }
}

Complete Load Balancing Policy Example

Here's a comprehensive policy that combines all concepts:

<policies>
    <inbound>
        <!-- Step 1: Check circuit breaker state -->
        <set-variable name="circuitOpen" value="@{
            return (bool)context.Variables.GetValueOrDefault("circuitOpen", false);
        }" />
        
        <choose>
            <when condition="@((bool)context.Variables.GetValueOrDefault("circuitOpen", false))">
                <return-response>
                    <set-status code="503" reason="Service Unavailable" />
                    <set-body>Backend temporarily unavailable - circuit breaker active</set-body>
                </return-response>
            </when>
        </choose>
        
        <!-- Step 2: Determine routing based on weight -->
        <set-variable name="targetBackend" value="@{
            var weightHeader = context.Request.Headers.GetValueOrDefault("X-Backend-Weight", "");
            var weight = string.IsNullOrEmpty(weightHeader) ? 10 : int.Parse(weightHeader);
            if (weight > 50) return "https://primary.example.com";
            if (weight > 20) return "https://secondary.example.com";
            return "https://tertiary.example.com";
        }" />
        
        <set-backend-service base-url="@((string)context.Variables["targetBackend"])" />
    </inbound>
    
    <backend>
        <forward-request />
    </backend>
    
    <outbound>
        <!-- Step 3: Track response for circuit breaker -->
        <choose>
            <when condition="@(context.Response.StatusCode >= 500)">
                <set-variable name="failureCount" value="@{
                    var current = (int)context.Variables.GetValueOrDefault("failureCount", 0);
                    return current + 1;
                }" />
                
                <!-- Open circuit after 5 failures -->
                <choose>
                    <when condition="@((int)context.Variables.GetValueOrDefault("failureCount", 0) >= 5)">
                        <set-variable name="circuitOpen" value="@(true)" />
                    </when>
                </choose>
            </when>
            <otherwise>
                <set-variable name="failureCount" value="@(0)" />
            </otherwise>
        </choose>
        
        <!-- Add health info to response headers -->
        <set-header name="X-Backend-Health" exists-action="override">
            <value>@((bool)context.Variables.GetValueOrDefault("circuitOpen", false) ? "degraded" : "healthy")</value>
        </set-header>
    </outbound>
</policies>

Best Practices

Practice	Description
Start with low weights	Begin with 5-10% traffic to new backends
Monitor closely	Track error rates and latency during rollout
Set appropriate thresholds	Configure failure thresholds based on your SLA
Implement graceful degradation	Have fallback responses when backends fail
Use health checks	Automatically detect and remove unhealthy backends
Test circuit breakers	Regularly test failure scenarios
Log everything	Capture metrics for post-incident analysis

Common Pitfalls to Avoid

Setting weights too aggressively — Don't route 100% to new backend immediately
Ignoring health checks — Always use automatic health detection
No timeout configuration — Always set reasonable timeouts
Forgetting to reset — Ensure circuits can close after recovery
No monitoring — Track circuit state changes in your dashboard

Azure Integration Hub - Advanced Level