Azure API Management — Backend Load Balancing & Circuit Breaker
Backend Pools, Weighted Routing, Health Checks, and Circuit Breaker Patterns
Introduction
When building resilient API architectures, distributing load across multiple backend services is essential for high availability and scalability. Azure API Management (APIM) provides sophisticated backend load balancing capabilities that go beyond simple round-robin routing.
This guide covers:
- Backend Pools — Grouping multiple backends for load distribution
- Weighted Routing — Traffic distribution based on weight values
- Health Checks — Automatic detection of unhealthy backends
- Circuit Breaker — Preventing cascading failures
- Custom Policies — Advanced routing and failover logic
Backend Pool Configuration
Why Use Backend Pools?
In enterprise scenarios, you often need to:
- Distribute traffic across multiple API instances for horizontal scaling
- Implement blue-green deployments with gradual traffic shifting
- Route to different backend versions (canary releases)
- Provide fallback when primary backends fail
- Implement geographic routing for lower latency
Basic Backend Pool
{
"backends": [
{
"url": "https://api-v1.example.com",
"weight": 10
},
{
"url": "https://api-v2.example.com",
"weight": 5
},
{
"url": "https://api-v3.example.com",
"weight": 5
}
]
}
This configuration distributes traffic with a 10:5:5 ratio (50% to v1, 25% each to v2 and v3).
Backend Pool with Capacity
{
"backends": [
{
"url": "https://primary-api.example.com",
"weight": 80,
"capacity": 1000,
"description": "Primary API cluster - handles most traffic"
},
{
"url": "https://secondary-api.example.com",
"weight": 20,
"capacity": 500,
"description": "Secondary cluster - overflow traffic"
}
]
}
Weighted Routing Policy
Understanding Weighted Routing
Weighted routing allows you to control what percentage of traffic goes to each backend. This is crucial for:
- Canary Deployments — Gradually shift traffic (e.g., 5% → 20% → 100%)
- A/B Testing — Route percentage of users to different versions
- Load Distribution — Distribute based on backend capacity
- Disaster Recovery — Route to backup when primary fails
Round-Robin with Weights
<backend>
<set-backend-service base-url="https://api.example.com" />
<round-robin weight="10">
<address>https://backend1.example.com</address>
<address>https://backend2.example.com</address>
<address>https://backend3.example.com</address>
</round-robin>
</backend>
In this policy:
- Backend1 receives 10/20 = 50% of traffic
- Backend2 receives 5/20 = 25% of traffic
- Backend3 receives 5/20 = 25% of traffic
Dynamic Weight Assignment
<inbound>
<!-- Set weight based on client tier -->
<set-variable name="backendWeight" value="@{
var tier = context.Request.Headers.GetValueOrDefault("X-Client-Tier", "standard");
return tier switch {
"premium" => 80,
"enterprise" => 100,
_ => 20
};
}" />
</inbound>
<backend>
<round-robin weight="@(context.Variables.GetValue<int>("backendWeight"))">
<address>https://premium-api.example.com</address>
<address>https://standard-api.example.com</address>
</round-robin>
</backend>
Geographic Routing Example
<inbound>
<!-- Determine user's region -->
<set-variable name="userRegion" value="@{
var region = context.Request.Headers.GetValueOrDefault("X-Azure-Region", "eastus");
return region;
}" />
</inbound>
<backend>
<choose>
<!-- Route to closest backend -->
<when condition="@(context.Variables.GetValue<string>("userRegion") == "westus2")">
<set-backend-service base-url="https://api-westus2.example.com" />
</when>
<when condition="@(context.Variables.GetValue<string>("userRegion") == "eastus")">
<set-backend-service base-url="https://api-eastus.example.com" />
</when>
<otherwise>
<set-backend-service base-url="https://api-default.example.com" />
</otherwise>
</choose>
</backend>
Circuit Breaker Pattern
Why Circuit Breakers Matter
Without circuit breakers, when a backend fails:
- Requests continue hitting the failed backend
- Timeout delays cause slow responses
- Failed requests queue up, consuming resources
- Cascading failures affect other services
Circuit breakers prevent this by:
- Detecting failures — Track failed requests
- Opening the circuit — Stop sending requests to failed backend
- Allowing recovery — Periodically test if backend recovered
- Falling back — Route to alternative backend
Retry Policy with Circuit Breaker
<backend>
<retry timeout="30">
<condition>
<variable name="retryCondition">
@(context.Response.StatusCode >= 500 || context.Variables.GetValue<bool>("circuitOpen"))
</variable>
</condition>
<break>5s</break> <!-- Break for 5 seconds after 5 failures -->
<backend>
<retry>
<!-- Fallback logic -->
</retry>
</backend>
</retry>
</backend>
Advanced Circuit Breaker Implementation
<inbound>
<!-- Check if circuit is open -->
<set-variable name="circuitOpen" value="@{
var lastFailure = context.Variables.GetValue<DateTime?>("lastFailureTime");
var failureCount = context.Variables.GetValue<int>("failureCount");
// Circuit opens after 5 failures within 60 seconds
if (failureCount >= 5 && lastFailure.HasValue)
{
var timeSinceLastFailure = DateTime.UtcNow - lastFailure.Value;
// Keep circuit open for 30 seconds
return timeSinceLastFailure.TotalSeconds < 30;
}
return false;
}" />
<!-- If circuit is open, return 503 -->
<choose>
<when condition="@(context.Variables.GetValue<bool>("circuitOpen"))">
<return-response>
<set-status code="503" reason="Service Unavailable" />
<set-body>Service temporarily unavailable. Please retry later.</set-body>
<set-header name="Retry-After" exists-action="override">
<value>30</value>
</set-header>
</return-response>
</when>
</choose>
</inbound>
<backend>
<!-- Track failures -->
<choose>
<when condition="@(context.Response.StatusCode >= 500)">
<set-variable name="failureCount" value="@(context.Variables.GetValue<int>("failureCount") + 1)" />
<set-variable name="lastFailureTime" value="@(DateTime.UtcNow)" />
</when>
<otherwise>
<!-- Reset failure count on success -->
<set-variable name="failureCount" value="0" />
</when>
</choose>
</backend>
Circuit Breaker State Machine
┌─────────────────────────────────────────────────────────────────┐
│ CIRCUIT BREAKER STATES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ CLOSED (Normal) ──▶ OPEN (Failure) ──▶ HALF-OPEN (Testing) │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Requests │ │Requests │ │Requests │ │
│ │pass │ │blocked │ │test │ │
│ │through │ │return 503│ │backend │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ Transition: Transition: Transition: │
│ 5 failures 30 seconds Success= │
│ in 60s timeout closed │
│ │
└─────────────────────────────────────────────────────────────────┘
Health Check Configuration
Built-in Health Checks
APIM can automatically monitor backend health and remove unhealthy backends from rotation.
{
"healthCheck": {
"isEnabled": true,
"interval": "00:00:30",
"timeout": "00:00:10",
"successThreshold": 1,
"failureThreshold": 3,
"healthStatus": "healthy",
"unhealthyStatus": "unhealthy"
}
}
Configuration explained:
- interval — How often to check (every 30 seconds)
- timeout — How long to wait for response (10 seconds)
- successThreshold — Successful checks before marking healthy (1)
- failureThreshold — Failed checks before marking unhealthy (3)
Health Check Endpoint Requirements
// Backend health check endpoint example
[ApiController]
[Route("health")]
public class HealthController : ControllerBase
{
[HttpGet]
public IActionResult GetHealth()
{
var health = new
{
Status = "healthy",
Timestamp = DateTime.UtcNow,
Dependencies = new
{
Database = "healthy",
Cache = "healthy",
ExternalApi = "healthy"
},
Metrics = new
{
CpuUsage = GetCpuUsage(),
MemoryUsage = GetMemoryUsage(),
ResponseTime = GetAvgResponseTime()
}
};
return Ok(health);
}
}
Custom Health Check Policy
<inbound>
<!-- Check backend health before routing -->
<choose>
<when condition="@(context.Variables.GetValue<bool>("backendUnhealthy"))">
<return-response>
<set-status code="503" />
<set-body>Service temporarily unavailable</set-body>
</return-response>
</when>
</choose>
</inbound>
<backend>
<!-- Execute health check -->
<send-request mode="new" timeout="5" response-variable-name="health-check">
<set-url>https://backend.example.com/health</set-url>
<set-method>GET</set-method>
</send-request>
<!-- Evaluate health check response -->
<set-variable name="backendHealthy"
value="@{
var response = (IResponse)context.Variables[\"health-check\"];
if (response == null) return false;
if (response.StatusCode != 200) return false;
// Parse health response body
var body = response.Content.As<string>();
var health = System.Text.Json.JsonSerializer.Deserialize<dynamic>(body);
return health.status == \"healthy\";
}" />
<set-variable name="backendUnhealthy" value="@(!(bool)context.Variables[\"backendHealthy\"])" />
</backend>
Health Check with Dependency Validation
<backend>
<send-request mode="new" timeout="10" response-variable-name="db-health">
<set-url>https://backend.example.com/health/database</set-url>
<set-method>GET</set-method>
</send-request>
<send-request mode="new" timeout="10" response-variable-name="cache-health">
<set-url>https://backend.example.com/health/cache</set-url>
<set-method>GET</set-method>
</send-request>
<set-variable name="allHealthy" value="@{
var dbResponse = (IResponse)context.Variables[\"db-health\"];
var cacheResponse = (IResponse)context.Variables[\"cache-health\"];
return dbResponse?.StatusCode == 200 && cacheResponse?.StatusCode == 200;
}" />
</backend>
Circuit Breaker State Management
Tracking Circuit State in Variables
<inbound>
<!-- Check if circuit is open -->
<set-variable name="circuitOpen" value="@{
var lastFailure = context.Variables.GetValue<DateTime?>("lastFailure");
if (!lastFailure.HasValue) return false;
// Circuit open for 30 seconds after failure
return DateTime.UtcNow - lastFailure.Value < TimeSpan.FromSeconds(30);
}" />
</inbound>
<backend>
<choose>
<when condition="@(context.Response.StatusCode >= 500)">
<set-variable name="lastFailure" value="@(DateTime.UtcNow)" />
</when>
</choose>
</backend>
State Persistence Across Requests
For production environments, consider storing circuit breaker state in Azure Cache for Redis:
// External processing (Azure Function)
public static class CircuitBreakerState
{
private static IConnectionMultiplexer _redis;
public static async Task<bool> IsCircuitOpenAsync(string backendId)
{
var db = _redis.GetDatabase();
var lastFailure = await db.StringGetAsync($"circuit:{backendId}:lastFailure");
if (string.IsNullOrEmpty(lastFailure)) return false;
var failureTime = DateTime.Parse(lastFailure);
return DateTime.UtcNow - failureTime < TimeSpan.FromSeconds(30);
}
public static async Task RecordFailureAsync(string backendId)
{
var db = _redis.GetDatabase();
await db.StringSetAsync($"circuit:{backendId}:lastFailure", DateTime.UtcNow.ToString());
await db.KeyExpireAsync($"circuit:{backendId}:lastFailure", TimeSpan.FromMinutes(5));
}
public static async Task ResetCircuitAsync(string backendId)
{
var db = _redis.GetDatabase();
await db.KeyDeleteAsync($"circuit:{backendId}:lastFailure");
}
}
Dashboard for Monitoring
[ApiController]
[Route("admin/circuit-breaker")]
public class CircuitBreakerController : ControllerBase
{
[HttpGet("status")]
public IActionResult GetStatus()
{
return Ok(new
{
Backends = new[]
{
new {
Name = "api-v1",
CircuitState = "Closed",
FailureCount = 0,
LastFailure = (DateTime?)null
},
new {
Name = "api-v2",
CircuitState = "Open",
FailureCount = 5,
LastFailure = DateTime.UtcNow.AddSeconds(-15)
}
}
});
}
[HttpPost("reset/{backendName}")]
public async Task<IActionResult> ResetCircuit(string backendName)
{
await CircuitBreakerState.ResetCircuitAsync(backendName);
return Ok(new { Message = $"Circuit reset for {backendName}" });
}
}
Complete Load Balancing Policy Example
Here's a comprehensive policy that combines all concepts:
<policies>
<inbound>
<!-- Step 1: Check circuit breaker state -->
<set-variable name="circuitOpen" value="@{
return context.Variables.GetValue<bool>($"circuit_{context.Request.Url.Host}_open");
}" />
<choose>
<when condition="@((bool)context.Variables.GetValueOrDefault("circuitOpen", false))">
<return-response>
<set-status code="503" />
<set-body>Backend temporarily unavailable - circuit breaker active</set-body>
</return-response>
</when>
</choose>
<!-- Step 2: Determine routing based on weight -->
<set-variable name="targetWeight" value="@{
var weightHeader = context.Request.Headers.GetValueOrDefault("X-Backend-Weight", "");
return string.IsNullOrEmpty(weightHeader) ? 10 : int.Parse(weightHeader);
}" />
</inbound>
<backend>
<!-- Step 3: Round robin with weights -->
<round-robin weight="@(context.Variables.GetValue<int>("targetWeight"))">
<address>https://primary.example.com</address>
<address>https://secondary.example.com</address>
<address>https://tertiary.example.com</address>
</round-robin>
<!-- Step 4: Track response for circuit breaker -->
<choose>
<when condition="@(context.Response.StatusCode >= 500)">
<set-variable name="failureCount" value="@{
var current = context.Variables.GetValueOrDefault<int>("failureCount", 0);
return current + 1;
}" />
<!-- Open circuit after 5 failures -->
<choose>
<when condition="@(context.Variables.GetValue<int>("failureCount") >= 5)">
<set-variable name="@($"circuit_{context.Request.Url.Host}_open")" value="true" />
</when>
</choose>
</when>
<otherwise>
<set-variable name="failureCount" value="0" />
</otherwise>
</choose>
</backend>
<outbound>
<!-- Add health info to response headers -->
<set-header name="X-Backend-Health" exists-action="override">
<value>@(context.Variables.GetValue<bool>("circuitOpen") ? "degraded" : "healthy")</value>
</set-header>
</outbound>
</policies>
Best Practices
| Practice | Description |
|---|---|
| Start with low weights | Begin with 5-10% traffic to new backends |
| Monitor closely | Track error rates and latency during rollout |
| Set appropriate thresholds | Configure failure thresholds based on your SLA |
| Implement graceful degradation | Have fallback responses when backends fail |
| Use health checks | Automatically detect and remove unhealthy backends |
| Test circuit breakers | Regularly test failure scenarios |
| Log everything | Capture metrics for post-incident analysis |
Common Pitfalls to Avoid
- Setting weights too aggressively — Don't route 100% to new backend immediately
- Ignoring health checks — Always use automatic health detection
- No timeout configuration — Always set reasonable timeouts
- Forgetting to reset — Ensure circuits can close after recovery
- No monitoring — Track circuit state changes in your dashboard
Azure Integration Hub - Advanced Level