Active-Active Multi-Region Architecture

Building Globally Distributed, Highly Available Systems


Introduction

Active-Active multi-region architecture is a deployment strategy where applications run simultaneously in multiple Azure regions, with all regions actively processing requests. Unlike active-passive (failover) architectures where secondary regions sit idle until a failure occurs, active-active deployments distribute load across all regions, providing better performance, higher availability, and resilience against regional outages.

This comprehensive guide covers:

  • Architecture patterns — Understanding active-active vs other strategies
  • Implementation approaches — Using Azure Front Door, Traffic Manager, and DNS
  • Data synchronization — Managing state across regions
  • Conflict resolution — Handling concurrent writes
  • Failover strategies — When and how to handle failures
  • Cost considerations — Optimizing for budget

Architecture Patterns Overview

Deployment Strategy Comparison

┌─────────────────────────────────────────────────────────────────────┐
│                    DEPLOYMENT STRATEGY COMPARISON                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ACTIVE-PASSIVE (Traditional)                                      │
│   ────────────────────────────                                      │
│                                                                     │
│   ┌─────────────┐                    ┌─────────────┐                │
│   │  PRIMARY    │                    │ SECONDARY   │                │
│   │  (East US)  │ ── Failover ──▶    │ (West US)   │                │
│   │             │                    │   STANDBY   │                │
│   │  ✅Active   │                    │   ⏸ Idle    │                │
│   └─────────────┘                    └─────────────┘                │
│                                                                     │
│   Pros: Lower cost, simple         Cons: RTO = failover time        │
│                                                                     │
│   ACTIVE-ACTIVE (Modern)                                            │
│   ──────────────────────────                                        │
│                                                                     │
│   ┌─────────────┐                    ┌─────────────┐                │
│   │  REGION A   │                    │  REGION B   │                │
│   │  (East US)  │ ◀───────  ──────▶  │  (West US)  │                │
│   │             │     Traffic        │             │                │
│   │  ✅ Active  │                    │  ✅ Active  │                 │
│   └─────────────┘                    └─────────────┘                │
│                                                                     │
│   Pros: Zero RTO, better latency   Cons: Higher cost, complex       │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

When to Choose Active-Active

ScenarioRecommended Pattern
Global user base with low latency requirementsActive-Active
Mission-critical with zero tolerance for downtimeActive-Active
Regulatory requirements for geographic redundancyActive-Active
Lower traffic with moderate availability needsActive-Passive
Disaster recovery onlyCold Standby

Azure Active-Active Architecture

Component Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                 ACTIVE-ACTIVE ARCHITECTURE ON AZURE                 │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│                         ┌─────────────────┐                         │
│                         │   Global DNS    │                         │
│                         │  (Azure DNS)    │                         │
│                         └────────┬────────┘                         │
│                                  │                                  │
│                                  ▼                                  │
│                    ┌─────────────────────────┐                      │
│                    │    Azure Front Door     │                      │
│                    │  (Global Load Balancer) │                      │
│                    └────────────┬────────────┘                      │
│                                 │                                   │
│              ┌──────────────────┼──────────────────┐                │
│              │                  │                  │                │
│              ▼                  ▼                  ▼                │
│   ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  │
│   │   East US        │  │   West Europe    │  │   Southeast Asia │  │
│   │   ┌────────────┐ │  │   ┌──────────┐   │  │   ┌──────────┐   │  │
│   │   │ App Service│ │  │   App Service│   │  │   App Service│   │  │
│   │   │ Functions  │ │  │   Functions  │   │  │   | Functions│   │  │
│   │   └─────┬──────┘ │  │   └─────┬────┘   │  │   └─────┬────┘   │  │
│   │         │        │  │         │        │  │         │        │  │
│   │         ▼        │  │         ▼        │  │         ▼        │  │
│   │   ┌──────────┐   │  │   ┌──────────┐   │  │   ┌──────────┐   │  │
│   │   │ Cosmos DB│   │  │   │ Cosmos DB│   │  │   │ Cosmos DB│   │  │
│   │   │(Multi-   │   │  │   │(Multi-   │   │  │   │(Multi-  v│   │  │
│   │   │ region)  │   │  │   │ region)  │   │  │   │ region)  │   │  │
│   │   └──────────┘   │  │   └──────────┘   │  │   └──────────┘   │  │
│   └──────────────────┘  └──────────────────┘  └──────────────────┘  │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Load Balancing Options

┌─────────────────────────────────────────────────────────────────────┐
│                   LOAD BALANCING OPTIONS                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Azure Front Door                                                  │
│   ───────────────────                                               │
│   ✓ Layer 7 (HTTP/HTTPS)                                            │
│   ✓ WAF integration                                                 │
│   ✓ SSL termination                                                 │
│   ✓ URL-based routing                                               │
│   ✓ Best for: Web apps, APIs                                        │
│                                                                     │
│   Azure Traffic Manager                                             │
│   ──────────────────────                                            │
│   ✓ DNS-based routing                                               │
│   ✓ Multiple routing methods                                        │
│   ✓ Health checks                                                   │
│   ✓ Best for: DNS failover, complex routing                         │
│                                                                     │
│   Azure Load Balancer                                               │
│   ──────────────────────                                            │
│   ✓ Layer 4 (TCP/UDP)                                               │
│   ✓ High performance                                                │
│   ✓ Best for: Non-HTTP traffic, internal                            │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Implementation Guide

Step 1: Deploy Application to Multiple Regions

# Create resources in multiple regions
for region in "eastus" "westeurope" "southeastasia"; do
  az functionapp create \
    --name "myapp-${region}" \
    --resource-group my-rg \
    --storage-account "mystorage${region}" \
    --plan my-app-plan \
    --deployment-source-url "https://github.com/myapp" \
    --location $region
done

Step 2: Configure Azure Front Door

{
  "backendPools": [
    {
      "name": "app-backends",
      "backends": [
        {
          "address": "myapp-eastus.azurewebsites.net",
          "weight": 1,
          "enabled": true
        },
        {
          "address": "myapp-westeurope.azurewebsites.net",
          "weight": 1,
          "enabled": true
        },
        {
          "address": "myapp-southeastasia.azurewebsites.net",
          "weight": 1,
          "enabled": true
        }
      ]
    }
  ],
  "routingRules": [
    {
      "name": "default-rule",
      "frontendEndpoints": ["default"],
      "backendPool": "app-backends",
      "routeConfiguration": {
        "forwardingProtocol": "MatchRequest"
      }
    }
  ]
}

Step 3: Configure Health Probes

{
  "healthProbeSettings": [
    {
      "name": "app-health",
      "path": "/health",
      "protocol": "Https",
      "intervalInSeconds": 30,
      "timeoutInSeconds": 10,
      "unhealthyThreshold": 3
    }
  ]
}

Step 4: Configure Custom Domains

# Add custom domain to Front Door
az afd domain create \
  --resource-group my-rg \
  --profile-name my-front-door \
  --domain-name "api.mydomain.com" \
  --certificate-validation-type "Cloud" \
  --azureDnsZone "mydomain.com"

Data Synchronization Strategies

Multi-Region Database Options

┌─────────────────────────────────────────────────────────────────────┐
│               DATABASE SYNC STRATEGIES                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Cosmos DB (Multi-Region Write)                                    │
│   ───────────────────────────────────                               │
│   ✓ Automatic replication across regions                            │
│   ✓ Conflict-free by default (LWW)                                  │
│   ✓ Custom conflict resolution                                      │
│   ✓ Best for: Globally distributed apps                             │
│                                                                     │
│   SQL Server with Geo-Replication                                   │
│   ─────────────────────────────────────                             │
│   ✓ Active geo-replication to 4 regions                             │
│   ✓ Readable secondaries                                            │
│   ✓ Automatic failover                                              │
│   ✓ Best for: SQL workloads, less write contention                  │
│                                                                     │
│   Event-Driven with Service Bus                                     │
│   ─────────────────────────────────────                             │
│   ✓ Cross-region message replication                                │
│   ✓ Topic subscriptions per region                                  │
│   ✓ Best for: Event-driven architectures                            │
│                                                                     │
│   Custom Async Replication                                          │
│   ──────────────────────────────                                    │
│   ✓ Table-based replication                                         │
│   ✓ Event store synchronization                                     │
│   ✓ Best for: Specific consistency requirements                     │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Cosmos DB Multi-Region Configuration

// Configure Cosmos DB with multi-region writes
var account = new CosmosClient(
    "https://myaccount.documents.azure.com",
    "primary-key",
    new CosmosClientOptions
    {
        ApplicationRegion = Regions.EastUS,
        PreferredRegions = new List<string>
        {
            Regions.EastUS,
            Regions.WestEurope,
            Regions.SoutheastAsia
        },
        AllowMultipleWriteLocations = true
    });

// Write to any region - automatically replicated
var container = account.GetDatabase("mydb").GetContainer("mycontainer");
await container.CreateItemAsync(new Order { Id = "123", Status = "Processing" });

Traffic Routing Patterns

Geographic Routing

{
  "routingRules": [
    {
      "name": "geographic-routing",
      "matchConditions": [
        {
          "matchVariables": ["GeoLocation"],
          "operator": "GeoMatch",
          "negateCondition": false,
          "matchValues": ["US", "CA", "MX"]
        }
      ],
      "action": {
        "routeConfiguration": {
          "forwardingProtocol": "MatchRequest",
          "backendPool": {
            "id": "/subscriptions/xxx/resourceGroups/my-rg/providers/Microsoft.Network/frontdoors/myfd/backendPools/east-us-pool"
          }
        }
      }
    },
    {
      "name": "europe-routing",
      "matchConditions": [
        {
          "matchVariables": ["GeoLocation"],
          "operator": "GeoMatch",
          "matchValues": ["DE", "FR", "GB", "IT", "ES", "NL"]
        }
      ],
      "action": {
        "routeConfiguration": {
          "backendPool": {
            "id": "/subscriptions/xxx/.../backendPools/europe-pool"
          }
        }
      }
    }
  ]
}

Weighted Round-Robin

{
  "backendPools": [
    {
      "name": "weighted-pool",
      "backends": [
        {
          "address": "region-a.azurewebsites.net",
          "weight": 50
        },
        {
          "address": "region-b.azurewebsites.net",
          "weight": 30
        },
        {
          "address": "region-c.azurewebsites.net",
          "weight": 20
        }
      ]
    }
  ]
}

Failover Strategies

Automatic Failover Configuration

{
  "healthProbeSettings": {
    "name": "probe",
    "intervalInSeconds": 10,
    "path": "/health",
    "protocol": "Https",
    "timeoutInSeconds": 5,
    "unhealthyThreshold": 3
  },
  "loadBalancingSettings": {
    "name": "loadbalancer",
    "sampleSize": 10,
    "successfulSamplesRequired": 3
  }
}

Manual Failover with Traffic Manager

# Disable endpoint in failed region
az network traffic-manager endpoint update \
  --profile-name my-tm \
  --resource-group my-rg \
  --name eastus-endpoint \
  --endpoint-status Disabled

# Enable standby region
az network traffic-manager endpoint update \
  --profile-name my-tm \
  --resource-group my-rg \
  --name standby-endpoint \
  --endpoint-status Enabled

Graceful Degradation

public class HealthCheckFunction
{
    [FunctionName("HealthCheck")]
    public async Task<IActionResult> Run([HttpTrigger] HttpRequest req)
    {
        var checks = new List<HealthCheckResult>
        {
            await CheckDatabaseAsync(),
            await CheckServiceBusAsync(),
            await CheckStorageAsync()
        };
        
        var isHealthy = checks.All(c => c.IsHealthy);
        
        return isHealthy 
            ? new OkObjectResult(new { status = "healthy" })
            : new StatusCodeResult(503);
    }
}

Cost Optimization

Cost Comparison

┌─────────────────────────────────────────────────────────────────────┐
│                 ACTIVE-ACTIVE COST ESTIMATES                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Components:                                                       │
│   ───────────                                                       │
│   ✓ Azure Front Door: ~$25/month + $0.007/requests                  │
│   ✓ App Service (3x): ~$150/month (Basic tier)                      │
│   ✓ Cosmos DB (3 regions): ~$100/month (100 RU/s)                   │
│   ✓ Storage (3 regions): ~$30/month                                 │
│   ✓ Additional services: ~$50/month                                 │
│   ───────────────────                                               │
│   Total: ~$355/month                                                │
│                                                                     │
│   vs Active-Passive:                                                │
│   ─────────────────                                                 │
│   Active-Active: ~$355/month (all regions active)                   │
│   Active-Passive: ~$250/month (primary only, standby idle)          │
│   Premium for zero RTO: ~$105/month additional                      │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Cost Optimization Strategies

# Use Azure Functions Consumption in secondary regions
# Only scale up when traffic increases

# Enable auto-scaling based on metrics
az monitor autoscale create \
  --resource-group my-rg \
  --resource my-function-app \
  --name "scale-policy" \
  --min-count 1 \
  --max-count 10 \
  --count 1

Best Practices

Implementation Checklist

PracticeDescription
Health endpointsImplement comprehensive /health checks
Connection stringsUse region-specific endpoints
Session stateUse distributed cache (Redis) or Cosmos DB
MonitoringConfigure alerts for each region
TestingRegularly test failover scenarios
DocumentationDocument failover procedures

Monitoring Dashboard

{
  "dashboard": {
    "widgets": [
      {
        "type": "Metric",
        "title": "Request Count by Region",
        "metrics": ["Frontend Request Count"],
        "filter": "Backend"
      },
      {
        "type": "Metric",
        "title": "Latency by Region",
        "metrics": ["Backend Request Latency"]
      },
      {
        "type": "Metric",
        "title": "Health Check Status",
        "metrics": ["Health Probe Failure Count"]
      }
    ]
  }
}

Related Topics


Azure Integration Hub - Architect Level Multi-Region & High Availability