Azure Service Bus — Geo-Disaster Recovery Setup

Primary/Secondary Pairing, Alias Endpoint, Failover


Introduction

Geo-disaster recovery (Geo-DR) in Azure Service Bus provides business continuity during regional outages. By pairing primary and secondary namespaces, you can ensure your messaging infrastructure remains available even when an entire Azure region experiences issues.

Key capabilities:

  • Automatic replication — Queue/topic definitions sync to secondary
  • Alias endpoint — Clients connect via alias, not direct namespace
  • One-click failover — Manual or automated failover initiation
  • Zero data loss — Pending messages preserved during failover

Prerequisites

Tier Requirements

Geo-DR is available only on Premium tier:

  • Supports geo-pairing within same subscription
  • Cross-region pairing supported
  • No additional cost for pairing (pay for secondary namespace)

Network Requirements

  • Both namespaces must be in the same subscription
  • Premium tier required for both
  • Virtual network rules don't replicate (reconfigure after failover)

Architecture Overview

┌───────────────────────────────────────────────────────────────────┐
│                     BEFORE FAILOVER                               │
│                                                                   │
│  ┌──────────────────┐                         ┌──────────────────┐│
│  │   Primary NS     │                         │   Secondary NS   ││
│  │   (East US)      │ ──── Replicates ────▶   │   (West Europe)  ││
│  │                  │      (async)            │                  ││
│  │  Queue: orders   │ ────────────────────▶   │  Queue: orders   ││
│  │  Topic: events   │ ────────────────────▶   │  Topic: events   ││
│  └────────┬─────────┘                         └────────┬─────────┘│
│           │                                            │          │
│           │                                            │          │
│           ▼                                            ▼          │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │                    ALIAS: shared-namespace                   │ │
│  │              (Client connects via alias)                     │ │
│  └──────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────────────┐
│                     AFTER FAILOVER                                │
│                                                                   │
│  ┌──────────────────┐                         ┌──────────────────┐│
│  │   Primary NS     │      INACTIVE           │   Secondary NS   ││
│  │   (East US)      │     (Standby)           │   (West Europe)  ││
│  │                  │                         │    ACTIVE        ││
│  │  Queue: orders   │                         │  Queue: orders   ││
│  │  Topic: events   │                         │  Topic: events   ││
│  └──────────────────┘                         └────────┬─────────┘│
│                                                        │          │
│                                                        ▼          │
│  ┌──────────────────────────────────────────────────────────────┐ │
│  │                    ALIAS: shared-namespace                   │ │
│  │              (Now points to West Europe)                     │ │
│  └──────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘

Setup Geo-DR

Step 1: Create Namespaces

# Create primary namespace (East US)
az servicebus namespace create \
  --name primary-namespace \
  --resource-group my-rg \
  --location eastus \
  --sku Premium \
  --capacity 1

# Create secondary namespace (West Europe)
az servicebus namespace create \
  --name secondary-namespace \
  --resource-group my-rg \
  --location westeurope \
  --sku Premium \
  --capacity 1

Step 2: Create Geo-Pairing

# Create the geo-pairing (alias)
az servicebus georecovery-alias set \
  --resource-group my-rg \
  --namespace primary-namespace \
  --alias my-alias \
  --partner-namespace "/subscriptions/xxx/resourceGroups/my-rg/providers/Microsoft.ServiceBus/namespaces/secondary-namespace"

Step 3: Verify Pairing

# Check pairing status
az servicebus georecovery-alias show \
  --resource-group my-rg \
  --namespace primary-namespace \
  --alias my-alias

# Output shows:
# {
#   "alias": "my-alias",
#   "partnerNamespace": "secondary-namespace",
#   "provisioningState": "Succeeded",
#   "role": "Primary"
# }

Configuring Clients

Using the Alias Connection String

// Get the alias connection string
var managementClient = new ManagementClient(
    new Uri("https://management.servicebus.windows.net/"),
    new TokenCredential());

var aliasInfo = await managementClient.GetGeoRecoveryAliasesAsync(
    "primary-namespace", "my-alias");

var primaryConnectionString = aliasInfo.AccessPaths[0].PrimaryConnectionString;
var secondaryConnectionString = aliasInfo.AccessPaths[1].PrimaryConnectionString;

Client Connection

// Use the alias connection string - handles failover automatically
var connectionString = "Endpoint=sb://my-alias.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=xxx";

var client = new ServiceBusClient(connectionString);
var processor = client.CreateProcessor("orders", new ServiceBusProcessorOptions
{
    AutoComplete = true,
    MaxConcurrentCalls = 10
});

// Service Bus handles:
// - Connecting to primary namespace
// - Automatically redirecting to secondary on failover
// - Retrying operations during failover

Configuration Example

{
  "ConnectionStrings": {
    "ServiceBus": {
      "Alias": "my-alias",
      "Primary": "sb://primary-namespace.servicebus.windows.net/",
      "Secondary": "sb://secondary-namespace.servicebus.windows.net/"
    }
  }
}

Initiating Failover

Manual Failover

# Trigger failover to secondary
az servicebus georecovery-alias fail-over \
  --resource-group my-rg \
  --namespace primary-namespace \
  --alias my-alias

Programmatic Failover

public class DisasterRecoveryManager
{
    private readonly ManagementClient _managementClient;
    
    public async Task FailOverAsync(string alias, string primaryNamespace)
    {
        // Check current status
        var aliasInfo = await _managementClient.GetGeoRecoveryAliasesAsync(
            primaryNamespace, alias);
        
        Console.WriteLine($"Current role: {aliasInfo.Role}");
        Console.WriteLine($"Partner: {aliasInfo.PartnerNamespace}");
        
        // Initiate failover
        await _managementClient.FailOverGeoReplicationAsync(
            primaryNamespace, alias);
        
        // Wait for failover to complete
        await WaitForFailoverCompletionAsync(primaryNamespace, alias);
        
        Console.WriteLine("Failover completed");
    }
    
    private async Task WaitForFailoverCompletionAsync(
        string namespaceName, 
        string alias)
    {
        while (true)
        {
            var aliasInfo = await _managementClient.GetGeoRecoveryAliasesAsync(
                namespaceName, alias);
            
            if (aliasInfo.ProvisioningState == "Succeeded" && 
                aliasInfo.Role == "Primary")
            {
                break;
            }
            
            await Task.Delay(TimeSpan.FromSeconds(5));
        }
    }
}

Post-Failover Configuration

Reconfigure Virtual Networks

# Apply VNet rules to new primary (secondary namespace)
az servicebus namespace network-rule add \
  --name secondary-namespace \
  --resource-group my-rg \
  --subnet "/subscriptions/xxx/resourceGroups/my-rg/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/function-subnet"

Update Application Settings

{
  "AppSettings": {
    "ServiceBus": {
      "Namespace": "secondary-namespace",
      "UpdatedAfterFailover": "2024-01-15T10:30:00Z"
    }
  }
}

Monitoring and Alerts

Azure Monitor Metrics

# Create alert for namespace health
az monitor metrics alert create \
  --name "servicebus-health-alert" \
  --resource-group my-rg \
  --description "Alert on Service Bus namespace health" \
  --condition "avg namespace_health_percentage < 100" \
  --window-size 5m

Log Analytics Query

// Check for failover events
AzureDiagnostics
| where TimeGenerated > ago(1d)
| where OperationName == "Geo_DR"
| project TimeGenerated, Category, OperationName, Status

Failback Process

After Primary Region Restores

# Failback to original primary
az servicebus georecovery-alias fail-over \
  --resource-group my-rg \
  --namespace secondary-namespace \
  --alias my-alias

Recommended Sequence

  1. Verify primary is healthy — Check namespace status
  2. Pause producers — Stop sending new messages
  3. Wait for processing — Allow pending messages to process
  4. Failover — Trigger failover to primary
  5. Verify — Confirm alias points to primary
  6. Resume — Start producing messages again

Best Practices

PracticeDescription
Use Premium tierGeo-DR requires Premium
Use aliasesAliases enable seamless failover
Test failoverRegularly test your DR procedures
Document processHave runbook for failover steps
MonitorSet up alerts for failover events
Plan for RPOUnderstand data loss during failover

Important Considerations

What Replicates

ReplicatesDoes Not Replicate
Queue definitionsMessage content
Topic definitionsVirtual network rules
Subscription rulesAccess policies
Message typesCustom domains
SchemaPrivate endpoints

Recovery Point Objective (RPO)

  • Queue/Topic messages — Replicated asynchronously, potential small loss during failover
  • Metadata — Instant synchronization

Azure Integration Hub - Advanced Level