Azure Service Bus — KEDA: Kubernetes Event-Driven Autoscaling

Kubernetes Pod Autoscaling via Service Bus Trigger


Introduction

KEDA (Kubernetes Event-Driven Autoscaling) enables event-driven autoscaling for containerized applications in Kubernetes. Instead of traditional CPU/memory-based scaling, KEDA scales your pods based on the depth of event queues, ensuring you have exactly the right number of replicas to handle your workload.

This is particularly valuable for:

  • Message processing workloads — Scale based on queue depth
  • Burst handling — Rapidly scale up during high load
  • Cost optimization — Scale down to zero when idle
  • Serverless-like behavior — Pay only for what you use

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                           │
│                                                                 │
│   ┌──────────────┐    ┌──────────────┐    ┌──────────────┐      │
│   │    Pod 1     │    │    Pod 2     │    │    Pod N     │      │
│   │ (Processing) │    │ (Processing) │    │ (Processing) │      │
│   └──────────────┘    └──────────────┘    └──────────────┘      │
│          │                   │                   │              │
│          └───────────────────┴───────────────────┘              │
│                              │                                  │ 
│                              ▼                                  │
│              ┌─────────────────────────────┐                    │
│              │      KEDA ScaledObject      │                    │
│              │  scaleTargetRef: order-app  │                    │
│              │  triggers: ServiceBus       │                    │
│              │  queueName: orders          │                    │
│              │  messageCount: 5            │                    │
│              └─────────────────────────────┘                    │
│                              │                                  │
│                              ▼                                  │
│   ┌──────────────────────────────────────────────────────┐      │
│   │              Azure Service Bus                       │      │
│   │                                                      │      │
│   │     orders queue ──────────────────▶ 150 messages    │      │
│   │                                                      │      │
│   └──────────────────────────────────────────────────────┘      │
└─────────────────────────────────────────────────────────────────┘

Installing KEDA

Step 1: Install KEDA using Helm

# Add KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

# Install KEDA in keda namespace
kubectl create namespace keda
helm install keda kedacore/keda --version 2.10.0 --namespace keda

# Verify installation
kubectl get pods -n keda
# Should show: keda-operator-xxxxx Running

Step 2: Install KEDA using Operator

# keda.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: keda
---
apiVersion: keda.sh/v1alpha1
kind: Keda
metadata:
  name: keda
  namespace: keda
spec:
  operator:
    spec:
      watcherInterval: 30
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: keda-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: keda-operator
subjects:
- kind: ServiceAccount
  name: keda-operator
  namespace: keda

Configuring Service Bus Scaling

Option 1: Using Azure Workload Identity

Create Managed Identity

# Create user-assigned managed identity
az identity create \
  --name keda-sb-identity \
  --resource-group my-rg

# Get the identity's client ID
IDENTITY_CLIENT_ID=$(az identity show \
  --name keda-sb-identity \
  --resource-group my-rg \
  --query clientId -o tsv)

# Get the identity's resource ID
IDENTITY_RESOURCE_ID=$(az identity show \
  --name keda-sb-identity \
  --resource-group my-rg \
  --query id -o tsv)

# Assign Service Bus Data Receiver role to the identity
az role assignment create \
  --assignee $IDENTITY_CLIENT_ID \
  --scope /subscriptions/xxx/resourceGroups/my-rg/providers/Microsoft.ServiceBus/namespaces/my-namespace \
  --role "Azure Service Bus Data Receiver"

Create TriggerAuthentication

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: servicebus-trigger-auth
  namespace: my-app
spec:
  podIdentity:
    provider: azure-workload
    identityName: keda-sb-identity

Create ScaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor-scaled
  namespace: my-app
spec:
  scaleTargetRef:
    name: order-processor-deployment  # Must match your deployment name
  pollingInterval: 15  # Check queue every 15 seconds
  cooldownPeriod: 300  # Wait 5 minutes before scaling down
  minReplicaCount: 0   # Scale to zero when idle
  maxReplicaCount: 10 # Maximum 10 pods

  triggers:
  - type: azure-servicebus
    metadata:
      # Connection from env - KEDA reads this
      connectionFromEnv: ServiceBusConnection
      # Queue name to monitor
      queueName: orders
      # Scale when queue has this many messages
      messageCount: "5"
    authenticationRef:
      name: servicebus-trigger-auth

Option 2: Using Connection String (Secrets)

Create Secret

apiVersion: v1
kind: Secret
metadata:
  name: servicebus-secrets
  namespace: my-app
type: Opaque
data:
  ServiceBusConnection: <base64-encoded-connection-string>

Create TriggerAuthentication

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: servicebus-trigger-auth
  namespace: my-app
spec:
  secretTargetRef:
  - parameter: connection
    name: servicebus-secrets
    key: ServiceBusConnection

Create ScaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor-scaled
  namespace: my-app
spec:
  scaleTargetRef:
    name: order-processor-deployment
  pollingInterval: 15
  cooldownPeriod: 300
  minReplicaCount: 0
  maxReplicaCount: 10

  triggers:
  - type: azure-servicebus
    metadata:
      queueName: orders
      messageCount: "5"
    authenticationRef:
      name: servicebus-trigger-auth

Advanced Scaling Configurations

Multiple Queues

triggers:
- type: azure-servicebus
  metadata:
    queueName: orders
    messageCount: "5"  # Scale when 5+ orders
- type: azure-servicebus
  metadata:
    queueName: urgent-orders
    messageCount: "1"  # Higher priority - scale when 1+ urgent
- type: azure-servicebus
  metadata:
    queueName: notifications
    messageCount: "100"  # Can handle more batched

Topic Subscription

triggers:
- type: azure-servicebus
  metadata:
    topicName: order-events
    subscriptionName: order-processor-sub
    messageCount: "10"

Scaling with Threshold Calculation

# Advanced: Use custom threshold calculation
triggers:
- type: azure-servicebus
  metadata:
    queueName: orders
    # Use formula: ceil(queueDepth / 10) - 1 pod per 10 messages
    messageCount: "10"
    activationMessageCount: "1"  # Minimum to activate

Configuring Your Application

Deployment with Environment Variables

apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-processor
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: order-processor
  template:
    metadata:
      labels:
        app: order-processor
    spec:
      containers:
      - name: order-processor
        image: myregistry/order-processor:1.0
        env:
        - name: ServiceBusConnection
          valueFrom:
            secretKeyRef:
              name: servicebus-secrets
              key: ServiceBusConnection
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
          limits:
            cpu: 1000m
            memory: 2Gi

Processing Strategy

public class OrderProcessor
{
    private readonly ServiceBusClient _client;
    private readonly ILogger<OrderProcessor> _logger;
    
    public OrderProcessor(
        ServiceBusClient client,
        ILogger<OrderProcessor> logger)
    {
        _client = client;
        _logger = logger;
    }
    
    [FunctionName("OrderProcessor")]
    public async Task ProcessOrdersAsync(
        [ServiceBusTrigger("orders", Connection = "ServiceBusConnection")] 
        ServiceBusReceivedMessage[] messages)
    {
        // Each pod can process a batch of messages
        // KEDA scales pods based on queue depth
        
        _logger.LogInformation("Processing {Count} messages on pod {PodId}", 
            messages.Length, Environment.MachineName);
        
        foreach (var message in messages)
        {
            try
            {
                await ProcessMessageAsync(message);
                await CompleteMessageAsync(message);
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to process message {MessageId}", 
                    message.MessageId);
                await AbandonMessageAsync(message);
            }
        }
        
        _logger.LogInformation("Completed processing {Count} messages", messages.Length);
    }
}

Monitoring and Troubleshooting

Verify ScaledObject Status

# Check scaled object
kubectl get scaledobject -n my-app

# Get detailed status
kubectl get scaledobject order-processor-scaled -n my-app -o yaml

# Check scaling metrics
kubectl describe scaledobject order-processor-scaled -n my-app

View KEDA Logs

# KEDA operator logs
kubectl logs -n keda -l app=keda-operator

# KEDA metric server logs
kubectl logs -n keda -l app=keda-metrics-server

Check HPA Status

# Get HPA (created by KEDA)
kubectl get hpa -n my-app

# Detailed HPA info
kubectl describe hpa keda-hpa-order-processor -n my-app

Common Issues

IssueSolution
No scalingCheck identity has RBAC to Service Bus
Authentication errorsVerify managed identity assignment
Not scaling to zeroCheck minReplicaCount is 0
Not scaling upVerify messageCount threshold

Performance Tuning

Optimal Settings

SettingRecommended ValueDescription
pollingInterval15-30 secondsBalance responsiveness vs. cost
cooldownPeriod5-15 minutesPrevent flapping
messageCount5-20Depends on message processing time
minReplicaCount0Zero when idle
maxReplicaCount10-50Based on max throughput

Calculation Examples

# If you can process 100 messages/second/pod
# And you want 5 seconds of throughput in queue
# messageCount = 100 * 5 * podCount

# Start with 5 messages per pod as baseline
# Adjust based on actual processing time

CI/CD Integration

GitHub Actions for KEDA

name: Deploy with KEDA

on:
  push:
    branches: [main]
    paths: ['k8s/**']

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: azure/k8s-set-context@v1
        with:
          kubeconfig: ${{ secrets.KUBE_CONFIG }}
      
      - name: Deploy ScaledObject
        run: |
          kubectl apply -f k8s/scaledobject.yaml
      
      - name: Deploy Application
        run: |
          kubectl apply -f k8s/deployment.yaml
      
      - name: Verify scaling
        run: |
          sleep 30
          kubectl get scaledobject -n my-app

Best Practices

  1. Use managed identity — Avoid storing connection strings
  2. Set appropriate thresholds — Match processing capacity
  3. Configure cooldown — Prevent rapid scaling oscillation
  4. Monitor metrics — Track queue depth and pod count
  5. Test under load — Verify scaling behavior before production

Azure Integration Hub - Advanced Level