Azure Functions Performance Tuning

Optimizing Throughput and Latency

Introduction

Azure Functions performance tuning involves configuring host.json settings, optimizing concurrency, managing connection pools, and implementing efficient patterns. Proper tuning can dramatically improve your function app's throughput, reduce latency, and lower costs by processing more messages with fewer resources.

This comprehensive guide covers:

host.json structure — Understanding configuration options
Service Bus optimization — Maximizing message throughput
Queue and Blob triggers — Efficient polling and batch processing
Concurrency tuning — Parallel processing optimization
Singleton patterns — Connection reuse strategies
Scaling considerations — Planning for production workloads

Understanding host.json Structure

Complete Configuration Structure

{
  "version": "2.0",
  "extensions": {
    "blobs": { },
    "queues": { },
    "serviceBus": { },
    "eventHub": { },
    "cosmosDB": { },
    "timer": { }
  },
  "logging": {
    "logLevel": { },
    "applicationInsights": { }
  },
  "retry": {
    "strategy": "",
    "maxRetryCount": 0,
    "delayInterval": "",
    "minInterval": "",
    "maxInterval": ""
  },
  "healthMonitor": { },
  "functionTimeout": "",
  "maxConcurrentExecutions": "",
  "retry": { }
}

Extension Bundle Versioning

{
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.*, 5.0.0)"
  }
}

Service Bus Performance Optimization

Complete Service Bus Configuration

{
  "version": "2.0",
  "extensions": {
    "serviceBus": {
      "prefetchCount": 100,
      "messageHandlerOptions": {
        "autoComplete": false,
        "maxConcurrentCalls": 32,
        "maxAutoRenewDuration": "00:10:00"
      },
      "sessionHandlerOptions": {
        "autoComplete": false,
        "maxConcurrentSessions": 200,
        "maxAutoRenewDuration": "00:10:00"
      },
      "batchOptions": {
        "maxMessageCount": 1000,
        "operationTimeout": "00:01:00",
        "maxConcurrentBatches": 8
      }
    }
  }
}

Configuration Explained

┌─────────────────────────────────────────────────────────────────────┐
│              SERVICE BUS CONFIGURATION OPTIONS                      │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Prefetch Count (prefetchCount)                                    │
│   ─────────────────────────────                                     │
│   What: Number of messages to fetch ahead                           │
│   Why:  Reduces network round trips                                 │
│   Recommended: 100-200 for high throughput                          │
│   Warning: Higher = more memory, more lost on errors                │
│                                                                     │
│   Max Concurrent Calls (maxConcurrentCalls)                         │
│   ───────────────────────────────────────                           │
│   What: Parallel message processing                                 │
│   Why:  Process multiple messages simultaneously                    │
│   Recommended: 16-32 for CPU-bound, 8-16 for I/O-bound              │
│   Warning: Too high can cause memory pressure                       │
│                                                                     │
│   Auto Complete (autoComplete)                                      │
│   ────────────────────                                              │
│   What: Auto-complete after successful processing                   │
│   When to use: Simple workflows, no complex logic                   │
│   When to avoid: Complex multi-step processing                      │
│                                                                     │
│   Max Auto Renew Duration                                           │
│   ──────────────────────                                            │
│   What: Maximum time to renew message lock                          │
│   Why:  Prevents lock expiration during long processing             │
│   Recommended: 5-10 minutes                                         │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

C# Implementation with Options

public class ServiceBusFunction
{
    [Function("ProcessOrders")]
    public async Task Run(
        [ServiceBusTrigger(
            "orders-queue",
            Connection = "ServiceBusConnection",
            PrefetchCount = 100,
            AutoComplete = false)]
        ServiceBusReceivedMessage[] messages,
        ServiceBusMessageActions messageActions)
    {
        _logger.LogInformation("Processing batch of {Count} messages", messages.Length);

        var completedCount = 0;
        var failedCount = 0;

        // Process all messages in batch
        foreach (var message in messages)
        {
            try
            {
                var order = JsonSerializer.Deserialize<Order>(message.Body.ToString());
                await ProcessOrderAsync(order);
                
                await messageActions.CompleteMessageAsync(message);
                completedCount++;
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to process message {MessageId}", message.MessageId);
                
                if (message.DeliveryCount >= 5)
                {
                    await messageActions.DeadLetterMessageAsync(message, new Dictionary<string, object>
                    {
                        { "Error", ex.Message }
                    });
                }
                else
                {
                    await messageActions.AbandonMessageAsync(message);
                }
                failedCount++;
            }
        }

        _logger.LogInformation("Batch complete. Success: {Success}, Failed: {Failed}", 
            completedCount, failedCount);
    }
}

Queue Storage Optimization

Queue Configuration

{
  "extensions": {
    "queues": {
      "maxPollingInterval": "00:00:02",
      "visibilityTimeout": "00:01:00",
      "batchSize": 32,
      "maxDequeueCount": 5,
      "newBatchThreshold": 8,
      "messageEncoder": "Json",
      "controlQueueVisibilityTimeout": "00:00:30"
    }
  }
}

Understanding Queue Polling

┌─────────────────────────────────────────────────────────────────────┐
│                    QUEUE POLLING MECHANISM                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                    POLLING CYCLE                            │   │
│   │                                                             │   │
│   │   ┌──────────┐     ┌──────────┐     ┌──────────┐            │   │
│   │   │ Poll #1  │────▶│ Poll #2  │────▶│ Poll #3  │            │   │
│   │   │  2 sec   │     │  2 sec   │     │  2 sec   │            │   │
│   │   └──────────┘     └──────────┘     └──────────┘            │   │
│   │        │                │                │                  │   │
│   │        ▼                ▼                ▼                  │   │
│   │   No messages      Batch found      Batch found             │   │
│   │   → wait           → process        → process               │   │
│   │                                                             │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   Batch Processing                                                  │
│   ─────────────────                                                 │
│   batchSize: 32       (messages per poll)                           │
│   newBatchThreshold: 8 (when to start next batch)                   │
│                                                                     │
│   Example: Processing 1000 messages                                 │
│   - Poll 1: Get 32 messages                                         │
│   - Process 24: Only 8 remaining                                    │
│   - Trigger Poll 2 immediately (threshold met)                      │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Batch Processing Implementation

public class QueueBatchFunction
{
    [Function("ProcessQueueBatch")]
    public async Task Run(
        [QueueTrigger("orders-queue", Connection = "StorageConnection")]
        string[] messages,
        ILogger log)
    {
        log.LogInformation("Processing batch of {Count} messages", messages.Length);

        var tasks = new List<Task>();
        
        // Process in parallel up to concurrency limit
        var semaphore = new SemaphoreSlim(16);
        
        foreach (var message in messages)
        {
            await semaphore.WaitAsync();
            
            tasks.Add(Task.Run(async () =>
            {
                try
                {
                    await ProcessMessageAsync(message);
                }
                finally
                {
                    semaphore.Release();
                }
            }));
        }
        
        await Task.WhenAll(tasks);
        
        log.LogInformation("Batch processing complete");
    }
}

Blob Trigger Optimization

Blob Configuration

{
  "extensions": {
    "blobs": {
      "maxDegreeOfParallelism": 8
    }
  }
}

Blob Processing Pattern

public class BlobProcessorFunction
{
    [Function("ProcessBlob")]
    public async Task Run(
        [BlobTrigger("input/{name}", Connection = "StorageConnection")] BlobClient inputBlob,
        string name,
        ILogger log)
    {
        log.LogInformation("Processing blob: {Name}", name);
        
        // Use BlobClient directly for more control
        var downloadResponse = await inputBlob.DownloadContentAsync();
        var content = downloadResponse.Value.Content.ToString();
        
        // Process content
        var result = await ProcessContentAsync(content);
        
        // Write output
        var outputBlob = new BlobClient(
            "https://mystorage.blob.core.windows.net/output",
            $"processed/{name}",
            new DefaultAzureCredential());
            
        await outputBlob.UploadAsync(BinaryData.FromString(result));
        
        log.LogInformation("Blob processed successfully");
    }
}

Concurrency and Scaling

Concurrent Execution Patterns

public class ConcurrentProcessing
{
    // Configuration for high concurrency
    [Function("HighConcurrency")]
    public async Task<List<string>> Run(
        [ServiceBusTrigger("orders-queue", 
            Connection = "ServiceBusConnection",
            PrefetchCount = 200)]
        ServiceBusReceivedMessage[] messages)
    {
        var tasks = messages.Select(async msg =>
        {
            var order = JsonSerializer.Deserialize<Order>(msg.Body.ToString());
            await ProcessOrderAsync(order);
            return order.OrderId;
        });
        
        return await Task.WhenAll(tasks);
    }
}

Parallel Execution with Semaphore

public class ParallelProcessing
{
    private static readonly SemaphoreSlim _semaphore = new SemaphoreSlim(32);

    [Function("ParallelQueue")]
    public async Task Run(
        [QueueTrigger("orders-queue")] string message,
        ILogger log)
    {
        await _semaphore.WaitAsync();
        
        try
        {
            await ProcessMessageAsync(message);
        }
        finally
        {
            _semaphore.Release();
        }
    }
}

Singleton Client Pattern

Why Singleton Matters

┌─────────────────────────────────────────────────────────────────────┐
│                  CONNECTION MANAGEMENT COMPARISON                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   ❌ BAD: Creating clients per invocation                           │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                                                             │   │
│   │  Invocation 1: Create connection → Authenticate → Use       │   │
│   │  Invocation 2: Create connection → Authenticate → Use       │   │
│   │  Invocation 3: Create connection → Authenticate → Use       │   │
│   │                                                             │   │
│   │  Result: 3 connections, 3 auth operations, slow             │   │
│   │                                                             │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   ✅ GOOD: Singleton pattern                                        │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                                                             │   │
│   │  Startup: Create connection pool (handles, auth tokens)     │   │
│   │  Invocation 1: Use pooled connection                        │   │
│   │  Invocation 2: Use pooled connection                        │   │
│   │  Invocation 3: Use pooled connection                        │   │
│   │                                                             │   │
│   │  Result: 1 connection pool, fast, efficient                 │   │
│   │                                                             │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Implementation

public class Startup : FunctionsStartup
{
    public override void Configure(IFunctionsHostBuilder builder)
    {
        // Register singleton clients
        builder.Services.AddSingleton(sp =>
        {
            var credential = new DefaultAzureCredential();
            
            return new BlobServiceClient(
                new Uri("https://mystorage.blob.core.windows.net"),
                credential);
        });
        
        builder.Services.AddSingleton(sp =>
        {
            var credential = new DefaultAzureCredential();
            
            return new ServiceBusClient(
                "mynamespace.servicebus.windows.net",
                new ServiceBusClientOptions
                {
                    RetryOptions = new ServiceBusRetryOptions
                    {
                        Mode = ServiceBusRetryMode.Exponential,
                        MaxRetries = 3,
                        MaxDelay = TimeSpan.FromSeconds(30)
                    }
                });
        });
    }
}

public class SingletonFunction
{
    private readonly BlobServiceClient _blobClient;
    private readonly ServiceBusClient _serviceBusClient;

    public SingletonFunction(
        BlobServiceClient blobClient,
        ServiceBusClient serviceBusClient)
    {
        _blobClient = blobClient;
        _serviceBusClient = serviceBusClient;
    }

    [Function("ProcessWithSingleton")]
    public async Task Run(
        [QueueTrigger("orders")] string message)
    {
        // Use injected singleton clients
        var container = _blobClient.GetBlobContainerClient("orders");
        // ... use clients
    }
}

Connection String Configuration

Development vs Production

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "ServiceBusConnection": "Endpoint=sb://localhost;SharedAccessKeyName=...",
    
    // For production - use Key Vault
    "ServiceBusConnection": "@Microsoft.KeyVault(SecretUri=...)"
  }
}

Managed Identity Configuration

{
  "Values": {
    "AzureWebJobsStorage": "",
    "ServiceBusConnection__fullyQualifiedNamespace": "mynamespace.servicebus.windows.net"
  }
}

Logging and Monitoring

Application Insights Configuration

{
  "logging": {
    "logLevel": {
      "default": "Information",
      "Host.Results": "Information",
      "Function": "Debug",
      "Host.Aggregator": "Information"
    },
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "maxTelemetryItemsPerSecond": 20
      }
    }
  }
}

Performance Queries

// Function execution time
requests
| where timestamp > ago(1h)
| summarize avg(duration) by name, bin(timestamp, 5m)
| render timechart

// Queue message processing
requests
| where operation_Name == "ProcessQueue"
| summarize 
    messageCount = sum(customDimensions.MessageCount),
    avgDuration = avg(duration),
    p95Duration = percentile(duration, 95)
  by bin(timestamp, 5m)

// Throttling detection
traces
| where message contains "throttl" or message contains "rate limit"
| summarize count() by bin(timestamp, 1h)

Best Practices Summary

Performance Checklist

Setting	Recommended Value	When to Change
prefetchCount	100-200	Higher for high-throughput
maxConcurrentCalls	16-32	Lower for memory constraints
batchSize	16-32	Higher for batch processing
maxAutoRenewDuration	5-10 min	Longer for long-running tasks
maxDegreeOfParallelism	8-16	Adjust based on workload

Optimization Patterns

┌─────────────────────────────────────────────────────────────────────┐
│                 PERFORMANCE OPTIMIZATION PATTERNS                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  1. PRE-FETCH MESSAGES                                              │
│     └─ Fetch messages ahead of processing                           │
│     └─ Reduces network latency                                      │
│                                                                     │
│  2. PARALLEL PROCESSING                                             │
│     └─ Process multiple messages concurrently                       │
│     └─ Use maxConcurrentCalls                                       │
│                                                                     │
│  3. BATCH PROCESSING                                                │
│     └─ Handle multiple items in single invocation                   │
│     └─ Reduces function invocation overhead                         │
│                                                                     │
│  4. CONNECTION REUSE                                                │
│     └─ Use singleton pattern for SDK clients                        │
│     └─ Avoid creating clients per invocation                        │
│                                                                     │
│  5. PROPER SCALING                                                  │
│     └─ Use Premium plan for consistent performance                  │
│     └─ Configure always-ready instances                             │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘