Azure Event Hubs — Architecture Deep Dive

Partitions, Consumer Groups, Checkpointing


Introduction

Azure Event Hubs is a fully managed real-time event streaming platform capable of processing millions of events per second. Understanding its architecture is crucial for building scalable event-driven applications.


Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                        Event Hubs Namespace                         │
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                    Event Hub (orders)                         │  │
│  │                                                               │  │
│  │   ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐          │  │
│  │   │ Part 0  │  │ Part 1  │  │ Part 2  │  │ Part 3  │          │  │
│  │   │         │  │         │  │         │  │         │          │  │
│  │   └─────────┘  └─────────┘  └─────────┘  └─────────┘          │  │
│  └───────────────────────────────────────────────────────────────┘  │
│                              │                                      │
│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐            │
│  │ Consumer      │  │ Consumer      │  │ Consumer      │            │
│  │ Group A       │  │ Group B       │  │ Group C       │            │
│  │ (processing)  │  │ (analytics)   │  │ (archival)    │            │
│  └───────────────┘  └───────────────┘  └───────────────┘            │
└─────────────────────────────────────────────────────────────────────┘

Key Concepts

Partitions

Partitions are the fundamental unit of parallelism in Event Hubs:

// Send to specific partition
var eventData = new EventData(body);
eventData.PartitionKey = customerId;  // Hash-based routing

await sender.SendAsync(eventData);

Consumer Groups

Each consumer group maintains its own offset:

var consumer = new EventProcessorClientBuilder()
    .Connection(connectionString)
    .ConsumerGroup("processing-group-1")  // Different from "analytics-group"
    .EventProcessor<MyEventProcessor>()
    .BuildEventProcessorClient();

Implementation

Producer

public class EventHubProducer
{
    private readonly EventHubProducerClient _client;
    
    public async Task SendOrderEventAsync(Order order)
    {
        var eventData = new EventData(
            Encoding.UTF8.GetBytes(JsonSerializer.Serialize(order)))
        {
            ContentType = "application/json",
            PartitionKey = order.CustomerId
        };
        
        eventData.Properties["EventType"] = "OrderCreated";
        eventData.Properties["Timestamp"] = DateTime.UtcNow.ToString("O");
        
        await _client.SendAsync(eventData);
    }
}

Consumer with Checkpointing

public class OrderEventProcessor : IEventProcessor
{
    private BlobContainerClient _checkpointStore;
    private string _consumerGroup;
    
    public async Task ProcessEventsAsync(
        PartitionContext context,
        IEnumerable<EventData> events)
    {
        foreach (var evt in events)
        {
            var order = JsonSerializer.Deserialize<Order>(
                Encoding.UTF8.GetString(evt.Body));
            
            await ProcessOrderAsync(order);
        }
        
        // Checkpoint: Save offset
        await context.CheckpointAsync();
    }
}

Best Practices

PracticeDescription
Use partition keyDistribute evenly
Checkpoint frequentlyMinimize reprocessing
Multiple consumer groupsDifferent processing pipelines
Choose partition countMatch downstream parallelism

Azure Integration Hub - Advanced Level