Azure Event Hubs — Architecture Deep Dive
Partitions, Consumer Groups, Checkpointing
Introduction
Azure Event Hubs is a fully managed real-time event streaming platform capable of processing millions of events per second. Understanding its architecture is crucial for building scalable event-driven applications.
Architecture Overview
┌─────────────────────────────────────────────────────────────────────┐
│ Event Hubs Namespace │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Event Hub (orders) │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Part 0 │ │ Part 1 │ │ Part 2 │ │ Part 3 │ │ │
│ │ │ │ │ │ │ │ │ │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ Consumer │ │ Consumer │ │ Consumer │ │
│ │ Group A │ │ Group B │ │ Group C │ │
│ │ (processing) │ │ (analytics) │ │ (archival) │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Key Concepts
Partitions
Partitions are the fundamental unit of parallelism in Event Hubs:
// Send to specific partition
var eventData = new EventData(body);
eventData.PartitionKey = customerId; // Hash-based routing
await sender.SendAsync(eventData);
Consumer Groups
Each consumer group maintains its own offset:
var consumer = new EventProcessorClientBuilder()
.Connection(connectionString)
.ConsumerGroup("processing-group-1") // Different from "analytics-group"
.EventProcessor<MyEventProcessor>()
.BuildEventProcessorClient();
Implementation
Producer
public class EventHubProducer
{
private readonly EventHubProducerClient _client;
public async Task SendOrderEventAsync(Order order)
{
var eventData = new EventData(
Encoding.UTF8.GetBytes(JsonSerializer.Serialize(order)))
{
ContentType = "application/json",
PartitionKey = order.CustomerId
};
eventData.Properties["EventType"] = "OrderCreated";
eventData.Properties["Timestamp"] = DateTime.UtcNow.ToString("O");
await _client.SendAsync(eventData);
}
}
Consumer with Checkpointing
public class OrderEventProcessor : IEventProcessor
{
private BlobContainerClient _checkpointStore;
private string _consumerGroup;
public async Task ProcessEventsAsync(
PartitionContext context,
IEnumerable<EventData> events)
{
foreach (var evt in events)
{
var order = JsonSerializer.Deserialize<Order>(
Encoding.UTF8.GetString(evt.Body));
await ProcessOrderAsync(order);
}
// Checkpoint: Save offset
await context.CheckpointAsync();
}
}
Best Practices
| Practice | Description |
|---|---|
| Use partition key | Distribute evenly |
| Checkpoint frequently | Minimize reprocessing |
| Multiple consumer groups | Different processing pipelines |
| Choose partition count | Match downstream parallelism |
Azure Integration Hub - Advanced Level