Distributed Tracing
Overview
In microservice architectures, a single user request often spans multiple services. Distributed tracing connects these hops into a single end-to-end view, letting you pinpoint exactly where latency or failures occur across service boundaries.
Core Concepts
| Concept | Description |
|---|---|
| Trace | The full journey of a request across all services |
| Span | A single unit of work within a trace (one service call) |
| Operation ID | Unique identifier shared by all spans in a trace |
| Parent ID | Links a span to its parent span |
| W3C Trace Context | Standard HTTP headers for propagating trace context |
How Correlation Works in Application Insights
Application Insights uses the W3C Trace Context standard:
traceparent: 00-<trace-id>-<span-id>-<trace-flags>
Example:
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
When Service A calls Service B:
- Service A generates
operation_Idandoperation_ParentId - These are passed via HTTP headers (
traceparent) - Service B reads the headers and continues the trace
- All telemetry shares the same
operation_Id
.NET Auto-Correlation
ASP.NET Core with Application Insights SDK automatically:
- Reads incoming
traceparentheaders - Propagates context to outbound HTTP calls via
HttpClient - Correlates all telemetry within the request pipeline
// No manual code needed — just ensure both services have App Insights configured
builder.Services.AddApplicationInsightsTelemetry();
builder.Services.AddHttpClient(); // HttpClient auto-propagates trace context
Manual Correlation for Messaging
For async messaging (Service Bus, Event Hubs), correlation requires explicit propagation:
Sending (Producer)
public class OrderPublisher
{
private readonly ServiceBusSender _sender;
public OrderPublisher(ServiceBusClient client)
{
_sender = client.CreateSender("orders");
}
public async Task PublishOrderAsync(string payload)
{
var message = new ServiceBusMessage(payload);
// Set diagnostic ID for correlation
message.ApplicationProperties["Diagnostic-Id"] = Activity.Current?.Id;
await _sender.SendMessageAsync(message);
}
}
Receiving (Consumer)
public class ProcessOrderFunction
{
private readonly TelemetryClient _telemetry;
public ProcessOrderFunction(TelemetryClient telemetry)
{
_telemetry = telemetry;
}
[Function("ProcessOrder")]
public async Task Run(
[ServiceBusTrigger("orders", Connection = "ServiceBusConnection")]
ServiceBusReceivedMessage message)
{
// Application Insights Function SDK auto-correlates Service Bus messages
// The operation_Id is extracted from the message's Diagnostic-Id property
_telemetry.TrackEvent("OrderProcessingStarted", new Dictionary<string, string>
{
["MessageId"] = message.MessageId,
["CorrelationId"] = message.CorrelationId ?? ""
});
var order = JsonSerializer.Deserialize<Order>(message.Body.ToString());
await ProcessOrderAsync(order);
}
private async Task ProcessOrderAsync(Order order)
{
// Business logic here
}
}
Dependency Tracking
Application Insights automatically tracks outbound dependencies:
| Dependency Type | Auto-Tracked |
|---|---|
| HTTP (HttpClient) | ✅ |
| SQL (SqlClient, EF Core) | ✅ |
| Azure Storage | ✅ |
| Azure Service Bus | ✅ |
| Redis | ✅ (with StackExchange.Redis) |
| gRPC | ✅ (.NET 5+) |
Custom Dependency Tracking
For unsupported dependencies:
public class ExternalApiService
{
private readonly TelemetryClient _telemetry;
private readonly HttpClient _httpClient;
public ExternalApiService(TelemetryClient telemetry, HttpClient httpClient)
{
_telemetry = telemetry;
_httpClient = httpClient;
}
public async Task<string> CallPartnerApiAsync(string endpoint)
{
using var operation = _telemetry.StartOperation<DependencyTelemetry>("ExternalAPI");
operation.Telemetry.Type = "HTTP";
operation.Telemetry.Target = "partner-api.example.com";
try
{
var response = await _httpClient.GetAsync(endpoint);
response.EnsureSuccessStatusCode();
operation.Telemetry.Success = true;
operation.Telemetry.ResultCode = ((int)response.StatusCode).ToString();
return await response.Content.ReadAsStringAsync();
}
catch (Exception ex)
{
operation.Telemetry.Success = false;
_telemetry.TrackException(ex);
throw;
}
}
}
Application Map
The Application Map is a visual topology of your distributed system, automatically generated from dependency telemetry:
- Nodes = Services (identified by cloud role name)
- Edges = Dependencies between services
- Metrics = Call count, avg duration, failure rate on each edge
Setting Cloud Role Name
Each service must have a unique cloud role name:
public class CloudRoleInitializer : ITelemetryInitializer
{
private readonly string _roleName;
public CloudRoleInitializer(string roleName)
{
_roleName = roleName;
}
public void Initialize(ITelemetry telemetry)
{
telemetry.Context.Cloud.RoleName = _roleName;
}
}
// Register in Program.cs
builder.Services.AddSingleton<ITelemetryInitializer>(
new CloudRoleInitializer("OrderService"));
End-to-End Transaction View
In the Azure Portal:
- Go to Application Insights → Transaction Search
- Click any request to see the full trace
- The timeline shows all spans: request → dependencies → sub-dependencies
- Each span shows duration, status, and target
KQL — Full Trace Reconstruction
let traceId = "4bf92f3577b34da6a3ce929d0e0e4736";
union requests, dependencies, traces, exceptions
| where operation_Id == traceId
| order by timestamp asc
| project timestamp, itemType, name, duration, success,
target = coalesce(target, ""), message = coalesce(message, "")
OpenTelemetry Integration
Azure Monitor now supports OpenTelemetry as an alternative to the classic SDK:
builder.Services.AddOpenTelemetry()
.WithTracing(tracing =>
{
tracing
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSqlClientInstrumentation()
.AddAzureMonitorTraceExporter(o =>
o.ConnectionString = builder.Configuration["APPINSIGHTS_CONNECTION_STRING"]);
});
Benefits of OpenTelemetry:
- Vendor-neutral instrumentation
- Broader ecosystem of instrumentation libraries
- W3C Trace Context native support
- Future-proof path recommended by Microsoft
Best Practices
- Set unique cloud role names — Without them, Application Map shows one blob
- Use W3C Trace Context — It's the standard; avoid proprietary correlation
- Propagate context through queues — Set
Diagnostic-Idon messages - Don't break the chain — Ensure all services in the path have tracing enabled
- Use sampling consistently — If Service A samples out a trace, Service B should too
- Monitor trace completeness — Missing spans indicate instrumentation gaps
- Keep span names meaningful — Use route templates, not full URLs with query strings
Key Takeaways
- Distributed tracing connects multi-service requests via
operation_Id - W3C Trace Context headers propagate correlation automatically for HTTP
- Messaging requires explicit
Diagnostic-Idpropagation - Application Map visualizes your service topology from trace data
- OpenTelemetry is the future-proof approach for new projects
- Every service needs a unique cloud role name for proper visualization