Logic Apps — Advanced Error Handling and Exception Management
Why Error Handling in Logic Apps Matters
Azure Logic Apps orchestrate business-critical workflows — processing orders, syncing data between systems, sending notifications, and managing approvals. When a step in your workflow fails, the consequences can range from a missed email to a corrupted database record or a lost financial transaction.
Without proper error handling, your Logic Apps will:
- Fail silently — A workflow stops mid-execution and nobody knows until a customer complains
- Lose data — Messages or records that were partially processed get stuck in limbo
- Create inconsistencies — One system gets updated but another doesn't, leading to data drift
- Overwhelm downstream services — Retrying too aggressively can cause cascading failures
- Make debugging impossible — Without proper logging, you can't determine what went wrong or why
With proper error handling, your workflows become resilient, observable, and self-healing.
Understanding Logic Apps Execution Model
Before implementing error handling, you need to understand how Logic Apps processes actions and what happens when something goes wrong.
Action Status Values
Every action in a Logic App produces one of four status values:
| Status | Meaning | What Happens Next |
|---|---|---|
| Succeeded | Action completed successfully | Next action runs normally |
| Failed | Action threw an error | Dependent actions are skipped (by default) |
| Skipped | Action was not executed | Happens when a preceding action failed |
| TimedOut | Action exceeded its timeout | Treated as a failure |
The Default Behavior Problem
By default, if any action fails, all subsequent actions in that branch are skipped and the entire workflow run is marked as "Failed." This is problematic because:
- You can't send an alert about the failure (the alert action gets skipped too)
- You can't perform cleanup or compensation logic
- You can't log the error details for investigation
This is where Run After configuration and Scope actions become essential.
Architecture: Error Handling Patterns
┌─────────────────────────────────────────────────────────────────────────┐
│ LOGIC APP ERROR HANDLING ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ SCOPE: "Process_Order" │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────────┐ ┌───────────────────────┐ │ │
│ │ │ Validate │──▶│ Create Record│──▶│ Send Confirmation │ │ │
│ │ │ Input │ │ in Database │ │ Email │ │ │
│ │ └──────────┘ └──────────────┘ └───────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ │ (Run After) │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌────────────────┐ ┌────────────────────┐ │
│ │ On Success │ │ On Failure │ │ On Timeout │ │
│ │ │ │ │ │ │ │
│ │ Log success │ │ Log error │ │ Log timeout │ │
│ │ Update status │ │ Send alert │ │ Send alert │ │
│ │ │ │ Move to DLQ │ │ Schedule retry │ │
│ └───────────────┘ │ Compensate │ └────────────────────┘ │
│ └────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ SCOPE: "Finally" (Runs regardless of outcome) │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────────┐ │ │
│ │ │ Release locks│──▶│ Update audit log │ │ │
│ │ └──────────────┘ └──────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
Step 1: Understanding Scopes for Error Handling
A Scope in Logic Apps is a container that groups multiple actions together. The key benefit is that you can check whether the entire scope succeeded or failed, and then branch your logic accordingly — similar to a try/catch block in programming.
Why Use Scopes?
- Group related actions — Treat a set of actions as a single unit of work
- Catch errors from any action — If any action inside the scope fails, you can handle it
- Implement try/catch/finally — Using multiple scopes with Run After configuration
- Simplify error checking — Instead of checking each action individually, check the scope result
Basic Scope Pattern (Try/Catch)
{
"definition": {
"$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
"contentVersion": "1.0.0.0",
"triggers": {
"When_a_message_is_received": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['servicebus']['connectionId']" } },
"method": "get",
"path": "/queues/orders/messages/head"
},
"recurrence": { "frequency": "Minute", "interval": 1 }
}
},
"actions": {
"Try_Process_Order": {
"type": "Scope",
"actions": {
"Parse_Order_JSON": {
"type": "ParseJson",
"inputs": {
"content": "@triggerBody()?['ContentData']",
"schema": {
"type": "object",
"properties": {
"orderId": { "type": "string" },
"customerId": { "type": "string" },
"items": { "type": "array" },
"totalAmount": { "type": "number" }
},
"required": ["orderId", "customerId", "items"]
}
},
"runAfter": {}
},
"Validate_Order": {
"type": "If",
"expression": {
"and": [
{ "greater": ["@length(body('Parse_Order_JSON')?['items'])", 0] },
{ "greater": ["@body('Parse_Order_JSON')?['totalAmount']", 0] }
]
},
"actions": {
"Create_Order_Record": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['sql']['connectionId']" } },
"method": "post",
"path": "/v2/datasets/@{encodeURIComponent('default')}/tables/@{encodeURIComponent('Orders')}/items",
"body": {
"OrderId": "@body('Parse_Order_JSON')?['orderId']",
"CustomerId": "@body('Parse_Order_JSON')?['customerId']",
"TotalAmount": "@body('Parse_Order_JSON')?['totalAmount']",
"Status": "Processing",
"CreatedAt": "@utcNow()"
}
}
},
"Send_Confirmation_Email": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['office365']['connectionId']" } },
"method": "post",
"path": "/v2/Mail",
"body": {
"To": "@body('Parse_Order_JSON')?['customerEmail']",
"Subject": "Order @{body('Parse_Order_JSON')?['orderId']} Confirmed",
"Body": "Your order has been received and is being processed."
}
},
"runAfter": { "Create_Order_Record": ["Succeeded"] }
}
},
"else": {
"actions": {
"Terminate_Invalid_Order": {
"type": "Terminate",
"inputs": {
"runStatus": "Failed",
"runError": {
"code": "ValidationFailed",
"message": "Order validation failed: empty items or zero amount"
}
}
}
}
},
"runAfter": { "Parse_Order_JSON": ["Succeeded"] }
}
},
"runAfter": {}
},
"Catch_Process_Order_Errors": {
"type": "Scope",
"actions": {
"Get_Error_Details": {
"type": "Compose",
"inputs": {
"workflowRunId": "@workflow().run.name",
"errorTime": "@utcNow()",
"scopeResult": "@result('Try_Process_Order')",
"failedActions": "@filter(result('Try_Process_Order'), item => item['status'] == 'Failed')"
},
"runAfter": {}
},
"Send_Error_Alert": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['teams']['connectionId']" } },
"method": "post",
"path": "/v3/conversations/@{encodeURIComponent('channel-id')}/activities",
"body": {
"messageBody": "🚨 **Order Processing Failed**\n\nWorkflow Run: @{workflow().run.name}\nTime: @{utcNow()}\nError: @{first(body('Get_Error_Details')?['failedActions'])?['error']?['message']}"
}
},
"runAfter": { "Get_Error_Details": ["Succeeded"] }
},
"Log_Error_To_Table": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['azuretables']['connectionId']" } },
"method": "post",
"path": "/Tables/@{encodeURIComponent('WorkflowErrors')}/entities",
"body": {
"PartitionKey": "OrderProcessing",
"RowKey": "@{guid()}",
"WorkflowRunId": "@workflow().run.name",
"ErrorDetails": "@{string(body('Get_Error_Details'))}",
"Timestamp": "@utcNow()"
}
},
"runAfter": { "Get_Error_Details": ["Succeeded"] }
}
},
"runAfter": {
"Try_Process_Order": ["Failed", "TimedOut"]
}
}
}
}
}
Understanding the Run After Configuration
The critical piece that makes error handling work is the runAfter property. By default, an action runs after the previous action succeeds. But you can configure it to run after specific statuses:
"runAfter": {
"Try_Process_Order": ["Failed", "TimedOut"]
}
This means the "Catch" scope only executes when the "Try" scope fails or times out. You can combine multiple statuses:
["Succeeded"]— Run only on success (default)["Failed"]— Run only on failure["TimedOut"]— Run only on timeout["Skipped"]— Run only when skipped["Failed", "TimedOut"]— Run on failure OR timeout["Succeeded", "Failed", "TimedOut", "Skipped"]— Always run (finally pattern)
Step 2: Retry Policies for Transient Failures
Many failures in Logic Apps are transient — a downstream API is temporarily unavailable, a database connection times out, or a rate limit is hit. Retry policies handle these automatically without requiring manual intervention.
Understanding Retry Policy Types
Logic Apps supports four retry policy types:
| Policy Type | Behavior | Best For |
|---|---|---|
| Default | Retries 4 times at exponentially increasing intervals (7s, 14s, 28s, 56s) | Most scenarios |
| Exponential | Configurable count and intervals with exponential backoff | APIs with rate limits |
| Fixed | Retries at fixed intervals | Services with predictable recovery |
| None | No retries | Actions where retry would cause duplicates |
Configuring Exponential Backoff
Exponential backoff is the recommended strategy for most integrations. It gives the downstream service time to recover while avoiding thundering herd problems.
{
"Call_Payment_API": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://payment-api.example.com/v1/charges",
"headers": {
"Content-Type": "application/json",
"Authorization": "Bearer @{body('Get_API_Token')?['access_token']}"
},
"body": {
"amount": "@body('Parse_Order_JSON')?['totalAmount']",
"currency": "USD",
"customerId": "@body('Parse_Order_JSON')?['customerId']"
},
"retryPolicy": {
"type": "exponential",
"count": 4,
"interval": "PT10S",
"minimumInterval": "PT5S",
"maximumInterval": "PT1H"
}
},
"runAfter": { "Validate_Payment_Details": ["Succeeded"] }
}
}
How Exponential Backoff Works
Attempt 1: Immediate
↓ (fails)
Wait: 10 seconds (interval)
Attempt 2:
↓ (fails)
Wait: 20 seconds (interval × 2)
Attempt 3:
↓ (fails)
Wait: 40 seconds (interval × 4)
Attempt 4:
↓ (fails)
Wait: 80 seconds (interval × 8, capped at maximumInterval)
Attempt 5 (final):
↓ (fails)
→ Action marked as "Failed"
Fixed Interval Retry
Use fixed intervals when the downstream service has a known recovery time (e.g., a service that restarts in exactly 30 seconds):
{
"Sync_To_Legacy_System": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://legacy-erp.internal/api/sync",
"body": "@body('Transform_Data')",
"retryPolicy": {
"type": "fixed",
"count": 3,
"interval": "PT30S"
}
}
}
}
Disabling Retries (Critical for Idempotency)
For actions that are NOT idempotent (e.g., charging a credit card, sending an SMS), disable retries to prevent duplicate operations:
{
"Charge_Credit_Card": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://payment-gateway.com/charge",
"body": {
"amount": "@body('Calculate_Total')",
"idempotencyKey": "@guid()"
},
"retryPolicy": {
"type": "none"
}
}
}
}
Why disable retries? If the payment API processes the charge but the response times out, Logic Apps would retry and charge the customer again. Using "type": "none" combined with an idempotency key prevents this.
Step 3: The Try/Catch/Finally Pattern
For production workflows, implement the full try/catch/finally pattern using three scopes:
┌─────────────────────────────────────────────────────────┐
│ SCOPE: "Try" │
│ Contains: Business logic actions │
│ Run After: Trigger │
└────────────────────────┬────────────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
(Succeeded) (Failed) (TimedOut)
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ (same as Failed)
│ SCOPE: │ │ SCOPE: │
│ "On_Success" │ │ "Catch" │
│ │ │ │
│ - Log success│ │ - Log error │
│ - Notify │ │ - Alert team │
│ - Metrics │ │ - Compensate │
└──────────────┘ └──────────────┘
│ │
└───────┬───────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ SCOPE: "Finally" │
│ Run After: On_Success [Succeeded, Failed, Skipped] │
│ Catch [Succeeded, Failed, Skipped] │
│ │
│ Contains: Cleanup actions that ALWAYS run │
│ - Release locks │
│ - Close connections │
│ - Update audit trail │
└─────────────────────────────────────────────────────────┘
Complete Try/Catch/Finally Implementation
{
"actions": {
"Try_Scope": {
"type": "Scope",
"actions": {
"Get_Order_From_Queue": { "type": "ApiConnection", "inputs": { "...": "..." } },
"Validate_Order": { "type": "If", "expression": "...", "actions": { "...": "..." } },
"Process_Payment": { "type": "Http", "inputs": { "...": "..." } },
"Update_Inventory": { "type": "Http", "inputs": { "...": "..." } },
"Send_Confirmation": { "type": "ApiConnection", "inputs": { "...": "..." } }
},
"runAfter": {}
},
"Catch_Scope": {
"type": "Scope",
"actions": {
"Compose_Error_Info": {
"type": "Compose",
"inputs": {
"workflowName": "@workflow().name",
"runId": "@workflow().run.name",
"timestamp": "@utcNow()",
"triggerBody": "@triggerBody()",
"failedActions": "@result('Try_Scope')",
"errorMessage": "@{first(filter(result('Try_Scope'), item => item['status'] == 'Failed'))?['error']?['message']}"
}
},
"Send_Teams_Alert": {
"type": "ApiConnection",
"inputs": {
"method": "post",
"body": {
"messageBody": "🚨 Workflow **@{workflow().name}** failed at @{utcNow()}\n\n**Error:** @{outputs('Compose_Error_Info')?['errorMessage']}\n\n**Run ID:** @{workflow().run.name}\n\n[View Run](https://portal.azure.com/#view/Microsoft.Azure.Management.Logic/LogicAppRunBlade/...)"
}
},
"runAfter": { "Compose_Error_Info": ["Succeeded"] }
},
"Dead_Letter_Message": {
"type": "ApiConnection",
"inputs": {
"method": "post",
"path": "/queues/orders-dlq/messages",
"body": {
"ContentData": "@{base64(string(outputs('Compose_Error_Info')))}",
"Properties": {
"ErrorReason": "@{outputs('Compose_Error_Info')?['errorMessage']}",
"OriginalRunId": "@{workflow().run.name}"
}
}
},
"runAfter": { "Compose_Error_Info": ["Succeeded"] }
}
},
"runAfter": {
"Try_Scope": ["Failed", "TimedOut"]
}
},
"Finally_Scope": {
"type": "Scope",
"actions": {
"Update_Audit_Log": {
"type": "ApiConnection",
"inputs": {
"method": "post",
"path": "/Tables/@{encodeURIComponent('AuditLog')}/entities",
"body": {
"PartitionKey": "@{formatDateTime(utcNow(), 'yyyy-MM-dd')}",
"RowKey": "@{workflow().run.name}",
"WorkflowName": "@{workflow().name}",
"Status": "@{if(equals(result('Try_Scope')[0]['status'], 'Succeeded'), 'Success', 'Failed')}",
"CompletedAt": "@{utcNow()}"
}
}
},
"Release_Processing_Lock": {
"type": "Http",
"inputs": {
"method": "DELETE",
"uri": "https://my-api.com/locks/@{triggerBody()?['orderId']}"
},
"runAfter": { "Update_Audit_Log": ["Succeeded", "Failed"] }
}
},
"runAfter": {
"Try_Scope": ["Succeeded", "Failed", "TimedOut", "Skipped"],
"Catch_Scope": ["Succeeded", "Failed", "TimedOut", "Skipped"]
}
}
}
}
Key Points About the Finally Scope
The Finally scope uses ["Succeeded", "Failed", "TimedOut", "Skipped"] for ALL preceding scopes. This ensures it runs regardless of what happened — just like a finally block in C# or Java. Use it for:
- Releasing distributed locks
- Closing database connections
- Updating audit trails
- Sending completion metrics
- Cleaning up temporary resources
Step 4: Compensation Logic (Undoing Partial Work)
When a multi-step workflow fails partway through, you may need to undo the steps that already completed. This is called compensation — rolling back partial changes to maintain data consistency.
Real-World Example: Order Processing
Consider an order workflow with these steps:
- ✅ Reserve inventory (succeeded)
- ✅ Charge credit card (succeeded)
- ❌ Create shipping label (failed)
Without compensation, the customer is charged but never receives their order. With compensation, you reverse steps 1 and 2:
{
"Compensate_On_Failure": {
"type": "Scope",
"actions": {
"Check_What_Succeeded": {
"type": "Compose",
"inputs": {
"inventoryReserved": "@equals(result('Reserve_Inventory')?[0]?['status'], 'Succeeded')",
"paymentCharged": "@equals(result('Charge_Payment')?[0]?['status'], 'Succeeded')",
"shippingCreated": "@equals(result('Create_Shipping_Label')?[0]?['status'], 'Succeeded')"
}
},
"Reverse_Payment_If_Charged": {
"type": "If",
"expression": { "equals": ["@outputs('Check_What_Succeeded')?['paymentCharged']", true] },
"actions": {
"Refund_Payment": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://payment-api.com/refunds",
"body": {
"chargeId": "@body('Charge_Payment')?['chargeId']",
"reason": "workflow_failure"
}
}
}
},
"runAfter": { "Check_What_Succeeded": ["Succeeded"] }
},
"Release_Inventory_If_Reserved": {
"type": "If",
"expression": { "equals": ["@outputs('Check_What_Succeeded')?['inventoryReserved']", true] },
"actions": {
"Release_Inventory": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://inventory-api.com/release",
"body": {
"reservationId": "@body('Reserve_Inventory')?['reservationId']"
}
}
}
},
"runAfter": { "Check_What_Succeeded": ["Succeeded"] }
},
"Notify_Customer_Of_Failure": {
"type": "ApiConnection",
"inputs": {
"method": "post",
"path": "/v2/Mail",
"body": {
"To": "@triggerBody()?['customerEmail']",
"Subject": "Issue with your order @{triggerBody()?['orderId']}",
"Body": "We encountered an issue processing your order. Any charges have been reversed. Please try again or contact support."
}
},
"runAfter": {
"Reverse_Payment_If_Charged": ["Succeeded", "Failed"],
"Release_Inventory_If_Reserved": ["Succeeded", "Failed"]
}
}
},
"runAfter": {
"Try_Process_Order": ["Failed", "TimedOut"]
}
}
}
Step 5: Parallel Execution with Error Handling
When running actions in parallel, error handling becomes more complex because multiple actions can fail simultaneously.
Parallel Branch Error Handling
{
"Parallel_Notifications": {
"type": "Scope",
"actions": {
"Send_Email": {
"type": "ApiConnection",
"inputs": { "method": "post", "path": "/v2/Mail", "body": { "...": "..." } },
"runAfter": {}
},
"Send_SMS": {
"type": "Http",
"inputs": { "method": "POST", "uri": "https://sms-api.com/send", "body": { "...": "..." } },
"runAfter": {}
},
"Send_Push_Notification": {
"type": "Http",
"inputs": { "method": "POST", "uri": "https://push-api.com/notify", "body": { "...": "..." } },
"runAfter": {}
}
}
},
"Handle_Notification_Failures": {
"type": "Scope",
"actions": {
"Analyze_Parallel_Results": {
"type": "Compose",
"inputs": {
"totalActions": "@length(result('Parallel_Notifications'))",
"failedActions": "@length(filter(result('Parallel_Notifications'), item => item['status'] == 'Failed'))",
"succeededActions": "@length(filter(result('Parallel_Notifications'), item => item['status'] == 'Succeeded'))",
"failures": "@filter(result('Parallel_Notifications'), item => item['status'] == 'Failed')"
}
},
"Alert_If_All_Failed": {
"type": "If",
"expression": {
"equals": ["@outputs('Analyze_Parallel_Results')?['succeededActions']", 0]
},
"actions": {
"Critical_Alert": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://alerts.pagerduty.com/integration/events",
"body": {
"severity": "critical",
"summary": "All notification channels failed for order @{triggerBody()?['orderId']}"
}
}
}
},
"runAfter": { "Analyze_Parallel_Results": ["Succeeded"] }
}
},
"runAfter": {
"Parallel_Notifications": ["Failed", "TimedOut"]
}
}
}
Step 6: Timeout Configuration and Handling
Logic Apps actions have default timeouts that may not suit your workflow. Long-running operations (file processing, batch jobs, approval workflows) need custom timeout configuration.
Setting Action Timeouts
{
"Call_Long_Running_API": {
"type": "Http",
"inputs": {
"method": "POST",
"uri": "https://batch-processor.com/jobs",
"body": "@body('Prepare_Batch_Data')"
},
"limit": {
"timeout": "PT5M"
}
}
}
Timeout Duration Format (ISO 8601)
| Format | Duration |
|---|---|
PT30S | 30 seconds |
PT5M | 5 minutes |
PT1H | 1 hour |
PT2H30M | 2 hours 30 minutes |
P1D | 1 day |
Handling Timeouts Differently from Failures
Sometimes a timeout requires different handling than a failure. For example, a timeout might mean the operation is still running (just slow), while a failure means it definitely didn't work:
{
"Handle_Timeout_Specifically": {
"type": "If",
"expression": {
"equals": ["@result('Try_Scope')?[0]?['status']", "TimedOut"]
},
"actions": {
"Check_If_Operation_Completed": {
"type": "Http",
"inputs": {
"method": "GET",
"uri": "https://api.com/operations/@{body('Start_Operation')?['operationId']}/status"
}
},
"Handle_Based_On_Status": {
"type": "Switch",
"expression": "@body('Check_If_Operation_Completed')?['status']",
"cases": {
"completed": {
"actions": { "Log_Late_Success": { "type": "Compose", "inputs": "Operation completed after timeout" } }
},
"running": {
"actions": { "Schedule_Status_Check": { "type": "Http", "inputs": { "...": "..." } } }
},
"failed": {
"actions": { "Handle_Confirmed_Failure": { "type": "Compose", "inputs": "Operation confirmed failed" } }
}
},
"runAfter": { "Check_If_Operation_Completed": ["Succeeded"] }
}
},
"runAfter": { "Try_Scope": ["TimedOut"] }
}
}
Step 7: Dead Letter Pattern for Failed Messages
When a workflow cannot process a message after all retries, move it to a dead letter queue for later investigation and reprocessing.
Implementation
{
"Move_To_Dead_Letter_Queue": {
"type": "ApiConnection",
"inputs": {
"host": { "connection": { "name": "@parameters('$connections')['servicebus']['connectionId']" } },
"method": "post",
"path": "/queues/orders-dead-letter/messages",
"body": {
"ContentData": "@{base64(triggerBody()?['ContentData'])}",
"ContentType": "application/json",
"Properties": {
"OriginalQueue": "orders",
"FailureReason": "@{outputs('Compose_Error_Info')?['errorMessage']}",
"FailedAt": "@{utcNow()}",
"WorkflowRunId": "@{workflow().run.name}",
"RetryCount": "@{coalesce(triggerBody()?['Properties']?['RetryCount'], '0')}",
"OriginalEnqueueTime": "@{triggerBody()?['EnqueuedTimeUtc']}"
}
}
},
"runAfter": { "Compose_Error_Info": ["Succeeded"] }
}
}
Reprocessing Dead Letters
Create a separate Logic App that periodically checks the dead letter queue and attempts reprocessing:
{
"triggers": {
"Recurrence": {
"type": "Recurrence",
"recurrence": { "frequency": "Hour", "interval": 1 }
}
},
"actions": {
"Peek_Dead_Letters": {
"type": "ApiConnection",
"inputs": {
"method": "get",
"path": "/queues/orders-dead-letter/messages/head/peek",
"queries": { "count": 10 }
}
},
"For_Each_Dead_Letter": {
"type": "Foreach",
"foreach": "@body('Peek_Dead_Letters')",
"actions": {
"Check_If_Retryable": {
"type": "If",
"expression": {
"and": [
{ "less": ["@int(items('For_Each_Dead_Letter')?['Properties']?['RetryCount'])", 3] },
{ "not": { "contains": ["@items('For_Each_Dead_Letter')?['Properties']?['FailureReason']", "ValidationFailed"] } }
]
},
"actions": {
"Requeue_Message": {
"type": "ApiConnection",
"inputs": {
"method": "post",
"path": "/queues/orders/messages",
"body": {
"ContentData": "@items('For_Each_Dead_Letter')?['ContentData']",
"Properties": {
"RetryCount": "@{add(int(coalesce(items('For_Each_Dead_Letter')?['Properties']?['RetryCount'], '0')), 1)}",
"ReprocessedAt": "@{utcNow()}"
}
}
}
}
}
}
}
}
}
}
Real-World Scenarios
Scenario 1: E-Commerce Order Processing
A complete order processing workflow with full error handling:
- Trigger: New message in Service Bus queue
- Try: Parse order → Validate → Reserve inventory → Charge payment → Create shipping label → Send confirmation
- Catch: Log error → Compensate (refund + release inventory) → Dead letter the message → Alert operations team
- Finally: Update audit log → Release processing lock → Track metrics
Scenario 2: Data Synchronization Between Systems
Syncing customer data from CRM to ERP with error handling:
- Trigger: Scheduled (every 15 minutes)
- Try: Query CRM for changes → Transform data → Batch update ERP
- Catch: Log failed records → Queue for manual review → Continue with remaining records
- Finally: Update sync checkpoint → Log sync statistics
Scenario 3: Approval Workflow with Escalation
An expense approval workflow that handles timeouts:
- Trigger: New expense submitted
- Try: Send approval email → Wait for response (timeout: 48 hours)
- On Timeout: Escalate to manager's manager → Wait again (timeout: 24 hours)
- On Second Timeout: Auto-reject → Notify submitter
- Finally: Update expense record status → Log decision
Monitoring and Diagnostics
Querying Failed Runs
Use Azure Monitor to track workflow failures:
// Find all failed Logic App runs in the last 24 hours
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where status_s == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, resource_workflowName_s, resource_runId_s, error_message_s
| order by TimeGenerated desc
// Track error patterns over time
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where status_s == "Failed"
| where TimeGenerated > ago(7d)
| summarize FailureCount = count() by resource_workflowName_s, bin(TimeGenerated, 1h)
| render timechart
Setting Up Alerts
# Alert when failure rate exceeds threshold
az monitor metrics alert create \
--name "logic-app-high-failure-rate" \
--resource-group my-rg \
--scopes "/subscriptions/{sub}/resourceGroups/my-rg/providers/Microsoft.Logic/workflows/order-processor" \
--condition "total RunsFailed > 5" \
--window-size 15m \
--evaluation-frequency 5m \
--severity 2 \
--action-groups ops-team
Best Practices
| Practice | Why | Implementation |
|---|---|---|
| Always use Scopes for grouping | Enables try/catch pattern | Wrap related actions in a Scope |
| Configure retry policies explicitly | Default may not suit your needs | Set type, count, and intervals |
| Disable retries for non-idempotent actions | Prevents duplicate side effects | "retryPolicy": { "type": "none" } |
| Implement compensation logic | Maintains data consistency | Undo completed steps on failure |
| Use dead letter queues | Prevents message loss | Move failed messages to DLQ |
| Log error context | Enables debugging | Capture run ID, trigger body, failed action details |
| Set appropriate timeouts | Prevents indefinite waits | Use limit.timeout on long-running actions |
| Monitor failure rates | Early warning of issues | Azure Monitor alerts on RunsFailed metric |
| Test error paths | Ensures handling works | Deliberately trigger failures in test environment |
| Use the Finally pattern | Ensures cleanup always runs | Run After with all four statuses |
Common Pitfalls
-
Forgetting Run After configuration — Your catch scope won't execute unless you explicitly set
runAfterto include["Failed", "TimedOut"] -
Catching too broadly — If your catch scope itself fails, you lose visibility. Keep catch logic simple and reliable.
-
Not handling partial failures in parallel branches — When running actions in parallel, some may succeed while others fail. Check individual results.
-
Infinite retry loops — Without a maximum retry count or dead letter pattern, a poison message can trigger the workflow indefinitely.
-
Missing compensation for idempotent operations — Even if an action is idempotent, downstream effects (emails, notifications) may not be. Always consider the full impact.
Summary
Robust error handling in Logic Apps requires:
- Scopes to group actions and enable try/catch/finally patterns
- Run After configuration to control execution flow based on action status
- Retry policies with exponential backoff for transient failures
- Compensation logic to undo partial work when workflows fail midway
- Dead letter queues to capture messages that cannot be processed
- Monitoring and alerts to detect issues before they impact users
- Timeout handling to prevent workflows from hanging indefinitely
The investment in proper error handling pays off immediately — fewer support tickets, faster incident resolution, and confidence that your business-critical workflows are resilient.
Azure Integration Hub — Logic Apps