Logic Apps — Advanced Error Handling and Exception Management

Why Error Handling in Logic Apps Matters

Azure Logic Apps orchestrate business-critical workflows — processing orders, syncing data between systems, sending notifications, and managing approvals. When a step in your workflow fails, the consequences can range from a missed email to a corrupted database record or a lost financial transaction.

Without proper error handling, your Logic Apps will:

Fail silently — A workflow stops mid-execution and nobody knows until a customer complains
Lose data — Messages or records that were partially processed get stuck in limbo
Create inconsistencies — One system gets updated but another doesn't, leading to data drift
Overwhelm downstream services — Retrying too aggressively can cause cascading failures
Make debugging impossible — Without proper logging, you can't determine what went wrong or why

With proper error handling, your workflows become resilient, observable, and self-healing.

Understanding Logic Apps Execution Model

Before implementing error handling, you need to understand how Logic Apps processes actions and what happens when something goes wrong.

Action Status Values

Every action in a Logic App produces one of four status values:

Status	Meaning	What Happens Next
Succeeded	Action completed successfully	Next action runs normally
Failed	Action threw an error	Dependent actions are skipped (by default)
Skipped	Action was not executed	Happens when a preceding action failed
TimedOut	Action exceeded its timeout	Treated as a failure

The Default Behavior Problem

By default, if any action fails, all subsequent actions in that branch are skipped and the entire workflow run is marked as "Failed." This is problematic because:

You can't send an alert about the failure (the alert action gets skipped too)
You can't perform cleanup or compensation logic
You can't log the error details for investigation

This is where Run After configuration and Scope actions become essential.

Architecture: Error Handling Patterns

┌─────────────────────────────────────────────────────────────────────────┐
│                    LOGIC APP ERROR HANDLING ARCHITECTURE                │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  SCOPE: "Process_Order"                                         │    │
│  │                                                                 │    │
│  │  ┌──────────┐   ┌──────────────┐   ┌───────────────────────┐    │    │
│  │  │ Validate │──▶│ Create Record│──▶│ Send Confirmation     │    │    │
│  │  │ Input    │   │ in Database  │   │ Email                 │    │    │
│  │  └──────────┘   └──────────────┘   └───────────────────────┘    │    │
│  │                                                                 │    │
│  └─────────────────────────────┬───────────────────────────────────┘    │
│                                │                                        │
│              ┌─────────────────┼─────────────────┐                      │
│              │ (Run After)     │                 │                      │
│              ▼                 ▼                 ▼                      │
│  ┌───────────────┐  ┌────────────────┐  ┌────────────────────┐          │
│  │ On Success    │  │ On Failure     │  │ On Timeout         │          │
│  │               │  │                │  │                    │          │
│  │ Log success   │  │ Log error      │  │ Log timeout        │          │
│  │ Update status │  │ Send alert     │  │ Send alert         │          │
│  │               │  │ Move to DLQ    │  │ Schedule retry     │          │
│  └───────────────┘  │ Compensate     │  └────────────────────┘          │
│                     └────────────────┘                                  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │  SCOPE: "Finally" (Runs regardless of outcome)                  │    │
│  │                                                                 │    │
│  │  ┌──────────────┐   ┌──────────────────┐                        │    │
│  │  │ Release locks│──▶│ Update audit log │                        │    │
│  │  └──────────────┘   └──────────────────┘                        │    │
│  └─────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────┘

Step 1: Understanding Scopes for Error Handling

A Scope in Logic Apps is a container that groups multiple actions together. The key benefit is that you can check whether the entire scope succeeded or failed, and then branch your logic accordingly — similar to a try/catch block in programming.

Why Use Scopes?

Group related actions — Treat a set of actions as a single unit of work
Catch errors from any action — If any action inside the scope fails, you can handle it
Implement try/catch/finally — Using multiple scopes with Run After configuration
Simplify error checking — Instead of checking each action individually, check the scope result

Basic Scope Pattern (Try/Catch)

{
  "definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "contentVersion": "1.0.0.0",
    "triggers": {
      "When_a_message_is_received": {
        "type": "ApiConnection",
        "inputs": {
          "host": { "connection": { "name": "@parameters('$connections')['servicebus']['connectionId']" } },
          "method": "get",
          "path": "/queues/orders/messages/head"
        },
        "recurrence": { "frequency": "Minute", "interval": 1 }
      }
    },
    "actions": {
      "Try_Process_Order": {
        "type": "Scope",
        "actions": {
          "Parse_Order_JSON": {
            "type": "ParseJson",
            "inputs": {
              "content": "@triggerBody()?['ContentData']",
              "schema": {
                "type": "object",
                "properties": {
                  "orderId": { "type": "string" },
                  "customerId": { "type": "string" },
                  "items": { "type": "array" },
                  "totalAmount": { "type": "number" }
                },
                "required": ["orderId", "customerId", "items"]
              }
            },
            "runAfter": {}
          },
          "Validate_Order": {
            "type": "If",
            "expression": {
              "and": [
                { "greater": ["@length(body('Parse_Order_JSON')?['items'])", 0] },
                { "greater": ["@body('Parse_Order_JSON')?['totalAmount']", 0] }
              ]
            },
            "actions": {
              "Create_Order_Record": {
                "type": "ApiConnection",
                "inputs": {
                  "host": { "connection": { "name": "@parameters('$connections')['sql']['connectionId']" } },
                  "method": "post",
                  "path": "/v2/datasets/@{encodeURIComponent('default')}/tables/@{encodeURIComponent('Orders')}/items",
                  "body": {
                    "OrderId": "@body('Parse_Order_JSON')?['orderId']",
                    "CustomerId": "@body('Parse_Order_JSON')?['customerId']",
                    "TotalAmount": "@body('Parse_Order_JSON')?['totalAmount']",
                    "Status": "Processing",
                    "CreatedAt": "@utcNow()"
                  }
                }
              },
              "Send_Confirmation_Email": {
                "type": "ApiConnection",
                "inputs": {
                  "host": { "connection": { "name": "@parameters('$connections')['office365']['connectionId']" } },
                  "method": "post",
                  "path": "/v2/Mail",
                  "body": {
                    "To": "@body('Parse_Order_JSON')?['customerEmail']",
                    "Subject": "Order @{body('Parse_Order_JSON')?['orderId']} Confirmed",
                    "Body": "Your order has been received and is being processed."
                  }
                },
                "runAfter": { "Create_Order_Record": ["Succeeded"] }
              }
            },
            "else": {
              "actions": {
                "Terminate_Invalid_Order": {
                  "type": "Terminate",
                  "inputs": {
                    "runStatus": "Failed",
                    "runError": {
                      "code": "ValidationFailed",
                      "message": "Order validation failed: empty items or zero amount"
                    }
                  }
                }
              }
            },
            "runAfter": { "Parse_Order_JSON": ["Succeeded"] }
          }
        },
        "runAfter": {}
      },
      "Catch_Process_Order_Errors": {
        "type": "Scope",
        "actions": {
          "Get_Error_Details": {
            "type": "Compose",
            "inputs": {
              "workflowRunId": "@workflow().run.name",
              "errorTime": "@utcNow()",
              "scopeResult": "@result('Try_Process_Order')",
              "failedActions": "@filter(result('Try_Process_Order'), item => item['status'] == 'Failed')"
            },
            "runAfter": {}
          },
          "Send_Error_Alert": {
            "type": "ApiConnection",
            "inputs": {
              "host": { "connection": { "name": "@parameters('$connections')['teams']['connectionId']" } },
              "method": "post",
              "path": "/v3/conversations/@{encodeURIComponent('channel-id')}/activities",
              "body": {
                "messageBody": "🚨 **Order Processing Failed**\n\nWorkflow Run: @{workflow().run.name}\nTime: @{utcNow()}\nError: @{first(body('Get_Error_Details')?['failedActions'])?['error']?['message']}"
              }
            },
            "runAfter": { "Get_Error_Details": ["Succeeded"] }
          },
          "Log_Error_To_Table": {
            "type": "ApiConnection",
            "inputs": {
              "host": { "connection": { "name": "@parameters('$connections')['azuretables']['connectionId']" } },
              "method": "post",
              "path": "/Tables/@{encodeURIComponent('WorkflowErrors')}/entities",
              "body": {
                "PartitionKey": "OrderProcessing",
                "RowKey": "@{guid()}",
                "WorkflowRunId": "@workflow().run.name",
                "ErrorDetails": "@{string(body('Get_Error_Details'))}",
                "Timestamp": "@utcNow()"
              }
            },
            "runAfter": { "Get_Error_Details": ["Succeeded"] }
          }
        },
        "runAfter": {
          "Try_Process_Order": ["Failed", "TimedOut"]
        }
      }
    }
  }
}

Understanding the Run After Configuration

The critical piece that makes error handling work is the runAfter property. By default, an action runs after the previous action succeeds. But you can configure it to run after specific statuses:

"runAfter": {
  "Try_Process_Order": ["Failed", "TimedOut"]
}

This means the "Catch" scope only executes when the "Try" scope fails or times out. You can combine multiple statuses:

["Succeeded"] — Run only on success (default)
["Failed"] — Run only on failure
["TimedOut"] — Run only on timeout
["Skipped"] — Run only when skipped
["Failed", "TimedOut"] — Run on failure OR timeout
["Succeeded", "Failed", "TimedOut", "Skipped"] — Always run (finally pattern)

Step 2: Retry Policies for Transient Failures

Many failures in Logic Apps are transient — a downstream API is temporarily unavailable, a database connection times out, or a rate limit is hit. Retry policies handle these automatically without requiring manual intervention.

Understanding Retry Policy Types

Logic Apps supports four retry policy types:

Policy Type	Behavior	Best For
Default	Retries 4 times at exponentially increasing intervals (7s, 14s, 28s, 56s)	Most scenarios
Exponential	Configurable count and intervals with exponential backoff	APIs with rate limits
Fixed	Retries at fixed intervals	Services with predictable recovery
None	No retries	Actions where retry would cause duplicates

Configuring Exponential Backoff

Exponential backoff is the recommended strategy for most integrations. It gives the downstream service time to recover while avoiding thundering herd problems.

{
  "Call_Payment_API": {
    "type": "Http",
    "inputs": {
      "method": "POST",
      "uri": "https://payment-api.example.com/v1/charges",
      "headers": {
        "Content-Type": "application/json",
        "Authorization": "Bearer @{body('Get_API_Token')?['access_token']}"
      },
      "body": {
        "amount": "@body('Parse_Order_JSON')?['totalAmount']",
        "currency": "USD",
        "customerId": "@body('Parse_Order_JSON')?['customerId']"
      },
      "retryPolicy": {
        "type": "exponential",
        "count": 4,
        "interval": "PT10S",
        "minimumInterval": "PT5S",
        "maximumInterval": "PT1H"
      }
    },
    "runAfter": { "Validate_Payment_Details": ["Succeeded"] }
  }
}

How Exponential Backoff Works

Attempt 1: Immediate
    ↓ (fails)
Wait: 10 seconds (interval)
Attempt 2:
    ↓ (fails)
Wait: 20 seconds (interval × 2)
Attempt 3:
    ↓ (fails)
Wait: 40 seconds (interval × 4)
Attempt 4:
    ↓ (fails)
Wait: 80 seconds (interval × 8, capped at maximumInterval)
Attempt 5 (final):
    ↓ (fails)
→ Action marked as "Failed"

Fixed Interval Retry

Use fixed intervals when the downstream service has a known recovery time (e.g., a service that restarts in exactly 30 seconds):

{
  "Sync_To_Legacy_System": {
    "type": "Http",
    "inputs": {
      "method": "POST",
      "uri": "https://legacy-erp.internal/api/sync",
      "body": "@body('Transform_Data')",
      "retryPolicy": {
        "type": "fixed",
        "count": 3,
        "interval": "PT30S"
      }
    }
  }
}

Disabling Retries (Critical for Idempotency)

For actions that are NOT idempotent (e.g., charging a credit card, sending an SMS), disable retries to prevent duplicate operations:

{
  "Charge_Credit_Card": {
    "type": "Http",
    "inputs": {
      "method": "POST",
      "uri": "https://payment-gateway.com/charge",
      "body": {
        "amount": "@body('Calculate_Total')",
        "idempotencyKey": "@guid()"
      },
      "retryPolicy": {
        "type": "none"
      }
    }
  }
}

Why disable retries? If the payment API processes the charge but the response times out, Logic Apps would retry and charge the customer again. Using "type": "none" combined with an idempotency key prevents this.

Step 3: The Try/Catch/Finally Pattern

For production workflows, implement the full try/catch/finally pattern using three scopes:

┌─────────────────────────────────────────────────────────┐
│  SCOPE: "Try"                                           │
│  Contains: Business logic actions                       │
│  Run After: Trigger                                     │
└────────────────────────┬────────────────────────────────┘
                         │
         ┌───────────────┼───────────────┐
         │               │               │
    (Succeeded)     (Failed)        (TimedOut)
         │               │               │
         ▼               ▼               ▼
┌──────────────┐  ┌──────────────┐  (same as Failed)
│ SCOPE:       │  │ SCOPE:       │
│ "On_Success" │  │ "Catch"      │
│              │  │              │
│ - Log success│  │ - Log error  │
│ - Notify     │  │ - Alert team │
│ - Metrics    │  │ - Compensate │
└──────────────┘  └──────────────┘
         │               │
         └───────┬───────┘
                 │
                 ▼
┌─────────────────────────────────────────────────────────┐
│  SCOPE: "Finally"                                       │
│  Run After: On_Success [Succeeded, Failed, Skipped]     │
│             Catch [Succeeded, Failed, Skipped]          │
│                                                         │
│  Contains: Cleanup actions that ALWAYS run              │
│  - Release locks                                        │
│  - Close connections                                    │
│  - Update audit trail                                   │
└─────────────────────────────────────────────────────────┘

Complete Try/Catch/Finally Implementation

{
  "actions": {
    "Try_Scope": {
      "type": "Scope",
      "actions": {
        "Get_Order_From_Queue": { "type": "ApiConnection", "inputs": { "...": "..." } },
        "Validate_Order": { "type": "If", "expression": "...", "actions": { "...": "..." } },
        "Process_Payment": { "type": "Http", "inputs": { "...": "..." } },
        "Update_Inventory": { "type": "Http", "inputs": { "...": "..." } },
        "Send_Confirmation": { "type": "ApiConnection", "inputs": { "...": "..." } }
      },
      "runAfter": {}
    },

    "Catch_Scope": {
      "type": "Scope",
      "actions": {
        "Compose_Error_Info": {
          "type": "Compose",
          "inputs": {
            "workflowName": "@workflow().name",
            "runId": "@workflow().run.name",
            "timestamp": "@utcNow()",
            "triggerBody": "@triggerBody()",
            "failedActions": "@result('Try_Scope')",
            "errorMessage": "@{first(filter(result('Try_Scope'), item => item['status'] == 'Failed'))?['error']?['message']}"
          }
        },
        "Send_Teams_Alert": {
          "type": "ApiConnection",
          "inputs": {
            "method": "post",
            "body": {
              "messageBody": "🚨 Workflow **@{workflow().name}** failed at @{utcNow()}\n\n**Error:** @{outputs('Compose_Error_Info')?['errorMessage']}\n\n**Run ID:** @{workflow().run.name}\n\n[View Run](https://portal.azure.com/#view/Microsoft.Azure.Management.Logic/LogicAppRunBlade/...)"
            }
          },
          "runAfter": { "Compose_Error_Info": ["Succeeded"] }
        },
        "Dead_Letter_Message": {
          "type": "ApiConnection",
          "inputs": {
            "method": "post",
            "path": "/queues/orders-dlq/messages",
            "body": {
              "ContentData": "@{base64(string(outputs('Compose_Error_Info')))}",
              "Properties": {
                "ErrorReason": "@{outputs('Compose_Error_Info')?['errorMessage']}",
                "OriginalRunId": "@{workflow().run.name}"
              }
            }
          },
          "runAfter": { "Compose_Error_Info": ["Succeeded"] }
        }
      },
      "runAfter": {
        "Try_Scope": ["Failed", "TimedOut"]
      }
    },

    "Finally_Scope": {
      "type": "Scope",
      "actions": {
        "Update_Audit_Log": {
          "type": "ApiConnection",
          "inputs": {
            "method": "post",
            "path": "/Tables/@{encodeURIComponent('AuditLog')}/entities",
            "body": {
              "PartitionKey": "@{formatDateTime(utcNow(), 'yyyy-MM-dd')}",
              "RowKey": "@{workflow().run.name}",
              "WorkflowName": "@{workflow().name}",
              "Status": "@{if(equals(result('Try_Scope')[0]['status'], 'Succeeded'), 'Success', 'Failed')}",
              "CompletedAt": "@{utcNow()}"
            }
          }
        },
        "Release_Processing_Lock": {
          "type": "Http",
          "inputs": {
            "method": "DELETE",
            "uri": "https://my-api.com/locks/@{triggerBody()?['orderId']}"
          },
          "runAfter": { "Update_Audit_Log": ["Succeeded", "Failed"] }
        }
      },
      "runAfter": {
        "Try_Scope": ["Succeeded", "Failed", "TimedOut", "Skipped"],
        "Catch_Scope": ["Succeeded", "Failed", "TimedOut", "Skipped"]
      }
    }
  }
}

Key Points About the Finally Scope

The Finally scope uses ["Succeeded", "Failed", "TimedOut", "Skipped"] for ALL preceding scopes. This ensures it runs regardless of what happened — just like a finally block in C# or Java. Use it for:

Releasing distributed locks
Closing database connections
Updating audit trails
Sending completion metrics
Cleaning up temporary resources

Step 4: Compensation Logic (Undoing Partial Work)

When a multi-step workflow fails partway through, you may need to undo the steps that already completed. This is called compensation — rolling back partial changes to maintain data consistency.

Real-World Example: Order Processing

Consider an order workflow with these steps:

✅ Reserve inventory (succeeded)
✅ Charge credit card (succeeded)
❌ Create shipping label (failed)

Without compensation, the customer is charged but never receives their order. With compensation, you reverse steps 1 and 2:

{
  "Compensate_On_Failure": {
    "type": "Scope",
    "actions": {
      "Check_What_Succeeded": {
        "type": "Compose",
        "inputs": {
          "inventoryReserved": "@equals(result('Reserve_Inventory')?[0]?['status'], 'Succeeded')",
          "paymentCharged": "@equals(result('Charge_Payment')?[0]?['status'], 'Succeeded')",
          "shippingCreated": "@equals(result('Create_Shipping_Label')?[0]?['status'], 'Succeeded')"
        }
      },
      "Reverse_Payment_If_Charged": {
        "type": "If",
        "expression": { "equals": ["@outputs('Check_What_Succeeded')?['paymentCharged']", true] },
        "actions": {
          "Refund_Payment": {
            "type": "Http",
            "inputs": {
              "method": "POST",
              "uri": "https://payment-api.com/refunds",
              "body": {
                "chargeId": "@body('Charge_Payment')?['chargeId']",
                "reason": "workflow_failure"
              }
            }
          }
        },
        "runAfter": { "Check_What_Succeeded": ["Succeeded"] }
      },
      "Release_Inventory_If_Reserved": {
        "type": "If",
        "expression": { "equals": ["@outputs('Check_What_Succeeded')?['inventoryReserved']", true] },
        "actions": {
          "Release_Inventory": {
            "type": "Http",
            "inputs": {
              "method": "POST",
              "uri": "https://inventory-api.com/release",
              "body": {
                "reservationId": "@body('Reserve_Inventory')?['reservationId']"
              }
            }
          }
        },
        "runAfter": { "Check_What_Succeeded": ["Succeeded"] }
      },
      "Notify_Customer_Of_Failure": {
        "type": "ApiConnection",
        "inputs": {
          "method": "post",
          "path": "/v2/Mail",
          "body": {
            "To": "@triggerBody()?['customerEmail']",
            "Subject": "Issue with your order @{triggerBody()?['orderId']}",
            "Body": "We encountered an issue processing your order. Any charges have been reversed. Please try again or contact support."
          }
        },
        "runAfter": {
          "Reverse_Payment_If_Charged": ["Succeeded", "Failed"],
          "Release_Inventory_If_Reserved": ["Succeeded", "Failed"]
        }
      }
    },
    "runAfter": {
      "Try_Process_Order": ["Failed", "TimedOut"]
    }
  }
}

Step 5: Parallel Execution with Error Handling

When running actions in parallel, error handling becomes more complex because multiple actions can fail simultaneously.

Parallel Branch Error Handling

{
  "Parallel_Notifications": {
    "type": "Scope",
    "actions": {
      "Send_Email": {
        "type": "ApiConnection",
        "inputs": { "method": "post", "path": "/v2/Mail", "body": { "...": "..." } },
        "runAfter": {}
      },
      "Send_SMS": {
        "type": "Http",
        "inputs": { "method": "POST", "uri": "https://sms-api.com/send", "body": { "...": "..." } },
        "runAfter": {}
      },
      "Send_Push_Notification": {
        "type": "Http",
        "inputs": { "method": "POST", "uri": "https://push-api.com/notify", "body": { "...": "..." } },
        "runAfter": {}
      }
    }
  },
  "Handle_Notification_Failures": {
    "type": "Scope",
    "actions": {
      "Analyze_Parallel_Results": {
        "type": "Compose",
        "inputs": {
          "totalActions": "@length(result('Parallel_Notifications'))",
          "failedActions": "@length(filter(result('Parallel_Notifications'), item => item['status'] == 'Failed'))",
          "succeededActions": "@length(filter(result('Parallel_Notifications'), item => item['status'] == 'Succeeded'))",
          "failures": "@filter(result('Parallel_Notifications'), item => item['status'] == 'Failed')"
        }
      },
      "Alert_If_All_Failed": {
        "type": "If",
        "expression": {
          "equals": ["@outputs('Analyze_Parallel_Results')?['succeededActions']", 0]
        },
        "actions": {
          "Critical_Alert": {
            "type": "Http",
            "inputs": {
              "method": "POST",
              "uri": "https://alerts.pagerduty.com/integration/events",
              "body": {
                "severity": "critical",
                "summary": "All notification channels failed for order @{triggerBody()?['orderId']}"
              }
            }
          }
        },
        "runAfter": { "Analyze_Parallel_Results": ["Succeeded"] }
      }
    },
    "runAfter": {
      "Parallel_Notifications": ["Failed", "TimedOut"]
    }
  }
}

Step 6: Timeout Configuration and Handling

Logic Apps actions have default timeouts that may not suit your workflow. Long-running operations (file processing, batch jobs, approval workflows) need custom timeout configuration.

Setting Action Timeouts

{
  "Call_Long_Running_API": {
    "type": "Http",
    "inputs": {
      "method": "POST",
      "uri": "https://batch-processor.com/jobs",
      "body": "@body('Prepare_Batch_Data')"
    },
    "limit": {
      "timeout": "PT5M"
    }
  }
}

Timeout Duration Format (ISO 8601)

Format	Duration
`PT30S`	30 seconds
`PT5M`	5 minutes
`PT1H`	1 hour
`PT2H30M`	2 hours 30 minutes
`P1D`	1 day

Handling Timeouts Differently from Failures

Sometimes a timeout requires different handling than a failure. For example, a timeout might mean the operation is still running (just slow), while a failure means it definitely didn't work:

{
  "Handle_Timeout_Specifically": {
    "type": "If",
    "expression": {
      "equals": ["@result('Try_Scope')?[0]?['status']", "TimedOut"]
    },
    "actions": {
      "Check_If_Operation_Completed": {
        "type": "Http",
        "inputs": {
          "method": "GET",
          "uri": "https://api.com/operations/@{body('Start_Operation')?['operationId']}/status"
        }
      },
      "Handle_Based_On_Status": {
        "type": "Switch",
        "expression": "@body('Check_If_Operation_Completed')?['status']",
        "cases": {
          "completed": {
            "actions": { "Log_Late_Success": { "type": "Compose", "inputs": "Operation completed after timeout" } }
          },
          "running": {
            "actions": { "Schedule_Status_Check": { "type": "Http", "inputs": { "...": "..." } } }
          },
          "failed": {
            "actions": { "Handle_Confirmed_Failure": { "type": "Compose", "inputs": "Operation confirmed failed" } }
          }
        },
        "runAfter": { "Check_If_Operation_Completed": ["Succeeded"] }
      }
    },
    "runAfter": { "Try_Scope": ["TimedOut"] }
  }
}

Step 7: Dead Letter Pattern for Failed Messages

When a workflow cannot process a message after all retries, move it to a dead letter queue for later investigation and reprocessing.

Implementation

{
  "Move_To_Dead_Letter_Queue": {
    "type": "ApiConnection",
    "inputs": {
      "host": { "connection": { "name": "@parameters('$connections')['servicebus']['connectionId']" } },
      "method": "post",
      "path": "/queues/orders-dead-letter/messages",
      "body": {
        "ContentData": "@{base64(triggerBody()?['ContentData'])}",
        "ContentType": "application/json",
        "Properties": {
          "OriginalQueue": "orders",
          "FailureReason": "@{outputs('Compose_Error_Info')?['errorMessage']}",
          "FailedAt": "@{utcNow()}",
          "WorkflowRunId": "@{workflow().run.name}",
          "RetryCount": "@{coalesce(triggerBody()?['Properties']?['RetryCount'], '0')}",
          "OriginalEnqueueTime": "@{triggerBody()?['EnqueuedTimeUtc']}"
        }
      }
    },
    "runAfter": { "Compose_Error_Info": ["Succeeded"] }
  }
}

Reprocessing Dead Letters

Create a separate Logic App that periodically checks the dead letter queue and attempts reprocessing:

{
  "triggers": {
    "Recurrence": {
      "type": "Recurrence",
      "recurrence": { "frequency": "Hour", "interval": 1 }
    }
  },
  "actions": {
    "Peek_Dead_Letters": {
      "type": "ApiConnection",
      "inputs": {
        "method": "get",
        "path": "/queues/orders-dead-letter/messages/head/peek",
        "queries": { "count": 10 }
      }
    },
    "For_Each_Dead_Letter": {
      "type": "Foreach",
      "foreach": "@body('Peek_Dead_Letters')",
      "actions": {
        "Check_If_Retryable": {
          "type": "If",
          "expression": {
            "and": [
              { "less": ["@int(items('For_Each_Dead_Letter')?['Properties']?['RetryCount'])", 3] },
              { "not": { "contains": ["@items('For_Each_Dead_Letter')?['Properties']?['FailureReason']", "ValidationFailed"] } }
            ]
          },
          "actions": {
            "Requeue_Message": {
              "type": "ApiConnection",
              "inputs": {
                "method": "post",
                "path": "/queues/orders/messages",
                "body": {
                  "ContentData": "@items('For_Each_Dead_Letter')?['ContentData']",
                  "Properties": {
                    "RetryCount": "@{add(int(coalesce(items('For_Each_Dead_Letter')?['Properties']?['RetryCount'], '0')), 1)}",
                    "ReprocessedAt": "@{utcNow()}"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Real-World Scenarios

Scenario 1: E-Commerce Order Processing

A complete order processing workflow with full error handling:

Trigger: New message in Service Bus queue
Try: Parse order → Validate → Reserve inventory → Charge payment → Create shipping label → Send confirmation
Catch: Log error → Compensate (refund + release inventory) → Dead letter the message → Alert operations team
Finally: Update audit log → Release processing lock → Track metrics

Scenario 2: Data Synchronization Between Systems

Syncing customer data from CRM to ERP with error handling:

Trigger: Scheduled (every 15 minutes)
Try: Query CRM for changes → Transform data → Batch update ERP
Catch: Log failed records → Queue for manual review → Continue with remaining records
Finally: Update sync checkpoint → Log sync statistics

Scenario 3: Approval Workflow with Escalation

An expense approval workflow that handles timeouts:

Trigger: New expense submitted
Try: Send approval email → Wait for response (timeout: 48 hours)
On Timeout: Escalate to manager's manager → Wait again (timeout: 24 hours)
On Second Timeout: Auto-reject → Notify submitter
Finally: Update expense record status → Log decision

Monitoring and Diagnostics

Querying Failed Runs

Use Azure Monitor to track workflow failures:

// Find all failed Logic App runs in the last 24 hours
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where status_s == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, resource_workflowName_s, resource_runId_s, error_message_s
| order by TimeGenerated desc

// Track error patterns over time
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where status_s == "Failed"
| where TimeGenerated > ago(7d)
| summarize FailureCount = count() by resource_workflowName_s, bin(TimeGenerated, 1h)
| render timechart

Setting Up Alerts

# Alert when failure rate exceeds threshold
az monitor metrics alert create \
  --name "logic-app-high-failure-rate" \
  --resource-group my-rg \
  --scopes "/subscriptions/{sub}/resourceGroups/my-rg/providers/Microsoft.Logic/workflows/order-processor" \
  --condition "total RunsFailed > 5" \
  --window-size 15m \
  --evaluation-frequency 5m \
  --severity 2 \
  --action-groups ops-team

Best Practices

Practice	Why	Implementation
Always use Scopes for grouping	Enables try/catch pattern	Wrap related actions in a Scope
Configure retry policies explicitly	Default may not suit your needs	Set type, count, and intervals
Disable retries for non-idempotent actions	Prevents duplicate side effects	`"retryPolicy": { "type": "none" }`
Implement compensation logic	Maintains data consistency	Undo completed steps on failure
Use dead letter queues	Prevents message loss	Move failed messages to DLQ
Log error context	Enables debugging	Capture run ID, trigger body, failed action details
Set appropriate timeouts	Prevents indefinite waits	Use `limit.timeout` on long-running actions
Monitor failure rates	Early warning of issues	Azure Monitor alerts on RunsFailed metric
Test error paths	Ensures handling works	Deliberately trigger failures in test environment
Use the Finally pattern	Ensures cleanup always runs	Run After with all four statuses

Common Pitfalls

Forgetting Run After configuration — Your catch scope won't execute unless you explicitly set runAfter to include ["Failed", "TimedOut"]
Catching too broadly — If your catch scope itself fails, you lose visibility. Keep catch logic simple and reliable.
Not handling partial failures in parallel branches — When running actions in parallel, some may succeed while others fail. Check individual results.
Infinite retry loops — Without a maximum retry count or dead letter pattern, a poison message can trigger the workflow indefinitely.
Missing compensation for idempotent operations — Even if an action is idempotent, downstream effects (emails, notifications) may not be. Always consider the full impact.

Summary

Robust error handling in Logic Apps requires:

Scopes to group actions and enable try/catch/finally patterns
Run After configuration to control execution flow based on action status
Retry policies with exponential backoff for transient failures
Compensation logic to undo partial work when workflows fail midway
Dead letter queues to capture messages that cannot be processed
Monitoring and alerts to detect issues before they impact users
Timeout handling to prevent workflows from hanging indefinitely

The investment in proper error handling pays off immediately — fewer support tickets, faster incident resolution, and confidence that your business-critical workflows are resilient.

Azure Integration Hub — Logic Apps