01What Is Azure API Management?
Azure API Management (APIM) is a fully managed API gateway that acts as the front door for all your backend services. It lets you publish, secure, transform, monitor, and scale APIs from one central place β whether backends are Azure Functions, Logic Apps, Kubernetes microservices, or legacy on-premises SOAP services.
02Core Concepts & Architecture
Azure API Management is built around a layered architecture where each component serves a distinct purpose in the API lifecycle. Understanding these building blocks β from the gateway that processes live traffic to the management plane that handles configuration β is essential for designing scalable API solutions. In production, these components work together to enforce security, manage access, and provide observability across all your APIs. Mastering the relationship between APIs, operations, products, and subscriptions will help you model access control that maps cleanly to your business requirements.
| Component | Role |
|---|---|
| Gateway | Receives API calls, applies policies, routes to backend, returns response |
| Management Plane | Admin interface via Portal, REST API, ARM templates, Bicep |
| Developer Portal | Auto-generated website for docs, Try-It, subscriptions, key management |
| API | A group of operations pointing to a backend service |
| Operation | A specific HTTP method + path (e.g. POST /orders) |
| Product | A bundle of APIs with access policies and subscription requirements |
| Subscription | A named key pair granting access to a product or API |
| Backend | The upstream service APIM forwards requests to |
| Named Value | Global key-value store for policy constants and secrets |
| Policy | XML rules that run at inbound, backend, outbound, or on-error stages |
Request Pipeline
Every API call flows through 4 policy sections in order: Inbound β Backend β Outbound β (On-Error if exception). Each section can inspect, transform, block, or redirect the request.
03Tiers β Consumption to Premium
Choosing the right APIM tier directly impacts your cost, availability, and feature set β and changing tiers later can involve downtime or migration effort. Each tier is designed for a specific workload profile, from serverless pay-per-call for prototyping to Premium with multi-region deployment and full VNet isolation. In production, the tier determines your SLA guarantee, whether you can use private networking, and how much built-in cache capacity you get. Evaluate your traffic volume, latency requirements, and network security needs before committing to a tier.
| Feature | Consumption | Developer | Basic v2 | Standard v2 | Premium |
|---|---|---|---|---|---|
| Pricing | Per-call | Fixed/mo | Fixed/mo | Fixed/mo | Fixed/mo |
| SLA | None | None | 99.95% | 99.95% | 99.99% |
| VNet Integration | β No | External | β Inject | β Inject | β Internal |
| Multi-region | β No | β No | β No | β No | β Yes |
| Availability Zones | β No | β No | β No | β No | β Yes |
| Built-in Cache | β No | β Yes | β Yes | β Yes | β Yes |
| Self-hosted GW | β No | β Yes | β No | β No | β Yes |
| Developer Portal | β No | β Yes | β Yes | β Yes | β Yes |
| Price (approx) | $3.50/1M calls | $50/mo | $150/mo | $700/mo | $3,000/mo |
04Creating & Importing APIs
APIM supports importing APIs from a wide variety of sources, making it easy to onboard existing services regardless of their technology stack. Whether you have an OpenAPI spec, a WSDL file from a legacy SOAP service, or a live Azure Function App, APIM can auto-discover operations and generate the gateway configuration for you. In production, importing from a spec file ensures your API contract is version-controlled and reproducible across environments. Always prefer spec-based imports over manual operation definitions to keep your CI/CD pipeline clean and your API documentation accurate.
| Method | Description |
|---|---|
| OpenAPI / Swagger (.json/.yaml) | Import full API spec β auto-generates all operations |
| WSDL | Import legacy SOAP services β APIM wraps as REST |
| Azure Function App | Auto-discovers all HTTP-triggered functions |
| Logic App | Auto-discovers HTTP Request triggers |
| App Service / Container App | Import from live endpoint or OpenAPI URL |
| GraphQL Schema | Import .graphql schema and set resolvers |
| gRPC (.proto) | Import protobuf definition |
| Manual HTTP | Define operations by hand for any custom backend |
Import via CLI
# Import from OpenAPI spec file
az apim api import \
--resource-group myRG --service-name myAPIM \
--path "orders" --api-id "orders-api" \
--specification-format OpenApiJson \
--specification-path ./openapi.json \
--display-name "Orders API" --protocols https
# Import from live Function App swagger endpoint
az apim api import \
--resource-group myRG --service-name myAPIM \
--path "functions" --api-id "func-api" \
--specification-format OpenApiJson \
--specification-url "https://myfunc.azurewebsites.net/api/swagger.json?code=<key>"05Policy Engine β Deep Dive
Policies are the core power of APIM β XML-based rules applied at 4 pipeline stages. They are inherited from Global β Product β API β Operation scope, with<base /> controlling parent inheritance.
Policy Structure
<policies>
<inbound>
<base /> <!-- Inherit parent scope policies β ALWAYS include! -->
<rate-limit calls="100" renewal-period="60" />
<validate-jwt header-name="Authorization" ... />
</inbound>
<backend>
<base />
<forward-request timeout="30" />
</backend>
<outbound>
<base />
<set-header name="X-Powered-By" exists-action="delete" />
</outbound>
<on-error>
<base />
<return-response>
<set-status code="500" reason="Server Error" />
<set-body>{"error": "@(context.LastError.Message)"}</set-body>
</return-response>
</on-error>
</policies>Policy Categories
| Category | Key Policies |
|---|---|
| Access restriction | rate-limit, quota, ip-filter, validate-jwt, check-header |
| Authentication | authentication-basic, authentication-certificate, authentication-managed-identity |
| Caching | cache-lookup, cache-store, cache-lookup-value, cache-store-value |
| Transformation | set-body, set-header, set-query-parameter, json-to-xml, xml-to-json, rewrite-uri |
| Routing | set-backend-service, forward-request, send-request, return-response |
| Logging | log-to-eventhub, emit-metric, trace |
| Validation | validate-content, validate-headers, validate-parameters, validate-status-code |
| Advanced | choose (if/else), retry, wait, mock-response |
06Inbound, Backend, Outbound Policies
The three pipeline stages β inbound, backend, and outbound β give you fine-grained control over every aspect of the API request lifecycle. Inbound policies handle authentication, rate limiting, and request transformation before the call reaches your backend. Backend policies control how the request is forwarded, including retry logic and timeout settings. Outbound policies let you strip sensitive headers, add correlation IDs, and transform responses before they reach the client β essential for maintaining a consistent, secure API surface in production.
JWT Validation
<inbound>
<base />
<validate-jwt header-name="Authorization"
failed-validation-httpcode="401"
failed-validation-error-message="Unauthorized"
require-scheme="Bearer">
<openid-config url="https://login.microsoftonline.com/{tenant}/v2.0/.well-known/openid-configuration" />
<audiences><audience>api://my-app-id</audience></audiences>
<required-claims>
<claim name="scp" match="any" separator=" ">
<value>Orders.Read</value>
<value>Orders.Write</value>
</claim>
</required-claims>
</validate-jwt>
</inbound>CORS Policy
<inbound>
<base />
<cors allow-credentials="true">
<allowed-origins>
<origin>https://myapp.com</origin>
<origin>https://staging.myapp.com</origin>
</allowed-origins>
<allowed-methods>
<method>GET</method><method>POST</method>
<method>PUT</method><method>DELETE</method>
</allowed-methods>
<allowed-headers>
<header>Authorization</header>
<header>Content-Type</header>
</allowed-headers>
</cors>
</inbound>Retry Backend on Failure
<backend>
<retry condition="@(context.Response.StatusCode >= 500)"
count="3" interval="2" max-interval="10"
delta="1" first-fast-retry="true">
<forward-request timeout="30" />
</retry>
</backend>Remove Internal Headers + Add Correlation ID
<outbound>
<base />
<set-header name="X-Powered-By" exists-action="delete" />
<set-header name="Server" exists-action="delete" />
<set-header name="X-Request-Id" exists-action="override">
<value>@(context.RequestId.ToString())</value>
</set-header>
<set-header name="X-Processing-Time" exists-action="override">
<value>@(context.Elapsed.TotalMilliseconds.ToString())</value>
</set-header>
</outbound>Policy Expressions Reference
// Request info
context.Request.Method // GET, POST...
context.Request.Url.ToString() // Full URL
context.Request.IpAddress // Client IP
context.Request.MatchedParameters["id"] // Route param
context.Request.Url.Query.GetValueOrDefault("v", "v1") // Query param
// Body
context.Request.Body.As<JObject>(preserveContent: true)
context.Response.Body.As<string>()
// JWT claim
context.Request.Headers["Authorization"][0]
.Split(' ')[1].AsJwt()?.Claims["sub"].FirstOrDefault()
// Auth context
context.User.Id // Authenticated user ID
context.Subscription.Id // Subscription key ID
context.Product.Id // Product ID
// Utility
Guid.NewGuid().ToString()
DateTime.UtcNow.ToString("o")
Convert.ToBase64String(Encoding.UTF8.GetBytes("hello"))07Authentication & Authorization
APIM provides multiple authentication mechanisms that can be layered together for defense-in-depth security. In production, most enterprises combine subscription keys for consumer identification with JWT/OAuth2 validation for actual authorization β ensuring both "who is calling" and "what are they allowed to do" are enforced at the gateway. Managed Identity eliminates the need to store secrets for backend-to-backend communication, while client certificates provide the highest assurance for partner integrations. Choose your auth strategy based on your threat model: public APIs need JWT + rate limiting, internal APIs benefit from mTLS + VNet isolation.
| Method | Type | Use Case |
|---|---|---|
| Subscription Key (API Key) | Ocp-Apim-Subscription-Key header | Simple consumer identification |
| JWT / OAuth2 / OpenID Connect | Bearer token from Entra ID | β Enterprise standard |
| Client Certificate (mTLS) | X.509 certificate | Partner integrations, highest assurance |
| Managed Identity β Backend | APIM to backend auth, no secrets | β Recommended for backends |
| IP Allow/Block list | Network-level filtering | Known partner IPs |
Managed Identity β Backend Auth
<inbound>
<base />
<!-- Get token for any Azure resource -->
<authentication-managed-identity
resource="https://servicebus.windows.net"
output-token-variable-name="sbToken" />
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["sbToken"])</value>
</set-header>
</inbound>Client Certificate Validation
<inbound>
<base />
<choose>
<when condition="@(context.Request.Certificate == null
|| context.Request.Certificate.Thumbprint != "{{cert-thumbprint}}")">
<return-response>
<set-status code="403" reason="Forbidden" />
<set-body>{"error": "Valid client certificate required"}</set-body>
</return-response>
</when>
</choose>
</inbound>08Subscriptions & Products
Products bundle APIs with access policies. Developers subscribe to products β not individual APIs. Each subscription provides a primary and secondary key.
| Subscription Scope | Description |
|---|---|
| Product (most common) | Access to all APIs in a product β use for tiered access |
| API | Access to a specific API only |
| All APIs | Access everything β for admin use only |
Ocp-Apim-Subscription-Key header, or as a query parameter ?subscription-key=<key>.09Rate Limiting & Throttling
Rate limiting and throttling protect your backend services from being overwhelmed by excessive traffic β whether from a misbehaving client, a DDoS attempt, or an unexpected traffic spike. APIM offers both short-term rate limits (calls per second/minute) and long-term quotas (calls per month) that can be scoped per subscription, per IP address, or per authenticated user. In production, always set rate limits even on internal APIs to prevent cascade failures, and use the Retry-After header to help well-behaved clients back off gracefully. Combine rate limiting with quotas to enforce both burst protection and fair-use policies across your API consumer tiers.
Rate Limit (per subscription)
<!-- 100 calls per 60 seconds per subscription key -->
<rate-limit calls="100" renewal-period="60"
remaining-calls-header-name="X-RateLimit-Remaining"
retry-after-header-name="Retry-After" />Rate Limit by Key (per IP / per user)
<!-- Per client IP -->
<rate-limit-by-key calls="50" renewal-period="60"
counter-key="@(context.Request.IpAddress)" />
<!-- Per JWT subject claim (user-level) -->
<rate-limit-by-key calls="200" renewal-period="3600"
counter-key="@(context.Request.Headers["Authorization"][0]
.Split(' ')[1].AsJwt()?.Claims["sub"].FirstOrDefault())" />Quota (long-period limits)
<!-- 10,000 calls per month per subscription -->
<quota-by-key calls="10000" renewal-period="2592000"
counter-key="@(context.Subscription?.Id)" />10Caching
Caching at the API gateway layer dramatically reduces backend load and improves response latency for frequently requested data. APIM supports both full response caching (store the entire HTTP response) and fragment caching (store individual values like tokens or lookup results). In production, cache GET responses for reference data, product catalogs, or configuration endpoints that don't change every second β even a 30-second cache can reduce backend calls by 90% under high traffic. Use fragment caching to store OAuth tokens or expensive computation results, and always set cache duration shorter than your data's acceptable staleness window.
Cache GET Response (5 minutes)
<!-- In inbound section -->
<cache-lookup vary-by-developer="false"
vary-by-developer-groups="false"
allow-private-response-caching="false">
<vary-by-header>Accept</vary-by-header>
<vary-by-query-parameter>category</vary-by-query-parameter>
</cache-lookup>
<!-- In outbound section -->
<cache-store duration="300" />Fragment Cache (Token Caching)
<inbound>
<base />
<!-- Try to get cached token -->
<cache-lookup-value key="backend-token" variable-name="token" />
<choose>
<when condition="@(!context.Variables.ContainsKey("token"))">
<!-- Fetch new token from identity provider -->
<send-request mode="new" response-variable-name="tokenResp" timeout="10">
<set-url>https://login.microsoftonline.com/{tenant}/oauth2/v2.0/token</set-url>
<set-method>POST</set-method>
<set-body>grant_type=client_credentials&client_id={{cid}}&client_secret={{csec}}&scope=...</set-body>
</send-request>
<set-variable name="token"
value="@(((IResponse)context.Variables["tokenResp"]).Body.As<JObject>()["access_token"].ToString())" />
<!-- Cache for 58 minutes (token valid 60min) -->
<cache-store-value key="backend-token"
value="@((string)context.Variables["token"])" duration="3480" />
</when>
</choose>
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["token"])</value>
</set-header>
</inbound>11Versioning & Revisions
Versioning and revisions are two complementary mechanisms that let you evolve your APIs without breaking existing consumers. Versions handle breaking changes by running multiple API versions side-by-side (v1 and v2 coexist), while revisions let you safely test non-breaking changes before making them live. In production, use URL-path versioning for public APIs since it's the most discoverable, and use revisions to stage policy changes or add new operations without affecting current traffic. A disciplined versioning strategy prevents the "big bang migration" problem where all consumers must update simultaneously.
For breaking changes. Multiple versions coexist:/v1/orders,/v2/orders. Versioning via URL path, query string, or header.
For non-breaking changes. Safely test changes before making current. Accessed via Ocp-Apim-Revision: 2 header. Only one revision is live at a time.
| Versioning Scheme | Example |
|---|---|
| URL path (most common) | /v1/orders, /v2/orders |
| Query string | /orders?api-version=2025-01-01 |
| Header | Api-Version: 2025-01-01 |
12Developer Portal
An auto-generated, fully customizable website where API consumers can browse interactive docs, test APIs in a Try-It console, request subscriptions, and manage their keys.
13Backends & Load Balancing
APIM's backend entity lets you define upstream services with advanced traffic management capabilities including weighted load balancing, priority-based failover, and circuit breakers. In production, backend pools distribute traffic across multiple instances or regions to improve availability and reduce latency for geographically distributed users. Circuit breakers automatically stop forwarding requests to unhealthy backends, preventing cascade failures from propagating through your system. Configure priority levels so that traffic fails over to secondary regions only when primary backends are down, and use weighted routing for gradual rollouts of new backend versions.
Backend Pool with Load Balancing
{
"type": "Pool",
"pool": {
"services": [
{ "id": "/backends/orders-east", "priority": 1, "weight": 50 },
{ "id": "/backends/orders-west", "priority": 1, "weight": 50 },
{ "id": "/backends/orders-fallback", "priority": 2, "weight": 100 }
]
},
"loadBalancing": { "algorithm": "RoundRobin" }
}Circuit Breaker
{
"circuitBreaker": {
"rules": [{
"failureCondition": { "count": 5, "interval": "PT10S" },
"name": "BreakerRule",
"tripDuration": "PT60S"
}]
}
}14Named Values & Key Vault
Named Values are global constants available in all policies via{{my-named-value}} syntax. Secrets should be stored in Key Vault and referenced β not hardcoded.
# Create plain Named Value
az apim nv create \
--resource-group myRG --service-name myAPIM \
--named-value-id "backend-url" \
--display-name "Backend URL" \
--value "https://api.mybackend.com"
# Create secret Named Value (masked in portal)
az apim nv create \
--named-value-id "api-secret-key" \
--value "s3cr3t!" --secret true
# Reference Key Vault secret (Managed Identity required)
az apim nv create \
--named-value-id "jwt-signing-key" \
--key-vault-secret-identifier \
https://myKeyVault.vault.azure.net/secrets/jwt-key15Managed Identity
APIM's Managed Identity lets it authenticate to other Azure services β Key Vault, Service Bus, Storage, Function Apps β with zero secrets in configuration.
# Enable System Assigned Identity on APIM
az apim update \
--name myAPIM --resource-group myRG \
--enable-managed-identity true
PRINCIPAL_ID=$(az apim show \
--name myAPIM --resource-group myRG \
--query identity.principalId -o tsv)
# Grant Key Vault access
az role assignment create --assignee $PRINCIPAL_ID \
--role "Key Vault Secrets User" \
--scope /subscriptions/<sub>/resourceGroups/<rg>/providers/\
Microsoft.KeyVault/vaults/<vault>
# Grant Service Bus Send
az role assignment create --assignee $PRINCIPAL_ID \
--role "Azure Service Bus Data Sender" \
--scope /subscriptions/<sub>/resourceGroups/<rg>/providers/\
Microsoft.ServiceBus/namespaces/<namespace>
# Grant Storage Blob
az role assignment create --assignee $PRINCIPAL_ID \
--role "Storage Blob Data Contributor" \
--scope /subscriptions/<sub>/resourceGroups/<rg>/providers/\
Microsoft.Storage/storageAccounts/<account>16Function Apps Integration
Azure Functions and APIM are a natural pairing β Functions provide the serverless compute for your API logic while APIM handles cross-cutting concerns like authentication, rate limiting, and documentation. APIM can auto-discover all HTTP-triggered functions in a Function App and import them as operations, saving you from manual configuration. In production, always secure the Function App so it only accepts traffic from APIM (using Managed Identity or restricting to the APIM VNet), preventing consumers from bypassing your gateway policies. This pattern gives you the cost efficiency of serverless with the enterprise governance of a managed API gateway.
Import Function App as API
az apim api import \
--resource-group myRG --service-name myAPIM \
--path "functions" --api-id "my-functions-api" \
--specification-format OpenApiJson \
--specification-url "https://myfunc.azurewebsites.net/api/swagger.json?code=<key>"Secure APIM β Function App
<backend>
<set-header name="x-functions-key"
exists-action="override">
<value>{{function-app-key}}</value>
</set-header>
<forward-request />
</backend><inbound>
<authentication-managed-identity
resource="api://my-function-app-id"
output-token-variable-name="t" />
<set-header name="Authorization"
exists-action="override">
<value>@("Bearer "+(string)context.Variables["t"])</value>
</set-header>
</inbound>17Logic Apps Integration
Import Logic App HTTP Triggers as APIM operations β clients use your APIM URL, the actual Logic App URL stays hidden. Useful for wrapping complex workflows behind a clean API.
<backend>
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["laToken"])</value>
</set-header>
<!-- Logic Apps may take longer β increase timeout -->
<forward-request timeout="120" />
</backend>timeout accordingly and consider returning 202 Accepted with a polling URL for very long-running workflows.18Service Bus Integration
APIM can publish messages to Service Bus queues and topics directly from a policy β turning a synchronous HTTP call into an async message with immediate 202 response.
<inbound>
<base />
<authentication-managed-identity
resource="https://servicebus.windows.net"
output-token-variable-name="sbToken" />
<send-request mode="new"
response-variable-name="sbResult" timeout="30">
<set-url>https://{{sb-namespace}}.servicebus.windows.net/{{queue}}/messages</set-url>
<set-method>POST</set-method>
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["sbToken"])</value>
</set-header>
<set-header name="Content-Type" exists-action="override">
<value>application/json</value>
</set-header>
<set-header name="BrokerProperties" exists-action="override">
<value>@{
return Newtonsoft.Json.JsonConvert.SerializeObject(new {
MessageId = context.RequestId.ToString(),
Label = "OrderCreated"
});
}</value>
</set-header>
<set-body>@(context.Request.Body.As<string>())</set-body>
</send-request>
<return-response>
<set-status code="202" reason="Accepted" />
<set-body>@($"{{\"requestId\": \"{context.RequestId}\"}}")</set-body>
</return-response>
</inbound>19Storage Integration
APIM can interact directly with Azure Storage β uploading blobs, reading from queues, or accessing table data β all from within a policy, without needing a separate backend service. This is powerful for scenarios like file upload APIs where APIM validates the request, authenticates the caller, and streams the payload directly to Blob Storage using Managed Identity. In production, this eliminates an entire microservice layer for simple storage operations while still enforcing rate limits, file size validation, and access control at the gateway. Use this pattern for document upload endpoints, static asset serving, or audit log storage where a full backend would be overkill.
Upload Blob via APIM
<inbound>
<base />
<authentication-managed-identity
resource="https://storage.azure.com/"
output-token-variable-name="storageToken" />
<send-request mode="new"
response-variable-name="uploadResult" timeout="60">
<set-url>@($"https://{{account}}.blob.core.windows.net/uploads/{context.Request.MatchedParameters["filename"]}")</set-url>
<set-method>PUT</set-method>
<set-header name="Authorization" exists-action="override">
<value>@("Bearer " + (string)context.Variables["storageToken"])</value>
</set-header>
<set-header name="x-ms-blob-type" exists-action="override">
<value>BlockBlob</value>
</set-header>
<set-header name="x-ms-version" exists-action="override">
<value>2023-01-03</value>
</set-header>
<set-body>@(context.Request.Body.As<string>())</set-body>
</send-request>
<return-response response-variable-name="uploadResult" />
</inbound>20Entra ID / Azure AD Integration
Microsoft Entra ID (formerly Azure AD) is the enterprise identity provider that powers OAuth2/OpenID Connect authentication for APIM-protected APIs. By registering your API in Entra ID, you define scopes and app roles that map to business permissions β then APIM's validate-jwt policy enforces these at the gateway before requests ever reach your backend. In production, this pattern lets you centralize identity governance: one app registration defines who can access what, and APIM enforces it consistently across all operations. Extract claims from validated tokens and forward them as headers to your backend, so downstream services know the caller's identity without re-validating the token themselves.
App Registration Setup
APIM API App Registration:
Application ID URI: api://apim-orders-api
Expose scopes:
- Orders.Read
- Orders.Write
- Admin.Full
App roles:
- Viewer, Editor, Admin
Client App Registration:
API Permissions (delegated):
- api://apim-orders-api/Orders.Read
API Permissions (application):
- api://apim-orders-api/Orders.WriteExtract Claims & Forward to Backend
<inbound>
<base />
<validate-jwt header-name="Authorization" ... />
<!-- Extract user claims and pass to backend as headers -->
<set-header name="X-User-Id" exists-action="override">
<value>@(context.Request.Headers["Authorization"][0]
.Split(' ')[1].AsJwt()?.Claims["oid"].FirstOrDefault())</value>
</set-header>
<set-header name="X-User-Email" exists-action="override">
<value>@(context.Request.Headers["Authorization"][0]
.Split(' ')[1].AsJwt()?.Claims["email"].FirstOrDefault())</value>
</set-header>
</inbound>21Other Azure Services
APIM integrates with a broad ecosystem of Azure services beyond the core compute and messaging platforms, enabling you to build comprehensive API solutions without custom glue code. From Application Insights for full distributed tracing to Event Hubs for real-time API call streaming, these integrations turn APIM into a central observability and governance hub. In production, connecting APIM to Azure OpenAI lets you manage AI model access with token-based rate limiting and cost attribution per consumer. Use these integrations to extend APIM's capabilities β route traffic to AKS clusters, stream analytics to Power BI, or catalog your APIs in API Center for organization-wide discoverability.
22Self-Hosted Gateway
Deploy the APIM gateway container on-premises or in any Kubernetes cluster while keeping the management plane in Azure. Perfect for hybrid and edge scenarios.
# Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: apim-gateway
spec:
replicas: 2
template:
spec:
containers:
- name: apim-gateway
image: mcr.microsoft.com/azure-api-management/gateway:latest
env:
- name: config.service.endpoint
value: "https://myAPIM.management.azure-api.net/subscriptions/.../gateways/my-gw?api-version=2021-08-01"
- name: config.service.auth
value: "GatewayKey <your-gateway-token>"23Monitoring & Analytics
Effective monitoring is critical for maintaining API reliability and understanding how consumers use your services. APIM provides multiple layers of observability β from built-in analytics dashboards showing top consumers and error rates, to deep Application Insights integration with full distributed tracing across your entire request pipeline. In production, set up alerts on P95 latency spikes and 5xx error rate thresholds to catch issues before they impact users. Use the log-to-eventhub policy to stream every API call to your SIEM or analytics platform for compliance auditing and real-time anomaly detection.
| Source | What You Get |
|---|---|
| APIM Built-in Analytics | Total requests, failed requests, latency P50/P95/P99, top consumers |
| Application Insights | Full request traces, dependencies, exceptions, custom metrics |
| Azure Monitor Metrics | Requests, backend duration, capacity, gateway errors |
| Event Hubs (via policy) | Real-time streaming of every API call for SIEM/analytics |
KQL β APIM Latency by Operation
requests
| where cloud_RoleName == "my-apim-name"
| where timestamp > ago(1h)
| summarize
p50 = percentile(duration, 50),
p95 = percentile(duration, 95),
p99 = percentile(duration, 99)
by name
| order by p99 descEmit Custom Business Metric
<outbound>
<emit-metric name="OrderAmount" namespace="BusinessMetrics"
value="@(double.Parse(context.Response.Body
.As<JObject>()["amount"].ToString()))">
<dimension name="Region"
value="@(context.Request.Headers
.GetValueOrDefault("X-Region","unknown"))" />
</emit-metric>
</outbound>24CI/CD & DevOps
Treating your APIM configuration as code is essential for reliable, repeatable deployments across environments. Use Bicep or ARM templates to provision APIM instances, and automate API imports via CLI commands in your CI/CD pipelines so that every merge to main automatically updates your gateway. In production, this eliminates manual portal clicks that lead to configuration drift between dev, staging, and production environments. Store your OpenAPI specs, policy XML files, and infrastructure templates in version control alongside your application code β this gives you full audit history and the ability to roll back any API change with a single git revert.
Bicep β APIM Resource
resource apim 'Microsoft.ApiManagement/service@2023-05-01-preview' = {
name: 'myAPIM'
location: 'eastus'
sku: { name: 'StandardV2', capacity: 1 }
identity: { type: 'SystemAssigned' }
properties: {
publisherEmail: 'admin@mycompany.com'
publisherName: 'My Company'
}
}GitHub Actions β Deploy API
name: Deploy APIM API
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Import API definition
run: |
az apim api import \
--resource-group ${{ vars.RG }} \
--service-name ${{ vars.APIM }} \
--api-id orders-api \
--specification-format OpenApiJson \
--specification-path ./apis/orders/openapi.json \
--path orders25Architecture Patterns
These architecture patterns represent proven approaches for using APIM in real-world enterprise systems, each addressing different scalability, decoupling, and governance challenges. Choosing the right pattern depends on your traffic profile, client diversity, and reliability requirements β most production systems combine several patterns together. For example, an enterprise might use the API Gateway pattern as the foundation, add Backend for Frontend for mobile optimization, and layer in the Async pattern for long-running operations. Understanding these patterns helps you design systems that scale gracefully and evolve without requiring wholesale rewrites.
26Quick Reference Cheat Sheet
This cheat sheet consolidates the most frequently needed APIM URLs, policy snippets, and service limits into a single quick-reference section. Keep this handy during development β it covers the patterns you'll reach for daily, from rate limiting and JWT validation to managed identity tokens and retry logic. In production, knowing the service limits (like max policy body size of 256 KB and max backend timeout of 240 seconds) helps you avoid hitting unexpected boundaries. Bookmark this section and use it as your go-to reference when writing policies or troubleshooting gateway behavior.
Gateway: https://<service>.azure-api.net
Management API: https://<service>.management.azure-api.net
Developer Portal: https://<service>.developer.azure-api.net<!-- Rate limit -->
<rate-limit calls="100" renewal-period="60" />
<!-- JWT validation -->
<validate-jwt header-name="Authorization" failed-validation-httpcode="401">
<openid-config url="https://login.microsoftonline.com/{t}/v2.0/.well-known/openid-configuration" />
<audiences><audience>api://<app-id></audience></audiences>
</validate-jwt>
<!-- Managed identity token -->
<authentication-managed-identity
resource="https://servicebus.windows.net"
output-token-variable-name="token" />
<!-- Cache response -->
<cache-lookup vary-by-developer="false" vary-by-developer-groups="false" />
<!-- + in outbound: -->
<cache-store duration="300" />
<!-- Return custom response -->
<return-response>
<set-status code="202" reason="Accepted" />
<set-body>{"status": "queued"}</set-body>
</return-response>
<!-- Retry -->
<retry condition="@(context.Response.StatusCode >= 500)"
count="3" interval="2">
<forward-request timeout="30" />
</retry>
<!-- Mock -->
<mock-response status-code="200" content-type="application/json" /># Header (default)
Ocp-Apim-Subscription-Key: <your-subscription-key>
# Query string
GET /api/orders?subscription-key=<your-subscription-key>| Limit | Value |
|---|---|
| Max APIM instances per subscription | 20 |
| Max policy body size | 256 KB |
| Max request/response body in policy | 512 KB |
| Max cache item size | 2 MB |
| Max backend timeout | 240 seconds |
| Max APIs per instance | Unlimited |
| Max operations per API | Unlimited |