This document discusses different types of failures that can occur in distributed systems and messaging architectures. It categorizes failures as either transient or permanent, provides examples of each, and demonstrates approaches for handling failures synchronously and asynchronously. The key approaches include retrying transient failures, logging permanent failures, and using messaging to decouple failure handling from the frontend for better retry capabilities and failure replay.
2. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Types of failure
Demo app
Synchronous failure handling
Asynchronous failure handling
Messaging architecture
Outline
3. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Distributed architecture
Timeout / process overloaded
Temporary should retry
Transient Failures
Database
Network timeout
Pool exhaustion
REST API
Connection timeout
503 'Service Unavailable'
4. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Contract changed
Authorization revoked
Permanent should not retry
Permanent Failures
Database
Procedure change
Permission change
REST API
400 'Bad Request'
401 'Unauthorized'
5. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Demo App
Web App REST APIs
Reliable (200)
Unreliable (200 or 503)
Broken (400)
6. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Synchronous processing
API calls with WebClient
Exceptions from non-OK result
App V1
7. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Demo V1
No failure handling
Bubble up to user
Transient & permanent
8. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Handling Failures
Type?
Audit
End
Retry
OK?
Failure
Transient
Permanent
Yes
No
9. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Handling Failures
End
OK?
Failure
Transient
No
Yes
Permanent
Transient or permanent?
Retry policies
Audit process
Type?
Audit
Retry
10. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Simple failure handling
Determine failure type
Retry transient; log permanent
App V2
try {}
catch {}
11. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Demo V2
Always appears successful
Retry options limited
Audit process basic
12. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Business / technical fix
Failure backlog
Replay process
Replaying Failures
13. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Asynchronous messaging
Web App sends message
Handler has retry & audit policy
App V3
14. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Integration happens offline
Wider scope for retry
Full details for replay
Demo V3
15. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Messaging Architecture
Message
All request data
Processing data
Queue
Ordered store
Transactions/ACKs
Handler
Decouples front-end
Can be stopped
16. Introduction to ServiceInsight for NServiceBusHandling Failures with Messaging
Expect failure
Handle failure in the right place
Retry & backoff damages front-end
Decoupled handlers have more options
Persisted messages can be replayed
Summary