The document discusses Tadej Murovec's approach to monitoring and alerting systems at HouseTrip. It recommends alerting on symptoms rather than causes, having teammates review alerts, preferring under-monitoring to over-monitoring, and using notifications to prevent alerts. It also provides an overview of the tools HouseTrip uses for monitoring, including New Relic, Datadog, Slack, and PagerDuty.
6. ! Rethink how to do alerting
Alerts should be urgent, important, actionable and real.
Over-monitoring is a harder problem to solve than under-
monitoring.
Symptoms are a better way to capture more problems
more comprehensively and robustly with less effort.1
1
My Philosophy on Alerting Rob Ewaschuk
息 Tadej Murovec, HouseTrip, 2015 6
7. ! Meet the Happy path
* Guests can search for properties
* Guests can browse properties
* Users can login or register
* Guests can send an enquiry or book a property
* Guests can pay
* Hosts can accept booking
息 Tadej Murovec, HouseTrip, 2015 7
8. The end user does not care about MySQL server being
unreachable, but she does care about not being able to
view a property.
息 Tadej Murovec, HouseTrip, 2015 8
12. New Relic
Application server response time
Key transactions tracking with Apdex T
Uptime monitoring
Application segmentation into policy groups
息 Tadej Murovec, HouseTrip, 2015 12
15. Datadog
Monitoring scheduled errands
Background queues and services health check
Custom metrics that make sense for the business
息 Tadej Murovec, HouseTrip, 2015 15
23. ! 4 rules for efficient alerting
Alert on symptoms not causes
" Get your teammates to pair review the alerts
# Prefer under monitoring to over monitoring
$ Use noti鍖cations to prevent alerts
息 Tadej Murovec, HouseTrip, 2015 23
24. ! Be a good citizen
息 Tadej Murovec, HouseTrip, 2015 24