- The document discusses how PagerDuty's Chef runs became bloated and slow as the infrastructure grew in size and complexity.
- Measurements found Chef was using over 500MB of memory and taking around 180 seconds to complete runs.
- Several optimizations were made including step-through searches, partial searches, result memoization, and API tarpitting which reduced memory usage to 60MB and run time to 30 seconds, improving performance significantly.
- The optimizations led to an 88% reduction in memory usage and 84% faster Chef runs.
1 of 61
Downloaded 13 times
More Related Content
Bloated Chefs: A Tale of Gluttony, and the Path to Enlightenment
3. 4/3/15
Agenda
BLOATED CHEFS
1. Chef resources in use at PD
2. Problems encountered as we grew
3. Measuring chef-client run
4. How we fixed it
5. How fast is it now?
14. 4/3/15
As we grew
BLOATED CHEFS
CPU spikes during chef-client runs
Awkward pauses at the beginning of the run
15. 4/3/15
As we grew
BLOATED CHEFS
CPU spikes during chef-client runs
Awkward pauses at the beginning of the run
chef-client run took several minutes
16. 4/3/15
As we grew
BLOATED CHEFS
CPU spikes during chef-client runs
Awkward pauses at the beginning of the run
chef-client run took several minutes
chef-client OOM
17. 4/3/15
As we grew
BLOATED CHEFS
CPU spikes during chef-client runs
Awkward pauses at the beginning of the run
chef-client run took several minutes
chef-client OOM
36. 4/3/15
Partial Searches
BLOATED CHEFS
Provide hash map of desired results
Minimizes volume of node data returned/handled
hash2node
Two searches touched
37. 4/3/15
Partial Searches
BLOATED CHEFS
Provide hash map of desired results
Minimizes volume of node data returned/handled
hash2node
Two searches touched
90s -> 60s
38. 4/3/15
Partial Searches
BLOATED CHEFS
Provide hash map of desired results
Minimizes volume of node data returned/handled
hash2node
Two searches touched
90s -> 60s
30%