This document discusses the evolution of a metrics time series service using Cassandra. It describes moving from Graphite to Cassandra to address limitations of Graphite including single points of failure and I/O saturation. It details three versions of the Graphite implementation and issues with each. Key lessons learned include optimizing Cassandra compaction strategies, leveraging time windows, and designing for cloud-native operations.
8. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v2)
Problems:
? Kind of like a Dynamo, but not
? Replacing node requires full
partition key shuffle
? Adding 5 nodes took 6 days on
1Gbps to re-replicate ring
? Less than 50% disk free means
pain during reshuffle
17. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Merge 4 similarly sized
SSTables into 1 new SSTable
? Data migrates into larger
SSTables that are less-
regularly compacted
? Disk space required:
Sum of 4 largest SSTables
21. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Aside: DELETE
? DELETE is the INSERT of a
TOMBSTONE to the end of a
partition
? INSERTs with TTL become
tombstones in the future
? Tombstones live for at least
gc_grace_seconds
? Data is only deleted during
compaction
https://flic.kr/p/35RACf
29. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Levelled Compaction
? Metrics workload writes to
all partitions,
every period
? Immediately rolled up to L1
? Immediately rolled up to L2
? Immediately rolled up to L3
? Immediately rolled up to L4
? Immediately rolled up to L5
51. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
General
? >= 4 Cores per node always
? >= 8 Cores as soon as feasible
? EC2 sweet spots:
? m3.2xlarge (8c/160GB) for small workloads
? i2.2xlarge (8c/1.6TB) for production
? Avoid c3.2xlarge - CPU:Mem ratio is too high