際際滷

際際滷Share a Scribd company logo
@WrathOfChris github.com/WrathOfChris .blog.wrathofchris.com
Time Series Metrics
with Cassandra
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
About Me
? Chris Maxwell
? @WrathOfChris
? Sr Systems Engineer @
Ubiquiti Networks
? Cloud Guy
? DevOps
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Mission
? Metrics service for internal services
? Deliver 90 60 30 days of system and app metrics
? Gain experience with Cassandra
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
History
Ancient Designs
Aging Tools
Pitfalls
https://flic.kr/p/6pqVnP
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v1)
? Single instance
? carbon-relay +
(2-4) carbon-cache
processes (=cpu)
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v1)
Problems:
? Single point of SUCCESS!
? Can grow to 16-32 cores, but
I/O saturation
? Carbon write-amplifies 10x
(flushes every 10s)
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v2)
? Frontend: carbon-relay
? Backend: carbon-relay +
4x carbon-cache
? m3.2xlarge ephemeral SSD
? Manual consistent-hash by IP
? Replication 3
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v2)
Problems:
? Kind of like a Dynamo, but not
? Replacing node requires full
partition key shuffle
? Adding 5 nodes took 6 days on
1Gbps to re-replicate ring
? Less than 50% disk free means
pain during reshuffle
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Limitations
? Cloud Native
? Avoid Manual Intervention
? Ephemeral SSD > EBS
https://flic.kr/p/2hZy6P
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Design
What we set out to build
https://flic.kr/p/2spiXb
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v3)
´it got complicated´
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v3)
Ingest:
? carbon-c-relay
https://github.com/grobian/carbon-c-relay
? cyanite
https://github.com/pyr/cyanite
? cassandra
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Graphite (v3)
Retrieval:
? graphite-api
https://github.com/brutasse/graphite-api
? grafana
https://github.com/grafana/grafana
? cyanite
https://github.com/pyr/cyanite
? elasticsearch
(metric path cache)
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Journey
Lessons learned along the way
https://flic.kr/p/hjY15L
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Sorted String Table (SSTable)
is an immutable data file
? New data written to small
SSTables
? Periodically merged into larger
SSTables
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Merge 4 similarly sized
SSTables into 1 new SSTable
? Data migrates into larger
SSTables that are less-
regularly compacted
? Disk space required:
Sum of 4 largest SSTables
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Updating a partition frequently
may cause it to be spread
between SSTables
? Metrics workload writes to
all partitions,
every period
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Metrics workload writes to
all partitions,
every period
? Range queries that spanned
50+ SSTables !!!
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Size Tiered Compaction
? Getting to the older data´
? Ingest 25% more data
? Major Compaction:
? Requires 50% free space
? Compacts all SSTables into
1 large SSTable
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Aside: DELETE
? DELETE is the INSERT of a
TOMBSTONE to the end of a
partition
? INSERTs with TTL become
tombstones in the future
? Tombstones live for at least
gc_grace_seconds
? Data is only deleted during
compaction
https://flic.kr/p/35RACf
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
gc_grace_seconds
Grace is getting something you don¨t deserve
(time to noetool repair a node that is down)
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
gc_grace_seconds
deleted data reappears!
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Time To Live
? INSERT with TTL becomes
tombstone after expiry
? 10s for 6 hours
? 60s for 3 days
? 300s for 30 days
https://flic.kr/p/6Fxv7M
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
TTL
? gc_grace_seconds is 10 days
(by default)
? 10s for 6 hours 10.25 days
? 60s for 3 days 13 days
? 300s for 30 days 40 days
https://flic.kr/p/gBLHYf
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
https://flic.kr/p/4LNiXg
https://flic.kr/p/35RACf
1.4TB
Disks
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Levelled Compaction
based on Google¨s LevelDB implementation
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Levelled Compaction
? Data is ingested at Level 0
? Immediately compacted and
merged with L1
? Partitions are merged up to Ln
? 90% of partition data
guaranteed to be in same level
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Levelled Compaction
? Metrics workload writes to
all partitions,
every period
? Immediately rolled up to L1
? Immediately rolled up to L2
? Immediately rolled up to L3
? Immediately rolled up to L4
? Immediately rolled up to L5
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Levelled Compaction
? Metrics workload writes to
all partitions,
every period
? 1 batch of writes !> 5 writes
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Increasing Write rate
Constant Ingest rate
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Increasing Write rate
Constant Ingest rate
https://flic.kr/p/4LNiXg
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
compaction_throughput_mb_per_sec: 128
´then 0 (unlimited)
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Speeding Compactions
´ Don¨t Do This ´
multithreaded: true
cassandra_in_memory_compaction_limit_in_mb: 256M
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Date Tiered Compaction
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Date Tiered Compaction
? Written by
Bj?rn Hegerfors at Spotify
? Experimental!
? Released in 2.0.11 / 2.1.1
? Group data by time
? Compact by time
? Drop expired data by time
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Compact SSTables by date window
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
C but the docs say 8GB maximum heap!
MAX_HEAP_SIZE=16G
HEAP_NEWSIZE=2048M
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
C Rick Branson, Instagram
http://www.slideshare.net/planetcassandra/cassandra-summit-2014-cassandra-at-instagram-2014
-XX:+CMSScavengeBeforeRemark
-XX:CMSMaxAbortablePrecleanTime=60000
-XX:CMSWaitDuration=30000
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
All systems normal
Inadvertently tested 30,000 writes/sec during launch
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Cloud Native
http://wattsupwiththat.com/2015/03/17/spaceship-lenticular-cloud-maybe-the-coolest-cloud-picture-evah/
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Cloud Native
Ec2MultiRegionSnitch
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Cloud Native
Ephemeral RAID0
-Djava.io.tmpdir=/mnt/cassandra/tmp
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Disable AutoScaling Terminate Process:
aws autoscaling suspend-processes --scaling-processes Terminate
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Cloud Native
This design works to 50 instances per region
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Security Groups
IAM instance-profile role
Security Group + (per region) Security Group
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Management (OpsCenter)
IAM instance-profile role
Security Group + (per region) Security Group
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Internode Encryption
server_encryption_options:
internode_encryption: all
? keytool -genkeypair -alias test-cass -keyalg RSA -validity 3650 
-keystore test-cass.keystore
? keytool -export -alias test-cass -keystore test-cass.keystore 
-rfc -file test-cass.crt
? keytool -import -alias test-cass -file test-cass.crt -keystore 
test-cass.truststore
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Seeds
Cheated´.
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Seeds
? selects first 3 nodes from each
region using Autoscale Group
order
? ignores (self) as a seed for
bootstrapping first 3 nodes in
each region
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
General
? >= 4 Cores per node always
? >= 8 Cores as soon as feasible
? EC2 sweet spots:
? m3.2xlarge (8c/160GB) for small workloads
? i2.2xlarge (8c/1.6TB) for production
? Avoid c3.2xlarge - CPU:Mem ratio is too high
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Breaking News!
Dense-storage Instances for EC2
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Questions?
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
d2 instances
Joining a node - system/network
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
d2 instances
Joining a node - disk performance
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
General
Metrics
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
General
Cassandra Metrics
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Metrics
CPU - DateTiered
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Metrics
JVM - DateTiered
@WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
Metrics
Compaction/CommitLog - DateTiered

More Related Content

Cassandra meetup 20150331