ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
A ScyllaDB Community
Replacing RocksDB with
ScyllaDB in Kafka Streams
Almog Gavra
Co-Founder, Responsive
Realtime Search Ingest @
Stream Processing @
Co-Founder @
Almog Gavra
About Kafka Streams
Why Introduce ScyllaDB?
Architecture Deep Dive
Lessons Learned & Practical Tips
Agenda
About Kafka Streams
Kafka A storage backend optimized for storing
ordered ¡°event¡± data
Kafka
Kafka Streams
A storage backend optimized for storing
ordered ¡°event¡± data
A library for building event-driven
applications: realtime, responsive & stateful
Kafka Streams is Everywhere
Realtime Inference
Walmart uses Kafka Streams
to power fraud detection and
purchase recommendations
Kafka Streams is Everywhere
Realtime Inference
Walmart uses Kafka Streams
to power fraud detection and
purchase recommendations
Logistics
Michelin uses Kafka Streams to
handle their tier distribution, ensuring
delivery is tracked in realtime
Kafka Streams is Everywhere
Realtime Inference
Walmart uses Kafka Streams
to power fraud detection and
purchase recommendations
Logistics
Michelin uses Kafka Streams to
handle their tier distribution, ensuring
delivery is tracked in realtime
Liquidity Management
Michelin uses Kafka Streams to
handle their tier distribution, ensuring
delivery is tracked in realtime
Why Introduce ScyllaDB?
Original Design Goals
Just a Library
Kafka Streams should be easy
to integrate with existing apps.
Original Design Goals
Just a Library
Kafka Streams should be easy
to integrate with existing apps.
Complete API
All stream processing use cases
should be possible to write using
Kafka Streams.
Original Design Goals
Just a Library
Kafka Streams should be easy
to integrate with existing apps.
Complete API
All stream processing use cases
should be possible to write using
Kafka Streams.
Depend only on Kafka
There should be no dependencies on
external systems (such as HDFS or
YARN).
Original Design Goals
Depend only on Kafka
There should be no dependencies on
external systems (such as HDFS or
YARN).
Original Design Goals
Availability
Source of truth is in a ¡°changelog¡± topic in Kafka. If
assignment changes, state must be restored.
Original Design Goals
Availability
Source of truth is in a ¡°changelog¡± topic in Kafka. If
assignment changes, state must be restored.
Flexibility
Dynamic scaling is impractical. Most deployments
are provisioned for peak throughput.
Original Design Goals
Availability
Source of truth is in a ¡°changelog¡± topic in Kafka. If
assignment changes, state must be restored.
Flexibility
Dynamic scaling is impractical. Most deployments
are provisioned for peak throughput.
Performance
Di?cult to properly attribute resources to the
stream processing vs. RocksDB storage subsystem
What Changed?
2016
Kafka Streams is released with
Apache Kafka 0.10
What Changed?
2016
Kafka Streams is released with
Apache Kafka 0.10
2017
Cloud databases gain
mainstream adoption
What Changed?
2016
Kafka Streams is released with
Apache Kafka 0.10
2017
Cloud databases gain
mainstream adoption
2019
Kubernetes wins out as de
facto orchestration system
What Changed?
2016
Kafka Streams is released with
Apache Kafka 0.10
2017
Cloud databases gain
mainstream adoption
2019
Kubernetes wins out as de
facto orchestration system
2020
Kafka Streams popularity soars,
widening possible use cases
What Changed?
2016
Kafka Streams is released with
Apache Kafka 0.10
2017
Cloud databases gain
mainstream adoption
2019
Kubernetes wins out as de
facto orchestration system
2020
Kafka Streams popularity soars,
widening possible use cases
Today
Some assumptions guiding the original
Kafka Streams design are outdated
Lifting the ¡°Kafka Only Requirement¡±
Lifting the ¡°Kafka Only Requirement¡±
Lifting the ¡°Kafka Only Requirement¡±
Deep Dive: Metronome
Metronome powers billing for companies like
OpenAI, NVIDIA and Databricks using Kafka
Streams.
Mission Critical Feature: Realtime Spend Limits
Key Results Migrating
from RocksDB to Scylla
Availability
Going from regular incidents to ¡°not thinking about
Kafka Streams¡±
Throughput Growth
ScyllaDB scaled without hiccup as their data size
and throughput scaled
Scale Potential
Decoupled compute and storage means we¡¯ve been
able to scale number of Kafka Partitions and
ScyllaDB cluster size independently
99.99%
3x
¡Þ
Architecture Deep Dive
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
Raw Data
Since Kafka Streams deals only with serialized
bytes (users supply the serializers) it makes storing
data easy and general purpose
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
Primary Key
Using the Kafka partition allows us to implement
LWTs per partition and the data key allows us to
implement e?cient lookups and range scans
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
Restoring State
Storing the offset with a sentinel dataKey enables
e?cient client hand-offs in failure scenarios
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
What¡¯s This?
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
Node B
Node A process Kafka commit
Fencing Zombies
Node B
Node A process Kafka commit
write to
ScyllaDB
Fencing Zombies
Node B
Node A process Kafka commit
write to
ScyllaDB
process Kafka commit
write to
ScyllaDB
Fencing Zombies
Node B
Node A process Kafka commit
write to
ScyllaDB
process Kafka commit
write to
ScyllaDB
prevents other nodes from
committing to Kafka (but not
writing to Scylla!)
Fencing Zombies
Node B
Node A process Kafka commit
write to
ScyllaDB
process Kafka commit
write to
ScyllaDB
write to
ScyllaDB
Fencing Zombies
Node B
Node A process Kafka commit
write to
ScyllaDB
process Kafka commit
write to
ScyllaDB
write to
ScyllaDB
Might overwrite data from the
previous ScyllaDB write!
Fencing Zombies
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
BEGIN BATCH;
starts an Atomic Batch (not
for speed, but for LWT)
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
BEGIN BATCH;
UPDATE key_value
SET epoch = 12
WHERE
partitionKey = 1
dataKey = metadata_key
IF epoch <= 12;
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
BEGIN BATCH;
UPDATE key_value
SET epoch = 12
WHERE
partitionKey = 1
dataKey = metadata_key
IF epoch <= 12;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
BEGIN BATCH;
UPDATE key_value
SET epoch = 12
WHERE
partitionKey = 1
dataKey = metadata_key
IF epoch <= 12;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
APPLY BATCH;
Data Model
CREATE TABLE key_value
partitionKey INTEGER,
dataKey BLOB,
dataValue BLOB,
epoch BIGINT,
offset BIGINT,
PRIMARY KEY ((partitionKey),
dataKey);
BEGIN BATCH;
UPDATE key_value
SET epoch = 12
WHERE
partitionKey = 1
dataKey = metadata_key
IF epoch <= 12;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
INSERT INTO key_value VALUES ¡­;
APPLY BATCH;
Lessons Learned
Latency Issues? Check Disk!
Bloom Filter Usage
An increase in disk reads is often a symptom
something else is wrong. For us, there were a few
times Bloom Filters were not properly constructed
(once due to a con?g, once due to a bug)!
Use LWT Only When Necessary
Cost of Atomic Batches
Atomic Batches signi?cantly slow down write
throughput, even if they¡¯re contained to only a
single partition
Throughput Before/After Removing LWTs
Selecting Node Size
Node Size
How to choose your ScyllaDB Cloud node type?
Consistency
Don¡¯t Be Inconsistent!
Make sure you¡¯ve set your Read / Write consistency
levels. The default is ONE / ONE, which can give
inconsistent results!
Favor Fast Reads
Favor Fast Writes
Inconsistent Consistent
ONE/ONE
Consistency
Don¡¯t Be Inconsistent!
Make sure you¡¯ve set your Read / Write consistency
levels. The default is ONE / ONE, which can give
inconsistent results!
Favor Fast Reads
Favor Fast Writes
Inconsistent Consistent
QUORUM/
QUORUM
ONE/ONE
Consistency
Don¡¯t Be Inconsistent!
Make sure you¡¯ve set your Read / Write consistency
levels. The default is ONE / ONE, which can give
inconsistent results!
Favor Fast Reads
Favor Fast Writes
Inconsistent Consistent
QUORUM/
QUORUM
ONE/ONE
ONE/ALL
risky, but has niche
use cases
Consistency
QUORUM/QUORUM
Need Faster Reads
Option: use ONE/ALL temporarily
Consistency
QUORUM/QUORUM
Need Faster Reads
Option: use ONE/ALL temporarily
Migrate to ALL / ALL
This is an intermediate state necessary
to maintain consistency
Run Repair
Ensure all data is available on all nodes
from the QUORUM / QUORUM time
Consistency
QUORUM/QUORUM
Need Faster Reads
Option: use ONE/ALL temporarily
Migrate to ALL / ALL
This is an intermediate state necessary
to maintain consistency
Run Repair
Ensure all data is available on all nodes
from the QUORUM / QUORUM time
Enable ONE / ALL
Speedy reads! This can help you
temporarily and migrating back to
QUORUM / QUORUM is safe.
Stay in Touch
Almog Gavra
almog@responsive.dev
@almog.ai
@agavra
/in/agavra/

More Related Content

More from ScyllaDB (20)

Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
ScyllaDB
?
Gmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
Gmetrics: Processing Metrics at Uber Scale by Cristian VelazquezGmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
Gmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
ScyllaDB
?
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas SaudemontUsing ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
ScyllaDB
?
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
ScyllaDB
?
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
ScyllaDB
?
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
ScyllaDB
?
ScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB¡¯s Monstrous Engineering Advances by Avi KivityScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB
?
The Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nesThe Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nes
ScyllaDB
?
Architecture for Extreme Scale by Avi Kivity
Architecture for Extreme Scale by Avi KivityArchitecture for Extreme Scale by Avi Kivity
Architecture for Extreme Scale by Avi Kivity
ScyllaDB
?
How We Boosted ScyllaDB Data Streaming by 25x by Asias He
How We Boosted ScyllaDB Data Streaming by 25x by Asias HeHow We Boosted ScyllaDB Data Streaming by 25x by Asias He
How We Boosted ScyllaDB Data Streaming by 25x by Asias He
ScyllaDB
?
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
ScyllaDB
?
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar PatturajInside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
ScyllaDB
?
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan ChanData Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
ScyllaDB
?
Building a Scalable Event-Driven Architecture for Open Finance Brasil by Thi...
Building a Scalable Event-Driven Architecture for Open Finance Brasil by  Thi...Building a Scalable Event-Driven Architecture for Open Finance Brasil by  Thi...
Building a Scalable Event-Driven Architecture for Open Finance Brasil by Thi...
ScyllaDB
?
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrieDynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
ScyllaDB
?
How Discord Performs Database Upgrades at Scale by Ethan Donowitz
How Discord Performs Database Upgrades at Scale by Ethan DonowitzHow Discord Performs Database Upgrades at Scale by Ethan Donowitz
How Discord Performs Database Upgrades at Scale by Ethan Donowitz
ScyllaDB
?
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik RexedTelemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
ScyllaDB
?
Pushing Your Streaming Platform to the Limit by Elad Leev
Pushing Your Streaming Platform to the Limit by Elad LeevPushing Your Streaming Platform to the Limit by Elad Leev
Pushing Your Streaming Platform to the Limit by Elad Leev
ScyllaDB
?
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert ZhangOvercome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
ScyllaDB
?
Caching for Performance Masterclass: Caching at Scale
Caching for Performance Masterclass: Caching at ScaleCaching for Performance Masterclass: Caching at Scale
Caching for Performance Masterclass: Caching at Scale
ScyllaDB
?
Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
Securely Serving Millions of Boot Artifacts a Day by Joa?o Pedro Lima & Matt ...
ScyllaDB
?
Gmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
Gmetrics: Processing Metrics at Uber Scale by Cristian VelazquezGmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
Gmetrics: Processing Metrics at Uber Scale by Cristian Velazquez
ScyllaDB
?
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas SaudemontUsing ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
Using ScyllaDB to Implement Lists in Medium¡¯s Feature Store by Andreas Saudemont
ScyllaDB
?
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
30B Images and Counting: Scaling Canva's Content-Understanding Pipelines by K...
ScyllaDB
?
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance b...
ScyllaDB
?
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
Route It Like It¡¯s Hot: Scaling Payments Routing at American Express by Benja...
ScyllaDB
?
ScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB¡¯s Monstrous Engineering Advances by Avi KivityScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB¡¯s Monstrous Engineering Advances by Avi Kivity
ScyllaDB
?
The Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nesThe Future of Repair: Transparent and Incremental by Botond De?nes
The Future of Repair: Transparent and Incremental by Botond De?nes
ScyllaDB
?
Architecture for Extreme Scale by Avi Kivity
Architecture for Extreme Scale by Avi KivityArchitecture for Extreme Scale by Avi Kivity
Architecture for Extreme Scale by Avi Kivity
ScyllaDB
?
How We Boosted ScyllaDB Data Streaming by 25x by Asias He
How We Boosted ScyllaDB Data Streaming by 25x by Asias HeHow We Boosted ScyllaDB Data Streaming by 25x by Asias He
How We Boosted ScyllaDB Data Streaming by 25x by Asias He
ScyllaDB
?
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
ScyllaDB
?
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar PatturajInside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
Inside Freshworks' Migration from Cassandra to ScyllaDB by Premkumar Patturaj
ScyllaDB
?
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan ChanData Structures Handling Trillions of Daily Streaming Events by Evan Chan
Data Structures Handling Trillions of Daily Streaming Events by Evan Chan
ScyllaDB
?
Building a Scalable Event-Driven Architecture for Open Finance Brasil by Thi...
Building a Scalable Event-Driven Architecture for Open Finance Brasil by  Thi...Building a Scalable Event-Driven Architecture for Open Finance Brasil by  Thi...
Building a Scalable Event-Driven Architecture for Open Finance Brasil by Thi...
ScyllaDB
?
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrieDynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
DynamoDB Cost Optimization Considerations and Strategies by Alex DeBrie
ScyllaDB
?
How Discord Performs Database Upgrades at Scale by Ethan Donowitz
How Discord Performs Database Upgrades at Scale by Ethan DonowitzHow Discord Performs Database Upgrades at Scale by Ethan Donowitz
How Discord Performs Database Upgrades at Scale by Ethan Donowitz
ScyllaDB
?
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik RexedTelemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
Telemetry Showdown: Fluent Bit vs. OpenTelemetry Collector by Henrik Rexed
ScyllaDB
?
Pushing Your Streaming Platform to the Limit by Elad Leev
Pushing Your Streaming Platform to the Limit by Elad LeevPushing Your Streaming Platform to the Limit by Elad Leev
Pushing Your Streaming Platform to the Limit by Elad Leev
ScyllaDB
?
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert ZhangOvercome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
Overcome Redis Cluster Scale Bottlenecks with ScyllaDB & EloqKV by Hubert Zhang
ScyllaDB
?
Caching for Performance Masterclass: Caching at Scale
Caching for Performance Masterclass: Caching at ScaleCaching for Performance Masterclass: Caching at Scale
Caching for Performance Masterclass: Caching at Scale
ScyllaDB
?

Recently uploaded (20)

Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Jonathan Bowen
?
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-StoryRevolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
ssuser52ad5e
?
Transform Your Future with Front-End Development Training
Transform Your Future with Front-End Development TrainingTransform Your Future with Front-End Development Training
Transform Your Future with Front-End Development Training
Vtechlabs
?
Unlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤EUnlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤E
Expeed Software
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
Tsuyoshi Hirayama
?
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
?
World Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a CrossroadsWorld Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a Crossroads
Joshua Randall
?
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIATHE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
Srivaanchi Nathan
?
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
?
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
?
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
?
DealBook of Ukraine: 2025 edition | AVentures Capital
DealBook of Ukraine: 2025 edition | AVentures CapitalDealBook of Ukraine: 2025 edition | AVentures Capital
DealBook of Ukraine: 2025 edition | AVentures Capital
Yevgen Sysoyev
?
Field Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci ResearchField Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci Research
Vipin Mishra
?
Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
?
Backstage Software Templates for Java Developers
Backstage Software Templates for Java DevelopersBackstage Software Templates for Java Developers
Backstage Software Templates for Java Developers
Markus Eisele
?
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Formal Methods: Whence and Whither? [Martin Fr?nzle Festkolloquium, 2025]
Jonathan Bowen
?
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-StoryRevolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
ssuser52ad5e
?
Transform Your Future with Front-End Development Training
Transform Your Future with Front-End Development TrainingTransform Your Future with Front-End Development Training
Transform Your Future with Front-End Development Training
Vtechlabs
?
Unlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤EUnlock AI Creativity: Image Generation with DALL¡¤E
Unlock AI Creativity: Image Generation with DALL¡¤E
Expeed Software
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
DAO UTokyo 2025 DLT mass adoption case studies IBM Tsuyoshi Hirayama (ƽɽÒã)
Tsuyoshi Hirayama
?
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
?
World Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a CrossroadsWorld Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a Crossroads
Joshua Randall
?
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIATHE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
THE BIG TEN BIOPHARMACEUTICAL MNCs: GLOBAL CAPABILITY CENTERS IN INDIA
Srivaanchi Nathan
?
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
?
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
?
L01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardnessL01 Introduction to Nanoindentation - What is hardness
L01 Introduction to Nanoindentation - What is hardness
RostislavDaniel
?
DealBook of Ukraine: 2025 edition | AVentures Capital
DealBook of Ukraine: 2025 edition | AVentures CapitalDealBook of Ukraine: 2025 edition | AVentures Capital
DealBook of Ukraine: 2025 edition | AVentures Capital
Yevgen Sysoyev
?
Field Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci ResearchField Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci Research
Vipin Mishra
?
Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
?
Backstage Software Templates for Java Developers
Backstage Software Templates for Java DevelopersBackstage Software Templates for Java Developers
Backstage Software Templates for Java Developers
Markus Eisele
?

Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra

  • 1. A ScyllaDB Community Replacing RocksDB with ScyllaDB in Kafka Streams Almog Gavra Co-Founder, Responsive
  • 2. Realtime Search Ingest @ Stream Processing @ Co-Founder @ Almog Gavra
  • 3. About Kafka Streams Why Introduce ScyllaDB? Architecture Deep Dive Lessons Learned & Practical Tips Agenda
  • 5. Kafka A storage backend optimized for storing ordered ¡°event¡± data
  • 6. Kafka Kafka Streams A storage backend optimized for storing ordered ¡°event¡± data A library for building event-driven applications: realtime, responsive & stateful
  • 7. Kafka Streams is Everywhere Realtime Inference Walmart uses Kafka Streams to power fraud detection and purchase recommendations
  • 8. Kafka Streams is Everywhere Realtime Inference Walmart uses Kafka Streams to power fraud detection and purchase recommendations Logistics Michelin uses Kafka Streams to handle their tier distribution, ensuring delivery is tracked in realtime
  • 9. Kafka Streams is Everywhere Realtime Inference Walmart uses Kafka Streams to power fraud detection and purchase recommendations Logistics Michelin uses Kafka Streams to handle their tier distribution, ensuring delivery is tracked in realtime Liquidity Management Michelin uses Kafka Streams to handle their tier distribution, ensuring delivery is tracked in realtime
  • 11. Original Design Goals Just a Library Kafka Streams should be easy to integrate with existing apps.
  • 12. Original Design Goals Just a Library Kafka Streams should be easy to integrate with existing apps. Complete API All stream processing use cases should be possible to write using Kafka Streams.
  • 13. Original Design Goals Just a Library Kafka Streams should be easy to integrate with existing apps. Complete API All stream processing use cases should be possible to write using Kafka Streams. Depend only on Kafka There should be no dependencies on external systems (such as HDFS or YARN).
  • 14. Original Design Goals Depend only on Kafka There should be no dependencies on external systems (such as HDFS or YARN).
  • 15. Original Design Goals Availability Source of truth is in a ¡°changelog¡± topic in Kafka. If assignment changes, state must be restored.
  • 16. Original Design Goals Availability Source of truth is in a ¡°changelog¡± topic in Kafka. If assignment changes, state must be restored. Flexibility Dynamic scaling is impractical. Most deployments are provisioned for peak throughput.
  • 17. Original Design Goals Availability Source of truth is in a ¡°changelog¡± topic in Kafka. If assignment changes, state must be restored. Flexibility Dynamic scaling is impractical. Most deployments are provisioned for peak throughput. Performance Di?cult to properly attribute resources to the stream processing vs. RocksDB storage subsystem
  • 18. What Changed? 2016 Kafka Streams is released with Apache Kafka 0.10
  • 19. What Changed? 2016 Kafka Streams is released with Apache Kafka 0.10 2017 Cloud databases gain mainstream adoption
  • 20. What Changed? 2016 Kafka Streams is released with Apache Kafka 0.10 2017 Cloud databases gain mainstream adoption 2019 Kubernetes wins out as de facto orchestration system
  • 21. What Changed? 2016 Kafka Streams is released with Apache Kafka 0.10 2017 Cloud databases gain mainstream adoption 2019 Kubernetes wins out as de facto orchestration system 2020 Kafka Streams popularity soars, widening possible use cases
  • 22. What Changed? 2016 Kafka Streams is released with Apache Kafka 0.10 2017 Cloud databases gain mainstream adoption 2019 Kubernetes wins out as de facto orchestration system 2020 Kafka Streams popularity soars, widening possible use cases Today Some assumptions guiding the original Kafka Streams design are outdated
  • 23. Lifting the ¡°Kafka Only Requirement¡±
  • 24. Lifting the ¡°Kafka Only Requirement¡±
  • 25. Lifting the ¡°Kafka Only Requirement¡±
  • 26. Deep Dive: Metronome Metronome powers billing for companies like OpenAI, NVIDIA and Databricks using Kafka Streams. Mission Critical Feature: Realtime Spend Limits
  • 27. Key Results Migrating from RocksDB to Scylla Availability Going from regular incidents to ¡°not thinking about Kafka Streams¡± Throughput Growth ScyllaDB scaled without hiccup as their data size and throughput scaled Scale Potential Decoupled compute and storage means we¡¯ve been able to scale number of Kafka Partitions and ScyllaDB cluster size independently 99.99% 3x ¡Þ
  • 29. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); Raw Data Since Kafka Streams deals only with serialized bytes (users supply the serializers) it makes storing data easy and general purpose
  • 30. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); Primary Key Using the Kafka partition allows us to implement LWTs per partition and the data key allows us to implement e?cient lookups and range scans
  • 31. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); Restoring State Storing the offset with a sentinel dataKey enables e?cient client hand-offs in failure scenarios
  • 32. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); What¡¯s This?
  • 33. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey);
  • 34. Node B Node A process Kafka commit Fencing Zombies
  • 35. Node B Node A process Kafka commit write to ScyllaDB Fencing Zombies
  • 36. Node B Node A process Kafka commit write to ScyllaDB process Kafka commit write to ScyllaDB Fencing Zombies
  • 37. Node B Node A process Kafka commit write to ScyllaDB process Kafka commit write to ScyllaDB prevents other nodes from committing to Kafka (but not writing to Scylla!) Fencing Zombies
  • 38. Node B Node A process Kafka commit write to ScyllaDB process Kafka commit write to ScyllaDB write to ScyllaDB Fencing Zombies
  • 39. Node B Node A process Kafka commit write to ScyllaDB process Kafka commit write to ScyllaDB write to ScyllaDB Might overwrite data from the previous ScyllaDB write! Fencing Zombies
  • 40. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); BEGIN BATCH; starts an Atomic Batch (not for speed, but for LWT)
  • 41. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); BEGIN BATCH; UPDATE key_value SET epoch = 12 WHERE partitionKey = 1 dataKey = metadata_key IF epoch <= 12;
  • 42. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); BEGIN BATCH; UPDATE key_value SET epoch = 12 WHERE partitionKey = 1 dataKey = metadata_key IF epoch <= 12; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­;
  • 43. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); BEGIN BATCH; UPDATE key_value SET epoch = 12 WHERE partitionKey = 1 dataKey = metadata_key IF epoch <= 12; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; APPLY BATCH;
  • 44. Data Model CREATE TABLE key_value partitionKey INTEGER, dataKey BLOB, dataValue BLOB, epoch BIGINT, offset BIGINT, PRIMARY KEY ((partitionKey), dataKey); BEGIN BATCH; UPDATE key_value SET epoch = 12 WHERE partitionKey = 1 dataKey = metadata_key IF epoch <= 12; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; INSERT INTO key_value VALUES ¡­; APPLY BATCH;
  • 46. Latency Issues? Check Disk! Bloom Filter Usage An increase in disk reads is often a symptom something else is wrong. For us, there were a few times Bloom Filters were not properly constructed (once due to a con?g, once due to a bug)!
  • 47. Use LWT Only When Necessary Cost of Atomic Batches Atomic Batches signi?cantly slow down write throughput, even if they¡¯re contained to only a single partition Throughput Before/After Removing LWTs
  • 48. Selecting Node Size Node Size How to choose your ScyllaDB Cloud node type?
  • 49. Consistency Don¡¯t Be Inconsistent! Make sure you¡¯ve set your Read / Write consistency levels. The default is ONE / ONE, which can give inconsistent results! Favor Fast Reads Favor Fast Writes Inconsistent Consistent ONE/ONE
  • 50. Consistency Don¡¯t Be Inconsistent! Make sure you¡¯ve set your Read / Write consistency levels. The default is ONE / ONE, which can give inconsistent results! Favor Fast Reads Favor Fast Writes Inconsistent Consistent QUORUM/ QUORUM ONE/ONE
  • 51. Consistency Don¡¯t Be Inconsistent! Make sure you¡¯ve set your Read / Write consistency levels. The default is ONE / ONE, which can give inconsistent results! Favor Fast Reads Favor Fast Writes Inconsistent Consistent QUORUM/ QUORUM ONE/ONE ONE/ALL risky, but has niche use cases
  • 53. Consistency QUORUM/QUORUM Need Faster Reads Option: use ONE/ALL temporarily Migrate to ALL / ALL This is an intermediate state necessary to maintain consistency Run Repair Ensure all data is available on all nodes from the QUORUM / QUORUM time
  • 54. Consistency QUORUM/QUORUM Need Faster Reads Option: use ONE/ALL temporarily Migrate to ALL / ALL This is an intermediate state necessary to maintain consistency Run Repair Ensure all data is available on all nodes from the QUORUM / QUORUM time Enable ONE / ALL Speedy reads! This can help you temporarily and migrating back to QUORUM / QUORUM is safe.
  • 55. Stay in Touch Almog Gavra almog@responsive.dev @almog.ai @agavra /in/agavra/