際際滷

際際滷Share a Scribd company logo
A.D.T
焔谿 (Chan @KakaoBank)
(Gordon @Kakao)
About Speakers
 焔谿
 KTH, 磯, 豺伎拘, ()豺伎拘る DBA
 轟伎: ADT 襦 譴 朱 
 A.k.a 覦一 (ル.)
 
 2007: (譯)伎 螻給 谿曙
 2012: 豺伎拘れ 語(豺伎拘る朱 覈 覲蟆)
 豺伎拘れ 螻  讌企 譴
What is ADT?
Almighty
Data
Transmitter
History
 2015 譴覦
- 朱 觜れ MySQL る 蟲
Range Sharding (1/3)
Shard1 Shard2 Shard3
ID: 1 - 1000
ID: 1001 - 2000
ID: 2001 - 3000
Range Sharding (2/3)
Shard1 Shard2 Shard3
ID: 1 - 1000
10,000,000 rows
5,000 QPS
ID: 1001 - 2000
5,000,000 rows
1,000 QPS
ID: 2001 - 3000
1,000 rows
10,000 QPS
new shard
Range Sharding (3/3)
- PROS.
- CONS.
Unbalanced rows and traffic
Easy to attach next shard 
without any side effect
Modulus Sharding (1/3)
Shard1 Shard2 Shard3
ID mod 3 : 0
ID mod 3 : 1
ID mod 3 : 2
Modulus Sharding (2/3)
Shard1 Shard2 Shard3
ID % 3 == 0 ID % 3 == 1 ID % 3 == 2
Shard1 Shard2 Shard3
ID % 4 == 0 ID % 4 == 1 ID % 4 == 2
Shard4
ID % 4 == 3
Add New?
Modulus Sharding (3/3)
- PROS.
- CONS.
Difficult to attach new shard
Better resource balancing
History
 2015 襴
- 朱 觜れ MySQL る 蟲  
 2015 覦蠍: 襦 
- 覈: MySQL る 蟲 ( 企朱...)
 Chan
MySQL Binary Log襯 伎伎    蟆
る 蟲 語 襷讌 蟾?
History
 2015 襴
- 朱 觜れ MySQL る 蟲  
 2015 覦蠍: 襦 
- 覈: MySQL る 蟲
- 覈: れ 襷願係伎 (ETL+CDC ?)
 企 觜れ 企 MySQL 覯
覦MB/min 襦 朱 Binlog襯
螳 螳麹  蟾?
History
 2015 襴
- 朱 觜れ MySQL る 蟲  
 2015 覦蠍: 襦 
- 覈: MySQL る 蟲
- れ 襷願係伎
- 觜襯 豌襴 螳
Goals & Concepts
Goals
Dynamic Migration
Performance
Support only MySQL in first release
Dynamic Migration
 One-time Migration
- e.g. Shard Rebalancing
 Real-time Migration (as a Service?)
- e.g. MySQL => HBase in Real-time
Ex: Add new shard
Shard1 Shard2
Shard1 Shard3Shard2
Ex: Change shard rule
ID: 1 - 1000
ID: 1001 - 2000
ID: 2001 - 3000
Range
ID mod 3: 0
ID mod 3: 1
ID mod 3: 2
Modulus
Ex: Copy to diff. DBMS
MySQL
HBase
MongoDB
NO side effect
Ex: Copy to diff. schema
MySQL
MySQL
MySQL
ID AGE CNT V
1 30 5 msg1
2 29 10 msg2
ID CNT V
1 5 msg1
ID CNT V
2 10 msg2
NO side effect
Ex: Write w/ custom processing
MySQL
MySQL
MySQL
Events
ID AGE CNT V
1 30 5 msg1
2 29 10 msg2
1) REQ
INFO2)
3) GET INFO
4) MERGE
ID FRIENDS
1 4,5,6,7,8,9
2 3,4,5,8,10
Performance (1/3)
Source
Dest1 Dest2
Data Copy Layer
Performance (2/3)
Single-threaded Applier
Dest1 Dest2
1ms
1ms
2ms
2ms
2ms
Performance (3/3)
Multi-threaded Applier
Dest1 Dest2
Parallel Processing (1/3)
Different Row ID
Parallel Processing
Same Row ID
Sequential Processing
Parallel Processing (2/3)
AS-IS (SQL Thread)
1 : a=a+1
3 : a=a+2
2 : a=a+1
3 : a=a+4
2 : a=a+1
1 : a=a+2
Binary Log
1 : a=a+1
3 : a=a+2
2 : a=a+1
3 : a=a+4
2 : a=a+1
1 : a=a+2
Sequential Apply
Parallel Processing (3/3)
TO-BE (ADT)
1 : a=a+1
3 : a=a+2
2 : a=a+1
3 : a=a+4
2 : a=a+1
1 : a=a+2
1 : a=a+1 3 : a=a+22 : a=a+1
3 : a=a+42 : a=a+11 : a=a+2
Thread1 Thread2 Thread3
Binary Log
ADT Features
Features
 Table Crawler
 SELECT 貎朱Μ 覦覲汲
SELECT * FROM ? [ WHERE id > ? ] LIMIT ?;
 Binlog Receiver
 MySQL Replication 襦貊
 Custom Data Handler
 讌 一危一 豌襴 覿覿
e.g. Shard reconstruction handler
  る   ろ
Features: Binlog Processor
Binlog
Recv
Queue #1
Custom
Data
Handler
Queue #2
Queue #3
Queue #n

ADT Binlog Processor
Binlog Receiver
1. PK/UK 螳 曙
2. 曙 螳 蠍一朱 
Queue l
3. 企 Queue襯 豌襴 企
Thread 讌
1 Thread / 1 Queue
- 螳 Queue: 谿 豌襴
- るジ Queue: 覲 豌襴
Features: Data Crawler
1. SELECT  LIMIT n
2. 危殊企
SELECT 螳 豌襴蠍
 るジ 磯襯 蟾 
 SELECT ろ
SELECT 襦
Custom Handler螳 語
讌 
Select Crawl from Last PK
Custom Data Handler
ADT Data Crawler
Thread
Pool
ADT
Requirements
Req 1. Row Format
螳 Binlog Before, After 螳 
N/A 1 : a=1, b=2, c=3
Before After
INSERT
1 : a=1, b=99, c=99 1 : a=1, b=2, c=3UPDATE
1 : a=1, b=99, c=99 N/ADELETE
Req 2. Primary Key
DELETE
Source Destination
INSERT
UPDATE
Delete Old/Insert New
Delete Old/Insert New
Delete Old
覈 Table Primary Key螳 譟伎伎
ADT
Req 3. Custom Data Handler
MySQL
Binlog
Processor
MySQL
Data
Crawler
Custom Data Handler
(: Shard 覿覦 碁る)
New
Masters
Load
Current
Master
Data
 覿覿 讌 蟲伎
蠍壱 蟲/ 
 Millisecond  覿螳
 Alter Table れ螳 覦  
 蠍壱 焔...
Integrity
Constraints
Types of Integrity
 Domain Integrity
 Column Type, NOT NULL, CHECK, 
 Entity Integrity
 Primary Key, Unique Key
 Referential Integrity
 Foreign Key
るジ data DML
レ 譯朱
Integrity襷 螻
Constraints
Unique Key
Same PK(Row ID) or UK
Sequential Processing
Foreign Key
Not Considered
set foreign_key_checks = 0
Why Ignore FK?
 る 蟲 り  螻褐 焔レ  FK襯  磯 螻褐伎伎
 FK constraint 豌危 not null, check 螳 master 企 
朱襦 slave   り 
 FK螳 一危 覲蟆曙 覦讌  蟆曙: 螻 X
 FK襦 誤 るジ 企 螳 覲蟆暑 蟆曙
 table るゴ覃 覲 豌襴  conflict 螳レ煙 
 FK襦 誤 螳 企 螳 覲蟆暑 蟆曙
 伎姶 parent, child row   るジ constraint 覓語螳 
 蟆曙一襷 binlog 蠍磯覩襦 PK, UK襷 螻
Error Handling
 Chan & Gordon
伎   襷襦  蟆 螻,
    螻ろ   螳蟆 螳.
蠏 螳 覦覯 企給
豕蠏 襭 覿 
 螳讌
覈 伎郁鍵襦 れ
Shard Rebalancing
Handler
How Data Crawler Works
 SELECT
 SELECT using PK of ex-selected rows
 INSERT
 INSERT IGNORE is required
if Binlog Processor runs together
How to Handle Binlog? (1/4)
Row Event
Type
Query to Dest. (normally)
WRITE insert( after )
DELETE delete( before )
UPDATE update( before, after )
Normally binlog events are handled like this.
How to Handle Binlog? (2/4)
 However, we should consider
- Unexpected restart
- Data inserted by Crawler
Overwriting!
How to Handle Binlog? (3/4)
Row Event
Type
Query to Dest.
(Overwriting)
WRITE replace( after )
DELETE delete( before )
UPDATE
if( before.pk!=after.pk ){
delete( before )
}
replace( after )
How to Handle Binlog? (4/4)
 Normal Query
UPDATE  SET @1=after.1, @2=after.2,
WHERE pk_col=before.pk
 Transformation 1: Unrolling
DELETE FROM  WHERE pk_col=before.pk;
INSERT INTO  VALUES(after.1, after.2,);
 Transformation 2: Overwriting
DELETE FROM  WHERE pk_col=before.pk;
REPLACE INTO  VALUES(after.1, after.2,);
 Transformation 3: Reducing
 Delete [before] only if PK is changed
Strategy 1
 Run sequentially
1. Crawl Data
2. Process Binary Log (after 1 is finished)
 Binlog file (created before starting) is required
 If crawling takes more than 3 days, then?
Strategy 2
 Run parallel with master DB
1. Start binlog processor
2. Start Data Crawler (ASAP after 1 is started)
with INSERT IGNORE
 Problem: conflicts
Strategy 2: Conflicts
src
dest
Binlog Processor Data Crawler
SELECT PK:1
INSERT PK:1
PK:1 is deleted
DELETE PK:1
Strategy 2-1
 Data Crawler
- SELECT  FOR UPDATE when crawling
- ROLLBACK after INSERT IGNORE
 Binlog Processor
- Just act normally
Because theres no logs for locked rows
 Problem: Multi row lock is dangerous for master DB
Strategy 2-2
 Binlog Processor
- Cache deleted history during a few minutes
 Data Cralwer
- If delete history exists, no INSERT
 Problems
- Complicated: Lock is neccesary for history cache
Strategy 2-3
 Same as Strategy 2-1, except using slave DB
 If sync is finished with slave, 
restart with new config
- Receive binlog from master
Test
Test Scenario
Split into 2 shards from 1 master DB
 Binary log only
Query many updates into master DB
 1K active sessions with random DML & data
Make errors
 Master DB: ifdown > mysql restart > ifup
Compare data
Test DML List
INSERT INTO 
INSERT IGNORE INTO 
INSERT INTO  ON DUPLICATE KEY 
REPLACE INTO 
UPDATE 
DELETE
Test Table Schema
Column
Name
Type
Table 1 Table 2
Value
(when exec. DML)Primary Unique Primary Unique
no int 1 2 1 Random(1~n)
seq int 2 Random(1~n)
uk int 1 1 Random(1~n)
update_cnt int Incr. when update
random_val text Random String
(危 )
Step1. Split into 2 Shards
MySQL
Source
Query
Tool
ADT
Binlog
Events
Write
MySQL
Dest.
Split
Shards
Step2. Make Error
MySQL
Source
ADT
Binlog
Events
ifdown eth0
service mysql restart
ifup eth0
If any problem is detected,
ADT restarts and re-sync
Step3. Compare Data
MySQL
Source
Comparator MySQL
Dest.
Lock &
Select All
Remained
Binlogs
Select All
Compare
Unlock
Compare Periodically
Write
Test result is
No Error
during 2 weeks
TODO
Wish to Apply for
Shard reconstruction (default)
MySQL binary log > NoSQL
Copy data change history into OLAP
MySQL binary log > Push Notification
Re-construct shards by GPS Point (Kakao Taxi?)
A.D.T
Change Replicate Rule
Sharded by USER_ID
Sharded by Location
4-nodes
1-nodes
2-nodes
Ex: Shard By GPS Point
Next Dev. Plans
 Change language: Java > GoLang
 Control Tower: Admin & Monitoring
 Is ADT alive?
 Save checkpoint for ungraceful restart
 Support Multiple DB Types
 Redis, PgSQL,
Thank You!
Any Question?

More Related Content

Intro KaKao ADT (Almighty Data Transmitter)

  • 2. About Speakers 焔谿 KTH, 磯, 豺伎拘, ()豺伎拘る DBA 轟伎: ADT 襦 譴 朱 A.k.a 覦一 (ル.) 2007: (譯)伎 螻給 谿曙 2012: 豺伎拘れ 語(豺伎拘る朱 覈 覲蟆) 豺伎拘れ 螻 讌企 譴
  • 4. History 2015 譴覦 - 朱 觜れ MySQL る 蟲
  • 5. Range Sharding (1/3) Shard1 Shard2 Shard3 ID: 1 - 1000 ID: 1001 - 2000 ID: 2001 - 3000
  • 6. Range Sharding (2/3) Shard1 Shard2 Shard3 ID: 1 - 1000 10,000,000 rows 5,000 QPS ID: 1001 - 2000 5,000,000 rows 1,000 QPS ID: 2001 - 3000 1,000 rows 10,000 QPS new shard
  • 7. Range Sharding (3/3) - PROS. - CONS. Unbalanced rows and traffic Easy to attach next shard without any side effect
  • 8. Modulus Sharding (1/3) Shard1 Shard2 Shard3 ID mod 3 : 0 ID mod 3 : 1 ID mod 3 : 2
  • 9. Modulus Sharding (2/3) Shard1 Shard2 Shard3 ID % 3 == 0 ID % 3 == 1 ID % 3 == 2 Shard1 Shard2 Shard3 ID % 4 == 0 ID % 4 == 1 ID % 4 == 2 Shard4 ID % 4 == 3 Add New?
  • 10. Modulus Sharding (3/3) - PROS. - CONS. Difficult to attach new shard Better resource balancing
  • 11. History 2015 襴 - 朱 觜れ MySQL る 蟲 2015 覦蠍: 襦 - 覈: MySQL る 蟲 ( 企朱...)
  • 12. Chan MySQL Binary Log襯 伎伎 蟆 る 蟲 語 襷讌 蟾?
  • 13. History 2015 襴 - 朱 觜れ MySQL る 蟲 2015 覦蠍: 襦 - 覈: MySQL る 蟲 - 覈: れ 襷願係伎 (ETL+CDC ?)
  • 14. 企 觜れ 企 MySQL 覯 覦MB/min 襦 朱 Binlog襯 螳 螳麹 蟾?
  • 15. History 2015 襴 - 朱 觜れ MySQL る 蟲 2015 覦蠍: 襦 - 覈: MySQL る 蟲 - れ 襷願係伎 - 觜襯 豌襴 螳
  • 18. Dynamic Migration One-time Migration - e.g. Shard Rebalancing Real-time Migration (as a Service?) - e.g. MySQL => HBase in Real-time
  • 19. Ex: Add new shard Shard1 Shard2 Shard1 Shard3Shard2
  • 20. Ex: Change shard rule ID: 1 - 1000 ID: 1001 - 2000 ID: 2001 - 3000 Range ID mod 3: 0 ID mod 3: 1 ID mod 3: 2 Modulus
  • 21. Ex: Copy to diff. DBMS MySQL HBase MongoDB NO side effect
  • 22. Ex: Copy to diff. schema MySQL MySQL MySQL ID AGE CNT V 1 30 5 msg1 2 29 10 msg2 ID CNT V 1 5 msg1 ID CNT V 2 10 msg2 NO side effect
  • 23. Ex: Write w/ custom processing MySQL MySQL MySQL Events ID AGE CNT V 1 30 5 msg1 2 29 10 msg2 1) REQ INFO2) 3) GET INFO 4) MERGE ID FRIENDS 1 4,5,6,7,8,9 2 3,4,5,8,10
  • 27. Parallel Processing (1/3) Different Row ID Parallel Processing Same Row ID Sequential Processing
  • 28. Parallel Processing (2/3) AS-IS (SQL Thread) 1 : a=a+1 3 : a=a+2 2 : a=a+1 3 : a=a+4 2 : a=a+1 1 : a=a+2 Binary Log 1 : a=a+1 3 : a=a+2 2 : a=a+1 3 : a=a+4 2 : a=a+1 1 : a=a+2 Sequential Apply
  • 29. Parallel Processing (3/3) TO-BE (ADT) 1 : a=a+1 3 : a=a+2 2 : a=a+1 3 : a=a+4 2 : a=a+1 1 : a=a+2 1 : a=a+1 3 : a=a+22 : a=a+1 3 : a=a+42 : a=a+11 : a=a+2 Thread1 Thread2 Thread3 Binary Log
  • 31. Features Table Crawler SELECT 貎朱Μ 覦覲汲 SELECT * FROM ? [ WHERE id > ? ] LIMIT ?; Binlog Receiver MySQL Replication 襦貊 Custom Data Handler 讌 一危一 豌襴 覿覿 e.g. Shard reconstruction handler る ろ
  • 32. Features: Binlog Processor Binlog Recv Queue #1 Custom Data Handler Queue #2 Queue #3 Queue #n ADT Binlog Processor Binlog Receiver 1. PK/UK 螳 曙 2. 曙 螳 蠍一朱 Queue l 3. 企 Queue襯 豌襴 企 Thread 讌 1 Thread / 1 Queue - 螳 Queue: 谿 豌襴 - るジ Queue: 覲 豌襴
  • 33. Features: Data Crawler 1. SELECT LIMIT n 2. 危殊企 SELECT 螳 豌襴蠍 るジ 磯襯 蟾 SELECT ろ SELECT 襦 Custom Handler螳 語 讌 Select Crawl from Last PK Custom Data Handler ADT Data Crawler Thread Pool
  • 35. Req 1. Row Format 螳 Binlog Before, After 螳 N/A 1 : a=1, b=2, c=3 Before After INSERT 1 : a=1, b=99, c=99 1 : a=1, b=2, c=3UPDATE 1 : a=1, b=99, c=99 N/ADELETE
  • 36. Req 2. Primary Key DELETE Source Destination INSERT UPDATE Delete Old/Insert New Delete Old/Insert New Delete Old 覈 Table Primary Key螳 譟伎伎
  • 37. ADT Req 3. Custom Data Handler MySQL Binlog Processor MySQL Data Crawler Custom Data Handler (: Shard 覿覦 碁る) New Masters Load Current Master Data 覿覿 讌 蟲伎
  • 38. 蠍壱 蟲/ Millisecond 覿螳 Alter Table れ螳 覦 蠍壱 焔...
  • 40. Types of Integrity Domain Integrity Column Type, NOT NULL, CHECK, Entity Integrity Primary Key, Unique Key Referential Integrity Foreign Key るジ data DML レ 譯朱 Integrity襷 螻
  • 41. Constraints Unique Key Same PK(Row ID) or UK Sequential Processing Foreign Key Not Considered set foreign_key_checks = 0
  • 42. Why Ignore FK? る 蟲 り 螻褐 焔レ FK襯 磯 螻褐伎伎 FK constraint 豌危 not null, check 螳 master 企 朱襦 slave り FK螳 一危 覲蟆曙 覦讌 蟆曙: 螻 X FK襦 誤 るジ 企 螳 覲蟆暑 蟆曙 table るゴ覃 覲 豌襴 conflict 螳レ煙 FK襦 誤 螳 企 螳 覲蟆暑 蟆曙 伎姶 parent, child row るジ constraint 覓語螳 蟆曙一襷 binlog 蠍磯覩襦 PK, UK襷 螻
  • 44. Chan & Gordon 伎 襷襦 蟆 螻, 螻ろ 螳蟆 螳.
  • 45. 蠏 螳 覦覯 企給 豕蠏 襭 覿 螳讌 覈 伎郁鍵襦 れ
  • 47. How Data Crawler Works SELECT SELECT using PK of ex-selected rows INSERT INSERT IGNORE is required if Binlog Processor runs together
  • 48. How to Handle Binlog? (1/4) Row Event Type Query to Dest. (normally) WRITE insert( after ) DELETE delete( before ) UPDATE update( before, after ) Normally binlog events are handled like this.
  • 49. How to Handle Binlog? (2/4) However, we should consider - Unexpected restart - Data inserted by Crawler Overwriting!
  • 50. How to Handle Binlog? (3/4) Row Event Type Query to Dest. (Overwriting) WRITE replace( after ) DELETE delete( before ) UPDATE if( before.pk!=after.pk ){ delete( before ) } replace( after )
  • 51. How to Handle Binlog? (4/4) Normal Query UPDATE SET @1=after.1, @2=after.2, WHERE pk_col=before.pk Transformation 1: Unrolling DELETE FROM WHERE pk_col=before.pk; INSERT INTO VALUES(after.1, after.2,); Transformation 2: Overwriting DELETE FROM WHERE pk_col=before.pk; REPLACE INTO VALUES(after.1, after.2,); Transformation 3: Reducing Delete [before] only if PK is changed
  • 52. Strategy 1 Run sequentially 1. Crawl Data 2. Process Binary Log (after 1 is finished) Binlog file (created before starting) is required If crawling takes more than 3 days, then?
  • 53. Strategy 2 Run parallel with master DB 1. Start binlog processor 2. Start Data Crawler (ASAP after 1 is started) with INSERT IGNORE Problem: conflicts
  • 54. Strategy 2: Conflicts src dest Binlog Processor Data Crawler SELECT PK:1 INSERT PK:1 PK:1 is deleted DELETE PK:1
  • 55. Strategy 2-1 Data Crawler - SELECT FOR UPDATE when crawling - ROLLBACK after INSERT IGNORE Binlog Processor - Just act normally Because theres no logs for locked rows Problem: Multi row lock is dangerous for master DB
  • 56. Strategy 2-2 Binlog Processor - Cache deleted history during a few minutes Data Cralwer - If delete history exists, no INSERT Problems - Complicated: Lock is neccesary for history cache
  • 57. Strategy 2-3 Same as Strategy 2-1, except using slave DB If sync is finished with slave, restart with new config - Receive binlog from master
  • 58. Test
  • 59. Test Scenario Split into 2 shards from 1 master DB Binary log only Query many updates into master DB 1K active sessions with random DML & data Make errors Master DB: ifdown > mysql restart > ifup Compare data
  • 60. Test DML List INSERT INTO INSERT IGNORE INTO INSERT INTO ON DUPLICATE KEY REPLACE INTO UPDATE DELETE
  • 61. Test Table Schema Column Name Type Table 1 Table 2 Value (when exec. DML)Primary Unique Primary Unique no int 1 2 1 Random(1~n) seq int 2 Random(1~n) uk int 1 1 Random(1~n) update_cnt int Incr. when update random_val text Random String (危 )
  • 62. Step1. Split into 2 Shards MySQL Source Query Tool ADT Binlog Events Write MySQL Dest. Split Shards
  • 63. Step2. Make Error MySQL Source ADT Binlog Events ifdown eth0 service mysql restart ifup eth0 If any problem is detected, ADT restarts and re-sync
  • 64. Step3. Compare Data MySQL Source Comparator MySQL Dest. Lock & Select All Remained Binlogs Select All Compare Unlock Compare Periodically Write
  • 65. Test result is No Error during 2 weeks
  • 66. TODO
  • 67. Wish to Apply for Shard reconstruction (default) MySQL binary log > NoSQL Copy data change history into OLAP MySQL binary log > Push Notification Re-construct shards by GPS Point (Kakao Taxi?)
  • 68. A.D.T Change Replicate Rule Sharded by USER_ID Sharded by Location 4-nodes 1-nodes 2-nodes Ex: Shard By GPS Point
  • 69. Next Dev. Plans Change language: Java > GoLang Control Tower: Admin & Monitoring Is ADT alive? Save checkpoint for ungraceful restart Support Multiple DB Types Redis, PgSQL,