27. More Data Sources
? Also Collect Server Logs
? Periodically Upload to S3
? Stuff into Redshift
? External Analytics Data Too
External
Analytics
EC2
28. Dealing With Messy Data
? Different File Formats
? Device vs Apache vs CDN
? Cleanup with EMR Job
? Output to Clean Bucket
? Load into Redshift
EC2
29. Direct From DynamoDB
? Integrate Game DB
? Load Directly into Redshift
? Redshift does Intelligent Merge
? Tracks Hash Keys, Columns
EC2
30. Direct From DynamoDB
? Integrate Game DB
? Load Directly into Redshift
? Redshift does Intelligent Merge
? Tracks Hash Keys, Columns
? Or Stream into EMR
EC2
32. Back To Basics
2014-?\01-?\24,nateware,e4df,login
?
2014-?\01-?\24,nateware,e4df,gamestart
?
2014-?\01-?\24,nateware,e4df,gameend
?
2014-?\01-?\25,nateware,a88c,login
?
2014-?\01-?\25,nateware,a88c,friendlist
?
2014-?\01-?\25,nateware,a88c,gamestart
33. Back To Basics [Dubstep Remix]
? Always Batch Due to S3
EC2
34. Need Data Faster!
? Stream Data With Kinesis
? Multiple Writers and Readers
? Still Output to Redshift
EC2
35. Lots of Ins and Outs
? Stream Data With Kinesis
? Multiple Writers and Readers
? Still Output to Redshift
? Stream to Spark on EMR
? Storm via Kinesis Spout
? Custom EC2 Workers
EC2
EC2
38. Clash of Clans
Amazon
Kinesis
Redshift
Clickstream
archive
EC2: In-game
engagement
trends dashboard
Real-time clickstream
processing app
Kinesis: Real-time data stream of in-game activity
Multiple Kinesis applications: Dashboards, analytics and storage
Redshift: Business intelligence reporting and interactive queries
S3 and Glacier: Data storage and long term archival
In-game
activity
S3 Aggregate
statistics
Business-intelligence
user
Kinesis-enabled apps on EC2