ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Loading and Analyzing Behavioral
Data in Amazon Redshift
PresentedbySegment,AWS&XOGroup Inc.
March3,2015
JonHawkins
Dir.ofAnalytics&SEO
XOGroupInc.
ScottWard
SolutionsArchitect
AWS
PeterReinhardt
CEO
Segment
Today¡¯s Speakers
Today¡¯s agenda
?? AmazonRedshiftArchitecture
?? SegmentSQLQuickDemo
?? XOGroup¡¯sStory
Amazon Redshift
Fast,simple,petabyte-scaledatawarehousingforlessthan$1,000/TB/Year
Amazon Redshift is Easy to Use
?? Provisioninminutes
?? Monitorqueryperformance
?? Pointandclickresize
?? Builtinsecurity
?? Automaticbackups
Amazon Redshift Architecture
?? LeaderNode
¨C? SQLendpoint
¨C? Storesmetadata
¨C? Coordinatesqueryexecution
?? ComputeNodes
¨C? Local,columnarstorage
¨C? Executequeriesinparallel
¨C? Load,backup,restoreviaAmazonS3
¨C? ParallelloadfromAmazonDynamoDB,Amazon
EMR,AmazonS3,HDFS/SSH
?? Twohardwareplatforms
¨C? Optimizedfordataprocessing
¨C? DW1:HDD;scalefrom2TBto1.6PB
¨C? DW2:SSD;scalefrom160GBto256TB
10 GigE
(HPC)
Ingestion
Backup
Restore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
Amazon S3
JDBC/ODBC
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
Leader
Node
?? Columnstorage
?? Datacompression
?? Zonemaps
?? Direct-attachedstorage ?? Withrowstorageyoudo
unnecessaryI/O
?? Togettotalamount,youhavetoread
everything
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift Dramatically Reduces I/O
?? Withcolumnstorage,youonly
readthedatayouneed
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
?? Columnstorage
?? Datacompression
?? Zonemaps
?? Direct-attachedstorage
Amazon Redshift Dramatically Reduces I/O
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
?? Columnstorage
?? Datacompression
?? Zonemaps
?? Direct-attachedstorage
?? COPYcompressesautomatically
?? Youcananalyzeandoverride
?? Moreperformance,lesscost
Amazon Redshift Dramatically Reduces I/O
?? Columnstorage
?? Datacompression
?? Zonemaps
?? Direct-attachedstorage
10 | 13 | 14 | 26 |¡­
¡­ | 100 | 245 | 324
375 | 393 | 417¡­
¡­ 512 | 549 | 623
637 | 712 | 809 ¡­
¡­ | 834 | 921 | 959
10
324
375
623
637
959
?? Tracktheminimumandmaximum
valueforeachblock
?? Skipoverblocksthatdon¡¯tcontain
relevantdata
Amazon Redshift Dramatically Reduces I/O
?? Columnstorage
?? Datacompression
?? Zonemaps
?? Direct-attachedstorage
128 GB RAM
16 cores
16 TB disk
DW.HS1.8XL:
?? >2GB/sscanrate
?? Optimizedfordataprocessing
?? Highdiskdensity
16 GB RAM
2 cores
2 TB disk
DW.HS1.XL:
Amazon Redshift Dramatically Reduces I/O
?? Query
?? Load
?? Backup/Restore
?? Resize
Amazon Redshift Parallelizes and Distributes Everything
Amazon S3/DynamoDB
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
?? Query
?? Load
?? Backup/Restore
?? Resize
?? ParallelloadfromAmazonDynamoDB,AmazonEMR,
AmazonS3,HDFS/SSH
?? Kinesisintegration
?? Dataautomaticallydistributedandsortedaccordingto
DDL
?? Scaleslinearlywithnumber
ofnodes
Amazon Redshift Parallelizes and Distributes Everything
Amazon S3
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
128GB RAM
16TB disk
16 coresCompute
Node
?? Query
?? Load
?? Backup/Restore
?? Resize
?? BackupstoAmazonS3areautomatic,continuous
andincremental
?? Backupyourclustertoasecondregion
?? Configurablesystemsnapshotretentionperiod;take
usersnapshotson-demand
?? Streamingrestoresenableyoutoresumequerying
faster
Amazon Redshift Parallelizes and Distributes Everything
SQL Clients/BI Tools
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Leader
Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Comput
e Node
128GB RAM
48TB disk
16 cores
Leader
Node
?? Query
?? Load
?? Backup/Restore
?? Resize
?? Add/removenodesorchangenodetypewhile
remainingonline
?? Provisionanewclusterandcopydatainparallelfrom
nodetonode
?? OnlychargedforsourceclusteruntilSQLendpoint
hasautomaticallybeenswitchedoverviaDNS
Amazon Redshift Parallelizes and Distributes Everything
?? SSLtosecuredataintransit
?? Encryptiontosecuredataatrest
¨C? AES-256;hardwareaccelerated
¨C? AllblocksondisksandinAmazonS3
encrypted
¨C? HSM/CloudHSM
?? Nodirectaccesstocomputenodes
?? AmazonVPCsupport
10 GigE
(HPC)
Ingestion
Backup
Restore
SQL Clients/BI Tools
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
128GB RAM
16TB disk
16 cores
Amazon S3 / Amazon DynamoDB
Customer VPC
Internal
Security
Group
JDBC/ODBC
Leader
Node
Compute
Node
Compute
Node
Compute
Node
Amazon Redshift Has Security Built In
Segment SQL
Asinglehubtostore,transform,andsendyourcustomerdata.
Allthesetoolsrunon
threetypesofdata.
IDENTIFY
Whoareyourusers?
TRACK
Whataretheydoing?
PAGE
Wherearethey?
analytics.track({ 'Added Product' , {
name: 'Monopoly: 3rd Edition',
price: 18.99,
quantity: 3
});
1
2
3
4
5
ReplaceallyourtagsandtrackingwithasingleAPI.
Asinglehubtostore,transform,andsendyourcustomerdata.
Thatsamedatacanfloweffortlessly(foryou)into
apetabytescaledatawarehouse.
Segment SQL
Allofyourdatainahosted,schematizedAmazonRedshiftinstance.
Segment SQL
Queryuser,page,andeventtables.
Segment SQL Partners
XO Group Inc.
XO Group Inc.
Our Analytics Journey
XO Group Inc. + Segment
Individualproductteams
wantedisolatedaccessto
theirownanalytics.
Segment+Mixpanel+
Customer.io+Optimizely+
Uservoice
XO Group Inc. + Segment
Stillneededasolutionto
connectSegmentdatafrom
multipleproductsand
platformsintoasingleview.
SegmentSQL+
ModeAnalytics
Version
support.
Sharing
functionality.
Membership
analysis.
3 Case Studies
Membership Analysis
Existingdatabasedidn¡¯tgiveany
contextintohowauserbecamea
memberintheapp.
Couldn¡¯tanalyzewhichfeatures
weredrivingmembership
upgrades,marketingattribution,
orcross-productpollination.
SELECT userID,
source,
reason
FROM xogrp.signed_up
1
2
3
4
Isolated Views
Membership Analysis
App A App B
Membership Analysis
Adesiretosaveourcontent
drivesourmembership.
Experiencingcertainappsbefore
downloadingasecond
significantlyimprovedLTV.
Wemadeiteasiertosaveand
organizeourcontent.
Weoptimizedpromotion
strategytoadvertisethe
stickiestappsfirsttocurrent
members.
Share Functionality
Facebook&Twitterdrovethe
mostinboundtraffic.
So,weguessedthey¡¯dalsobethe
mostpopularsharingoptions.
SELECT title,
shareOption,
quantity
FROM xogrp.shared_article
1
2
3
4
Share Functionality
More considerate ¡°Share¡± options
SMSadressfrom
desktopormobileweb
browser.
Oneclickemailyourselfthe
detailsofavenue.
Version Support
Couldn¡¯tanswervery
simplequestionabout
whichrollupbrowser
versions,devices,and
operatingsystems
customerswereusing.
Causedalotofdebate
overwhichversionsto
support,whichtest
devicestobuy.
Version Support
Chrome40+
Chrome40+
Chrome40=
No,justno!
AWS Segment XO Group Joint webinar
AWS Segment XO Group Joint webinar
Version Support
Nowwecaneasilyseewhatbrowser
versionsandOS¡¯saremostpopular.
Productandengineeringteamscanmake
data-drivendecisionsonwhichversions
tosupport.
AWS Segment XO Group Joint webinar
Setupyourtrackingandanalyticsdatabasetoansweranyquestion.
Save valuableengineeringtimewithprebuiltsoftware.
Listentoeachteam¡¯sneedsandcreatetheanalyticsviewtheywant.
Takeaways
Thank you!
Scheduleademo today.
segment.com/contact/sql
Ad

Recommended

Redshift overview
Redshift overview
Amazon Web Services LATAM
?
A tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
?
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queries
timonk
?
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
?
Amazon RedShift - Ianni Vamvadelis
Amazon RedShift - Ianni Vamvadelis
huguk
?
Google mesa
Google mesa
Sameer Tiwari
?
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
FlyData Inc.
?
Cloud Optimized Big Data
Cloud Optimized Big Data
Joydeep Sen Sarma
?
McCoy SAP TechEd Takeaway 2016
McCoy SAP TechEd Takeaway 2016
Sven van Leuken, Bsc, PMP
?
HANA SPS07 Replication
HANA SPS07 Replication
SAP Technology
?
Time for migration to SAP HANA
Time for migration to SAP HANA
BCC_Group
?
Currency Conversion - SAP BPC
Currency Conversion - SAP BPC
kannancr
?
Sap hana poc volvo it
Sap hana poc volvo it
Sangeetha Bangar
?
SAP BPC Environment Export Import Implementation Guide
SAP BPC Environment Export Import Implementation Guide
Cloneskills
?
GRC_2016_US_Brochure
GRC_2016_US_Brochure
Jimmy Singh Mathur
?
SAP HANA Live vs BW on HANA
SAP HANA Live vs BW on HANA
Jimmy Singh Mathur
?
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
Jothi Periasamy
?
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Jothi Periasamy
?
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
Cloneskills
?
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Big Data Spain
?
How to free up memory in SAP HANA
How to free up memory in SAP HANA
Debajit Banerjee
?
SAP HANA on Red Hat
SAP HANA on Red Hat
Debajit Banerjee
?
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Michael Bohlig
?
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
Amazon Web Services Japan
?
Intro to database_services_fg_aws_summit_2014
Intro to database_services_fg_aws_summit_2014
Amazon Web Services LATAM
?
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
Amazon Web Services Korea
?
AWS Innovate: Running Databases in AWS- Russell Nash
AWS Innovate: Running Databases in AWS- Russell Nash
Amazon Web Services Korea
?
Aws summit 2014 redshift
Aws summit 2014 redshift
Amazon Web Services LATAM
?

More Related Content

Viewers also liked (14)

McCoy SAP TechEd Takeaway 2016
McCoy SAP TechEd Takeaway 2016
Sven van Leuken, Bsc, PMP
?
HANA SPS07 Replication
HANA SPS07 Replication
SAP Technology
?
Time for migration to SAP HANA
Time for migration to SAP HANA
BCC_Group
?
Currency Conversion - SAP BPC
Currency Conversion - SAP BPC
kannancr
?
Sap hana poc volvo it
Sap hana poc volvo it
Sangeetha Bangar
?
SAP BPC Environment Export Import Implementation Guide
SAP BPC Environment Export Import Implementation Guide
Cloneskills
?
GRC_2016_US_Brochure
GRC_2016_US_Brochure
Jimmy Singh Mathur
?
SAP HANA Live vs BW on HANA
SAP HANA Live vs BW on HANA
Jimmy Singh Mathur
?
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
Jothi Periasamy
?
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Jothi Periasamy
?
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
Cloneskills
?
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Big Data Spain
?
How to free up memory in SAP HANA
How to free up memory in SAP HANA
Debajit Banerjee
?
SAP HANA on Red Hat
SAP HANA on Red Hat
Debajit Banerjee
?
Time for migration to SAP HANA
Time for migration to SAP HANA
BCC_Group
?
Currency Conversion - SAP BPC
Currency Conversion - SAP BPC
kannancr
?
SAP BPC Environment Export Import Implementation Guide
SAP BPC Environment Export Import Implementation Guide
Cloneskills
?
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
SAP BPC on HANA EPM Report Developer Guide - Part #1 - v9
Jothi Periasamy
?
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Enterprise Cloud Computing - Analytics, Planning & Digital Boardroom
Jothi Periasamy
?
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
SAP BPC NW 10.0 Knowledgebase - Consolidations Group Currency conversion
Cloneskills
?
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Big Data Spain
?
How to free up memory in SAP HANA
How to free up memory in SAP HANA
Debajit Banerjee
?

Similar to AWS Segment XO Group Joint webinar (6)

Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Michael Bohlig
?
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
Amazon Web Services Japan
?
Intro to database_services_fg_aws_summit_2014
Intro to database_services_fg_aws_summit_2014
Amazon Web Services LATAM
?
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
Amazon Web Services Korea
?
AWS Innovate: Running Databases in AWS- Russell Nash
AWS Innovate: Running Databases in AWS- Russell Nash
Amazon Web Services Korea
?
Aws summit 2014 redshift
Aws summit 2014 redshift
Amazon Web Services LATAM
?
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Michael Bohlig
?
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
[¤è¤¯¤ï¤«¤ëAmazon Redshift]Amazon Redshift×îÐÂÇéˆó¤ÈŒ§ÈëÊÂÀý¤Î¤´½B½é
Amazon Web Services Japan
?
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
2017 AWS DB Day | AWS ?????? ?? - ?? ??? ??? ????????
Amazon Web Services Korea
?
AWS Innovate: Running Databases in AWS- Russell Nash
AWS Innovate: Running Databases in AWS- Russell Nash
Amazon Web Services Korea
?
Ad

AWS Segment XO Group Joint webinar