際際滷

際際滷Share a Scribd company logo
THE SIDEKICK PATTERN:
USING SMALL DATA TO MULTIPLY
THE VALUE OF BIG DATA
@AbeGong
Data Scientist, Jawbone
Strata - February 2014

Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
DATA SIDEKICKS

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: HIEROGLYPH
TRANSLATION

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: CAMPAIGN TARGETING

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
EX: SLEEP CONTEXT

Wednesday, February 12, 14
[DATA ART EXAMPLE]
SUB-TITLE

Wednesday, February 12, 14
Wednesday, February 12, 14
EXAMPLES, PLEASE:
WHICH DATA STREAMS GET

BIG?
(...AND BESIDES SIZE, WHAT ELSE DO THEY HAVE IN COMMON?)

Wednesday, February 12, 14
BIG, RICH, MESSY

Wednesday, February 12, 14
BIG, RICH, MESSY CAREFULLY CURATED

Wednesday, February 12, 14
TRANSMUTATION!

Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
EX: HUFFPO MODERATION

Wednesday, February 12, 14
WHEN SHOULD I USE THE
SIDEKICK PATTERN?

Wednesday, February 12, 14
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
 To

Wednesday, February 12, 14

separate munging and cleaning from scaling.
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
 To
 To

Wednesday, February 12, 14

separate munging and cleaning from scaling.
bootstrap new data products.
WHEN SHOULD I USE THE
SIDEKICK PATTERN?
 To
 To

bootstrap new data products.

 To

Wednesday, February 12, 14

separate munging and cleaning from scaling.

leverage variety against volume.
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
EX: SLEEP RECOVERY

Wednesday, February 12, 14
Wednesday, February 12, 14
Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
LEVELS OF ABSTRACTION

Wednesday, February 12, 14
QUESTIONS? COMMENTS?

@AbeGong
Data Scientist, Jawbone
Strata - February 2014

Wednesday, February 12, 14
Wednesday, February 12, 14
Big
Rich
Messy
Sensory
User experience
External-facing

Abstract
Business logic
Internal-facing

Qualitative
Story-making
Wednesday, February 12, 14

Small
Focused
Curated

Quantitative
Science-making
TRANSMUTATION EXAMPLES
Example

Example

Property

Rosetta stone

Synonyms/Comparability

Bridge cases in IRT
scaling models

Relative ranking

Campaign targeting

Demographic categories

Sentiment analysis

Categories

Sleep context

Context

Pretty much all
supervised learning

Categories/Scales

Instrumental variables

Causality

...

Hu鍖Po moderation

Credibility

Sleep recovery

Clean examples

Economic mobility

Continuity

Crowd鍖ower gold

Wednesday, February 12, 14

Property

Credibility
RECOMMENDED READING




Paco Nathan: http://www.slideshare.net/pacoid/using-cascalog-to-build-an-appbased-on-city-of-palo-alto-open-data



Jay Kreps: http://engineering.linkedin.com/distributed-systems/log-what-everysoftware-engineer-should-know-about-real-time-datas-unifying



Joseph Turian: http://鍖les.meetup.com/1542972/20120202-more-data-samemodels-STUDY-SLIDES.pdf



Wednesday, February 12, 14

Pete Skomoroch: http://www.slideshare.net/pskomoroch/strataendorsements-16939466

Me: http://blog.abegong.com/2014/02/wanted-good-examples-of-datasidekicks.html

More Related Content

The Sidekick Pattern: Strata talk by Abe Gong