際際滷

際際滷Share a Scribd company logo
Sprez.za.tura
Roelof van Zwol
Netflix
Sprez.za.tura
It is an art which does not seem to be an art. One
must avoid affectation and practice in all things. A
certain sprezzatura, disdain or carelessness, so as
to conceal art, and make whatever is done or said
appear to be without effort and almost without any
thought about it ... obvious effort is the antithesis
of grace.
Baldassare Castiglione (1478-1529)
Is machine learning an art?
When done well,
recommendations are
perceived a natural
extension of the
service
98% Match
Spot the
Algorithms!
98% Match
Spot the
Algorithms!
98% Match
Introducing new content
 Who will watch the show?
 How many members will
watch the show?
 Which canvas to use?
 When to promote?
Overview
 Correlation  Causation
 Online-learning
 Incrementality
Correlation  Causation
Should you stop buying margarine,
to save your marriage?
Correlation (X,Y) is high, does it mean
 X causes Y?  Y causes X?
Correlation (X,Y) is high, does it mean
 X causes Y?  Y causes X?
In general, neither!
Most common reason: unobserved confounder
X Y
Unobserved
Observed Observed
C
Omited variable bias
Advertising
W1 W2 W3 W4 W5
Probability of
buying:
Advertise?$ $ $ $
Advertising
 High probability of conversion the day before weekly groceries irrespective
of adverts shown
 Effect of Pampers ads is null in this case.
Traditional (correlational) machine learning will fail
and waste $ on useless ads
W1 W2 W3 W4 W5
Probability of
buying:
Advertise?$ $ $ $
in practice, Cost-Per-Incremental-Acquisition can be > 100x Cost-Per-Acquisition (!!!!!)
Netflix Promotions
Netflix homepage is an expensive real-estate (opportunity cost):
- so many titles to promote
- so few opportunities to win a moment of truth
D1 D2 D3 D4 D5
Promote?
Netflix Promotions
Netflix homepage is an expensive real-estate (opportunity cost):
- so many titles to promote
- so few opportunities to win a moment of truth
Traditional (correlational) ML systems:
- take action if probability of positive reward is high, irrespective of reward
base rate
- dont model incremental effect of taking action
D1 D2 D3 D4 D5
Promote?
Surely we can do better!
CASE STUDY:
Content promotion
through Billboard
98% Match
Online Learning
Background and notation
 Title t belongs to the pool of candidate titles T, eligible for promotion in
Billboard when member m visits the homepage
 Let xm,t
be a context vector for member m and title t
 Let ym,t
be the label indicating a play of title t by member m from the
homepage, after having seen a billboard.
What (sequence of) actions will maximize the
cumulative reward?
 Reinforcement Learning
 Multi-Armed Bandits
 Acknowledge the need for balancing
exploration and exploitation
 Allow sub-optimal actions, to collect unbiased treatment
effects and learn the probability distributions over the
space of possible actions.
B B7
7 7B
7 77
?
R3
R2
R1
狼-greedy policy
 Explore  Collect experimental data
 With 狼 probability, select at random a title for promotion in Billboard
 Log context (xm,t
)
 Causal observations of play-feedback (ym,t
)
 Exploit  Train on the experimental data
 With (1-狼) probability, select the optimal title for promotion
 Alternatives: UCB, Thompson Sampling
Greedy exploit model
 Learn a model per title to predict likelihood of play
P(ym,t
| xm,t
,T) = ( f(xm,t
, ) )
 Pick winning title:
t = argmax P(ym,t
| xm,t
,T)
 Various models can be used to predict probability of
play, such as logistic regression, GBDT, neural networks
Considerations for 狼-greedy policy
 Explore
 Bandwidth allocation and cost of exploration
 New vs existing titles
 Exploit
 Model synchronisation
 Title availability (group censoring)
 Observation window
 Frequency of model update
 Incremental updates vs batch training
 Stationarity of title popularities
?
?
?
? ??
?
Online learning works great for title
cold start scenarios, but...
MABs are
greedy, not
lift-based!
Incrementality
Incrementality-based policy
 Goal: Select title for promotion that benefits most from
being shown in billboard
 Member can play title from other sections on the homepage or search
 Popular titles likely to appear on homepage anyway: Trending Now
 Better utilize most expensive real-estate on the homepage!
 Define policy to be incremental with respect to probability of play
Incrementality-based policy
 Goal: Select title for promotion that benefits most from
being shown in billboard
t = argmax [ P(ym,t
| xm,t
, T, b=1) - P(ym,t
| xm,t
, T, b=0) ]
Where b is an indicator for the treatment of a title being shown in billboard (b=1),
versus not being shown in billboard (b=0)
Offline evaluation: Replay [Li et al, 2010]
 Relies upon uniform exploration data.
 For every record in the uniform exploration log
{context, title k shown, reward, list of candidates}
 For every record:
 Evaluate the trained model for all the titles in the candidate pool.
 Pick the winning title k
 Keep the record in history if k = k (the title impressed in the logged
data) else discard it.
 Compute the metrics from the history.
Offline evaluation: Replay [Li et al, 2010]
Uniform Exploration Data - Unbiased evaluation
Evaluation
Data
Train Data
Trained
Model
Reveal context x
Use reward only if k = k
Winner title k
context,title,reward
context,title,reward
context,title,reward
Take Rate = # Plays
# Matches
Offline replay
Greedy exploit has higher replay
take rate than incrementality based
model.
Incrementality Based Policy
sacrifices replay by selecting a
lesser known title that would benefit
from being shown on the Billboard.
Lift in Replay in the various algorithms as
compared to the Random baseline
Which titles benefit from Billboard promotion?
Title A has a low baseline
probability of play, however when
the billboard is shown the
probability of play increases
substantially!
Title C has higher baseline
probability and may not benefit as
much from being shown on the
Billboard. Scatter plot of incremental vs baseline
probability of play for various members.
Online observations
 Online take rates for take rates follow the offline
patterns.
 Our implementation of incrementality is able to shift
engagement within the candidate pool.
In Summary
Correlation, causation, and incrementality
Most ML algorithms are correlational, e.g. based on observational data
In this context, the Explore-exploit models are causal
E.g. we train models based on experimental data, where we are in control of
the randomization
Incrementality can be defined as the causal lift in a metric of interest
For instance, the change in probability of play for a title in a session, when a
billboard is shown for that title to a member
Ad

Recommended

Marketplace in motion - AdKDD keynote - 2020
Marketplace in motion - AdKDD keynote - 2020
Roelof van Zwol
Correlation, causation and incrementally recommendation problems at netflix ...
Correlation, causation and incrementally recommendation problems at netflix ...
Roelof van Zwol
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
Sequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
Jaya Kawale
Reinforcement Learning in Practice: Contextual Bandits
Reinforcement Learning in Practice: Contextual Bandits
Max Pagels
Intro to Reinforcement Learning
Intro to Reinforcement Learning
Utkarsh Garg
Causality without headaches
Causality without headaches
Beno樽t Rostykus
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
BigMine
Artwork Personalization at Netflix
Artwork Personalization at Netflix
Justin Basilico
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
AI to open more doors in Personal Finance Management (PFM)
AI to open more doors in Personal Finance Management (PFM)
SK Reddy
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
22eg105n49
Rl chapter 1 introduction
Rl chapter 1 introduction
ConnorShorten2
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
Actionable Machine Learning
Actionable Machine Learning
Meir Maor
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Universit辿 de Li竪ge (ULg)
Reinforcement Learning
Reinforcement Learning
CloudxLab
Machine Learning Techniques all units .ppt
Machine Learning Techniques all units .ppt
vidhyav58
Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524
Scott Domes
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
DevGAMM Conference
Causal inference in practice
Causal inference in practice
Amit Sharma
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
22eg105n11
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
ssuseradaf5f
An introduction to reinforcement learning
An introduction to reinforcement learning
Subrat Panda, PhD
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
Rikiya Takahashi
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
Yelp Engineering
Chapter01.ppt
Chapter01.ppt
butest
Gas Exchange in Insects and structures 01
Gas Exchange in Insects and structures 01
PhoebeAkinyi1
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
aparnamp966

More Related Content

Similar to Sprezzatura - Roelof van Zwol - May 2018 (20)

Artwork Personalization at Netflix
Artwork Personalization at Netflix
Justin Basilico
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
AI to open more doors in Personal Finance Management (PFM)
AI to open more doors in Personal Finance Management (PFM)
SK Reddy
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
22eg105n49
Rl chapter 1 introduction
Rl chapter 1 introduction
ConnorShorten2
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
Actionable Machine Learning
Actionable Machine Learning
Meir Maor
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Universit辿 de Li竪ge (ULg)
Reinforcement Learning
Reinforcement Learning
CloudxLab
Machine Learning Techniques all units .ppt
Machine Learning Techniques all units .ppt
vidhyav58
Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524
Scott Domes
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
DevGAMM Conference
Causal inference in practice
Causal inference in practice
Amit Sharma
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
22eg105n11
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
ssuseradaf5f
An introduction to reinforcement learning
An introduction to reinforcement learning
Subrat Panda, PhD
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
Rikiya Takahashi
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
Yelp Engineering
Chapter01.ppt
Chapter01.ppt
butest
Artwork Personalization at Netflix
Artwork Personalization at Netflix
Justin Basilico
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
AI to open more doors in Personal Finance Management (PFM)
AI to open more doors in Personal Finance Management (PFM)
SK Reddy
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
MachineLearning_Unit-I.pptx.pdtegfdxcdsfxf
22eg105n49
Rl chapter 1 introduction
Rl chapter 1 introduction
ConnorShorten2
Scott Clark, Software Engineer, Yelp at MLconf SF
Scott Clark, Software Engineer, Yelp at MLconf SF
MLconf
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
Actionable Machine Learning
Actionable Machine Learning
Meir Maor
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Learning for exploration-exploitation in reinforcement learning. The dusk of ...
Universit辿 de Li竪ge (ULg)
Reinforcement Learning
Reinforcement Learning
CloudxLab
Machine Learning Techniques all units .ppt
Machine Learning Techniques all units .ppt
vidhyav58
Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524
Scott Domes
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
Applied Data Science for monetization: pitfalls, common misconceptions, and n...
DevGAMM Conference
Causal inference in practice
Causal inference in practice
Amit Sharma
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
MachineLearning_Unit-I.pptxScrum.pptxAgile Model.pptxAgile Model.pptxAgile Mo...
22eg105n11
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
ssuseradaf5f
An introduction to reinforcement learning
An introduction to reinforcement learning
Subrat Panda, PhD
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory
Rikiya Takahashi
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
Yelp Engineering
Chapter01.ppt
Chapter01.ppt
butest

Recently uploaded (20)

Gas Exchange in Insects and structures 01
Gas Exchange in Insects and structures 01
PhoebeAkinyi1
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
aparnamp966
GBSN_Unit 3 - Medical and surgical Asepsis
GBSN_Unit 3 - Medical and surgical Asepsis
Areesha Ahmad
BP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptx
RoccoHunter8
GBSN__Unit 2 - Control of Microorganisms
GBSN__Unit 2 - Control of Microorganisms
Areesha Ahmad
TISSUE TRANSPLANTATTION and IT'S IMPORTANCE IS DISCUSSED
TISSUE TRANSPLANTATTION and IT'S IMPORTANCE IS DISCUSSED
PhoebeAkinyi1
HOW TO FACE THREATS FROM THE FORCES OF NATURE EXISTING ON PLANET EARTH.pdf
HOW TO FACE THREATS FROM THE FORCES OF NATURE EXISTING ON PLANET EARTH.pdf
Faga1939
tstrygggggggggggggjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
tstrygggggggggggggjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
halimebyz1344
Abzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscales
OrchideaMariaLecian
Properties of Gases siwhdhadpaldndn.pptx
Properties of Gases siwhdhadpaldndn.pptx
CatherineJadeBurce
Science Holiday Homework (interesting slide )
Science Holiday Homework (interesting slide )
aryanxkohli88
The Emergence of Signatures of AGI: The Physics of Learning
The Emergence of Signatures of AGI: The Physics of Learning
Charles Martin
Lesson 1 in Earth and Life Science .pptx
Lesson 1 in Earth and Life Science .pptx
KizzelLanada2
How Psychology Can Power Product Decisions: A Human-Centered Blueprint- Shray...
How Psychology Can Power Product Decisions: A Human-Centered Blueprint- Shray...
ShrayasiRoy
Climate and Weather_SCIENCE9-QUARTER3.pptx
Climate and Weather_SCIENCE9-QUARTER3.pptx
Dayan Espartero
THE CIRCULATORY SYSTEM GRADE 9 SCIENCE.pptx
THE CIRCULATORY SYSTEM GRADE 9 SCIENCE.pptx
roselyncatacutan
Solution Chemistry Basics, molarity Molality
Solution Chemistry Basics, molarity Molality
nuralam819365
GBSN_ Unit 1 - Introduction to Microbiology
GBSN_ Unit 1 - Introduction to Microbiology
Areesha Ahmad
What is Skeleton system.pptx by rizwan bashir
What is Skeleton system.pptx by rizwan bashir
bhatbashir421
Chromatography 際際滷s for the course of Introduction to Biology and Chemistry...
Chromatography 際際滷s for the course of Introduction to Biology and Chemistry...
Md. Arif Shahriar
Gas Exchange in Insects and structures 01
Gas Exchange in Insects and structures 01
PhoebeAkinyi1
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
MOLD -GENERAL CHARACTERISTICS AND CLASSIFICATION
aparnamp966
GBSN_Unit 3 - Medical and surgical Asepsis
GBSN_Unit 3 - Medical and surgical Asepsis
Areesha Ahmad
BP_MXene_Project_Proposal_Presentation.pptx
BP_MXene_Project_Proposal_Presentation.pptx
RoccoHunter8
GBSN__Unit 2 - Control of Microorganisms
GBSN__Unit 2 - Control of Microorganisms
Areesha Ahmad
TISSUE TRANSPLANTATTION and IT'S IMPORTANCE IS DISCUSSED
TISSUE TRANSPLANTATTION and IT'S IMPORTANCE IS DISCUSSED
PhoebeAkinyi1
HOW TO FACE THREATS FROM THE FORCES OF NATURE EXISTING ON PLANET EARTH.pdf
HOW TO FACE THREATS FROM THE FORCES OF NATURE EXISTING ON PLANET EARTH.pdf
Faga1939
tstrygggggggggggggjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
tstrygggggggggggggjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
halimebyz1344
Abzymes mimickers in catalytic reactions at nanoscales
Abzymes mimickers in catalytic reactions at nanoscales
OrchideaMariaLecian
Properties of Gases siwhdhadpaldndn.pptx
Properties of Gases siwhdhadpaldndn.pptx
CatherineJadeBurce
Science Holiday Homework (interesting slide )
Science Holiday Homework (interesting slide )
aryanxkohli88
The Emergence of Signatures of AGI: The Physics of Learning
The Emergence of Signatures of AGI: The Physics of Learning
Charles Martin
Lesson 1 in Earth and Life Science .pptx
Lesson 1 in Earth and Life Science .pptx
KizzelLanada2
How Psychology Can Power Product Decisions: A Human-Centered Blueprint- Shray...
How Psychology Can Power Product Decisions: A Human-Centered Blueprint- Shray...
ShrayasiRoy
Climate and Weather_SCIENCE9-QUARTER3.pptx
Climate and Weather_SCIENCE9-QUARTER3.pptx
Dayan Espartero
THE CIRCULATORY SYSTEM GRADE 9 SCIENCE.pptx
THE CIRCULATORY SYSTEM GRADE 9 SCIENCE.pptx
roselyncatacutan
Solution Chemistry Basics, molarity Molality
Solution Chemistry Basics, molarity Molality
nuralam819365
GBSN_ Unit 1 - Introduction to Microbiology
GBSN_ Unit 1 - Introduction to Microbiology
Areesha Ahmad
What is Skeleton system.pptx by rizwan bashir
What is Skeleton system.pptx by rizwan bashir
bhatbashir421
Chromatography 際際滷s for the course of Introduction to Biology and Chemistry...
Chromatography 際際滷s for the course of Introduction to Biology and Chemistry...
Md. Arif Shahriar
Ad

Sprezzatura - Roelof van Zwol - May 2018

  • 2. Sprez.za.tura It is an art which does not seem to be an art. One must avoid affectation and practice in all things. A certain sprezzatura, disdain or carelessness, so as to conceal art, and make whatever is done or said appear to be without effort and almost without any thought about it ... obvious effort is the antithesis of grace. Baldassare Castiglione (1478-1529)
  • 4. When done well, recommendations are perceived a natural extension of the service 98% Match
  • 7. Introducing new content Who will watch the show? How many members will watch the show? Which canvas to use? When to promote?
  • 8. Overview Correlation Causation Online-learning Incrementality
  • 10. Should you stop buying margarine, to save your marriage?
  • 11. Correlation (X,Y) is high, does it mean X causes Y? Y causes X?
  • 12. Correlation (X,Y) is high, does it mean X causes Y? Y causes X? In general, neither! Most common reason: unobserved confounder X Y Unobserved Observed Observed C Omited variable bias
  • 13. Advertising W1 W2 W3 W4 W5 Probability of buying: Advertise?$ $ $ $
  • 14. Advertising High probability of conversion the day before weekly groceries irrespective of adverts shown Effect of Pampers ads is null in this case. Traditional (correlational) machine learning will fail and waste $ on useless ads W1 W2 W3 W4 W5 Probability of buying: Advertise?$ $ $ $ in practice, Cost-Per-Incremental-Acquisition can be > 100x Cost-Per-Acquisition (!!!!!)
  • 15. Netflix Promotions Netflix homepage is an expensive real-estate (opportunity cost): - so many titles to promote - so few opportunities to win a moment of truth D1 D2 D3 D4 D5 Promote?
  • 16. Netflix Promotions Netflix homepage is an expensive real-estate (opportunity cost): - so many titles to promote - so few opportunities to win a moment of truth Traditional (correlational) ML systems: - take action if probability of positive reward is high, irrespective of reward base rate - dont model incremental effect of taking action D1 D2 D3 D4 D5 Promote?
  • 17. Surely we can do better!
  • 20. Background and notation Title t belongs to the pool of candidate titles T, eligible for promotion in Billboard when member m visits the homepage Let xm,t be a context vector for member m and title t Let ym,t be the label indicating a play of title t by member m from the homepage, after having seen a billboard.
  • 21. What (sequence of) actions will maximize the cumulative reward? Reinforcement Learning Multi-Armed Bandits Acknowledge the need for balancing exploration and exploitation Allow sub-optimal actions, to collect unbiased treatment effects and learn the probability distributions over the space of possible actions. B B7 7 7B 7 77 ? R3 R2 R1
  • 22. 狼-greedy policy Explore Collect experimental data With 狼 probability, select at random a title for promotion in Billboard Log context (xm,t ) Causal observations of play-feedback (ym,t ) Exploit Train on the experimental data With (1-狼) probability, select the optimal title for promotion Alternatives: UCB, Thompson Sampling
  • 23. Greedy exploit model Learn a model per title to predict likelihood of play P(ym,t | xm,t ,T) = ( f(xm,t , ) ) Pick winning title: t = argmax P(ym,t | xm,t ,T) Various models can be used to predict probability of play, such as logistic regression, GBDT, neural networks
  • 24. Considerations for 狼-greedy policy Explore Bandwidth allocation and cost of exploration New vs existing titles Exploit Model synchronisation Title availability (group censoring) Observation window Frequency of model update Incremental updates vs batch training Stationarity of title popularities ? ? ? ? ?? ?
  • 25. Online learning works great for title cold start scenarios, but... MABs are greedy, not lift-based!
  • 27. Incrementality-based policy Goal: Select title for promotion that benefits most from being shown in billboard Member can play title from other sections on the homepage or search Popular titles likely to appear on homepage anyway: Trending Now Better utilize most expensive real-estate on the homepage! Define policy to be incremental with respect to probability of play
  • 28. Incrementality-based policy Goal: Select title for promotion that benefits most from being shown in billboard t = argmax [ P(ym,t | xm,t , T, b=1) - P(ym,t | xm,t , T, b=0) ] Where b is an indicator for the treatment of a title being shown in billboard (b=1), versus not being shown in billboard (b=0)
  • 29. Offline evaluation: Replay [Li et al, 2010] Relies upon uniform exploration data. For every record in the uniform exploration log {context, title k shown, reward, list of candidates} For every record: Evaluate the trained model for all the titles in the candidate pool. Pick the winning title k Keep the record in history if k = k (the title impressed in the logged data) else discard it. Compute the metrics from the history.
  • 30. Offline evaluation: Replay [Li et al, 2010] Uniform Exploration Data - Unbiased evaluation Evaluation Data Train Data Trained Model Reveal context x Use reward only if k = k Winner title k context,title,reward context,title,reward context,title,reward Take Rate = # Plays # Matches
  • 31. Offline replay Greedy exploit has higher replay take rate than incrementality based model. Incrementality Based Policy sacrifices replay by selecting a lesser known title that would benefit from being shown on the Billboard. Lift in Replay in the various algorithms as compared to the Random baseline
  • 32. Which titles benefit from Billboard promotion? Title A has a low baseline probability of play, however when the billboard is shown the probability of play increases substantially! Title C has higher baseline probability and may not benefit as much from being shown on the Billboard. Scatter plot of incremental vs baseline probability of play for various members.
  • 33. Online observations Online take rates for take rates follow the offline patterns. Our implementation of incrementality is able to shift engagement within the candidate pool.
  • 35. Correlation, causation, and incrementality Most ML algorithms are correlational, e.g. based on observational data In this context, the Explore-exploit models are causal E.g. we train models based on experimental data, where we are in control of the randomization Incrementality can be defined as the causal lift in a metric of interest For instance, the change in probability of play for a title in a session, when a billboard is shown for that title to a member