際際滷

際際滷Share a Scribd company logo
IR & E
Personalized News Article Recommendation (Stream Data Based)
Monsoon 17, IIIT Hyderabad
Keywords
 Contextual Bandit
 Web Service
 Personalization
 Recommender Systems
 Exploration/Exploitation dilemma
Example of Learning through Exploration
Repeatedly:
1. A user comes to Yahoo! (with history of previous visits, IP addresses, data related to his Yahoo!
account)
2. Yahoo! chooses information to present (from URLs, Ads, news stories)
3. The user reacts to the presented information (clicks on something, clicks, comes back and clicks
again, etc.)
Yahoo! wants to interactively choose content and use the observed feedback to improve future content
choices.
Another Example: Clinical Decision Making
Repeatedly:
1. A patient comes to a doctor with symptoms, medical history, test results
2. The doctor chooses and suggests a treatment
3. The patient responds to it
The doctor wants a policy for choosing targeted treatments for individual patients.
Current Scenario
Which article to feature?
Challenges:
 A lot of new users and articles.
 Incorporation of content.
 Changing relevance of articles.
Goal:
"Quickly" identify relevant news stories on
personal level.
The Contextual Bandit Setting
For t = 1, . . . , T:
1. The world produces some context xt
 X
2. The learner chooses an action at
 {1, . . . ,K}
3. The world reacts with reward rt
(at
)  [0, 1]
Goal: Learn a good policy for choosing actions given context.
What does learning mean?
The Contextual Bandit Setting (Contd.)
What does learning mean?
Efficiently competing with a large reference class of possible policies  = {  : X  {1, ..., K} }
Some Remarks
This is not a supervised learning problem.
 We dont know the reward of actions not taken,
 loss function is unknown even at training time.
 Exploration is needed to succeed.
 Simpler than reinforcement learning,
 We know which action is responsible for each reward.
Some Remarks (Contd.)
This is not a bandit problem.
 In the bandit setting, there is no x, and the goal is to compete with the set of constant actions.
 Too weak in practice.
 Generalization across x is required to succeed.
Mapping to our current problem
For each time t = 1, 2, 3,  , T, the news page is loaded:
1. Arms or actions are the articles, which can be shown to the user. The environment could be user
and article information.
2. If the article a is clicked, rt, a
= 1, otherwise 0.
3. Improve new article selection.
Goal: Maximize expected Click-through-rate, i.e.,
Balancing Exploration and Exploitation
LinUCB (Disjoint Linear Model)
Assumption: The expected reward for action a is a linear function in the features of the context, i.e.:
1. In each trial t, for each a  At
estimate 慮a
via regularized linear regression using feature matrix Da
.
E[rt, a
| xt, a
] = xT
t, a
慮a
*
2. Choose at
such that,
LinUCB (Hybrid Model)
Assumption: The expected reward for action a is the sum of two linear terms, one that is independent of
the action and one that is specific to each action, i.e.:
E[rt, a
| xt, a
] = zT
t, a
硫*
+ xT
t, a
慮a
*
Algorithm works similar to the previous LinUCB algorithm.
Evaluation
 Testing on Live Data?
 TOO EXPENSIVE.
 Then, testing offline?
 DIFFERENT LOGGING POLICY
 Then, simulator-based approach?
 BIASED.
Results
 Training Set: 4.7 million events
 Test Set: 36 million events
 Articles and users clustered into 5 clusters:
 Two 6-dimensional (one constant) feature
vectors
Questions?
Ask in the comment section.

More Related Content

Similar to Personalized News Recommendation (Stream Data Based) (20)

Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
Kan-Han (John) Lu
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
BigMine
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...
Yusuke Kaneko
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
Beno樽t Rostykus
Weird News Ranking : IRE project
Weird News Ranking : IRE projectWeird News Ranking : IRE project
Weird News Ranking : IRE project
Rupali Aher
ML_Lec4 introduction to linear regression.pdf
ML_Lec4 introduction to linear regression.pdfML_Lec4 introduction to linear regression.pdf
ML_Lec4 introduction to linear regression.pdf
BeshoyArnest
Reinforcement learning for data-driven optimisation
Reinforcement learning for data-driven optimisationReinforcement learning for data-driven optimisation
Reinforcement learning for data-driven optimisation
Universit辿 de Li竪ge (ULg)
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game Learning
Prabhu Kumar
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
Akanksha Gohil
Exploration exploitation trade off in mobile context-aware recommender systems
Exploration  exploitation trade off in mobile context-aware recommender systemsExploration  exploitation trade off in mobile context-aware recommender systems
Exploration exploitation trade off in mobile context-aware recommender systems
Bouneffouf Djallel
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...
Universit辿 de Li竪ge (ULg)
Artificial intelligence and Machine learning
Artificial intelligence and Machine learningArtificial intelligence and Machine learning
Artificial intelligence and Machine learning
2303oyxxxjdeepak
chapter Three artificial intelligence 1.pptx
chapter Three artificial intelligence   1.pptxchapter Three artificial intelligence   1.pptx
chapter Three artificial intelligence 1.pptx
gadisaadamu101
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
Benjaminlapid1
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
Nurendra Choudhary
Pp ts for machine learning
Pp ts for machine learningPp ts for machine learning
Pp ts for machine learning
Wrushali Mendre
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
Anas Jamil
Twitter as a personalizable information service ii
Twitter as a personalizable information service iiTwitter as a personalizable information service ii
Twitter as a personalizable information service ii
Kan-Han (John) Lu
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
BigMine
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...How to correctly estimate the effect of online advertisement(About Double Mac...
How to correctly estimate the effect of online advertisement(About Double Mac...
Yusuke Kaneko
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
Beno樽t Rostykus
Weird News Ranking : IRE project
Weird News Ranking : IRE projectWeird News Ranking : IRE project
Weird News Ranking : IRE project
Rupali Aher
ML_Lec4 introduction to linear regression.pdf
ML_Lec4 introduction to linear regression.pdfML_Lec4 introduction to linear regression.pdf
ML_Lec4 introduction to linear regression.pdf
BeshoyArnest
Reinforcement learning for data-driven optimisation
Reinforcement learning for data-driven optimisationReinforcement learning for data-driven optimisation
Reinforcement learning for data-driven optimisation
Universit辿 de Li竪ge (ULg)
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game Learning
Prabhu Kumar
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
Akanksha Gohil
Exploration exploitation trade off in mobile context-aware recommender systems
Exploration  exploitation trade off in mobile context-aware recommender systemsExploration  exploitation trade off in mobile context-aware recommender systems
Exploration exploitation trade off in mobile context-aware recommender systems
Bouneffouf Djallel
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
Agnes van Belle
A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...
Universit辿 de Li竪ge (ULg)
Artificial intelligence and Machine learning
Artificial intelligence and Machine learningArtificial intelligence and Machine learning
Artificial intelligence and Machine learning
2303oyxxxjdeepak
chapter Three artificial intelligence 1.pptx
chapter Three artificial intelligence   1.pptxchapter Three artificial intelligence   1.pptx
chapter Three artificial intelligence 1.pptx
gadisaadamu101
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
Benjaminlapid1
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
Nurendra Choudhary
Pp ts for machine learning
Pp ts for machine learningPp ts for machine learning
Pp ts for machine learning
Wrushali Mendre
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
Anas Jamil

Recently uploaded (20)

Successful management of intussusception in a cow under double drip anaesthesia
Successful management of intussusception  in a cow under double drip anaesthesiaSuccessful management of intussusception  in a cow under double drip anaesthesia
Successful management of intussusception in a cow under double drip anaesthesia
rajvet4163
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxyARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
S辿rgio Sacani
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
ThrombUS+ Project
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
christianagboeze2427
Simple Phenomena of Magnetism | IGCSE Physics
Simple Phenomena of Magnetism | IGCSE PhysicsSimple Phenomena of Magnetism | IGCSE Physics
Simple Phenomena of Magnetism | IGCSE Physics
Blessing Ndazie
Scientific Pig Farming Manual for Pig Farmers
Scientific Pig Farming Manual for Pig FarmersScientific Pig Farming Manual for Pig Farmers
Scientific Pig Farming Manual for Pig Farmers
Dr. Subhrajit Das
Blotting techniques and types of blotting .pptx
Blotting techniques and  types of blotting .pptxBlotting techniques and  types of blotting .pptx
Blotting techniques and types of blotting .pptx
sakshibhongal26
Moulding techniques for polymers industrial process
Moulding techniques for polymers industrial processMoulding techniques for polymers industrial process
Moulding techniques for polymers industrial process
JinnJinnkiJaddu
QUANTITATIVE GENETICS PART 2.pdf agriculture
QUANTITATIVE GENETICS PART 2.pdf agricultureQUANTITATIVE GENETICS PART 2.pdf agriculture
QUANTITATIVE GENETICS PART 2.pdf agriculture
KushiBhatia
quantitative genetics part 3.pdf agriculture
quantitative genetics part 3.pdf agriculturequantitative genetics part 3.pdf agriculture
quantitative genetics part 3.pdf agriculture
KushiBhatia
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Muhammad Salman Iqbal
Investigational New drug application process
Investigational New drug application processInvestigational New drug application process
Investigational New drug application process
onepalyer4
epidemiology (aim, component, principles).pptx
epidemiology (aim, component, principles).pptxepidemiology (aim, component, principles).pptx
epidemiology (aim, component, principles).pptx
lopamudraray88
Variation and Natural Selection | IGCSE Biology
Variation and Natural Selection | IGCSE BiologyVariation and Natural Selection | IGCSE Biology
Variation and Natural Selection | IGCSE Biology
Blessing Ndazie
Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awyulachew's_Official letters between organizations and researc...Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awulachew
Lower Secondary Science Stage 9 Scheme of Work
Lower Secondary Science Stage 9 Scheme of WorkLower Secondary Science Stage 9 Scheme of Work
Lower Secondary Science Stage 9 Scheme of Work
MayoreeChannaryPisey
Nutrient deficiency and symptoms in plants
Nutrient deficiency and symptoms in plantsNutrient deficiency and symptoms in plants
Nutrient deficiency and symptoms in plants
laxmichoudhary77657
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube TutorialTOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
Vivek Bhakta
Electrophoretic Technique Electro .pptx
Electrophoretic Technique Electro  .pptxElectrophoretic Technique Electro  .pptx
Electrophoretic Technique Electro .pptx
nghns4wcvc
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITROSILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
Lilya BOUCELHA
Successful management of intussusception in a cow under double drip anaesthesia
Successful management of intussusception  in a cow under double drip anaesthesiaSuccessful management of intussusception  in a cow under double drip anaesthesia
Successful management of intussusception in a cow under double drip anaesthesia
rajvet4163
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxyARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
ARepeatingFastRadioBurstSourceinaLow-luminosityDwarfGalaxy
S辿rgio Sacani
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
Preparing Ultrasound Imaging Data for Artificial Intelligence Tasks: Anonymis...
ThrombUS+ Project
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
Unjustly Incriminating Bacteria: the Role of Bacteriophages in Bacterial Infe...
christianagboeze2427
Simple Phenomena of Magnetism | IGCSE Physics
Simple Phenomena of Magnetism | IGCSE PhysicsSimple Phenomena of Magnetism | IGCSE Physics
Simple Phenomena of Magnetism | IGCSE Physics
Blessing Ndazie
Scientific Pig Farming Manual for Pig Farmers
Scientific Pig Farming Manual for Pig FarmersScientific Pig Farming Manual for Pig Farmers
Scientific Pig Farming Manual for Pig Farmers
Dr. Subhrajit Das
Blotting techniques and types of blotting .pptx
Blotting techniques and  types of blotting .pptxBlotting techniques and  types of blotting .pptx
Blotting techniques and types of blotting .pptx
sakshibhongal26
Moulding techniques for polymers industrial process
Moulding techniques for polymers industrial processMoulding techniques for polymers industrial process
Moulding techniques for polymers industrial process
JinnJinnkiJaddu
QUANTITATIVE GENETICS PART 2.pdf agriculture
QUANTITATIVE GENETICS PART 2.pdf agricultureQUANTITATIVE GENETICS PART 2.pdf agriculture
QUANTITATIVE GENETICS PART 2.pdf agriculture
KushiBhatia
quantitative genetics part 3.pdf agriculture
quantitative genetics part 3.pdf agriculturequantitative genetics part 3.pdf agriculture
quantitative genetics part 3.pdf agriculture
KushiBhatia
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Deep Learning-Driven Protein Design for Maize Improvement: AI-Guided Solution...
Muhammad Salman Iqbal
Investigational New drug application process
Investigational New drug application processInvestigational New drug application process
Investigational New drug application process
onepalyer4
epidemiology (aim, component, principles).pptx
epidemiology (aim, component, principles).pptxepidemiology (aim, component, principles).pptx
epidemiology (aim, component, principles).pptx
lopamudraray88
Variation and Natural Selection | IGCSE Biology
Variation and Natural Selection | IGCSE BiologyVariation and Natural Selection | IGCSE Biology
Variation and Natural Selection | IGCSE Biology
Blessing Ndazie
Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awyulachew's_Official letters between organizations and researc...Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awyulachew's_Official letters between organizations and researc...
Melaku Tafese Awulachew
Lower Secondary Science Stage 9 Scheme of Work
Lower Secondary Science Stage 9 Scheme of WorkLower Secondary Science Stage 9 Scheme of Work
Lower Secondary Science Stage 9 Scheme of Work
MayoreeChannaryPisey
Nutrient deficiency and symptoms in plants
Nutrient deficiency and symptoms in plantsNutrient deficiency and symptoms in plants
Nutrient deficiency and symptoms in plants
laxmichoudhary77657
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube TutorialTOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
TOP 10 CBSE Top Science Projects for Classes 6 to 10 with Youtube Tutorial
Vivek Bhakta
Electrophoretic Technique Electro .pptx
Electrophoretic Technique Electro  .pptxElectrophoretic Technique Electro  .pptx
Electrophoretic Technique Electro .pptx
nghns4wcvc
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITROSILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
SILICON IS AN INHIBITOR OF CERTAIN ENZYMES IN VITRO
Lilya BOUCELHA

Personalized News Recommendation (Stream Data Based)

  • 1. IR & E Personalized News Article Recommendation (Stream Data Based) Monsoon 17, IIIT Hyderabad
  • 2. Keywords Contextual Bandit Web Service Personalization Recommender Systems Exploration/Exploitation dilemma
  • 3. Example of Learning through Exploration Repeatedly: 1. A user comes to Yahoo! (with history of previous visits, IP addresses, data related to his Yahoo! account) 2. Yahoo! chooses information to present (from URLs, Ads, news stories) 3. The user reacts to the presented information (clicks on something, clicks, comes back and clicks again, etc.) Yahoo! wants to interactively choose content and use the observed feedback to improve future content choices.
  • 4. Another Example: Clinical Decision Making Repeatedly: 1. A patient comes to a doctor with symptoms, medical history, test results 2. The doctor chooses and suggests a treatment 3. The patient responds to it The doctor wants a policy for choosing targeted treatments for individual patients.
  • 5. Current Scenario Which article to feature? Challenges: A lot of new users and articles. Incorporation of content. Changing relevance of articles. Goal: "Quickly" identify relevant news stories on personal level.
  • 6. The Contextual Bandit Setting For t = 1, . . . , T: 1. The world produces some context xt X 2. The learner chooses an action at {1, . . . ,K} 3. The world reacts with reward rt (at ) [0, 1] Goal: Learn a good policy for choosing actions given context. What does learning mean?
  • 7. The Contextual Bandit Setting (Contd.) What does learning mean? Efficiently competing with a large reference class of possible policies = { : X {1, ..., K} }
  • 8. Some Remarks This is not a supervised learning problem. We dont know the reward of actions not taken, loss function is unknown even at training time. Exploration is needed to succeed. Simpler than reinforcement learning, We know which action is responsible for each reward.
  • 9. Some Remarks (Contd.) This is not a bandit problem. In the bandit setting, there is no x, and the goal is to compete with the set of constant actions. Too weak in practice. Generalization across x is required to succeed.
  • 10. Mapping to our current problem For each time t = 1, 2, 3, , T, the news page is loaded: 1. Arms or actions are the articles, which can be shown to the user. The environment could be user and article information. 2. If the article a is clicked, rt, a = 1, otherwise 0. 3. Improve new article selection. Goal: Maximize expected Click-through-rate, i.e.,
  • 12. LinUCB (Disjoint Linear Model) Assumption: The expected reward for action a is a linear function in the features of the context, i.e.: 1. In each trial t, for each a At estimate 慮a via regularized linear regression using feature matrix Da . E[rt, a | xt, a ] = xT t, a 慮a * 2. Choose at such that,
  • 13. LinUCB (Hybrid Model) Assumption: The expected reward for action a is the sum of two linear terms, one that is independent of the action and one that is specific to each action, i.e.: E[rt, a | xt, a ] = zT t, a 硫* + xT t, a 慮a * Algorithm works similar to the previous LinUCB algorithm.
  • 14. Evaluation Testing on Live Data? TOO EXPENSIVE. Then, testing offline? DIFFERENT LOGGING POLICY Then, simulator-based approach? BIASED.
  • 15. Results Training Set: 4.7 million events Test Set: 36 million events Articles and users clustered into 5 clusters: Two 6-dimensional (one constant) feature vectors
  • 16. Questions? Ask in the comment section.