際際滷

際際滷Share a Scribd company logo
R PACKAGE
DEVELOPMENT
 Dominant in statistics research.
 Interpreted language:
No need to compile before run.
 At its core an Imperative Language.
Also supports Functional Programming.
And Object Oriented Programming.
WHAT IS THE R LANGUAGE?
R
HELLO WORLD
Imperative: explore data
FP: data analysis
OOP: building tools
SO
WHICH PARADIGM
TO USE IN R?
ALSO, R IS REAL GOOD WITH VECTORS
(and matrices)
10 to 100 times faster
CLASS SYSTEMS
 S3: minimal
 S4: very verbose
 R5: (reference classes) slow
 C++: fast, not platform independent,
needs boilerplate.
 New: R6 (default at Microsoft)
 New kid on the block
 Light weight and fast
 Public and private methods
 Active bindings
 Mature inheritance
MY PREFERENCE: R6
ALSO, R6 JUST MAKES ME FEEL
RIGHT AT HOME
Developing in R - the contextual Multi-Armed Bandit edition
 SEMANTIC DEV SKILLS
 SYNTACTIC DEV SKILLS
 DOMAIN KNOWLEDGE
R DEVELOPMENT IN 3D
Developing in R - the contextual Multi-Armed Bandit edition
Semantic: What is a Multi-Armed Bandit?
 Origin: Gambler in casino want to maximize winnings by playing slot
machines
 Balance exploration vs exploitation (also: learning vs learning)
 Objective: Given a set of K distinct arms, each with unknown reward
distribution, find the maximum sum of rewards.
 Example: 3 slot machines (arms)
Each 2 pulls explore, what now?
Translation to health related problem
 1. A patient arrives with symptoms, medical history at physician.
 2. Physician prescribes treatment A or treatment B.
 3. Patients health responds (e.g., improves, worsens).
 4. Depending on results, physician changes opinion
on best treatment option for this kind of patient.
 Goal: prescribe treatments that yield good health outcomes.
Whats the challenge for the physician here?
 Fundamental dilemma
 Exploit what has been learned
 Explore to find which behaviors lead to high rewards
 Need to use context and arm history effectively
 Different actions are preferred under different contexts
 Might not see the same context twice
Solution: use a smart rule: a policy
 Policy: rule mapping context to action
 Allows choice of different good actions in different contexts
 E.g.:
 If (sex = male) choose action 1
 Else if (age > 45) choose action 2
 Else choose action 3
 Policy   context   (action ) + HISTORY  adapt
Lets formalize, to easier compare, apply ..
 Use adaptive policy  with distribution parameters 慮 to make a choice.
 For t=1,2,,T:
 1. Observe context  
 2. Choose action    {, ,  , } using current 慮 of 
 3. Collect reward ( )
 4. Using reward ( ) adapt 慮 as suggested by good 
Goal: finding for choosing actions with high reward
犒
=1
get_something
do_procedure
get_value
For example 
 1.
 2.
 3.
 4.
FIRST SKETCH, THEN CODE
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
CONTEXTUAL:
UML DIAGRAMS
CONTEXTUAL:
UML DIAGRAMS
Developing in R - the contextual Multi-Armed Bandit edition
CLEAN CODE
Keep It Simple Stupid
You Arent Gonna Need It
Dont Repeat Yourself !
ADDING LI BANDIT: EASY!
PSEUDO CODE:
YAY!
ADDING LI BANDIT: EASY!
FULLY RANDOMLY OFFERED CHOICES
IN REAL LIFE SETTING
PEOPLE HAVE MAKE CHOICE WITH
KNOWN CONTEXT
THE DATA IS USED BY THE BANDIT
CHECKS IF POLICY MAKES SAME CHOICE AS
PERSON ORIGINALLY MADE
IF SO, CAN USE THIS ROW OF INCLUSIVE
CONTEXT TO TEST THE POLICY
Developing in R - the contextual Multi-Armed Bandit edition
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
RSTUDIO: REAL USEFUL
PROFVIS PROFILING
PRE-ALLOCATE DATA STRUCTURES
Developing in R - the contextual Multi-Armed Bandit edition
VERSION CONTROL / GITHUB
your safety net
(and makes collaboration easy)
ALSO HELPS AUTOMATE
DEVELOPMENT
RELATED PROCESSES
ZENODO
Research Data Repository
on releases: auto doi generation
Commit
CONTINUOUS INTEGRATION:
DOES YOUR CODEBASE STILL WORK?
CODECOV.IO INTEGRATION:
DO TESTS COVER ALL OF YOUR CODE?
CODECOVERAGE
Developing in R - the contextual Multi-Armed Bandit edition
PARALLEL PROCESSING ON AWS
THE ART OF PARALLEL PROCESSING
58 cores 120 cores
faster
Balancing overhead and network
with more processing power
k3 * d3 * 5 policies * 300 * 10000
58 cores: 132 seconds
120 cores: 390 seconds
k3 * d3 * 5 policies * 3000 * 10000
58 cores: 930 seconds
120 cores: 691 seconds
Developing in R - the contextual Multi-Armed Bandit edition
 More documentation, clean result printouts
 More paper writing (!)
 Implement famous papers, show same results?
 Refactor again: to focus less on optimization,
more on readability, particularly SyntheticBandit (although)
WHAT IS NEXT?
R BEGINNERS
R in Action- Robert Kabacoff - R in Action
MORE ADVANCED
R packages - Hadley Wickham
Advanced R - Hadley Wickham
CLEAN CODE
Code Complete - Steve McConnell
Clean Code - Robert C. Martin
ALSO INTERESTING
R Inferno - Patrick Burns
LITERATURE

More Related Content

Developing in R - the contextual Multi-Armed Bandit edition

  • 2. Dominant in statistics research. Interpreted language: No need to compile before run. At its core an Imperative Language. Also supports Functional Programming. And Object Oriented Programming. WHAT IS THE R LANGUAGE?
  • 4. Imperative: explore data FP: data analysis OOP: building tools SO WHICH PARADIGM TO USE IN R?
  • 5. ALSO, R IS REAL GOOD WITH VECTORS (and matrices) 10 to 100 times faster
  • 6. CLASS SYSTEMS S3: minimal S4: very verbose R5: (reference classes) slow C++: fast, not platform independent, needs boilerplate. New: R6 (default at Microsoft)
  • 7. New kid on the block Light weight and fast Public and private methods Active bindings Mature inheritance MY PREFERENCE: R6
  • 8. ALSO, R6 JUST MAKES ME FEEL RIGHT AT HOME
  • 10. SEMANTIC DEV SKILLS SYNTACTIC DEV SKILLS DOMAIN KNOWLEDGE R DEVELOPMENT IN 3D
  • 12. Semantic: What is a Multi-Armed Bandit? Origin: Gambler in casino want to maximize winnings by playing slot machines Balance exploration vs exploitation (also: learning vs learning) Objective: Given a set of K distinct arms, each with unknown reward distribution, find the maximum sum of rewards. Example: 3 slot machines (arms) Each 2 pulls explore, what now?
  • 13. Translation to health related problem 1. A patient arrives with symptoms, medical history at physician. 2. Physician prescribes treatment A or treatment B. 3. Patients health responds (e.g., improves, worsens). 4. Depending on results, physician changes opinion on best treatment option for this kind of patient. Goal: prescribe treatments that yield good health outcomes.
  • 14. Whats the challenge for the physician here? Fundamental dilemma Exploit what has been learned Explore to find which behaviors lead to high rewards Need to use context and arm history effectively Different actions are preferred under different contexts Might not see the same context twice
  • 15. Solution: use a smart rule: a policy Policy: rule mapping context to action Allows choice of different good actions in different contexts E.g.: If (sex = male) choose action 1 Else if (age > 45) choose action 2 Else choose action 3 Policy context (action ) + HISTORY adapt
  • 16. Lets formalize, to easier compare, apply .. Use adaptive policy with distribution parameters 慮 to make a choice. For t=1,2,,T: 1. Observe context 2. Choose action {, , , } using current 慮 of 3. Collect reward ( ) 4. Using reward ( ) adapt 慮 as suggested by good Goal: finding for choosing actions with high reward 犒 =1
  • 18. For example 1. 2. 3. 4.
  • 26. CLEAN CODE Keep It Simple Stupid You Arent Gonna Need It Dont Repeat Yourself !
  • 27. ADDING LI BANDIT: EASY! PSEUDO CODE: YAY!
  • 28. ADDING LI BANDIT: EASY! FULLY RANDOMLY OFFERED CHOICES IN REAL LIFE SETTING PEOPLE HAVE MAKE CHOICE WITH KNOWN CONTEXT THE DATA IS USED BY THE BANDIT CHECKS IF POLICY MAKES SAME CHOICE AS PERSON ORIGINALLY MADE IF SO, CAN USE THIS ROW OF INCLUSIVE CONTEXT TO TEST THE POLICY
  • 40. VERSION CONTROL / GITHUB your safety net (and makes collaboration easy)
  • 42. ZENODO Research Data Repository on releases: auto doi generation
  • 44. CODECOV.IO INTEGRATION: DO TESTS COVER ALL OF YOUR CODE? CODECOVERAGE
  • 47. THE ART OF PARALLEL PROCESSING 58 cores 120 cores faster Balancing overhead and network with more processing power k3 * d3 * 5 policies * 300 * 10000 58 cores: 132 seconds 120 cores: 390 seconds k3 * d3 * 5 policies * 3000 * 10000 58 cores: 930 seconds 120 cores: 691 seconds
  • 49. More documentation, clean result printouts More paper writing (!) Implement famous papers, show same results? Refactor again: to focus less on optimization, more on readability, particularly SyntheticBandit (although) WHAT IS NEXT?
  • 50. R BEGINNERS R in Action- Robert Kabacoff - R in Action MORE ADVANCED R packages - Hadley Wickham Advanced R - Hadley Wickham CLEAN CODE Code Complete - Steve McConnell Clean Code - Robert C. Martin ALSO INTERESTING R Inferno - Patrick Burns LITERATURE