�ݺ�ߣ

• Dominant in statistics research.
• Interpreted language:
No need to compile before run.
• At its core an Imperative Language.
Also supports Functional Programming.
And Object Oriented Programming.
WHAT IS THE R LANGUAGE?

Imperative: explore data
FP: data analysis
OOP: building tools
SO…
WHICH PARADIGM
TO USE IN R?

ALSO, R IS REAL GOOD WITH VECTORS
(and matrices)
10 to 100 times faster

CLASS SYSTEMS
• S3: minimal
• S4: very verbose
• R5: (reference classes) slow
• C++: fast, not platform independent,
needs boilerplate.
• New: R6 (default at Microsoft)

• New kid on the block
• Light weight and fast
• Public and private methods
• Active bindings
• Mature inheritance
MY PREFERENCE: R6

ALSO, R6 JUST MAKES ME FEEL
RIGHT AT HOME…

Developing in R - the contextual Multi-Armed Bandit edition

• SEMANTIC DEV SKILLS
• SYNTACTIC DEV SKILLS
• DOMAIN KNOWLEDGE
R DEVELOPMENT IN 3D

Semantic: What is a Multi-Armed Bandit?
• Origin: Gambler in casino want to maximize winnings by playing slot
machines
• Balance exploration vs exploitation (also: “learning” vs “learning”)
• Objective: Given a set of K distinct arms, each with unknown reward
distribution, find the maximum sum of rewards.
• Example: 3 slot machines (arms)
Each 2 pulls explore, what now?

Translation to health related problem
• 1. A patient arrives with symptoms, medical history at physician.
• 2. Physician prescribes treatment A or treatment B.
• 3. Patient’s health responds (e.g., improves, worsens).
• 4. Depending on results, physician changes opinion
on best treatment option for this kind of patient.
• Goal: prescribe treatments that yield good health outcomes.

What’s the challenge for the physician here?
• Fundamental dilemma
• Exploit what has been learned
• Explore to find which behaviors lead to high rewards
• Need to use context and arm history effectively
• Different actions are preferred under different contexts
• Might not see the same context twice

Solution: use a smart rule: a policy
• Policy: rule mapping context to action
• Allows choice of different good actions in different contexts
• E.g.:
• If (sex = male) choose action 1
• Else if (age > 45) choose action 2
• Else choose action 3
• Policy 𝜋 ∶ context 𝑥 ↦ (action 𝑎) + HISTORY … adapt

Let’s formalize, to easier compare, apply ..
• Use adaptive policy Π with distribution parameters θ to make a choice.
• For t=1,2,…,T:
• 1. Observe context 𝒙 𝒕
• 2. Choose action 𝒂 𝒕 ∈ {𝟏, 𝟐, … , 𝑲} using current θ of Π
• 3. Collect reward 𝑟𝑡(𝑎 𝑡)
• 4. Using reward 𝑟𝑡(𝑎 𝑡) adapt θ as suggested by good Π
Goal: finding for choosing actions with high reward
෍
𝑡=1
𝑇
𝑟𝑡 𝑎 𝑡

get_something
do_procedure
get_value

For example …
• 1.
• 2.
• 3.
• 4.

CLEAN CODE
Keep It Simple Stupid
You Aren’t Gonna Need It
Don’t Repeat Yourself !

ADDING LI BANDIT: EASY!
PSEUDO CODE:
YAY!

ADDING LI BANDIT: EASY!
FULLY RANDOMLY OFFERED CHOICES
IN REAL LIFE SETTING
PEOPLE HAVE MAKE CHOICE WITH
KNOWN CONTEXT
THE DATA IS USED BY THE BANDIT
CHECKS IF POLICY MAKES SAME CHOICE AS
PERSON ORIGINALLY MADE
IF SO, CAN USE THIS ROW OF INCLUSIVE
CONTEXT TO TEST THE POLICY

VERSION CONTROL / GITHUB
your safety net
(and makes collaboration easy)

ALSO HELPS AUTOMATE
DEVELOPMENT
RELATED PROCESSES

ZENODO
Research Data Repository
on releases: auto doi generation

Commit
CONTINUOUS INTEGRATION:
DOES YOUR CODEBASE STILL WORK?

CODECOV.IO INTEGRATION:
DO TESTS COVER ALL OF YOUR CODE?
CODECOVERAGE

THE ART OF PARALLEL PROCESSING
58 cores 120 cores
faster
Balancing overhead and network
with more processing power
k3 * d3 * 5 policies * 300 * 10000
58 cores: 132 seconds
k3 * d3 * 5 policies * 3000 * 10000

• More documentation, clean result printouts
• More paper writing (!)
• Implement famous papers, show same results?
• Refactor again: to focus less on optimization,
more on readability, particularly SyntheticBandit (although…)
WHAT IS NEXT?

R BEGINNERS
R in Action- Robert Kabacoff - R in Action
MORE ADVANCED
R packages - Hadley Wickham
Advanced R - Hadley Wickham
CLEAN CODE
Code Complete - Steve McConnell
Clean Code - Robert C. Martin
ALSO INTERESTING
R Inferno - Patrick Burns
LITERATURE

�ݺ�ߣ

Developing in R - the contextual Multi-Armed Bandit edition

More Related Content

Developing in R - the contextual Multi-Armed Bandit edition