際際滷

際際滷Share a Scribd company logo
Black Magic
How to apply ML to real-world problems
It is great tool for some purposes
ML is (Magic) Hammer

If all you have is a hammer, everything
looks like a nail
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
I am Evion Kim
Lead Machine Learning Engineer @ Mattermark
Senior Software Engineer /Data Scientist @ Linkedin
M.S. , Computer Science @ Stanford University
B.S., Computer Science @ KAIST
Hello!
Todays Talk
 Machine Learning - the concept
 Mattermark?
 Funding Extraction Problem @ Mattermark
& Some Magic Spells
Not about
# Deep academic technical
knowledge about ML
algorithms
# Data Infrastructure
About
# How to transform real-
world problem into ML-
problem
# Tips and tricks on Machine
Learning based problem
solving
This talk is...
Machine Learning
The powerful hammer
1
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
def traditional(x):
return x*(x+1)
Traditional way
y = x * (x+1)
2 = 6
3 = 12
4 = 20
5 = 30
6 = 42
ML Way y = x * (x+1)
Model
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Data
DEEP LEARNING?
Trained
Model
It is not SKYNET
 at least not yet.
It is tool
that can be used for some problems
What is
(just quick advertisement)
2
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Case Study: Funding
Extraction
-And Dark Magic spells we learned
3
Small(er) company
# Much smaller training data points
# Very high precision requirement.
Big(ger) company
# Millions of Millions of training
data points
# Precision requirement: not
that high
ML @ Big(ger) vs. Small(er)
Whats the bottleneck?
# Scalability or Accuracy?
# Precision or Recall?
# Engineering or Machine Learning?
Spell 1: Know your Enemy
~$156 BillionTotal VC funding in year 2015
~8,532VC funding events in year 2015
Problem to solve
Divide big chunky problem into smaller
ML-solvable problems.
Spell 2: Slice and Dice
Smaller Problems
Classify Funding
Articles
Classify
Funding
Sentences
Extract
Funding
Entities
Confidence
Scorer
Classify Funding Article
TF-IDF + SVM Classifier
NO
YES
Analyze and understand the problem
space you are working on.
Spell 3: Understand
your Domain
Amount/Series/Investors
...has closed a $3.5m
Series A funding round
led by Inter Capital, ...
Investors
Intel Capital led the
round with
participation from other
investors that included
Horizons Ventures
Amount/ Series
...has raised $3.5
million in Series A
Funding
Funding Sentences
Patterns
Classify Funding Sentences
Word2Vec
+ Semantic Role
Labeling (SRL)
+ Gradient
Boosting
Classsifier
Regex Parsing
+ Named Entity
Recognition
Extract Funding Entities
Spell 4: Probabilistic
Train and use the probabilistic models
helps a lot sometimes.
Whats the probability of these
extracted information to be
correct?
Confidence Scoring
0~1
probability
score
Spell 5: Human + Machine
Let some part of the job get the help from
mighty human-being
Human Administration
~$156 BillionTotal VC funding in year 2015
~8,532VC funding events in year 2015
Spell 5: Human + MachineSpell 4: ProbabilisticSpell 3: Understand your domain
Spell 2: Slice and DiceSpell 1: Know your enemyML is powerful Hammer
Summary
We are hiring!
Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems
Thanks!
Any questions?
You can find me at:
 in/evionkim
 twitter@evion12
 evion12@gmail.com

More Related Content

Data By The Bay 2016 - Black Magic: How to apply Machine Learning to real-world problems