際際滷

際際滷Share a Scribd company logo
LessSketchy
by Shih-Ho Cheng
Many posts seem to be great deals ...
Many posts seem to be great deals ...
Cheap! Great Location
No Credit Check
... but many are really scams!
... but many are really scams!
How can I prevent this?
LessSketchy uses machine learning
to warn you about scams
Where to get training set?
Where to get training set?
1. Web-Scrape a large sample
Where to get training set?
1. Web-Scrape a large sample
2. Wait...
Where to get training set?
1. Web-Scrape a large sample
2. Wait...
3. Re-visit each post and check for removal
Balanced Random Forest
 Use it to train classification of scam vs. legit
Balanced Random Forest
 Use it to train classification of scam vs. legit
 Deals with an imbalance sample (1 : 99)
Balanced Random Forest
 Use it to train classification of scam vs. legit
 Deals with an imbalance sample (1 : 99)
 Each tree is trained by bootstrapping a
balanced training sample
Balanced Random Forest
 Use it to train classification of scam vs. legit
 Deals with an imbalance sample (1 : 99)
 Each tree is trained by bootstrapping a
balanced training sample
 Aggregate the classification of each tree

More Related Content

Less sketchy