Personal Information
Organization / Workplace
San Francisco Bay Area United States
Occupation
Senior Research Engineer at Netflix
Industry
Education
Website
About
Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers.
I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem.
Specialties:
? Machine Learning and Data Mining.
? Distributed/Parallel Computing and Big Data Processing.
? Expert in Apache Hadoop
Contact Details
Tags
machine learning
spark
mapreduce
hadoop
mllib
logistic regression
alpine data labs
big data
data mining
apache spark
l-bfgs
multinomial
netflix
svd
k-means
unsupervised learning
internet of things
iot
large scale
recommendation
pipeline
kernel methods
linear models
polynomial mapping
feature engineering
linear regression
ml
spark summit
elastic-net
batch layer
serving layer
speed layer
spark streaming
pig
lambda architecture
real time
storm
stream
See more
Users following DB Tsai