�ݺ�ߣ

A General Overview of Machine
Learning
Boise Data Science Meetup -- September 18, 2018
Ashish Sharma
? Software Systems Engineer -- HomeCU, LLC. (2017 - present)
? Founder -- AI Developers, Boise
? City Ambassador -- AI Saturdays (global initiative of nurture.ai)
? Alumnus -- Boise State University (MS in Computer Science, 2015-2017)
1

Overview
�� AI and Applications
�� Intro to Machine Learning
�� Types of Machine Learning
�� Which algorithm should I use?
�� Effective Machine Learning
Image Source: Cousins of Artificial Intelligence -- Towards Datascience 2

AI Resurgence
? Computational Power (GPUs, cloud computing, distributed systems)
? Availability of large amount of Data (eg. Imagenet)
? Better theoretical understanding of the underlying techniques/algorithms
? Open and easily accessible research culture in academia and industry
(NIPS, ICML, archiv.org)
3

AI Resurgence (contd..)
? Netflix Challenge (2009) $1 Million Prize (User ratings for films)
? Kaggle (2010) (over more than a million users today)
? Fei-Fei Li and team at Stanford open sourced ImageNet (2008-2010)
�� Imagenet Large Scale Visual Recognition Challenge (ILSVRC)
? Geoffrey Hinton��s Deep Learning Team wins ImageNet 2012 (Alexnet)
4

Common Applications
? Speech recognition (virtual assistants)
? Advanced machine translation and natural language intelligence
? Strategic gaming algorithms (AlphaGo, chess)
? Computer Vision (image classification and object detection)
? Autonomous Vehicles
? Manufacturing Companies (landing.ai)
? Healthcare (Google��s research on diabetic retinopathy -- with F-score of
0.95, surpassing the accuracy of 8 expert ophthalmologists)
5

Machine Learning
? Form of applied statistics with emphasis
on the use of computers to learn
complex mathematical functions.
? More formally, ��A computer program is
said to learn from experience E with
respect to some class of tasks T and
performance measure P, if its
performance at tasks in T, as measured
by P, improves with experience E.��
Image Source: xkcd
6

Types of Machine Learning
? Supervised Learning
? Unsupervised Learning
? Reinforcement Learning
7

Supervised Learning
Terminologies:
? Input variable(s)
�� independent variable(s)
�� feature(s)/characteristic(s) of a single input object
�� Numerical -- continuous ( height, area of house) , discrete (grades, age)
�� Categorical (race, sex) -- nominal, ordinal
? Target variable(s)
�� Dependent variable(s), number/vector (eg. price of house, patient is diabetic, etc.)
8

Supervised Learning
? Function approximation
�� Mathematically: solve for coefficient(s) of a function
�� Search for a best performing model from a hypothesis space.
�� Make predictions based on historical (labeled) data
? Regression (predict continuous target variable)
�� Univariate Regression (1 input variable, 1 output variable)
�� Multiple Regression (>=2 input variables, 1 output variable)
�� Multivariate Regression (>=2 output variable)
? Classification (predict discrete/categorical target variable)
�� Email: Spam or not?
�� Is this image a dog or cat?
9

Unsupervised Learning
? Unsupervised Learning
�� Find hidden patterns and draw inference from (unlabeled) data
�� Essential for preliminary data analysis and visualization
? Clustering (grouping of similar data points)
�� K-Means, DBSCAN
? Dimensionality Reduction
�� Principal Components Analysis
�� Autoencoders
10

Reinforcement Learning
? AI, Animal Psychology, Control Theory
? Agents, Actions, Environment, Change in State, Reward/Punishment
? Eg. Deep Attari:
�� Input: Snapshots of Attari board images (State and Actions)
�� Algorithm: Convolutional NNs with no pooling
�� Output layer: tailored for regression score (Maximize Reward)
11

Beginner��s Question!
? (Q)* Which Algorithm Should I Use?
? (A) The answer varies depending on many factors, including:
�� The size, quality, and nature of data ;
�� The available computational time;
�� The urgency of the task; and
�� What you want to do with the data(the problem).
* towardsdatascience.com
12

Which algorithm should I use?
�� No one algorithm works best for every problem (Yes, not even neural networks!)
13

Important Concepts
? Model Selection:
�� K-crossfold validation
�� Train/Test/Evaluation Dataset
? Loss functions
? Convex Optimization
? Gradient Descent
? Model Complexity, Overfitting and Underfitting
? Regularization
? Training and Generalization Errors
14

Questions to ask when working on ML project!
? How much data do I have? What type/nature of data?
? How skilled and knowledgeable am I in this domain?
�� Will I be able to create more useful features from what I already have?
? How good am I in error analysis?
15

Questions to ask when working on ML project!
? Assumptions, Limitations and Adoption (ALA rule) of the algorithm.
�� Linear Regression (linear relationship, no or little multicollinearity, etc.)
�� Why does this particular loss function make sense?
? How good am I in debugging the chosen learning algorithm?
16

Effective Machine Learning
? Reduce time spent in programming (more experiments in short time)
�� Use off the shelf tools
? Customize and Scale Products
�� Start simple, scale as needed (again, choice of relevant toolsets)
? Think like a Scientist
�� Use statistics, not logic, to make decisions from the real world observations
* �ݺ�ߣ content referred from Google��s Machine Learning Crash Course
17

Thank You
Ashish Sharma
Email: accssharma@gmail.com
/in/accssharma
@accssharma
AI Developers, Boise: https://github.com/aidevelopersboise/ai6-boise-materials
HomeCU is hiring Software Engineers and Mobile Developers.
https://www.homecu.net/company-jobs.html
18

Visual Demonstrations
? K nearest neighbor: http://vision.stanford.edu/teaching/cs231n-demos/knn/
? CIFAR 10 Image Classification:
https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html
19

�ݺ�ߣ

A General Overview of Machine Learning

More Related Content

A General Overview of Machine Learning