際際滷

際際滷Share a Scribd company logo
Lunch	and	Learn	
At	
Data	Science	for	Social	Good
NOVA-SBE	and	University	of	Chicago
By
Manas	Gaur
Singular Value Decomposition
first right
singular vector
 Singular	Value	Decomposition	(SVD)	is	also	called	
Spectral	Decomposition
 Instead	of	using	two	coordinates	(, ) to	describe	point	
locations,	lets	use	only	one	coordinate	 
 Points	position	is	its	location	along	vector	 
 How	to	choose	 ?	Minimize	reconstruction	error
J.	Leskovec,	A.	Rajaraman,	J.	Ullman:	Mining	of	Massive	Datasets,	http://www.mmds.org
Singular Value Decomposition
 Goal:	Minimize	the	sum
of	reconstruction	errors:	
) ) +,  +,
/
0
,12
3
+12
 where	  are	the	old	and	 are	the	
new	coordinates
 SVD	gives	best	axis	to	project	on:
 best	=	minimizing	the	reconstruction	errors
 In	other	words,	minimum	reconstruction	error
J.	Leskovec,	A.	Rajaraman,	J.	Ullman:	Mining	of	Massive	Datasets,	http://www.mmds.org
Singular Value Decomposition
A	=	U	裡 VT	- example:
 V:	movie-to-concept	matrix
 U:	user-to-concept	matrix
= x x
1 1 1 0 0
3 3 3 0 0
4 4 4 0 0
5 5 5 0 0
0 2 0 4 4
0 0 0 5 5
0 1 0 2 2
0.13 0.02 -0.01
0.41 0.07 -0.03
0.55 0.09 -0.04
0.68 0.11 -0.05
0.15 -0.59 0.65
0.07 -0.73 -0.67
0.07 -0.29 0.32
12.4 0 0
0 9.5 0
0 0 1.3
0.56 0.59 0.56 0.09 0.09
0.12 -0.02 0.12 -0.69 -0.69
0.40 -0.80 0.40 0.09 0.09
variance (spread)
on the v1 axis
Movie 1 rating
Movie2rating
J.	Leskovec,	A.	Rajaraman,	J.	Ullman:	Mining	of	Massive	Datasets,	http://www.mmds.org
Singular Value Decomposition
A U
Sigma
VT
=
B U
Sigma
VT
=
B is best approximation of A
How Many Singular Values Should
We Retain?
 A useful rule of thumb is to
retain enough singular values
to make up 90% of the energy
in 裡.
 Sum of the squares of the
retained singular values should
be at least 90% of the sum of the
squares of all the singular values.
 Example: the total energy is
(12.4)2 + (9.5)2 + (1.3)2 =
245.70, while the retained
energy is (12.4)2 + (9.5)2 =
244.01.
 We have retained over 99% of the
energy. However, were we to
eliminate the second singular
value, 9.5, the retained energy
would be only (12.4)2/245.70 or
about 63%.J.	Leskovec,	A.	Rajaraman,	J.	Ullman:	Mining	of	Massive	Datasets,	http://www.mmds.org
Relation to Eigen-decomposition
 SVD	gives	us:
 A = U 裡 VT
 Eigen-decomposition:
 A = X  XT
 A	is	symmetric
 U,	V,	X	are	orthonormal	(UTU=I),
 , 裡 are	diagonal
 Now	lets	calculate:
 AAT= U裡 VT(U裡 VT)T = U裡 VT(V裡TUT) = U裡裡T UT
 ATA = V 裡T UT (U裡 VT) = V 裡裡T VT
X 2 XT
X 2 XT
Shows how to compute
SVD using eigenvalue
decomposition!
J.	Leskovec,	A.	Rajaraman,	J.	Ullman:	Mining	of	Massive	Datasets,	http://www.mmds.org
Non-Linear Dimensionality
Reduction
Brainstorming
 What	is	dimensionality	of	data	?
 What	is	degree	of	freedoms	of	data	?
 Is	the	data	always	exist	in	high-dimensional	space	?
 What	is	the	rank	of	a	matrix	?
 What	motivates	us	for	non-linear	dimensionality	reduction	?
 Can	the	deep	learnings	popular	MNIST	dataset	problem,	solvable	by	
simple	machine	learning	model	?
Why do we need dimensionality reduction?
 You	need	to	visualize	it	to	some	non-technical	board	members	which	
are	probably	not	familiar	with	:	terms	like	cosine	similarity	etc.
 Based	on	the	constraint,	such	as	preserve	80%	of	the	data.
 You	need	to	reduce	the	data	you	have	and	any	new	data	as	it	comes,	
which	method	would	you	choose?
Non-Linear Dimensionality Reduction
 Given a low dimensional surface embedded non-linearly in high dimensional space.
Such a surfaceiscalledManifold.
 Agood wayto representdatapointsisbytheir low-dimensionalcoordinates.
 The low-dimensional representation of the data should capture information about
high- dimensionalpairwisedistances.
 Non-lineardimensionalityreductionisalsocalledManifoldlearning.
 Idea :- Torecoverthe lowdimensionalsurface
ISOMAP
Stochastic	
Nearest	
Embedding
T-Stochastic
Nearest	
Embedding
NLDR over PCA
NLDR PCA
https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
Isomap
 Isomapusesthesame core ideasastheMDS algorithm:
 Obtainamatrixofproximities(distancesbetweenpointsinadataset).
 Thisdistancematrixisamatrixofinnerproducts.
 AnEigendecompositionofthismatrixgivesusthelowerdimensionembedding.
Stochastic Neighbor Embedding (SNE)
,|+ =	

;(<=;<>)
/?=
@
@
 
;(<=;<B)
/?=
@
@
CD+
,|+ =
 F=;F>
@
  F=;F>
@
CD+
 =	
1
2
情(|  =	 )    	log	
  
  
	

F=,F>
情(||)
High	dimensional	space
Minimization	function
Low	dimensional	space	(2-D)
1.Large 倹| is modeled as Low  | 
High Cost
2.Small 倹| is modeled as High  | 
Low Cost
1.SNE is not Symmetric whereas
t-SNE is Symmetric.
2.Symmetricity makes t-SNE
fast.
T-Stochastic Neighbor Embedding (t-SNE)
,+ =	
(1 + ( +  ,
/
);2
 (1 + ( +  ,
/
);2
CD+
,+ =	

;(<=;<>)
/?@
@
 
;(<=;<B)
/?@
@
CD+
1. t-distribution has longer tails, embeds
more points in higher dimension to low
dimension.
2. There are some heuristics underlying t-
SNE.
3. Develops an intuition for whats going
on in the high dimensional data
4. Find structure where other
dimensionality-reduction algorithms
cannot
High	dimensional	space
Low	dimensional	space	(2-D)

More Related Content

Similar to Dimensionality Reduction : Above PCA (20)

DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
Data Mining Lecture_9.pptx
Data Mining Lecture_9.pptxData Mining Lecture_9.pptx
Data Mining Lecture_9.pptx
Subrata Kumer Paul
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
hktripathy
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in education
CSIRO
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
Rajiv Advani
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and Related
Pei-Che Chang
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
Shubham Sharma
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
sghorai
Elhabian lda09
Elhabian lda09Elhabian lda09
Elhabian lda09
Mr. Neelamegam D
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ChenYiHuang5
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
AminaRepo
Part2
Part2Part2
Part2
khawarbashir
Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr
taeseon ryu
惆悽 悒 惠惺 悋悛悸
惆悽 悒 惠惺 悋悛悸惆悽 悒 惠惺 悋悛悸
惆悽 悒 惠惺 悋悛悸
Fares Al-Qunaieer
TPDM Presentation 際際滷 (ICCV23)
TPDM Presentation 際際滷 (ICCV23)TPDM Presentation 際際滷 (ICCV23)
TPDM Presentation 際際滷 (ICCV23)
Suhyeon Lee
clustering unsupervised learning and machine learning.pdf
clustering unsupervised learning and machine learning.pdfclustering unsupervised learning and machine learning.pdf
clustering unsupervised learning and machine learning.pdf
SameerAhmed721974
Unsupervised learning and clustering.pdf
Unsupervised learning and clustering.pdfUnsupervised learning and clustering.pdf
Unsupervised learning and clustering.pdf
officialnovice7
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
South West Data Meetup
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
Pranav Ainavolu
Machine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConfMachine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConf
Seth Juarez
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
hktripathy
Explainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in educationExplainable algorithm evaluation from lessons in education
Explainable algorithm evaluation from lessons in education
CSIRO
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
Rajiv Advani
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and Related
Pei-Che Chang
Data Science and Machine Learning with Tensorflow
 Data Science and Machine Learning with Tensorflow Data Science and Machine Learning with Tensorflow
Data Science and Machine Learning with Tensorflow
Shubham Sharma
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
sghorai
Paper study: Learning to solve circuit sat
Paper study: Learning to solve circuit satPaper study: Learning to solve circuit sat
Paper study: Learning to solve circuit sat
ChenYiHuang5
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
AminaRepo
Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr Deep learning paper review ppt sourece -Direct clr
Deep learning paper review ppt sourece -Direct clr
taeseon ryu
惆悽 悒 惠惺 悋悛悸
惆悽 悒 惠惺 悋悛悸惆悽 悒 惠惺 悋悛悸
惆悽 悒 惠惺 悋悛悸
Fares Al-Qunaieer
TPDM Presentation 際際滷 (ICCV23)
TPDM Presentation 際際滷 (ICCV23)TPDM Presentation 際際滷 (ICCV23)
TPDM Presentation 際際滷 (ICCV23)
Suhyeon Lee
clustering unsupervised learning and machine learning.pdf
clustering unsupervised learning and machine learning.pdfclustering unsupervised learning and machine learning.pdf
clustering unsupervised learning and machine learning.pdf
SameerAhmed721974
Unsupervised learning and clustering.pdf
Unsupervised learning and clustering.pdfUnsupervised learning and clustering.pdf
Unsupervised learning and clustering.pdf
officialnovice7
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
Pranav Ainavolu
Machine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConfMachine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConf
Seth Juarez

Recently uploaded (20)

Updated Willow 2025 Media Deck_270225.pdf
Updated Willow 2025 Media Deck_270225.pdfUpdated Willow 2025 Media Deck_270225.pdf
Updated Willow 2025 Media Deck_270225.pdf
tangramcommunication
Industry_Use_Cases.ppt Industry_Use_Cases.ppt
Industry_Use_Cases.ppt Industry_Use_Cases.pptIndustry_Use_Cases.ppt Industry_Use_Cases.ppt
Industry_Use_Cases.ppt Industry_Use_Cases.ppt
ssuser3e8ddb
Data_Collection_Methods_ in researchppt.pdf
Data_Collection_Methods_ in researchppt.pdfData_Collection_Methods_ in researchppt.pdf
Data_Collection_Methods_ in researchppt.pdf
nishaaggarwal46
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICESHIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
anastasiapenova16
Monitoring Imam Ririn di Pilkada Kota Depok 2024
Monitoring Imam Ririn di Pilkada Kota Depok 2024Monitoring Imam Ririn di Pilkada Kota Depok 2024
Monitoring Imam Ririn di Pilkada Kota Depok 2024
Deddy Rahman
New Income Tax Bill - Capital Gains .pdf
New Income Tax Bill - Capital Gains .pdfNew Income Tax Bill - Capital Gains .pdf
New Income Tax Bill - Capital Gains .pdf
HarshilShah134194
International Journal on Cloud Computing: Services and Architecture (IJCCSA)
International Journal on Cloud Computing: Services and Architecture (IJCCSA)International Journal on Cloud Computing: Services and Architecture (IJCCSA)
International Journal on Cloud Computing: Services and Architecture (IJCCSA)
ijccsa
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
jarreldelacruz31
2025 Trends: What Really Works in SEO and Content Marketing
2025 Trends: What Really Works in SEO and Content Marketing2025 Trends: What Really Works in SEO and Content Marketing
2025 Trends: What Really Works in SEO and Content Marketing
Dr. Sasidharan Murugan
G33this is the presentaion fo smart desing.pdf
G33this is the presentaion fo smart desing.pdfG33this is the presentaion fo smart desing.pdf
G33this is the presentaion fo smart desing.pdf
Li0nSinEscanor
Kaggle & Datathons: A Practical Guide to AI Competitions
Kaggle & Datathons: A Practical Guide to AI CompetitionsKaggle & Datathons: A Practical Guide to AI Competitions
Kaggle & Datathons: A Practical Guide to AI Competitions
rasheedsrq
9th Edition of International Research Awards
9th Edition of International Research Awards9th Edition of International Research Awards
9th Edition of International Research Awards
sciencereviewerview
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Krishna Khanal
Key Benefits of Implementing Contify's M&CI Platform
Key Benefits of Implementing Contify's M&CI PlatformKey Benefits of Implementing Contify's M&CI Platform
Key Benefits of Implementing Contify's M&CI Platform
Contify
Cybersecurity_Management_Presentation.pptx
Cybersecurity_Management_Presentation.pptxCybersecurity_Management_Presentation.pptx
Cybersecurity_Management_Presentation.pptx
rajkumarrch23
02 a movie weekend 2001 a space odyssey.pptx
02 a movie weekend 2001 a space odyssey.pptx02 a movie weekend 2001 a space odyssey.pptx
02 a movie weekend 2001 a space odyssey.pptx
kasmirsyariati
Herding behavior Experimental Studies --AlisonLo
Herding behavior Experimental Studies --AlisonLoHerding behavior Experimental Studies --AlisonLo
Herding behavior Experimental Studies --AlisonLo
AlisonKL
Plant Disease Prediction with Image Classification using CNN.pdf
Plant Disease Prediction with Image Classification using CNN.pdfPlant Disease Prediction with Image Classification using CNN.pdf
Plant Disease Prediction with Image Classification using CNN.pdf
Theekshana Wanniarachchi
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptxvnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
deomom129
Analyzing Consumer Spending Trends and Purchasing Behavior
Analyzing Consumer Spending Trends and Purchasing BehaviorAnalyzing Consumer Spending Trends and Purchasing Behavior
Analyzing Consumer Spending Trends and Purchasing Behavior
omololaokeowo1
Updated Willow 2025 Media Deck_270225.pdf
Updated Willow 2025 Media Deck_270225.pdfUpdated Willow 2025 Media Deck_270225.pdf
Updated Willow 2025 Media Deck_270225.pdf
tangramcommunication
Industry_Use_Cases.ppt Industry_Use_Cases.ppt
Industry_Use_Cases.ppt Industry_Use_Cases.pptIndustry_Use_Cases.ppt Industry_Use_Cases.ppt
Industry_Use_Cases.ppt Industry_Use_Cases.ppt
ssuser3e8ddb
Data_Collection_Methods_ in researchppt.pdf
Data_Collection_Methods_ in researchppt.pdfData_Collection_Methods_ in researchppt.pdf
Data_Collection_Methods_ in researchppt.pdf
nishaaggarwal46
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICESHIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
HIRE MUYERN TRUST HACKER FOR AUTHENTIC CYBER SERVICES
anastasiapenova16
Monitoring Imam Ririn di Pilkada Kota Depok 2024
Monitoring Imam Ririn di Pilkada Kota Depok 2024Monitoring Imam Ririn di Pilkada Kota Depok 2024
Monitoring Imam Ririn di Pilkada Kota Depok 2024
Deddy Rahman
New Income Tax Bill - Capital Gains .pdf
New Income Tax Bill - Capital Gains .pdfNew Income Tax Bill - Capital Gains .pdf
New Income Tax Bill - Capital Gains .pdf
HarshilShah134194
International Journal on Cloud Computing: Services and Architecture (IJCCSA)
International Journal on Cloud Computing: Services and Architecture (IJCCSA)International Journal on Cloud Computing: Services and Architecture (IJCCSA)
International Journal on Cloud Computing: Services and Architecture (IJCCSA)
ijccsa
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
Pink Yellow Purple Lifestyle Vision Board Scribbles Doodles Whiteboard Pres_2...
jarreldelacruz31
2025 Trends: What Really Works in SEO and Content Marketing
2025 Trends: What Really Works in SEO and Content Marketing2025 Trends: What Really Works in SEO and Content Marketing
2025 Trends: What Really Works in SEO and Content Marketing
Dr. Sasidharan Murugan
G33this is the presentaion fo smart desing.pdf
G33this is the presentaion fo smart desing.pdfG33this is the presentaion fo smart desing.pdf
G33this is the presentaion fo smart desing.pdf
Li0nSinEscanor
Kaggle & Datathons: A Practical Guide to AI Competitions
Kaggle & Datathons: A Practical Guide to AI CompetitionsKaggle & Datathons: A Practical Guide to AI Competitions
Kaggle & Datathons: A Practical Guide to AI Competitions
rasheedsrq
9th Edition of International Research Awards
9th Edition of International Research Awards9th Edition of International Research Awards
9th Edition of International Research Awards
sciencereviewerview
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Leveraging-Virtual-Reality-(VR)-and-Augmented-Reality-(AR)-for-Enhanced-Touri...
Krishna Khanal
Key Benefits of Implementing Contify's M&CI Platform
Key Benefits of Implementing Contify's M&CI PlatformKey Benefits of Implementing Contify's M&CI Platform
Key Benefits of Implementing Contify's M&CI Platform
Contify
Cybersecurity_Management_Presentation.pptx
Cybersecurity_Management_Presentation.pptxCybersecurity_Management_Presentation.pptx
Cybersecurity_Management_Presentation.pptx
rajkumarrch23
02 a movie weekend 2001 a space odyssey.pptx
02 a movie weekend 2001 a space odyssey.pptx02 a movie weekend 2001 a space odyssey.pptx
02 a movie weekend 2001 a space odyssey.pptx
kasmirsyariati
Herding behavior Experimental Studies --AlisonLo
Herding behavior Experimental Studies --AlisonLoHerding behavior Experimental Studies --AlisonLo
Herding behavior Experimental Studies --AlisonLo
AlisonKL
Plant Disease Prediction with Image Classification using CNN.pdf
Plant Disease Prediction with Image Classification using CNN.pdfPlant Disease Prediction with Image Classification using CNN.pdf
Plant Disease Prediction with Image Classification using CNN.pdf
Theekshana Wanniarachchi
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptxvnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
vnptloveeeeeeeeeeeeeeeeeeeeeeeeeeee.pptx
deomom129
Analyzing Consumer Spending Trends and Purchasing Behavior
Analyzing Consumer Spending Trends and Purchasing BehaviorAnalyzing Consumer Spending Trends and Purchasing Behavior
Analyzing Consumer Spending Trends and Purchasing Behavior
omololaokeowo1

Dimensionality Reduction : Above PCA

  • 2. Singular Value Decomposition first right singular vector Singular Value Decomposition (SVD) is also called Spectral Decomposition Instead of using two coordinates (, ) to describe point locations, lets use only one coordinate Points position is its location along vector How to choose ? Minimize reconstruction error J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
  • 3. Singular Value Decomposition Goal: Minimize the sum of reconstruction errors: ) ) +, +, / 0 ,12 3 +12 where are the old and are the new coordinates SVD gives best axis to project on: best = minimizing the reconstruction errors In other words, minimum reconstruction error J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
  • 4. Singular Value Decomposition A = U 裡 VT - example: V: movie-to-concept matrix U: user-to-concept matrix = x x 1 1 1 0 0 3 3 3 0 0 4 4 4 0 0 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 0 1 0 2 2 0.13 0.02 -0.01 0.41 0.07 -0.03 0.55 0.09 -0.04 0.68 0.11 -0.05 0.15 -0.59 0.65 0.07 -0.73 -0.67 0.07 -0.29 0.32 12.4 0 0 0 9.5 0 0 0 1.3 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09 variance (spread) on the v1 axis Movie 1 rating Movie2rating J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
  • 5. Singular Value Decomposition A U Sigma VT = B U Sigma VT = B is best approximation of A How Many Singular Values Should We Retain? A useful rule of thumb is to retain enough singular values to make up 90% of the energy in 裡. Sum of the squares of the retained singular values should be at least 90% of the sum of the squares of all the singular values. Example: the total energy is (12.4)2 + (9.5)2 + (1.3)2 = 245.70, while the retained energy is (12.4)2 + (9.5)2 = 244.01. We have retained over 99% of the energy. However, were we to eliminate the second singular value, 9.5, the retained energy would be only (12.4)2/245.70 or about 63%.J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
  • 6. Relation to Eigen-decomposition SVD gives us: A = U 裡 VT Eigen-decomposition: A = X XT A is symmetric U, V, X are orthonormal (UTU=I), , 裡 are diagonal Now lets calculate: AAT= U裡 VT(U裡 VT)T = U裡 VT(V裡TUT) = U裡裡T UT ATA = V 裡T UT (U裡 VT) = V 裡裡T VT X 2 XT X 2 XT Shows how to compute SVD using eigenvalue decomposition! J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
  • 8. Brainstorming What is dimensionality of data ? What is degree of freedoms of data ? Is the data always exist in high-dimensional space ? What is the rank of a matrix ? What motivates us for non-linear dimensionality reduction ? Can the deep learnings popular MNIST dataset problem, solvable by simple machine learning model ?
  • 9. Why do we need dimensionality reduction? You need to visualize it to some non-technical board members which are probably not familiar with : terms like cosine similarity etc. Based on the constraint, such as preserve 80% of the data. You need to reduce the data you have and any new data as it comes, which method would you choose?
  • 10. Non-Linear Dimensionality Reduction Given a low dimensional surface embedded non-linearly in high dimensional space. Such a surfaceiscalledManifold. Agood wayto representdatapointsisbytheir low-dimensionalcoordinates. The low-dimensional representation of the data should capture information about high- dimensionalpairwisedistances. Non-lineardimensionalityreductionisalsocalledManifoldlearning. Idea :- Torecoverthe lowdimensionalsurface
  • 12. NLDR over PCA NLDR PCA https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
  • 13. Isomap Isomapusesthesame core ideasastheMDS algorithm: Obtainamatrixofproximities(distancesbetweenpointsinadataset). Thisdistancematrixisamatrixofinnerproducts. AnEigendecompositionofthismatrixgivesusthelowerdimensionembedding.
  • 14. Stochastic Neighbor Embedding (SNE) ,|+ = ;(<=;<>) /?= @ @ ;(<=;<B) /?= @ @ CD+ ,|+ = F=;F> @ F=;F> @ CD+ = 1 2 情(| = ) log F=,F> 情(||) High dimensional space Minimization function Low dimensional space (2-D) 1.Large 倹| is modeled as Low | High Cost 2.Small 倹| is modeled as High | Low Cost 1.SNE is not Symmetric whereas t-SNE is Symmetric. 2.Symmetricity makes t-SNE fast.
  • 15. T-Stochastic Neighbor Embedding (t-SNE) ,+ = (1 + ( + , / );2 (1 + ( + , / );2 CD+ ,+ = ;(<=;<>) /?@ @ ;(<=;<B) /?@ @ CD+ 1. t-distribution has longer tails, embeds more points in higher dimension to low dimension. 2. There are some heuristics underlying t- SNE. 3. Develops an intuition for whats going on in the high dimensional data 4. Find structure where other dimensionality-reduction algorithms cannot High dimensional space Low dimensional space (2-D)