PCA (Principal Component Analysis) is a technique used to simplify complex data sets by reducing their dimensionality. It transforms a number of possibly correlated variables into a smaller number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The document provides background on concepts like variance, covariance, and eigenvalues that are important to understanding PCA. It also includes an example of using PCA to analyze student data and identify the most important parameters to describe students.
1 of 26
Downloaded 266 times
More Related Content
Pca
1. PCA : Principal Component
Analysis
Author : Nalini Yadav
Under Guidance of Prof. K. Rajeshwari
2. PCA
ï‚› A backbone of modern data analysis.
ï‚› A black box that is widely used but poorly
understood.
ï‚› It is a mathematical tool from applied linear
algebra.
ï‚› It is a simple, non-parametric method of
extracting relevant information from confusing
data sets.
ï‚› It provides a roadmap for how to reduce a
complex data set to a lower dimension
5. Background knowledge
 Finding the roots of | A – λ .I| will give the
eigenvalues and for each of these
eigenvalues there will be an eigenvector
9. PCA : Example
ï‚›We collected m parameters about 100
students
ï‚›Height
ï‚›Weight
ï‚›Hair color
ï‚›Average grade
…
ï‚›We want to find the most important
parameters that best describe a student.
11. PCA : Example
ï‚› Which parameters can we ignore?
ï‚› Constant parameter (number of heads)
1,1,...,1.
ï‚› Constant parameter with some noise - (thickness of
hair)
0.003, 0.005,0.002,....,0.0008 -> low variance
ï‚› Parameter that is linearly dependent on other
parameters (head size and height)
Z= aX + bY