You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
finding an affine transformation minimizing the reconstruction error
finding an affine transformation maximizing the variance of the projections
you do not necessarily lose information by reducing the number of dimensions: you just change the representation of the data
we want the dimensions to be orthogonal to each other
the first principal component vector is a normalized eigenvector associated with the largest eigenvalue of the sample covariance matrix $\tilde{X}\tilde{X}^{T}$
fraction of variance we want to keep: $\frac{\sum_{i=0}^{r-1}}{\sum_{i=0}^{N-1}}$ keeps $r$ first eigenvalues
Algorithm:
Principal Componentss are the eigenvectors of the covariance matrix of the original dataset
covariance matrix is symmetric => eigenvectors are orthogonal
big eigenvalue => big variance, we want to keep this dimension
scale data (standardize only)
compute cov matrix
compute eigenvalues/vectors
sort eigenvalues and keep the $K$ highest eigenvalues and corresponding eigenvectors
normalize eigenvectors
compute $X(PC1|PC2|...)$
Questions
In what case is a PCA lossless?
If you losslessly reduce your data dimensionality from 3 to 2 with PCA, then one eigenvalue of the covariance matrix must have been zero.
Remember that a big eigenvalue means there's a lot of variance on this axis.
So a small eigenvalue means there's no variance on this axis. Thus it's useless.