Let’s say you want to recognise digits
MNIST: Very famous dataset from scikit-learn
Let’s say you want to use the large training set with examples (128×128 pixels)
So that when I draw you a new digit, you can tell what it is!
⃝c -Trenn, King’s College London 2
Let’s say you want to recognise digits
Problem: Each digit has 128 ̈ 128 “ 16, 384 features/dimensions Is there a nice way to reduce the number of features/dimensions?
⃝c -Trenn, King’s College London 3
A cool way of doing this
We try to find the two components (each is a combination of the features)
⃝c -Trenn, King’s College London 4
PCA
⃝c -Trenn, King’s College London 5
PCA
Red cross is new input
Easy to figure out where it belongs to..
⃝c -Trenn, King’s College London 6
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised)
⃝c -Trenn, King’s College London 7
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
⃝c -Trenn, King’s College London 8
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
Strong theoretical background
⃝c -Trenn, King’s College London 9
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
Strong theoretical background
Can also be used to store data in more efficient way (Image compression).
⃝c -Trenn, King’s College London 10
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
Strong theoretical background
Can also be used to store data in more efficient way (Image compression). Visual evaluation possible for a small number of components (say 2)
⃝c -Trenn, King’s College London 11
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
Strong theoretical background
Can also be used to store data in more efficient way (Image compression). Visual evaluation possible for a small number of components (say 2)
⃝c -Trenn, King’s College London 12
PCA / SVD
Advantages
State-of-the-art for many applications (supervised and unsupervised) Incredibly efficient (often, almost linear time)
Strong theoretical background
Can also be used to store data in more efficient way (Image compression). Visual evaluation possible for a small number of components (say 2)
Small disclaimer: PCA and SVD (Singular value decomposition) are slightly different, but very very similar, we’ll look at PCA (which often uses SVD)
⃝c -Trenn, King’s College London 13