LECTURE 4 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE
SYSTEMS DESIGN
Original problem
DEALING WITH DIFFICULT PROBLEMS
— Improving bad solutions
– StartwithabadSolution(weaklearner)andimproveit
– Buildupabettersolutionbythinkingabouthowpartialsolutionscan support/correct each others mistakes
DEALING WITH DIFFICULT PROBLEMS
— Improving bad solutions
– StartwithabadSolution(weaklearner)andimproveit
– Buildupabettersolutionbythinkingabouthowpartialsolutionscan support/correct each others mistakes
— Make the problem simpler – Divideandconcur
– Problemdecomposition
DEALING WITH DIFFICULT PROBLEMS
— Improving bad solutions
– StartwithabadSolution(weaklearner)andimproveit
– Buildupabettersolutionbythinkingabouthowpartialsolutionscan support/correct each others mistakes
— Make the problem simpler – Divideandconcur
– Problemdecomposition
— Building much better solutions – Deepmodels
ENSEMBLES
IMPROVING BAD SOLUTIONS
— Start with a bad Solution (weak learner) and improve it
— Build up a better solution by thinking about how partial solutions can support/correct each others mistakes
ENSEMBLES
IMPROVING BAD SOLUTIONS
— Voting
– Majorityvoting
— Bagging and Pasting
– Out-of-bagevaluation
— Boosting
– AdaptiveBoosting(Adaboost) – GradientBoosting
– XGBoost
— Stacking
MAJORITY VOTING
B AGGING
GRADIENT BOOSTING FITTING RESIDUAL ERRORS
DECOMPOSITION
STARTING WITH EASIER PROBLEMS
— Start with a hard Problem
— Break the problem into a lot of easier sub-tasks
— Make each subtask support the analysis in subsequent tasks easier
A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
Super vised
C. Clustering
D. Decomposition
Unsuper vised
A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
We know what the right answer is
Super vised
C. Clustering
D. Decomposition
Unsuper vised
A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
Super vised
C. Clustering
D. Decomposition
Unsuper vised
We don’t know what the right answer is – but we can recognize a good answer if we find it
A – B – C- D ALGORITHMIC APPROACHES
A. ClAssification
B. Regression
Super vised
C. Clustering
D. Decomposition
Unsuper vised
We don’t know what the right answer is – but we can recognize a good answer if we find it
MOTIVATING DECOMPOSITION
COMPRESSION
D. DECOMPOSITION 2. PROJECTION METHODS
Dimensionality reduction
D. DECOMPOSITION 2. KERNEL METHODS
D. DECOMPOSITION 3. MANIFOLD LEARNING
CURSE OF DIMENSIONALITY
SUBSPACES
MOTIVATING DECOMPOSITION
LOW DIMENSIONAL SUBSPACES
DECOMPOSITION
THREE APPROACHES
— Dimensionality Reduction / Projection — Kernel Methods
— Manifold Learning
B. REGRESSION REAL VALUED VARIABLE
MOTIVATING PROJECTION INSTABILITY
FINDING THE RIGHT DIMENSION
SUBSPACES
PROJECTION IN MULTIPLE DIMENSIONS
REDUCTION TO A SINGLE DIMENSION
COMPRESSION
MNIST 95% VARIANCE PRESERVED
PROBLEMS WITH PROJECTION
PROBLEMS WITH PROJECTION
KERNEL METHODS
Kernel spaces
KERNEL PCA
MANIFOLD METHODS
Manifold learning
MANIFOLD LEARNING
MANIFOLD LEARNING
OTHER TECHNIQUES
LOCAL LINEAR EMBEDDING
DECOMPOSITION METHODS
— Random Projections
— Multidimensional Scaling (MDS) — Isomap
— Linear Discriminant Analysis (LDA)
ADVANTAGES
The main motivations for dimensionality reduction are:
— To speed up a subsequent training algorithm (in some cases it may even remove noise and redundant features, making the training algorithm perform better).
— To visualize the data and gain insights on the most important features. — Simply to save space (compression).
DIS AD VAN TAGES
The main drawbacks are:
— Some information is lost, possibly degrading the performance of subsequent training algorithms.
— It can be computationally intensive.
— It adds some complexity to your Machine Learning pipelines. — Transformed features are often hard to interpret.
WHEN IT DOESN’T WORK
IMPLICIT ASSUMPTION IT MAKES THE PROBLEM EASIER
EMBEDDING PROJECTOR
GOOGLE BRAIN TEAM 2016