LECTURE 2 TERM 2:
MSIN0097
Predictive Analytics Video 5: Classification
A P MOORE
CL ASSIFIC ATION
A. ClAssification B. Regression
C. Clustering D. Decomposition
Supervised
Unsuper vised
A. CLASSIFICATION CATEGORICAL VARIABLE
A. CLASSIFICATION CATEGORICAL VARIABLE
A. CLASSIFICATION CATEGORICAL VARIABLE
A. CLASSIFICATION CATEGORICAL VARIABLE
GRID (PIXEL) DATA
VARIATION IN DIGITS
PARTITIONING DATA
K-FOLD VALIDATION
CONFUSION MATRIX BINARY FORCED CHOICE
The picture can’t be displayed.
A. CLASSIFICATION CATEGORICAL VARIABLE
Model 1
predicted
predicted
Model 2
actual
actual
CONFUSION MATRIX
SINGLE METRICS
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑅𝑒𝑐𝑎𝑙𝑙 =
𝐹1 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃 𝑇𝑃
𝑇𝑃 + 𝐹𝑁 2
=2×𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛×𝑅𝑒𝑐𝑎𝑙𝑙= 𝑃𝑟𝑒𝑐𝑖𝑖𝑠𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
𝑇𝑃
𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃 2
1
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑅𝑒𝑐𝑎𝑙𝑙
+
1
SENSITIVITY, SPECIFICITY…
https://en.wikipedia.org/wiki/Confusion_matrix
Prevalence
Accuracy
Positive predictive value False discovery rate False omission rate Diagnostic odds ratio…
PRECISION / RECALL TRADE OFF
Decision function
DECISION THRESHOLD
PR CURVE
ROC CURVE & AUC
https://en.wikipedia.org/wiki/Confusion_matrix
AUC – MODEL SELECTION
BINARY CLASSIFICATION
MULTICL ASS
MULTICL ASS
— one-versus-the-rest (OvR) strategy (also called one-versus-all) — one-versus-one (OvO) strategy
CONFUSION MATRIX
CONFUSION IMAGE
CONFUSION ERRORS
CONFUSION ERRORS
ANALYZING ERRORS 3 VS 5
TP1 0FN 1
0
FP TN
FAILURE MODES – GENERALIZATION
CL ASSIFIC ATION
— Multilabel Classification
— Multioutput Classification
DENOISING
DENOISING
REVIEW
— Select good metrics for classification tasks
— How to pick the appropriate precision/recall trade-off
— How to compare classifiers
— Different classification systems for a variety of tasks
— What business problems can you think of that are classification tasks?
— Can you think of some business problems that are multilable and multioutput?
TEACHING TEAM
Dr Alastair Moore Senior Teaching Fellow
a.p.moore@ucl.ac.uk
@latticecut
Kamil Tylinski Teaching Assistant
kamil.tylinski.16@ucl.ac.uk
Jiangbo Shangguan Teaching Assistant
j.shangguan.17@ucl.ac.uk