CS代考程序代写 decision tree algorithm LECTURE 3 TERM 2:

LECTURE 3 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE

MSIN0097
Individual coursework

INDIVIDUAL COURSEWORK

MSIN0097
Group coursework

COURSEWORK / INDUSTRY REPORT

MACHINE LEARNING JARGON
— Model
— Interpolating / Extrapolating — Data Bias
— Noise / Outliers
— Learning algorithm
— Inference algorithm
— Supervised learning
— Unsupervised learning
— Classification
— Regression
— Clustering
— Decomposition
— Parameters
— Optimisation
— Training data
— Testing data
— Error metric
— Linear model
— Parametric model
— Model variance
— Model bias
— Model generalization
— Overfitting
— Goodness-of-fit
— Hyper-parameters
— Failure modes
— Confusion matrix
— True Positive
— False Negative
— Partition
— Data density
— Hidden parameter
— High dimensional space
— Low dimensional space
— Separable data
— Manifold / Decision surface
— Hyper cube / volume / plane

MSIN0097
Homework…

LEARNING RATES
Hands of ML: Chapt 4, Figure 4.8, page 116 (v1)

MSIN0097
Reviewing notebooks

CODE

ALTERING CODE

DIFFS

LEARNING RATES

LEARNING RATES

LEARNING RATES
+ [0],[0]
[1],[0]
[8],[-2]
[10],[3]
𝜃! = 𝑐
𝜃” = 𝑚

LEARNING CURVES

DECISION TREES

Source: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

TREE ALGORITHM

MODEL VARIANCE / 模型方差 MÓXÍNG FĀNGCHĀ

MODEL GENERALIZATION / 模型泛化 MÓXÍNG FÀN HUÀ

OVERFITTING / 过拟合 GUÒ NǏ HÉ

FAILURE MODES / 失败模式 SHĪBÀI MÓSHÌ

CLASS WEIGHTING

QUES TIONS
— What is the benefit of out-of-bag evaluation?
— If a Decision Tree is overfitting the training set, is it a good idea to try
decreasing max_depth?
— If a Decision Tree is underfitting the training set, is it a good idea to try scaling the
input features?
— What problems might we have if we try to grow a tree with high class imbalance?

MSIN0097
Breakout 1

CLASS WEIGHTING
— Domain expertise
– determinedbytalkingtosubjectmatterexperts.
— Tuning
– determinedbyahyperparametersearchsuchasagridsearch.
— Heuristic
– specifiedusingageneralbestpractice
Applications
• Fraud Detection
• Claim Prediction
• Churn Prediction
• Spam Detection

MSIN0097
Breakout 2

COMMON APPROACHES
— Performance Metrics – F-measure
– G-mean — Data Sampling
– SMOTE(SyntheticMinorityOversamplingTechnique)
– ENN(Editednearestneighbours) — Cost-Sensitive Algorithms
– DecisionTrees — Post-Processing
– ThresholdMoving – Calibration

SYNTHETIC DATA SAMPLING
Source: https://github.com/minoue-xx/Oversampling-Imbalanced-Data
•SMOTE (Chawla, NV. et al. 2002)
•Borderline SMOTE (Han, H. et al. 2005)
•ADASYN (He, H. et al. 2008)
•Safe-level SMOTE (Bunkhumpornpat, C. at al. 2009)

MSIN0097
Ensembles

KEY CONCEPTS
If you have trained five different models on the exact same training data, and they all achieve 95% precision, is there any chance that you can combine these models to get better results?
If so, how? If not, why?

MULTIPLE MODELS

VOTING
Leipzig–Dresden Railway Company in 1852

MACHINE LEARNING SYSTEMS
Why is ML hard?

WHY IS ML HARD?
https://ai.stanford.edu/~zayd/why-is-machine-learning-hard.html

DEBUGGING
def recursion(input):
if input is endCase:
else:
return transform(input)
return recursion(transform(input))

WHY IS ML HARD? ALGORITHM, IMPLEMENTATION, DATA, MODEL

WHY IS ML HARD?

ML AS HACKING

TWITTER
@chipro @random_forests @zachar ylipton @yudapearl

HAVE YOUR SAY!
“The questionnaires are very short and will take less than a minute for them to complete.”

VOTING IN ACTION!

TEACHING TEAM
Dr Alastair Moore Senior Teaching Fellow
a.p.moore@ucl.ac.uk
@latticecut
Kamil Tylinski Teaching Assistant
kamil.tylinski.16@ucl.ac.uk
Jiangbo Shangguan Teaching Assistant
j.shangguan.17@ucl.ac.uk

LECTURE 3 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE