CS代考程序代写 decision tree algorithm WEEK 2 TERM 2:

WEEK 2 TERM 2:
MSIN0097
Predictive Analytics Lecture 2
A P MOORE

PREDICTIVE ANALYTICS
Review

MACHINE LEARNING JARGON
— Model
— Interpolating / Extrapolating — Data Bias
— Noise / Outliers
— Learning algorithm
— Inference algorithm
— Supervised learning
— Unsupervised learning
— Classification
— Regression
— Clustering
— Decomposition
— Parameters
— Optimisation
— Training data
— Testing data
— Error metric
— Linear model
— Parametric model
— Model variance
— Model bias
— Model generalization
— Overfitting
— Goodness-of-fit
— Hyper-parameters
— Failure modes
— Confusion matrix
— True Positive
— False Negative
— Data density
— Partition
— Hidden parameter
— High dimensional space
— Low dimensional space
— Separable data
— Manifold / Decision surface
— Hyper cube / volume / plane

机器学习 行话
— 模型
— 内插 / 外推 — 数据偏差
— 噪声/离群值 — 学习算法
— 推断算法
— 监督学习
— 无监督学习 — 分类
— 回归
— 聚类
— 分解
— 参数
— 优化
— 训练数据 — 测试数据 — 误差指标 — 线性模型 — 参数模型 — 模型方差 — 模型偏差 — 模型泛化 — 过拟合 — 拟合优度 — 超参数
— 失败模式
— 混淆矩阵
— 真正例
— 假反例
— 数据密度
— 划分
— 隐藏参数
— 高维空间
— 低维空间
— 可分数据
— 流形/ 决策面
— 超立方体/超体积/超平 面

OPTIMISATION / 优化 YŌUHUÀ

SC ALING

ERROR METRIC / 误差指标 WÙCHĀ ZHǏBIĀO

HYPER-PARAMETERS / 超参数 CHĀO CĀNSHÙ

PARAMETRIC MODEL / 参数模型 CĀNSHÙ MÓXÍNG

MODEL BIAS / 模型偏差 MÓXÍNG PIĀNCHĀ

MODEL VARIANCE / 模型方差 MÓXÍNG FĀNGCHĀ

MODEL GENERALIZATION / 模型泛化 MÓXÍNG FÀN HUÀ

OVERFITTING / 过拟合 GUÒ NǏ HÉ

FAILURE MODES / 失败模式 SHĪBÀI MÓSHÌ

COMMON CLASSIFICATION METRICS
— Accuracy
— Precision (P)
— Recall (R)
— F1 score (F1)
— Area under the ROC (Receiver Operating Characteristic) curve or simply AUC (AUC) – Log loss
— Precision at k (P@k)
— Average precision at k (AP@k)
— Mean average precision at k (MAP@k)

COMMON REGRESSION METRICS
— Mean absolute error (MAE)
— Mean squared error (MSE)
— Root mean squared error (RMSE)
— Root mean squared logarithmic error (RMSLE) — Mean percentage error (MPE)
— Mean absolute percentage error (MAPE)
— R2

PREDICTIVE ANALYTICS
Measuring performance

CONFUSION MATRIX / 混淆矩阵 HÙNXIÁO JǓZHÈN

PRACTICAL TOOLS ML CANVAS

A. CLASSIFICATION CATEGORICAL VARIABLE

PREDICTIVE ANALYTICS
Logistic Regression

LOGISTIC REGRESSION (CLASSIFICATION !!!!!)

DECISION BOUNDARIES

DECISION BOUNDARIES

PREDICTIVE ANALYTICS
Problem 1
~15 mins group work ~15 mins discussion

REVIEW
— Select good metrics for classification tasks
— How to pick the appropriate precision/recall trade-off
— How to compare classifiers
— Different classification systems for a variety of tasks
— What business problems can you think of that are classification tasks?
— Can you think of some business problems that are multilabel and multioutput?

The Machine Lea􏰈ning Can􏰉a􏰊 (􏰉0.4)​ ​De􏰊ig􏰋ed f􏰌􏰈: ​ ​ De􏰊ig􏰋ed b􏰍: ​ ​ Da􏰎e: ​ ​ I􏰎e􏰈a􏰎i􏰌􏰋: ​ .
Deci􏰊i􏰌n􏰊
H􏰌􏰏 a􏰈e 􏰐􏰈edic􏰎i􏰌􏰋􏰊 􏰑􏰊ed 􏰎􏰌
􏰒ake deci􏰊i􏰌􏰋􏰊 􏰎ha􏰎 􏰐􏰈􏰌􏰉ide
􏰎he 􏰐􏰈􏰌􏰐􏰌􏰊ed 􏰉al􏰑e 􏰎􏰌 􏰎he e􏰋d-􏰑􏰊e􏰈?
ML 􏰎a􏰊k
I􏰋􏰐􏰑􏰎, 􏰌􏰑􏰎􏰐􏰑􏰎 􏰎􏰌 􏰐􏰈edic􏰎, 􏰎􏰍􏰐e 􏰌f 􏰐􏰈􏰌ble􏰒.
Val􏰑e P􏰈􏰌􏰐􏰌􏰊i􏰎i􏰌n􏰊
Wha􏰎 a􏰈e 􏰏e 􏰎􏰈􏰍i􏰋g 􏰎􏰌 d􏰌 f􏰌􏰈 􏰎he e􏰋d-􏰑􏰊e􏰈(􏰊) 􏰌f 􏰎he 􏰐􏰈edic􏰎i􏰉e 􏰊􏰍􏰊􏰎e􏰒? Wha􏰎 􏰌bjec􏰎i􏰉e􏰊 a􏰈e 􏰏e 􏰊e􏰈􏰉i􏰋g?
Da􏰎a S􏰌􏰑􏰈ce􏰊
Which 􏰈a􏰏 da􏰎a 􏰊􏰌􏰑􏰈ce􏰊 ca􏰋 􏰏e 􏰑􏰊e (i􏰋􏰎e􏰈􏰋al a􏰋d e􏰓􏰎e􏰈􏰋al)?
C􏰌llec􏰎ing Da􏰎a
H􏰌􏰏 d􏰌 􏰏e ge􏰎 􏰋e􏰏 da􏰎a 􏰎􏰌 lea􏰈􏰋 f􏰈􏰌􏰒 (i􏰋􏰐􏰑􏰎􏰊 a􏰋d 􏰌􏰑􏰎􏰐􏰑􏰎􏰊)?
Making P􏰈edic􏰎i􏰌n􏰊
Whe􏰋 d􏰌 􏰏e 􏰒ake 􏰐􏰈edic􏰎i􏰌􏰋􏰊 􏰌􏰋 􏰋e􏰏 i􏰋􏰐􏰑􏰎􏰊? H􏰌􏰏 l􏰌􏰋g d􏰌 􏰏e ha􏰉e 􏰎􏰌 fea􏰎􏰑􏰈i􏰔e a 􏰋e􏰏 i􏰋􏰐􏰑􏰎 a􏰋d 􏰒ake a 􏰐􏰈edic􏰎i􏰌􏰋?
Offline E􏰉al􏰑a􏰎i􏰌n
Me􏰎h􏰌d􏰊 a􏰋d 􏰒e􏰎􏰈ic􏰊 􏰎􏰌 e􏰉al􏰑a􏰎e 􏰎he 􏰊􏰍􏰊􏰎e􏰒 bef􏰌􏰈e de􏰐l􏰌􏰍􏰒e􏰋􏰎.
Fea􏰎􏰑􏰈e􏰊
I􏰋􏰐􏰑􏰎 􏰈e􏰐􏰈e􏰊e􏰋􏰎a􏰎i􏰌􏰋􏰊 e􏰓􏰎􏰈ac􏰎ed f􏰈􏰌􏰒 􏰈a􏰏 da􏰎a 􏰊􏰌􏰑􏰈ce􏰊.
B􏰑ilding M􏰌del􏰊
Whe􏰋 d􏰌 􏰏e c􏰈ea􏰎e/􏰑􏰐da􏰎e
􏰒􏰌del􏰊 􏰏i􏰎h 􏰋e􏰏 􏰎􏰈ai􏰋i􏰋g
da􏰎a? H􏰌􏰏 l􏰌􏰋g d􏰌 􏰏e ha􏰉e 􏰎􏰌 fea􏰎􏰑􏰈i􏰔e 􏰎􏰈ai􏰋i􏰋g i􏰋􏰐􏰑􏰎􏰊 a􏰋d c􏰈ea􏰎e a 􏰒􏰌del?
Li􏰉e E􏰉al􏰑a􏰎i􏰌n and M􏰌ni􏰎􏰌􏰈ing
Me􏰎h􏰌d􏰊 a􏰋d 􏰒e􏰎􏰈ic􏰊 􏰎􏰌 e􏰉al􏰑a􏰎e 􏰎he 􏰊􏰍􏰊􏰎e􏰒 af􏰎e􏰈 de􏰐l􏰌􏰍􏰒e􏰋􏰎, a􏰋d 􏰎􏰌 􏰕􏰑a􏰋􏰎if􏰍 􏰉al􏰑e c􏰈ea􏰎i􏰌􏰋.
machinelea􏰈ningcan􏰉a􏰊.c􏰌m​ b􏰍 L􏰌􏰑i􏰊 D􏰌􏰈a􏰈d, Ph.D. ​Lice􏰋􏰊ed 􏰑􏰋de􏰈 a C􏰈ea􏰎i􏰉e C􏰌􏰒􏰒􏰌􏰋􏰊 A􏰎􏰎􏰈ib􏰑􏰎i􏰌􏰋-Sha􏰈eAlike 4.0 I􏰋􏰎e􏰈􏰋a􏰎i􏰌􏰋al Lice􏰋􏰊e.

TEXT CATEGORIZATION

FILM GENRE CLASSIFICATION
https://towardsdatascience.com/journey-to-the-center-of-multi-label-classification-384c40229bff

MULTI-OUTPUT LEARNING
https://arxiv.org/pdf/1901.00248.pdf

OUTPUT STRUCTURES
https://arxiv.org/pdf/1901.00248.pdf

PREDICTIVE ANALYTICS
Decision boundaries

LEARNING CURVES

DECISION BOUNDARIES
Decision Boundaries Animations by Ryan Holbrook is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at https://github.com/ryanholbrook/decision-boundaries-animations.

DECISION BOUNDARIES
Decision Boundaries Animations by Ryan Holbrook is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at https://github.com/ryanholbrook/decision-boundaries-animations.

DECISION BOUNDARIES
Decision Boundaries Animations by Ryan Holbrook is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at https://github.com/ryanholbrook/decision-boundaries-animations.

DECISION BOUNDARIES
Decision Boundaries Animations by Ryan Holbrook is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at https://github.com/ryanholbrook/decision-boundaries-animations.

IRIS DECISION TREE

DECISION TREES

Source: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

PREDICTIVE ANALYTICS
Individual Assessment

LECTURE 1 TERM 2:
MSIN0097
Predictive Analytics
A P MOORE