CS计算机代考程序代写 python data structure chain deep learning flex finance decision tree MFIN 290 Application of Machine Learning in Finance: Lecture 9

MFIN 290 Application of Machine Learning in Finance: Lecture 9

MFIN 290 Application of Machine
Learning in Finance: Lecture 9

Edward Sheng, Yujie He

8/21/2021

Agenda

Review of Homework 3

Review of Mid-term

Review of Lecture 1 – 4 and 7 – 8

Mock interview on key concepts

Section 1: Review of Homework 3

Section 2: Review of Mid-term

Probability threshold and Precision/Recall Trade-Off

Threshold = 0.5

Precision: 4/5

Recall: 4/5

Accuracy: 6/8

Threshold = 0.8

Precision: 2/2

Recall: 2/5

Accuracy: 5/8

Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8

Actual 0 0 1 0 1 1 1 1

Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8

Actual 0 0 1 0 1 1 1 1

Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8

Actual 0 0 1 0 1 1 1 1

Neural Network

X1 0 1 1 0
X2 1 0 1 0
output -10 -10 10 -30
Sigmoid(output) 0 0 1 0

• X: [n, 2]; W: [2, 1] => x1*w1 + x2*w2 + 1*b

• X: [n, 3]; W: [3, 1] => add a “1” column to X, concatenate b to W

Section 3: Review of Lecture 1 – 4
and 6 – 8

Lecture 1.1 Introduction
Three key components of machine learning

Supervised learning

Unsupervised learning

Regression

Classification

No free lunch theorem

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

OLS

Machine learning work flow

Data preparation
Imputation
Winsorizing/winsorization
Standardization/normalization
Lookahead bias (data leakage)
Survivorship bias

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

Feature selection
Curse of dimensionality
Stepwise
Shrinkage/regularization
Ridge regression (L2 regularization)
Lasso regression (L1 regularization)
Elastic net
PCA

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

Model assessment
Collinearity/multicollinearity
Heteroskedasticity
Loss function
Training error and test error
Overfitting
Adjusted R2

Cross validation and k-fold cross validation

Lecture 1.3 Logistic regression
Logistic regression

Logit

Generalized linear model (GLM)

Maximum likelihood estimation (MLE)

Likelihood function

Type I (false positive, α) error and Type II (false negative, β) error

Confusion matrix

Recall, precision, and F1 score

ROC curve and AUC

Lecture 2.1 Basic decision tree
Flexibility-interpretation trade-off

Bias-variance trade-off

Decision tree (leaf, root, branch, node)

Recursive binary splitting

Pruning

Weak learner

Ensemble methods

Lecture 2.2 Bagging and boosting tree
Bagging

Bootstrap

Random forest

Variable importance

Boosting

Difference between bagging and boosting

AdaBoost, Gradient boosting, and XGBoost

Lecture 2.3 Support vector machine (SVM)
Hyperplane, separating hyperplane, maximal margin hyperplane

Margin

Support vectors

Kernel

Soft margin

One-verses-all (OVA) and one-verses-one (OVO)

Hyperparameter and hyperparameter tuning

Lecture 3 Classification
Basic Python

Basic data structures and functions (syntax, basic data structures, list comprehension etc.)
Numpy, pandas

Classification (Supervised approach)
K Nearest Neighbor
Logistic Regression

Properties of logistic function
Regularization
Loss function and training

Evaluations
Precision, recall, ROC, PR-curve, AUC, impact of threshold
Modeling highly unbalanced classes

Lecture 3 Classification
Gradient Descent

General optimization problem
Convex functions
Step size/learning rate and its impact (too big/small?)
Advanced optimizers

Momentum based optimizers
Adam (pros and cons)

Learning Theory
Bias-variance trade-off
Impact of adding variables/features
How to identify overfitting vs. underfitting
What is learning curve

Lecture 4 Unsupervised Learning
Unsupervised Learning

Dimension Reduction
Use cases: visualization, curse of dimensionality

PCA (theory and applications)

Auto-encoder (theory and applications)

Clustering
Conceptual understanding of common clustering methods and pros and cons

K-means clustering

Evaluation

Applications of clustering

Lecture 4 Unsupervised Learning
Neural Network

Basic structures
Be able to calculate each layer’s output given weight matrix
Linear vs non-linear components
Activation function and its impact on models
Normalization/regularization
Backpropagation (i.e. use chain rule to derive derivative of Loss over model parameters)

Lecture 7.1 Basic time series analysis, part I
Stationary and ADF test

Autocorrelation and Ljung-Box test

ACF, PACF

White noise, random walk, Markov property, Martingale property

AR model and constraints on coefficients

Unit root, characteristic root

AR model behavior in ACF, PACF, and mean reversion in forecast

How to select order p for AR model

How to check residual

Lecture 7.2 Basic time series analysis, part II
Conversion from AR to MA and vice versa

Difference between AR and MA in ACF, PACF, stationary, forecast

How to select order q for MA model

ARMA model and its advantage

ARMA model constraints on coefficients, behavior in ACF and PACF

How to select order p, q for ARMA model

ARIMA and SARIMA

Cross validation in time series, sliding window and forward chaining

Lecture 7.3 Advanced time series analysis – State Space
Model and Kalman Filter

State space model and its common structure

Kalman filter, components (model and measurement) and common logic of its
optimization

Kalman filter iteration process (prediction and update), no need to memorize
formulas, just general understanding

Kalman Gain and its relationship with source of uncertainty

Lecture 8.1 NLP – Basics
Semantic vs. Syntactic Analysis

Tokenization
Different types (whitespace, punctuation, sub-word etc.)

Stemming/Lemmatization
Meaning, difference

POS Tagging

Lecture 8.2 NLP – Embeddings
Why use embedding

Representation of text
One-hot encoder vs. distribution representation

Word2vec: skip-gram, CBOW
Core concept: use context words to predict center word or vice versa

Lecture 8.3 NLP – Applications
Sentiment analysis

Features: pre-defined score card; embedding
Classification task

Named Entity Recognition
Features: POS, pre-/post-words, etc. ; embedding
Token classification task

Sentence/document classification
Naïve Bayes classification

Deep learning
Contextual embedding
Transfer learning: unsupervised pretraining -> task specific fine tuning

Section 4: Mock Interview on Key
Concepts

Supervised learning

Unsupervised learning

Regression vs. classification

No free lunch theorem

OLS

Imputation

Winsorization

Standardization

Look ahead bias (data leakage)

Survivorship bias

Feature

Feature selection

Curse of dimensionality

Stepwise

Loss function

Regularization/shrinkage

Ridge regression

L2 regularization

Lasso regression

L1 regularization

Elastic net

PCA

Training error

Test error

Overfitting

AIC

Adjusted R2

Cross validation

k-fold cross validation

Collinearity/multicollinearity

Heteroskedasticity

Logistic regression

Logit

Dummy variable

Likelihood function

MLE

Confusion matrix

Type I error

Type II error

False positive

False negative

Recall

Precision

F1 score

ROC curve

AUC

Flexibility-interpretation trade off

Bias-variance trade off

Decision tree

Leaf

Root

Branch

Node

Recursive binary splitting

Pruning

Ensemble methods

Bagging

Bootstrap

Random forest

Variable importance

Boosting

Hyperplane

Separating hyperplane

Maximal margin hyperplane

Support vectors

Kernel

Soft margin

One-verses-all (OVA)

One-verses-one (OVO)

Hyperparameter

Hyperparameter tuning

k-NN

Dimension reduction

100

Clustering

101

k-means

102

Imbalanced dataset/SMOTE

103

Backpropagation

104

Gradient descent

105

Time series

106

Seasonality

107

Stationary

108

Autocorrelation

109

ACF

110

Ljung-Box test

111

White noise

112

Random walk

113

Markov property

114

Martingale property

115

116

Unit root

117

Dick-Fuller test/ADF

118

PACF

119

120

ARMA

121

Parsimony

122

ARIMA

123

SARIMA

124

Sliding window

125

Forward chaining

126

State Space Model

127

Kalman Filter

128

Kalman Gain

129

Embeddings

130

Word2vec

131

Tokenization

132

Lemmatization/stemming

133

Sentiment Analysis

134

NER (named entity recognition)

135

Naïve Bayes (spam classification)

MFIN 290 Application of Machine Learning in Finance: Lecture 9
Agenda
Section 1: Review of Homework 3
Section 2: Review of Mid-term
Probability threshold and Precision/Recall Trade-Off
Neural Network
Section 3: Review of Lecture 1 – 4 and 6 – 8
Lecture 1.1 Introduction
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.3 Logistic regression
Lecture 2.1 Basic decision tree
Lecture 2.2 Bagging and boosting tree
Lecture 2.3 Support vector machine (SVM)
Lecture 3 Classification
Lecture 3 Classification
Lecture 4 Unsupervised Learning
Lecture 4 Unsupervised Learning
Lecture 7.1 Basic time series analysis, part I
Lecture 7.2 Basic time series analysis, part II
Lecture 7.3 Advanced time series analysis – State Space Model and Kalman Filter
Lecture 8.1 NLP – Basics
Lecture 8.2 NLP – Embeddings
Lecture 8.3 NLP – Applications
Section 4: Mock Interview on Key Concepts
幻灯片编号 27
幻灯片编号 28
幻灯片编号 29
幻灯片编号 30
幻灯片编号 31
幻灯片编号 32
幻灯片编号 33
幻灯片编号 34
幻灯片编号 35
幻灯片编号 36
幻灯片编号 37
幻灯片编号 38
幻灯片编号 39
幻灯片编号 40
幻灯片编号 41
幻灯片编号 42
幻灯片编号 43
幻灯片编号 44
幻灯片编号 45
幻灯片编号 46
幻灯片编号 47
幻灯片编号 48
幻灯片编号 49
幻灯片编号 50
幻灯片编号 51
幻灯片编号 52
幻灯片编号 53
幻灯片编号 54
幻灯片编号 55
幻灯片编号 56
幻灯片编号 57
幻灯片编号 58
幻灯片编号 59
幻灯片编号 60
幻灯片编号 61
幻灯片编号 62
幻灯片编号 63
幻灯片编号 64
幻灯片编号 65
幻灯片编号 66
幻灯片编号 67
幻灯片编号 68
幻灯片编号 69
幻灯片编号 70
幻灯片编号 71
幻灯片编号 72
幻灯片编号 73
幻灯片编号 74
幻灯片编号 75
幻灯片编号 76
幻灯片编号 77
幻灯片编号 78
幻灯片编号 79
幻灯片编号 80
幻灯片编号 81
幻灯片编号 82
幻灯片编号 83
幻灯片编号 84
幻灯片编号 85
幻灯片编号 86
幻灯片编号 87
幻灯片编号 88
幻灯片编号 89
幻灯片编号 90
幻灯片编号 91
幻灯片编号 92
幻灯片编号 93
幻灯片编号 94
幻灯片编号 95
幻灯片编号 96
幻灯片编号 97
幻灯片编号 98
幻灯片编号 99
幻灯片编号 100
幻灯片编号 101
幻灯片编号 102
幻灯片编号 103
幻灯片编号 104
幻灯片编号 105
幻灯片编号 106
幻灯片编号 107
幻灯片编号 108
幻灯片编号 109
幻灯片编号 110
幻灯片编号 111
幻灯片编号 112
幻灯片编号 113
幻灯片编号 114
幻灯片编号 115
幻灯片编号 116
幻灯片编号 117
幻灯片编号 118
幻灯片编号 119
幻灯片编号 120
幻灯片编号 121
幻灯片编号 122
幻灯片编号 123
幻灯片编号 124
幻灯片编号 125
幻灯片编号 126
幻灯片编号 127
幻灯片编号 128
幻灯片编号 129
幻灯片编号 130
幻灯片编号 131
幻灯片编号 132
幻灯片编号 133
幻灯片编号 134
幻灯片编号 135

Related Posts