MFIN 290 Application of Machine Learning in Finance: Lecture 9
MFIN 290 Application of Machine
Learning in Finance: Lecture 9
Edward Sheng, Yujie He
8/21/2021
Agenda
Review of Homework 3
Review of Mid-term
Review of Lecture 1 – 4 and 7 – 8
Mock interview on key concepts
1
2
3
4
2
Section 1: Review of Homework 3
3
Section 2: Review of Mid-term
4
Probability threshold and Precision/Recall Trade-Off
Threshold = 0.5
Precision: 4/5
Recall: 4/5
Accuracy: 6/8
Threshold = 0.8
Precision: 2/2
Recall: 2/5
Accuracy: 5/8
Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8
Actual 0 0 1 0 1 1 1 1
Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8
Actual 0 0 1 0 1 1 1 1
Predicted 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.8
Actual 0 0 1 0 1 1 1 1
Neural Network
X1 0 1 1 0
X2 1 0 1 0
output -10 -10 10 -30
Sigmoid(output) 0 0 1 0
• X: [n, 2]; W: [2, 1] => x1*w1 + x2*w2 + 1*b
• X: [n, 3]; W: [3, 1] => add a “1” column to X, concatenate b to W
Section 3: Review of Lecture 1 – 4
and 6 – 8
7
Lecture 1.1 Introduction
Three key components of machine learning
Supervised learning
Unsupervised learning
Regression
Classification
No free lunch theorem
8
Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)
OLS
Machine learning work flow
Data preparation
Imputation
Winsorizing/winsorization
Standardization/normalization
Lookahead bias (data leakage)
Survivorship bias
9
Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)
Feature selection
Curse of dimensionality
Stepwise
Shrinkage/regularization
Ridge regression (L2 regularization)
Lasso regression (L1 regularization)
Elastic net
PCA
10
Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)
Model assessment
Collinearity/multicollinearity
Heteroskedasticity
Loss function
Training error and test error
Overfitting
Adjusted R2
Cross validation and k-fold cross validation
11
Lecture 1.3 Logistic regression
Logistic regression
Logit
Generalized linear model (GLM)
Maximum likelihood estimation (MLE)
Likelihood function
Type I (false positive, α) error and Type II (false negative, β) error
Confusion matrix
Recall, precision, and F1 score
ROC curve and AUC
12
Lecture 2.1 Basic decision tree
Flexibility-interpretation trade-off
Bias-variance trade-off
Decision tree (leaf, root, branch, node)
Recursive binary splitting
Pruning
Weak learner
Ensemble methods
13
Lecture 2.2 Bagging and boosting tree
Bagging
Bootstrap
Random forest
Variable importance
Boosting
Difference between bagging and boosting
AdaBoost, Gradient boosting, and XGBoost
14
Lecture 2.3 Support vector machine (SVM)
Hyperplane, separating hyperplane, maximal margin hyperplane
Margin
Support vectors
Kernel
Soft margin
One-verses-all (OVA) and one-verses-one (OVO)
Hyperparameter and hyperparameter tuning
15
Lecture 3 Classification
Basic Python
Basic data structures and functions (syntax, basic data structures, list comprehension etc.)
Numpy, pandas
Classification (Supervised approach)
K Nearest Neighbor
Logistic Regression
Properties of logistic function
Regularization
Loss function and training
Evaluations
Precision, recall, ROC, PR-curve, AUC, impact of threshold
Modeling highly unbalanced classes
16
Lecture 3 Classification
Gradient Descent
General optimization problem
Convex functions
Step size/learning rate and its impact (too big/small?)
Advanced optimizers
Momentum based optimizers
Adam (pros and cons)
Learning Theory
Bias-variance trade-off
Impact of adding variables/features
How to identify overfitting vs. underfitting
What is learning curve
17
Lecture 4 Unsupervised Learning
Unsupervised Learning
Dimension Reduction
Use cases: visualization, curse of dimensionality
PCA (theory and applications)
Auto-encoder (theory and applications)
Clustering
Conceptual understanding of common clustering methods and pros and cons
K-means clustering
Evaluation
Applications of clustering
18
Lecture 4 Unsupervised Learning
Neural Network
Basic structures
Be able to calculate each layer’s output given weight matrix
Linear vs non-linear components
Activation function and its impact on models
Normalization/regularization
Backpropagation (i.e. use chain rule to derive derivative of Loss over model parameters)
19
Lecture 7.1 Basic time series analysis, part I
Stationary and ADF test
Autocorrelation and Ljung-Box test
ACF, PACF
White noise, random walk, Markov property, Martingale property
AR model and constraints on coefficients
Unit root, characteristic root
AR model behavior in ACF, PACF, and mean reversion in forecast
How to select order p for AR model
How to check residual
20
Lecture 7.2 Basic time series analysis, part II
Conversion from AR to MA and vice versa
Difference between AR and MA in ACF, PACF, stationary, forecast
How to select order q for MA model
ARMA model and its advantage
ARMA model constraints on coefficients, behavior in ACF and PACF
How to select order p, q for ARMA model
ARIMA and SARIMA
Cross validation in time series, sliding window and forward chaining
21
Lecture 7.3 Advanced time series analysis – State Space
Model and Kalman Filter
State space model and its common structure
Kalman filter, components (model and measurement) and common logic of its
optimization
Kalman filter iteration process (prediction and update), no need to memorize
formulas, just general understanding
Kalman Gain and its relationship with source of uncertainty
22
Lecture 8.1 NLP – Basics
Semantic vs. Syntactic Analysis
Tokenization
Different types (whitespace, punctuation, sub-word etc.)
Stemming/Lemmatization
Meaning, difference
POS Tagging
23
Lecture 8.2 NLP – Embeddings
Why use embedding
Representation of text
One-hot encoder vs. distribution representation
Word2vec: skip-gram, CBOW
Core concept: use context words to predict center word or vice versa
24
Lecture 8.3 NLP – Applications
Sentiment analysis
Features: pre-defined score card; embedding
Classification task
Named Entity Recognition
Features: POS, pre-/post-words, etc. ; embedding
Token classification task
Sentence/document classification
Naïve Bayes classification
Deep learning
Contextual embedding
Transfer learning: unsupervised pretraining -> task specific fine tuning
25
Section 4: Mock Interview on Key
Concepts
26
27
Supervised learning
28
Unsupervised learning
29
Regression vs. classification
30
No free lunch theorem
31
OLS
32
Imputation
33
Winsorization
34
Standardization
35
Look ahead bias (data leakage)
36
Survivorship bias
37
Feature
38
Feature selection
39
Curse of dimensionality
40
Stepwise
41
Loss function
42
Regularization/shrinkage
43
Ridge regression
44
L2 regularization
45
Lasso regression
46
L1 regularization
47
Elastic net
48
PCA
49
Training error
50
Test error
51
Overfitting
52
AIC
53
Adjusted R2
54
Cross validation
55
k-fold cross validation
56
Collinearity/multicollinearity
57
Heteroskedasticity
58
Logistic regression
59
Logit
60
Dummy variable
61
Likelihood function
62
MLE
63
Confusion matrix
64
Type I error
65
Type II error
66
False positive
67
False negative
68
Recall
69
Precision
70
F1 score
71
ROC curve
72
AUC
73
Flexibility-interpretation trade off
74
Bias-variance trade off
75
Decision tree
76
Leaf
77
Root
78
Branch
79
Node
80
Recursive binary splitting
81
Pruning
82
Ensemble methods
83
Bagging
84
Bootstrap
85
Random forest
86
Variable importance
87
Boosting
88
Hyperplane
89
Separating hyperplane
90
Maximal margin hyperplane
91
Support vectors
92
Kernel
93
Soft margin
94
One-verses-all (OVA)
95
One-verses-one (OVO)
96
Hyperparameter
97
Hyperparameter tuning
98
k-NN
99
Dimension reduction
100
Clustering
101
k-means
102
Imbalanced dataset/SMOTE
103
Backpropagation
104
Gradient descent
105
Time series
106
Seasonality
107
Stationary
108
Autocorrelation
109
ACF
110
Ljung-Box test
111
White noise
112
Random walk
113
Markov property
114
Martingale property
115
AR
116
Unit root
117
Dick-Fuller test/ADF
118
PACF
119
MA
120
ARMA
121
Parsimony
122
ARIMA
123
SARIMA
124
Sliding window
125
Forward chaining
126
State Space Model
127
Kalman Filter
128
Kalman Gain
129
Embeddings
130
Word2vec
131
Tokenization
132
Lemmatization/stemming
133
Sentiment Analysis
134
NER (named entity recognition)
135
Naïve Bayes (spam classification)
MFIN 290 Application of Machine Learning in Finance: Lecture 9
Agenda
Section 1: Review of Homework 3
Section 2: Review of Mid-term
Probability threshold and Precision/Recall Trade-Off
Neural Network
Section 3: Review of Lecture 1 – 4 and 6 – 8
Lecture 1.1 Introduction
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.2 Machine learning work flow – an example with linear regression (OLS)
Lecture 1.3 Logistic regression
Lecture 2.1 Basic decision tree
Lecture 2.2 Bagging and boosting tree
Lecture 2.3 Support vector machine (SVM)
Lecture 3 Classification
Lecture 3 Classification
Lecture 4 Unsupervised Learning
Lecture 4 Unsupervised Learning
Lecture 7.1 Basic time series analysis, part I
Lecture 7.2 Basic time series analysis, part II
Lecture 7.3 Advanced time series analysis – State Space Model and Kalman Filter
Lecture 8.1 NLP – Basics
Lecture 8.2 NLP – Embeddings
Lecture 8.3 NLP – Applications
Section 4: Mock Interview on Key Concepts
幻灯片编号 27
幻灯片编号 28
幻灯片编号 29
幻灯片编号 30
幻灯片编号 31
幻灯片编号 32
幻灯片编号 33
幻灯片编号 34
幻灯片编号 35
幻灯片编号 36
幻灯片编号 37
幻灯片编号 38
幻灯片编号 39
幻灯片编号 40
幻灯片编号 41
幻灯片编号 42
幻灯片编号 43
幻灯片编号 44
幻灯片编号 45
幻灯片编号 46
幻灯片编号 47
幻灯片编号 48
幻灯片编号 49
幻灯片编号 50
幻灯片编号 51
幻灯片编号 52
幻灯片编号 53
幻灯片编号 54
幻灯片编号 55
幻灯片编号 56
幻灯片编号 57
幻灯片编号 58
幻灯片编号 59
幻灯片编号 60
幻灯片编号 61
幻灯片编号 62
幻灯片编号 63
幻灯片编号 64
幻灯片编号 65
幻灯片编号 66
幻灯片编号 67
幻灯片编号 68
幻灯片编号 69
幻灯片编号 70
幻灯片编号 71
幻灯片编号 72
幻灯片编号 73
幻灯片编号 74
幻灯片编号 75
幻灯片编号 76
幻灯片编号 77
幻灯片编号 78
幻灯片编号 79
幻灯片编号 80
幻灯片编号 81
幻灯片编号 82
幻灯片编号 83
幻灯片编号 84
幻灯片编号 85
幻灯片编号 86
幻灯片编号 87
幻灯片编号 88
幻灯片编号 89
幻灯片编号 90
幻灯片编号 91
幻灯片编号 92
幻灯片编号 93
幻灯片编号 94
幻灯片编号 95
幻灯片编号 96
幻灯片编号 97
幻灯片编号 98
幻灯片编号 99
幻灯片编号 100
幻灯片编号 101
幻灯片编号 102
幻灯片编号 103
幻灯片编号 104
幻灯片编号 105
幻灯片编号 106
幻灯片编号 107
幻灯片编号 108
幻灯片编号 109
幻灯片编号 110
幻灯片编号 111
幻灯片编号 112
幻灯片编号 113
幻灯片编号 114
幻灯片编号 115
幻灯片编号 116
幻灯片编号 117
幻灯片编号 118
幻灯片编号 119
幻灯片编号 120
幻灯片编号 121
幻灯片编号 122
幻灯片编号 123
幻灯片编号 124
幻灯片编号 125
幻灯片编号 126
幻灯片编号 127
幻灯片编号 128
幻灯片编号 129
幻灯片编号 130
幻灯片编号 131
幻灯片编号 132
幻灯片编号 133
幻灯片编号 134
幻灯片编号 135