CS计算机代考程序代写 python data structure chain flex finance decision tree MFIN 290 Application of Machine Learning in Finance: Lecture 5

MFIN 290 Application of Machine Learning in Finance: Lecture 5

MFIN 290 Application of Machine
Learning in Finance: Lecture 5

Edward Sheng

7/24/2021

Agenda

Review of Homework 1

Release of Final Project

Review of Lecture 1 – 4

Midterm Exam (1 hour)

1

2

3

4

2

Section 1: Review of Homework 1

3

Section 2: Release of Final Project

4

Section 3: Review of Lecture 1 – 4

5

Lecture 1.1 Introduction

Three key components of machine learning

Supervised learning

Unsupervised learning

Regression

Classification

No free lunch theorem

6

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

OLS

Machine learning work flow

Data preparation

Imputation

Winsorizing/winsorization

Standardization/normalization

Lookahead bias (data leakage)

Survivorship bias

7

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

Feature selection

Curse of dimensionality

Stepwise

Shrinkage/regularization

Ridge regression (L2 regularization)

Lasso regression (L1 regularization)

Elastic net

PCA

8

Lecture 1.2 Machine learning work flow – an example with
linear regression (OLS)

Model assessment

Collinearity/multicollinearity

Heteroskedasticity

Loss function

Training error and test error

Overfitting

Adjusted R2

Cross validation and k-fold cross validation

9

Lecture 1.3 Logistic regression

Logistic regression

Logit

Generalized linear model (GLM)

Maximum likelihood estimation (MLE)

Likelihood function

Type I (false positive, α) error and Type II (false negative, β) error

Confusion matrix

Recall, precision, and F1 score

ROC curve and AUC

10

Lecture 2.1 Basic decision tree

Flexibility-interpretation trade-off

Bias-variance trade-off

Decision tree (leaf, root, branch, node)

Recursive binary splitting

Pruning

Weak learner

Ensemble methods

11

Lecture 2.2 Bagging and boosting tree

Bagging

Bootstrap

Random forest

Variable importance

Boosting

Difference between bagging and boosting

AdaBoost, Gradient boosting, and XGBoost

12

Lecture 2.3 Support vector machine (SVM)

Hyperplane, separating hyperplane, maximal margin hyperplane

Margin

Support vectors

Kernel

Soft margin

One-verses-all (OVA) and one-verses-one (OVO)

Hyperparameter and hyperparameter tuning

13

Lecture 3 Classification

Basic Python

Basic data structures and functions (syntax, basic data structures, list comprehension etc.)

Numpy, pandas

Classification (Supervised approach)

K Nearest Neighbor

Logistic Regression

Properties of logistic function

Regularization

Loss function and training

Evaluations

Precision, recall, ROC, PR-curve, AUC, impact of threshold

Modeling highly unbalanced classes

14

Lecture 3 Classification

Gradient Descent

General optimization problem

Convex functions

Step size/learning rate and its impact (too big/small?)

Advanced optimizers

Momentum based optimizers

Adam (pros and cons)

Learning Theory

Bias-variance trade-off

Impact of adding variables/features

How to identify overfitting vs. underfitting

What is learning curve

15

Lecture 4 Unsupervised Learning

Unsupervised Learning

Dimension Reduction

Use cases

PCA (theory and applications)

Auto-encoder (theory and applications)

Clustering

Conceptual understanding of common clustering methods and pros and cons

Evaluation

Applications of clustering

16

Lecture 4 Unsupervised Learning

Neural Network

Basic structures

Be able to calculate each layer’s output given weight matrix

Linear vs non-linear components

Activation function and its impact on models

Normalization/regularization

Backpropagation (i.e. use chain rule to derive derivative of Loss over model parameters)

17

Section 4: Midterm Exam (1 Hour)

18

19