Motivation
Motivation
Introduction
Course information
Module scope
ST4060/ST6015/ST6040 aims to provide a broad understanding
of methodological and implementational aspects involved with
current techniques used for statistical learning.
This includes reviewing various statistical concepts and techniques
used in data exploration and analysis, methods for simulating
statistical frameworks, and basic concepts of machine learning.
The objective is not so much to cover these techniques in depth
(this will be done in the follow-on course), but rather to develop a
sensitivity to different aspects and issues related to statistical
analysis, from a practical angle.
The course commences at a basic level, and aims for you to be
self-sufficient at R-based development of statistical analyses.
Motivation
Introduction
Importance of implementation skills
Why should you be interested?
Employers really like applied
experience (quant roles, …)
First day at your new job: hit
the ground running!
Exploratory analysis
Performance analysis
Benchmarking
Motivation
Introduction
Importance of implementation skills
How is this course useful?
Stochastic modelling
Regression & GLM
Survival Analysis
Time Series Analysis
Projects (simulation work)…
Predicting stuff
2 4 6 8 10
5
1
0
1
5
2
0
2
5
3
0
linear regression
x
y
time series
Time
A
ir
P
a
ss
e
n
g
e
rs
1950 1952 1954 1956 1958 1960
1
0
0
2
0
0
3
0
0
4
0
0
5
0
0
6
0
0
Motivation
Introduction
Course outline
Course outline
1 Stochastic modelling
Why and how we model
random stuff
2 Resampling
How to simulate data and
estimation processes
3 Regression & Parametric
models
A battery of tools to
describe typical patterns
4 Smoothing &
nonparametric modelling
Other tools to describe
patterns without assuming
their shape
5 Fundamentals of
Statistical Learning
Overview of some of the
main problems met in
Statistical (and Machine)
Learning
Motivation
Introduction
Course outline
Example: nonparametric curve estimation
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
100 200 300 400
1
0
1
5
2
0
2
5
3
0
Nonparametric curve estimation
(mtcars)
engine displacement (cu.in.)
c
o
n
s
u
m
p
ti
o
n
(
m
p
g
)
How do we come
up with these
models (i.e.
curves)?
Why is the blue
model better than
the other one?
How do we
measure this
performance?
And then what?
Motivation
Introduction
Course outline
Using R…
Motivation
Introduction
Course outline
… or RStudio…
Motivation
Introduction
Course outline
Interactive LearnR pages
A number of interactive online pages have been designed to help
you practice on focused R aspects for each of the course sections:
Basic R LearnR page
Modelling basics LearnR page
Resampling LearnR page
Regression LearnR page
Smoothing LearnR page (link TBA)
Machine Learning LearnR page (link TBA)
https://eric-wolsztynski.shinyapps.io/MachineLearningWorkout_learnr/
https://eric-wolsztynski.shinyapps.io/learnr1_modelling
https://eric-wolsztynski.shinyapps.io/learnr2_resampling/
https://eric-wolsztynski.shinyapps.io/learnr3_regression
Motivation
Introduction
Course outline
Comments/feedback?
For any comments or queries about this course, please contact
eric.
Introduction
Course information
Importance of implementation skills
Course outline