代写代考 Introduction - PowCoder代写

Introduction

Data and Knowledge

What have we done?
• How to quantify beliefs

What have we done?
• How to quantify beliefs
• How to integrate beliefs with data

What have we done?
• How to quantify beliefs
• How to integrate beliefs with data • How to reach updated beliefs

What have we not done?
• How to extract and formulate domain specific knowledge mathematically

What have we not done?
• How to extract and formulate domain specific knowledge mathematically • How to iteratively improve our models

What have we not done?
• How to extract and formulate domain specific knowledge mathematically • How to iteratively improve our models
• How to interpret our results and make decisions

What have we not done?
• How to extract and formulate domain specific knowledge mathematically
• How to iteratively improve our models
• How to interpret our results and make decisions
• How to perform statistical inference when we have intractable computations

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation
• Deterministic approximations

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation
• Deterministic approximations • Stochastic approximations

Point Estimates
• Bayesian Inference
p(θ | D) = p(D | θ)p(θ) p(D)

Point Estimates
• Bayesian Inference
p(θ | D) = p(D | θ)p(θ)
θˆ = argmaxθ p(D | θ)
• Maximum Likelihood (ML)

Point Estimates
• Bayesian Inference
• Maximum Likelihood (ML)
θˆ = argmaxθ p(D | θ) • Maximum-a-Posteriori (MAP)
θˆ = argmaxθ p(D | θ)p(θ)
p(θ | D) = p(D | θ)p(θ)

Model Selection

p(θ | y) = p(y | θ)p(θ)
􏰋 p(y | θ)p(θ)dθ
Likelihood How much evidence is there in the data for a specific hypothesis Prior What are my beliefs about different hypothesis
Posterior What is my updated belief after having seen data Evidence What is my belief about the data

The Compute: Evidence
p(y | θ)p(θ)dθ

Regression Model

Which Parametrisation
• Should I use a line, polynomial, quadratic basis function? • How many basis functions should I use?
• Likelihood won’t help me
• How do we proceed?

Regression Models
Linear Linear Model
Basis function
p(yi|xi,w)=N(w0 +w1 ·xi,β−1)
p(yi|xi, w) = N (􏰌 wiφ(xi), β−1)

p(Y|W)p(W)dW

Probabilities are a zero-sum game

Model Selection1
1 D Thesis

Occams Razor

Occams Razor
Definition (Occams Razor)
“All things being equal, the simplest solution tends to be the best one” – William of Ockham

What is Simple?2
2 https://www.imdb.com/title/tt8132700/

The Mac Mackay, 1991

Hypothesis Spaces

Composite Functions
f(x)=fL ◦fL−1 ◦···◦f0(x)

What Does Compositions Do?
Im(f)[X] = {f(x) | x ∈ X} Kern(f)[X]={(x,x′)|f(x)=f(x′), (x,x′)∈X ×X}

What Does Compositions Do?
Kern(f1)⊆Kern(fk−1 ◦…◦f2 ◦f1)⊆Kern(fk ◦fk−1 ◦…◦f2 ◦f1) Im(fk ◦fk−1 ◦…◦f2 ◦f1)⊆Im(fk ◦fk−1 ◦…◦f2)⊆…⊆Im(fk)

Why would you ever want this?
y1 = {x1,x2,x3,x4} y2 = {x5,x6,x7,x8,x9} y3 = {x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Why would you ever want this?
y1 = {x1,x2,x5,x6,x7} y2 = {x3,x4} y3 = {x8,x9,x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Composite Functions

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
y2 = {x4,x9} y3 = {x6,x7}

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7}

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour
• Small changes in early compositions give rise to large changes in composite behaviour

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour
• Small changes in early compositions give rise to large changes in composite behaviour
• Over parametrisation give rise to symmetries

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4?x5 x6 x7 x8 x9 x10? y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7}

Why Compositional Functions?
• Compositional functions cannot do more

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective • These symmetries turns out to be excellent for optimisation

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective • These symmetries turns out to be excellent for optimisation • We are not really sure why

Is this useful?
“A theory that explains everything, explains nothing” – The Logic of Scientific Discovery

Data and Knowledge

• Machine Learning might look like it is changing very quickly

• Machine Learning might look like it is changing very quickly
• it is not

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing
• we are mainly using statistical methods that are decades if not centuries old • however we have access to vast amounts of data

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing
• we are mainly using statistical methods that are decades if not centuries old
• however we have access to vast amounts of data
• it turns out that with very large volumes of data a lot of problems are a lot easier than
we thought

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth
• Do not decide on methods, do not start writing code but extract as much knowledge you have

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts