代写代考 Introduction

Introduction

Data and Knowledge

Copyright By PowCoder代写 加微信 powcoder

What have we done?
• How to quantify beliefs

What have we done?
• How to quantify beliefs
• How to integrate beliefs with data

What have we done?
• How to quantify beliefs
• How to integrate beliefs with data • How to reach updated beliefs

What have we not done?
• How to extract and formulate domain specific knowledge mathematically

What have we not done?
• How to extract and formulate domain specific knowledge mathematically • How to iteratively improve our models

What have we not done?
• How to extract and formulate domain specific knowledge mathematically • How to iteratively improve our models
• How to interpret our results and make decisions

What have we not done?
• How to extract and formulate domain specific knowledge mathematically
• How to iteratively improve our models
• How to interpret our results and make decisions
• How to perform statistical inference when we have intractable computations

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation
• Deterministic approximations

Approximate Posterior Inference
• When we have non-conjugate models we have to compute the marginal likelihood
• We have to approximate the computation
• Deterministic approximations • Stochastic approximations

Point Estimates
• Bayesian Inference
p(θ | D) = p(D | θ)p(θ) p(D)

Point Estimates
• Bayesian Inference
p(θ | D) = p(D | θ)p(θ)
θˆ = argmaxθ p(D | θ)
• Maximum Likelihood (ML)

Point Estimates
• Bayesian Inference
• Maximum Likelihood (ML)
θˆ = argmaxθ p(D | θ) • Maximum-a-Posteriori (MAP)
θˆ = argmaxθ p(D | θ)p(θ)
p(θ | D) = p(D | θ)p(θ)

Model Selection

p(θ | y) = p(y | θ)p(θ)
􏰋 p(y | θ)p(θ)dθ
Likelihood How much evidence is there in the data for a specific hypothesis Prior What are my beliefs about different hypothesis
Posterior What is my updated belief after having seen data Evidence What is my belief about the data

The Compute: Evidence
p(y | θ)p(θ)dθ

Regression Model

Which Parametrisation
• Should I use a line, polynomial, quadratic basis function? • How many basis functions should I use?
• Likelihood won’t help me
• How do we proceed?

Regression Models
Linear Linear Model
Basis function
p(yi|xi,w)=N(w0 +w1 ·xi,β−1)
p(yi|xi, w) = N (􏰌 wiφ(xi), β−1)

p(Y|W)p(W)dW

Probabilities are a zero-sum game

Model Selection1
1 D Thesis

Occams Razor

Occams Razor
Definition (Occams Razor)
“All things being equal, the simplest solution tends to be the best one” – William of Ockham

What is Simple?2
2 https://www.imdb.com/title/tt8132700/

The Mac Mackay, 1991

Hypothesis Spaces

Composite Functions
f(x)=fL ◦fL−1 ◦···◦f0(x)

What Does Compositions Do?
Im(f)[X] = {f(x) | x ∈ X} Kern(f)[X]={(x,x′)|f(x)=f(x′), (x,x′)∈X ×X}

What Does Compositions Do?
Kern(f1)⊆Kern(fk−1 ◦…◦f2 ◦f1)⊆Kern(fk ◦fk−1 ◦…◦f2 ◦f1) Im(fk ◦fk−1 ◦…◦f2 ◦f1)⊆Im(fk ◦fk−1 ◦…◦f2)⊆…⊆Im(fk)

Why would you ever want this?
y1 = {x1,x2,x3,x4} y2 = {x5,x6,x7,x8,x9} y3 = {x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Why would you ever want this?
y1 = {x1,x2,x5,x6,x7} y2 = {x3,x4} y3 = {x8,x9,x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Composite Functions

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
y2 = {x4,x9} y3 = {x6,x7}

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7}

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour
• Small changes in early compositions give rise to large changes in composite behaviour

Composite Functions
• Simple functions in composition give rise to complicated composite behaviour
• Small changes in early compositions give rise to large changes in composite behaviour
• Over parametrisation give rise to symmetries

Why would you ever want this?
y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7} ···
x1 x2 x3 x4?x5 x6 x7 x8 x9 x10? y1 = {x1,x2,x3,x5,x8,x10} y2 = {x4,x9} y3 = {x6,x7}

Why Compositional Functions?
• Compositional functions cannot do more

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective • These symmetries turns out to be excellent for optimisation

Why Compositional Functions?
• Compositional functions cannot do more
• Compositional functions introduce symmetries in objective • These symmetries turns out to be excellent for optimisation • We are not really sure why

Is this useful?
“A theory that explains everything, explains nothing” – The Logic of Scientific Discovery

Data and Knowledge

• Machine Learning might look like it is changing very quickly

• Machine Learning might look like it is changing very quickly
• it is not

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing
• we are mainly using statistical methods that are decades if not centuries old

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing
• we are mainly using statistical methods that are decades if not centuries old • however we have access to vast amounts of data

• Machine Learning might look like it is changing very quickly
• it is not
• Remember to attribute the advances to the right thing
• we are mainly using statistical methods that are decades if not centuries old
• however we have access to vast amounts of data
• it turns out that with very large volumes of data a lot of problems are a lot easier than
we thought

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth
• Do not decide on methods, do not start writing code but extract as much knowledge you have

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth
• Do not decide on methods, do not start writing code but extract as much knowledge you have
• Formulate the knowledge mathematically

What to do next?
• Define a project, do not do ML in isolation, find something that you want to solve that has data
• Understand the problem in depth
• Do not decide on methods, do not start writing code but extract as much knowledge you have
• Formulate the knowledge mathematically
• What model exists that can use this data

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com