程序代写代做代考 data science Introduction to information system

Introduction to information system

Popular Distributions (2/2)

Bowei Chen

School of Computer Science

University of Lincoln

CMP3036M/CMP9063M Data Science

• Univariate Distributions

– Discrete Distributions

• Uniform

• Bernoulli

• Binomial

• Poisson

– Continuous Distributions

• Uniform

• Exponential

• Normal/Gaussian

• Multivariate Distributions

– Multivariate Normal Distribution

Objectives

Today’s Objectives

Popular Distributions (2/2)

Quick Recap

on Discrete

Distributions!

Discrete Uniform Distribution

When to Use it?

• The experiment has finite outcomes

• Each outcome has the same probability to occur

Example:

• Flipping a fair coin

• Tossing a fair dice

Bernoulli and Binomial Distributions

When to Use it?

• The experiment has two outcomes

– If the experiment is performed once, it is the Bernoulli distribution

– If the experiment is performed many times, it is the Binomial distribution

Example: Flipping a fair/unfair coin

𝑛 flips
Ber(𝑝)

Bin(𝑛, 𝑝)

Head occurs with

probability 𝑝

Think Deeper

𝑛 tosses

A dice can be fair or unfair, the 𝑛 tosses of the
dice become the Multinomial distribution. It is not

required for this course but can be an interesting

advanced topic for your direct study.

𝑝1 𝑝2 𝑝3

𝑝4 𝑝5 𝑝6

Poisson Distribution

When to Use it?

• The experiment has two outcomes

• If the experiment is performed infinite times in a period

• The average rate of one outcome is finite.

Example:

• Number of text message arrivals in a period

What if we look

at the seconds,

milliseconds,

microseconds?

Continuous Uniform Distribution

• Notation

𝑋~U a, b

• PDF

𝑓 𝑥; 𝑎, 𝑏 =
1

𝑏 − 𝑎
, if 𝑥 ∈ [𝑎, 𝑏],

0, otherwise.

• Expectation and variance

𝔼(𝑋) =
𝑎 + 𝑏

2
,

𝕍(𝑋) =
(𝑏 − 𝑎)2

12
.

𝐴𝑟𝑒𝑎 =

𝑎

𝑏

𝑓 𝑥; 𝑎, 𝑏 𝑑𝑥 = 1

You usually receive 5 text messages per hour

Event counts!
Waiting time!

You receive 0

message in

the next hour

Your waiting

time for the

first message

is less than or

equal to 1 hour

𝑋: the waiting hour for the first message
𝑌: the number of messages received for the next hour

The PMF of the Poisson

distribution is ℙ 𝑌 = 𝑦 =
𝑒−𝜆𝜆𝑦

𝑦!

𝑦 = 0
𝜆 = 5

Your waiting

time for the

first message

is more than

1 hour

ℙ 𝑋 ≤ 1 = 1 − ℙ 𝑋 > 1 = 1 − ℙ 𝑌 = 0 = 1 −
𝑒−550

0!
= 1 − 𝑒−5.

General Solution

ℙ 𝑋 ≤ 𝑥 = 1 − ℙ 𝑋 > 𝑥 = 1 − ℙ 𝑌 = 0 = 1 −
𝑒−𝜆𝑥𝜆𝑥0

0!
= 1 − 𝑒−𝜆𝑥.

Since ℙ 𝑋 ≤ 𝑥 = 𝔽(𝑥), then

𝑓 𝑥 = ℙ(𝑋 = 𝑥) =
𝑑𝔽(𝑥)

𝑑𝑥
= 𝜆𝑒−𝜆𝑥

Then, 𝑋 follows the Exponential distribution, denoted by 𝑋 ∼ Exp(𝜆)

Exponential Distribution

• Notation

𝑋~𝐸𝑥𝑝(𝜆)

• PDF

𝑓 𝑥; 𝜆 =
𝜆𝑒−𝜆𝑥, if 𝑥 ≥ 0,
0, otherwise.

• Expectation and variance

𝔼(𝑋) =
1

𝜆
,

𝕍(𝑋) =
1

𝜆2
.

𝐴𝑟𝑒𝑎 =

0

𝑓 𝑥; 𝜆 𝑑𝑥 = 1

Memoryless Property

Let 𝑋 be exponentially distributed with parameter 𝜆. Suppose we know 𝑋 > 𝑥1.
What is the probability that 𝑋 is also greater than some value 𝑥1 + 𝑥2?

ℙ 𝑋 > 𝑥1 + 𝑥2 𝑋 > 𝑥1 =
ℙ(𝑋 > 𝑥1 + 𝑥2 and 𝑋 > 𝑥1)

ℙ(𝑋 > 𝑥1)

If 𝑋 > 𝑥1 + 𝑥2, 𝑋 > 𝑥1. Therefore

ℙ 𝑋 > 𝑥1 + 𝑥2 𝑋 > 𝑥1 =
ℙ(𝑋 > 𝑥1 + 𝑥2)

ℙ(𝑋 > 𝑥1)
=
𝑒−𝜆(𝑥1 +𝑥2)

𝑒−𝜆𝑥1
= 𝑒−𝜆𝑥2 = ℙ(𝑋 > 𝑥2)

The memoryless property means that

the future is independent of the past.

Why is the Gambler Wrong?

If you flip a fair coin 8 times and do not observe a head. In your 9th flipping,

would you bet on head or tail?

1st

flipping

2nd

flipping

3rd

flipping

4th

flipping

5th

flipping

6th

flipping

7th

flipping

8th

flipping

9th

flipping

No. of

students

Marks195 30

Normal/Gaussian Distribution

• Notation

𝑋~𝒩(𝜇, 𝜎2)

• PDF

𝑓 𝑥; 𝜇, 𝜎2 =
1

2𝜋𝜎2
exp −

𝑥 − 𝜇 2

2𝜎2
,

where −∞ < 𝑥 < ∞. • Expectation and variance 𝔼(𝑋) = 𝜇, 𝕍(𝑋) = 𝜎2. 𝐴𝑟𝑒𝑎 = −∞ ∞ 𝑓 𝑥; 𝜇, 𝜎2 𝑑𝑥 = 1 Standard Normal Distribution −1 1 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.68269 −2 2 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.95450 −3 3 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.9973 Bivariate Normal Distribution V𝑜𝑙𝑢𝑚𝑒 = −∞ ∞ 𝑓 𝑥1, 𝑥2 𝑑𝑥1𝑑𝑥2 = 1 𝑓 𝑥1, 𝑥2; 𝜇1, 𝜎1 2, 𝜇2, 𝜎2 2; 𝜌 = 1 2𝜋𝜎1𝜎2 1 − 𝜌 2 exp − 1 2(1 − 𝜌2) 𝑄 , where 𝑄 = 𝑥1 − 𝜇1 𝜎1 2 + 𝑥2 − 𝜇2 𝜎2 2 −2𝜌 𝑥1 − 𝜇1 𝜎1 𝑥2 − 𝜇2 𝜎2 𝑓 (𝒙 ) 𝜇1 = 0, 𝜎1 2 = 10, 𝜇2 = 0, 𝜎2 2 = 10, 𝜌 = 0.5 Multivariate Normal Distribution • Notation 𝑿~𝒩𝑑 𝝁, 𝚺 𝑿 = 𝑋1 ⋮ 𝑋𝑑 𝝁 = 𝜇1 ⋮ 𝜇𝑑 𝚺 = 𝜎11 ⋯ 𝜎1𝑑 ⋮ ⋱ ⋮ 𝜎𝑑1 ⋯ 𝜎𝑑𝑑 • PDF 𝑓 𝒙; 𝝁, 𝚺 = (2𝜋) − 𝑑 2 𝚺 − 1 2exp − 1 2 𝒙 − 𝝁 𝑇𝚺−1 𝒙 − 𝝁 . • Expectation and variance 𝔼(𝑿) = 𝝁 Cov(𝑿) = 𝚺 References • G.Casella, R.Berger (2002) Statistical Inference. Chapter 3 • K.Murphy (2012) Machine Learning: A Probabilistic Perspective. Chapter 2 Thank You! bchen@Lincoln.ac.uk mailto:bchen@Lincoln.ac.uk