Basic Probability Definitions
COMPCSI 753: Algorithms for Massive Data
Ninh Pham
University of Auckland
Parts of this material are modifications of the lecture slides of Kevin Wayne (https://www.cs.princeton.edu/~wayne/kleinberg-tardos/) Designed for the textbook Algorithm Design
by Jon Kleinberg and Eva Tardos.
Auckland, Aug 3, 2020
1
Randomized algorithms
The output and running time of algorithm are functions of both inputs and random bits chosen.
Example of random bits: flipping a coin, i.e. Bernoulli(0.5), random integer, i.e. randint(n)…
Input
Output
Random bits
Algorithm
2
Outline
Basic probability definitions Warm up exercises
3
Finite probability space
Simple probability statements:
S1: “If a fair coin is flipped, the probability of heads is 1/2.”
S2: “If a fair dice is rolled, the probability of 6 is 1/6.” Finite probability space: all possible outcomes
S1: Ω ={‘heads’, ‘tails’} S2: Ω ={1, 2, 3, 4, 5, 6}
Compute the probability of a particular outcome: Pr [coin = ‘heads’] = 1/2
Pr [dice = 6] = 1/6
4
Event definition
Event:
A specific event E is defined by a subset of outcomes from the finite probability
space.
We are interested in Pr [E], i.e. the probability that the event E occurs.
Complement event: 𝑬 – when the event E does not occur, i.e. Pr [𝑬] = 1 – Pr [E]
Example:
Rolling two fair dices, what is the probability that the event 2 dices are the same
occurs?
Ω ={(1,1), … , (1,6), … , (6,1), … , (6,6)}
Pr [2 dices are the same] = Pr [(1,1)] + Pr [(2,2)] + … + Pr [(6,6)]
Pr [2 dices are different] = 1 – Pr [2 dices are the same]
5
Independent events
Two events A and B are independent if their outcomes do not affect each other.
A: The event that flipping the 1st fair coin returns ‘heads’. B: The event that flipping the 2nd fair coin returns ‘heads’.
Pr [A B] = Pr [A] . Pr [B].
Pr [1st coin = ‘heads’ 2nd coin = ‘heads’] =1/2 . 1/2 = 1/4
If E1,…, En are independent events, then
Pr [E1 … En] = Pr [E1] … Pr [En]
6
Monte Carlo algorithm
Problem:
Given an array of n elements A[1, … , n] where n is even. Half of
elements are 0s, the other half are 1s. Goal:
Find an index containing 1s. Monte Carlo solutions:
1) Repeat 100 times: 1.1) k = randint(n) 1.2) If A[k] == 1
Probability of failure in the 1st times:
Return k 2) Return “Failed”
Probability of failure:
7
Monte Carlo algorithm
Problem:
Given an array of n elements A[1, … , n] where n is even. Half of
elements are 0s, the other half are 1s. Goal:
Find an index containing 1s. Monte Carlo solutions:
1) Repeat 100 times: 1.1) k = randint(n) 1.2) If A[k] == 1
Probability of failure in the 1st times: 1⁄2
Return k 2) Return “Failed”
Probability of failure: (1⁄2)100
8
Random variables
Informally, X is a random variable if X is an outcome of a random process.
X presents a value which we are uncertain. Examples:
Random process: rolling a fair dice, define X as the value of the dice.
X = 1/2/3/4/5/6 with the same probability 1/6 (i.e. randint(6)).
Random process: flipping a fair coin, if ‘heads’, we return 1; otherwise, we return 0. Define X as the random variable of this process outcome.
X = 0/1 with the same probability 1/2 (i.e. Bernoulli(0.5)).
9
Expectation of a random variable
Expectation: Given a discrete random variable X with value in {0, 1, … , ∞}, its expectation E [X] (e.g. the average value
of X) is define by:
Example:
Rolling a fair dice, expectation of the dice value is:
1*1/6 + 2*1/6 + 3*1/6 + 4*1/6 + 5*1/6 + 6*1/6 = 7/2.
Given a random variable X generated by Bernoulli(0.5): o Pr[X=1]=0.5andPr[X=0]=0.5.
o E[X]=1*0.5+0*0.5=0.5.
10
Roulette game in casino
Game rule:
If you bet on Reds and the ball
Problem:
lands on red, you double your bet. Otherwise, you lose your bet.
P1: Given that you bet $1B on Reds, what is the expectation of your outcome?
P2: Which game should we play?
11
Expectation properties
If X is a 0/1 random variable, then E [X] = Pr [X = 1] E [X] = 1 * Pr [X = 1] + 0 * Pr [X = 0] = Pr [X = 1]
Given two random variables X and Y defined over the same finite probability space, we have E [X + Y] = E [X] + E [Y]
Given n random variables X1, … , Xn defined over the same finite probability space, we have:
E[X1 +…+Xn]=E[X1]+…+E[Xn]
12
Guessing cards
Game: Shuffle n different cards and turn them over one at a time. Try to guess each card without remembering the previous cards (sampling with replacement).
Question: What is the expected number of correct guesses if you guess n times?
Proof:
Define Xi = 1 if we guess correctly at the ith time and 0 otherwise.
We have: E [Xi] = 1/n.
Define X = X1 + … + Xn as the number of correct guesses. E[X]=E[X1+…+Xn]=E[X1]+…+E[Xn] =n*1/n=1
13
Outline
Basic probability definitions Warm up exercises
14
Guessing cards
Game: Shuffle n different cards and take one of them out. Try to guess the card with memory of the previous cards (sampling without replacement).
Question: What is the expected number of correct guesses if you guess n times?
Proof:
Define Xi = 1 if we guess correctly at the ith time and 0 otherwise.
We have: E [Xi] = 1/(n – i + 1).
Define X = X1 + … + Xn as the number of correct guesses. E [X] = E [X1 + … + Xn] = E [X1]+ … +E [Xn]
E [X] = 1/n + 1/(n-1) + … + 1/1 = H(n) = O(log n).
Harmonic series » ln(n) + 0.577
15
Las Vegas algorithm
Problem:
Given an array of n elements A[1, … , n] where n is even. Half of
elements are 0s, the other half are 1s. Goal:
Find an index containing 1s. Las Vegas solutions:
Repeat:
1) k = randint(n) 2) If A[k] == 1
Exercise: Expected running time is O(1) (2 iterations)
Return k
Define X as number of iterations.
E[X]=1*1/2+2*(1/2)2 +…+ n * (1/2)n 2 when n ∞
16
Waiting for a first success
Game: Flipping a biased coin where it is heads with probability p and tails with probability 1 – p.
Question: How many independent flips X until first heads? Proof:
E[X]jPr[Xj]j(1p)j1p pj(1p)j p1p1
j0 j0
j-1 tails
1 p j0 1 p p2 p
1 head
Hard trick:
Taylor Series of f(x) = 1/x2 where x = p and a = 1.
17
Coupon collector
Game: Each box of cereal contains a coupon. There are n different types of coupons. Assuming all boxes are equally likely to contain each coupon, how many boxes before you have ≥ 1 coupon of each type?
Proof:
Phase j = time between j and j + 1 distinct coupons.
Let Xj = number of boxes you open in phase j. LetX=X0 +X1 +…+Xn-1 =numberofboxesintotal.
n1 n1nn1
E[X] E[Xj] nj n i nH(n)
j0 j0 i1
prob. of success = (n-j)/n
expected waiting time = n/(n-j)
18