Algorithms & Data Structures (Winter 2022) Extras – Randomization and Probabilistic Analysis
• AmortizedAnalysis.
• Randomized algorithms. • ProbabilisticAnalysis.
Copyright By PowCoder代写 加微信 powcoder
• Review Final Exam.
Deterministic VS Randomized
• Deterministic Algorithm : Identical behavior for different runs for a given input.
• Randomized Algorithm : Behavior is generally different for different runs for a given input.
Deterministic
Randomized
Probabilistic Analysis
Average Running Time
Algorithms
Worst-case Analysis
Worst-case Running Time
Probabilistic Analysis
Average Running Time
2 different views
To consider the world as behaving randomly:
One can consider traditional algorithms that confront randomly generated input.
• Average-case analysis => to study the behavior of an algorithm on an “average” input (subject to some underlying random process), rather than a worst-case input.
To consider algorithms that behave randomly:
Same worst-case input as always, but we allow our algorithm to make random decisions as it processes the input.
• The role of randomization in this approach is purely internal to the algorithm and does not require new assumptions about the nature of the input.
2 different views
When we say quicksort is O(n logn) in the “average case”, we could mean two different things.
A quicksort algorithm that chooses pivots in some deterministic way has average performance O(n logn), namely averaging over all possible inputs which are equally like to occur.
• Randomized input
For any given input, a quicksort algorithm that chooses
pivots randomly takes time O(n log n). • Randomized algorithm
Probabilistic Analysis
• Definition: Sample Space.
• A set of possible outcomes of some “experiment” (loose definition)
• Flipacoin{H,T}
• Flip a coin 3 times {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
• Rolladiceonce{1,2,3,4,5,6}
• Roll a dice twice {(1,1), (1,2), …. , (5,6), (6,6)}
• TakeComp251{A,A-,B+,B,…..,C,D,F}
• When is your birthday? {1, 2, 3, ….. , 365 }
• Instance of a sorting algorithm of size 3 {(1,2,3), (1,3,2), …. (3,2,1)}
Probabilistic Analysis
• Definition: Event.
• A subset of a sample space.
• You flip a coin 3 times and you get 2 heads {THH, HTH, HHT}
• Roll a dice twice and the sum is less than 5 {(1,1), (1,2), (2,1), (1,3), (2,2), (3,1)} • You pass Comp251 {A, A-, B+, B, ….. , C, D}
• Your birthday is after February 5th? {36, 37 ….. , 365 }
Probabilistic Analysis
• Definition: Probability distribution on a sample space S.
• A mapping from events of S to real numbers satisfying the following
probability axioms:
• Gives the probabilities of occurrence of different events for an experiment. • Pr 𝐴 ≥0𝑓𝑜𝑟𝑎𝑛𝑦𝑒𝑣𝑒𝑛𝑡𝐴
• Pr 𝑆 = 1
• Pr 𝐴∪𝐵 =Pr 𝐴 +Pr 𝐵 𝑓𝑜𝑟𝑎𝑛𝑦𝑡𝑤𝑜𝑚𝑢𝑡𝑢𝑎𝑙𝑙𝑦𝑒𝑥𝑐𝑙𝑢𝑠𝑖𝑣𝑒𝑒𝑣𝑒𝑛𝑡𝑠𝐴𝑎𝑛𝑑𝐵
Probabilistic Analysis
• Definition: Discrete Probability Distribution.
• A probability distribution is discrete if it is defined over a finite or countable
infinite sample space.
• If S is finite and every elementary event s ∈ 𝑆 has probability Pr 𝑠 = !” , then we have the uniform probability distribution on S.
• In such a case the experiment is often described as “picking an element of S at random”.
• As an example consider the process of flipping a fair coin.
Probabilistic Analysis
• Definition: Random Variable.
• Is a function from the underlying sample space to the real numbers.
• It associates a real number with each possible outcome of an experiment, which allows us to work with the probability distribution induced on the resulting set of numbers.
• Although random variables are formally not variables at all, we typically
describe and manipulate them as if they were variables.
Probabilistic Analysis
• Definition: Random Variable.
f is a mapping from R to R f is not a variable
However, when we say y = f(x), then y is a “variable”. In the case of a random variable, the x values (and hence the y values) occur with some probability.
Probabilistic Analysis
• Definition: Random Variable.
• You flip a coin 3 times and you get X = x0 heads e.g., x0=2. • {THH, HTH, HHT}
• You roll a dice twice and the sum is X = x0 e.g., x0=5. • {(1,4),(2,3),(3,2),(4,1)}
Probabilistic Analysis
• Definition: Distribution on the random variable X.
• Think of it as probabilities on the values of the random variable X.
Probabilistic Analysis
• Definition: Distribution on the random variable X.
• You flip a coin 3 times. What is the distribution of the number of heads?
Probabilistic Analysis
• Definition: Distribution on the random variable X.
Probabilistic Analysis
• Definition: Expected value of a random variable. E[X] = åx x Pr{X=x}
• The “average” of the values taken by the random variable. • What is the expected value of a roll of a dice?
• åxxPr{X=x}=1∗!”+2∗!”+3∗!”+4∗!”+5∗!”+6∗!” =3.5 • E(X) does not have to be one of the values taken by x.
• What is the expected value of the sum of two rolls of a dice?
• åxxPr{X=x}=2∗ ! +3∗ $ +4∗ # + …+12∗ ! #” #” #” #”
Probabilistic Analysis
• Definition: Linearity of Expectation. E[X+Y] = E[X]+E[Y], for all X, Y
• The expected value of the sum is the sum of the expected values.
• Suppose we roll four dices. What is the expected value of the sum?
• The sample space S has 64= 1296 outcomes, each with a probability ! !$%”
• E[X1+X2+X3+X4] = E[X1]+E[X2]+E[X3]+E[X4],
• = 3.5 + 3.5 + 3.5 + 3.5
Probabilistic Analysis
• Definition: Indicator Random Variables.
• Indicator Random Variable for an event A of a sample space is
defined as:
• A simple yet powerful technique for computing the expected value of a random variable.
• Convenient method for converting between probabilities and expectations.
• Helpful in situations in which there may be dependence.
• Takes only 2 values, 1 and 0.
I { A } = #”
1 if A occurs, #$ 0 if A does not occur.
Probabilistic Analysis
• Definition: Indicator Random Variables.
Given a sample space S and an event A in the sample space S, let XA= I{A}. Then E[XA] = Pr{A}.
• the expected value of an indicator random variable associated with an event A is equal to the probability that A occurs.
Let Ā = S – A (Complement of A) Then,
E[XA] = E[I{A}]
= 1·Pr{A} + 0·Pr{Ā} = Pr{A}
Probabilistic Analysis
• Definition: Indicator Random Variables.
Problem: Determine the expected number of heads in n coin flips.
Method 1: Without indicator random variables.
Let X be the random variable for the number of heads in n flips. Then, E[X] = åk=0..nk·Pr{X=k}. We can solve this with a lot of math.
Probabilistic Analysis
• Definition: Indicator Random Variables.
Problem: Determine the expected number of heads in n coin flips.
• Method 2 : Use Indicator Random Variables
• Define n indicator random variables, Xi, 1 £ i £ n.
• Let Xi be the indicator random variable for the event that the ith flip results in a Head.
• Xi = I{the ith flip results in H} •ThenX=X1+X2+…+Xn =åi=1..nXi.
• ByLemma5.1,E[Xi]=Pr{H}=1⁄2,1£i£n.
• Expected number of heads is E[X] = E[åi=1..n Xi].
• By linearity of expectation, E[åi=1..n Xi] = åi=1..n E[Xi]. • E[X] = åi=1..n E[Xi] = åi=1..n 1⁄2 = n/2.
2 different views
To consider the world as behaving randomly:
One can consider traditional algorithms that confront randomly generated input.
• Average-case analysis => to study the behavior of an algorithm on an “average” input (subject to some underlying random process), rather than a worst-case input.
To consider algorithms that behave randomly:
Same worst-case input as always, but we allow our algorithm to make random decisions as it processes the input.
• The role of randomization in this approach is purely internal to the algorithm and does not require new assumptions about the nature of the input.
QuickSort: Review
Quicksort(A, p, r) if p < r then
q := Partition(A, p, r); Quicksort(A, p, q – 1); Quicksort(A, q + 1, r)
Partition(A, p, r)
x, i := A[r], p – 1; for j := p to r – 1 do
ifA[j] £ xthen i := i + 1;
A[i] « A[j] fi
A[i + 1] « A[r]; return i + 1
A[p..q – 1] A[q+1..r]
QuickSort: Review
QuickSort: Worst-case Partition
Recursion tree for worst-case partition
Split off a single element at each level:
T(n) = T(n – 1) + T(0) + PartitionTime(n) = T(n – 1) + Q(n)
= åk=1 to nQ(k) =Q(åk=1ton k)
The Q(n2) running time occurs when the input array is already completely sorted—a common situation in which insertion sort runs in O(n) time.
QuickSort: Best-case Partition
cn/2 cn/4 cn/4cn/4 cn/4
• Each subproblem size £ n/2.
• Recurrence for running time
• T(n) £ 2T(n/2) + PartitionTime(n)
= 2T(n/2) + Q(n) • T(n)=Q(nlgn)
QuickSort: Unbalanced Partition Analysis
What happens if we get poorly-balanced partitions,
e.g., something like: T(n) £ T(9n/10) + T(n/10) + Q(n)?
Still get Q(n lg n)!! (As long as the split is of constant proportionality.)
Intuition: Can divide n by c > 1 only Q(lg n) times before getting 1. n
̄n/c2 Roughly logc n ̄ levels;
! Cost per level is ̄ O(n).
1= n/clogcn
£n £ n £ n
(Remember: Different base logs are related by a constant.)
QuickSort: Unbalanced Partition Analysis
What happens if we get poorly-balanced partitions,
e.g., something like: T(n) £ T(9n/10) + T(n/10) + Q(n)?
Still get Q(n lg n)!! (As long as the split is of constant proportionality.)
QuickSort: Average-case Analysis
• The behavior of quicksort depends on the relative ordering of the values in the array elements given as the input, and not by the particular values in the array.
• On a random input array, the partitioning is unlikely to happen in the same way at every level.
• Split ratio is different for different levels.
(Contrary to our assumption in the previous slide.)
• Partition produces a mix of “good” and “bad” splits, distributed randomly in the recursion tree.
• What is the running time likely to be in such a case?
QuickSort: Average-case Analysis
QuickSort: Average-case Analysis
Bad split followed by a good split:
(n – 1)/2 – 1, and (n – 1)/2.
Produces subarrays of sizes 0,
n–1 ((n – 1)/2) – 1
(n – 1)/2 Q(n)
Cost of partitioning :
Q(n) + Q(n-1) = Q(n). Good split at the first level:
Produces two subarrays of size (n – 1)/2.
Cost of partitioning :
Situation at the end of case 1 is not worse than that at the end of case 2. When splits alternate between good and bad, the cost of bad split can be
absorbed into the cost of good split.
Thus the running time is O(n lg n), though with larger hidden constants.
QuickSort: Randomized
w Want to make running time independent of input ordering.
w How can we do that?
» Make the algorithm randomized.
» Make every possible input equally likely.
• Can randomly shuffle to permute the entire array.
• For quicksort, it is sufficient if we can ensure that every
element is equally likely to be the pivot.
• So, we choose an element in A[p..r] and exchange it with
• Because the pivot is randomly chosen, we expect the partitioning to be well balanced on average.
QuickSort: Randomized
Want to make running time independent of input ordering.
Randomized-Partition(A, p, r)
i := Random(p, r); A[r] « A[i]; Partition(A, p, r)
Randomized-Quicksort(A, p, r) if p < r then
q := Randomized-Partition(A, p, r); Randomized-Quicksort(A, p, q – 1); Randomized-Quicksort(A, q + 1, r)
Randomized QuickSort: Average-case Analysis
Random Variable X = # comparisons over all calls to Partition.
Q: Why is it a good measure?
• Let z1, z2, ..., zn denote the list items (in sorted order). • Let the set Zij = {zi, zi+1, ..., zj}.
indicator random variable. Xij=I{zi is compared to zj}.
1 if zi is compared to zj 0 otherwise
Let RV Xij =
Thus,X=ååX .
ij i=1 j=i+1
Randomized QuickSort: Average-case Analysis
én-1n ù E[X]=EêëååX û
ijú i=1 j=i+1
n-1 n =ååE[X ]
ij i=1 j=i+1
=ååP[z iscomparedtoz ]
So, all we need to do is to compute P[z is compared to z ].
E[Xij] = 0·P[Xij=0] + 1·P[Xij=1] = P[Xij=1]
Randomized QuickSort: Average-case Analysis
Claim: zi and zj are compared iff the first element to be chosen as a pivot from Zij is either zi or zj.
P[zi iscomparedtozj]=P[zi orzj isfirstpivotfromZij] =P[zi isfirstpivotfromZij]
We choose the pivot uniformly at random
+P[zj isfirstpivotfromZij]
= 2 36 j-i+1
=1+1 j-i+1 j-i+1
Randomized QuickSort: Average-case Analysis
Therefore,
E[X]=ååj-i+1 i=1 j=i+1
n-1 n-i 2 =ååk+1
i=1 k=1 n-1 n 2
CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com