CS计算机代考程序代写 algorithm Probability theory and average-case complexity

Probability theory and average-case complexity

Review of probability theory

Review of probability theory:
outcome
• Examples:
– Rolling a die and getting 1
– Rolling a die and getting 6
– Flipping three coins and getting H, H, T
– Drawing two cards and getting 7 of hearts, 9 of clubs
• NOTexamples:
– Rolling a 6-sided die and getting an even number
(this is more than one outcome—3 to be exact!) – Drawing a card and getting an ace (4 outcomes!)

Review of probability theory:
event
• Defn: one or more possible outcomes
• Examples:
– Rolling a die and getting 1
– Rolling a die and getting an even number
– Flipping three coins and getting at least 2 “heads” – Drawing five cards and getting one of each suit

Review of probability theory:
• Defn: A set of events
• Examples:
Flipping 3 coins
Drawing a card
sample space
Rolling 2 dice

Review of probability theory:
probability distribution
• Idea:takeasamplespaceSandaddprobabilities for each event
• Defn: mapping from events of S to real numbers –Pr 􏰅 ≥0foranyeventAinS
–∑􏰆∈􏰇Pr􏰅 =1
• Example:
0.75*0.75
0.753
Flipping 3 biased coins 75% heads, 25% tails
0.75*0.25 0.25*0.75
0.75*0.252
0.75
0.752*0.25 0.752*0.25
0.25 0.25*0.25
0.752*0.25
0.75*0.252 0.75*0.252
0.253

Review of probability theory:
probability of an event A
• Defn: Pr 􏰅 = ∑􏰎􏰏􏰐􏰑􏰎􏰒􏰓∈􏰆 Pr 􏰈􏰉􏰊􏰋􏰈􏰌􏰍 • Example:
– Pr(roll a die and get even number) = Pr(roll a 2) + Pr(roll a 4) + Pr(roll a 6)

Review of probability theory:
random variable
• Idea:turneventsintonumbers
• Let S be a sample space
• Defn: mapping from events to real numbers
• Example:
X = the number on a die after a roll
event “rolling a 1” -> 1 event “rolling a 2” -> 2 …
event “rolling a 6” -> 6
Technically, X is a function, so we can write: X(rolling a 1) = 1 X(rolling a 2) = 2

X(rolling a 6) = 6

Review of probability theory:
expected value of a random variable • Idea:“average”valueoftherandomvariable
• Remember:randomvariableXisamappingfrom events in a sample space S to numbers
• Defn:ExpectedvalueofX=E X = 􏰔 􏰕 􏰍􏰖􏰍􏰗􏰊 Pr 􏰕=􏰕 􏰍􏰖􏰍􏰗􏰊
􏰓􏰘􏰓􏰙􏰐∈􏰇
• Short form: E X = ∑􏰛 􏰚 Pr 􏰕 = 􏰚
Here, x = X(event)
• Example: X = number on die after rolling
􏰜􏰕 = 􏰔 􏰚Pr􏰕=􏰚 =11+21+⋯+61=7 􏰝􏰞􏰛􏰞􏰟 66 62

Expected running time of an algorithm

Expected running time of an algorithm
• Let A be an algorithm.
• Let Sn be the sample space of all inputs of size n.
• To talk about expected (/average) running time, we must specify how we measure running time.
– We want to turn each input into a number (runtime). – Random variables do that…
• We must also specify how likely each input is.
– We do this by specifying a probability distribution over Sn.

Expected running time of an algorithm • Recall: algorithm A, sample space Sn
• We define a random variable
tn(I) = number of steps taken by A on input I
• We then obtain:
􏰜􏰊􏰙 =􏰔􏰊􏰙􏰠Pr􏰠 􏰡∈􏰇􏰢
• In this equation, I is an input in Sn, and
Pr(I) is the probability of input I according to the probability distribution we defined over Sn
• 􏰣 􏰤􏰥 is the average running time of A, given Sn

Example time!

Example time: searching an array
• Let L be an array containing 8 distinct keys Search(k, L[1..8]):
for i = 1..8
if L[i].key == k then return true
return false
• What should our sample space S9 of inputs be? • Hard to reason about all possible inputs.
– (In fact, there are uncountably infinitely many!)
• Cangroupinputsbyhowmanystepstheytake!

Grouping inputs by how long they take
Search(k, L[1..8]):
for i = 1..8
if L[i].key == k then return true
return false
• What causes us to return in loop iteration 1? • How about iteration 2? 3? … 8? After loop? • S9 = { L[1]=k, L[2]=k, …, L[8]=k, k not in L }
• NowweneedarandomvariableforS9!

Using a random variable to capture running time
Search(k, L[1..8]):
For simplicity: assume each iteration takes 2 steps. Do we have enough information to
for i = 1..8
if L[i].key == k then return true
compute an answer? return false 1 step
• S9 = { L[1]=k, L[2]=k, …, L[8]=k, k not in L }
• Let T(e) = running time for event e in S9
• T(L[1]=k) = 2, T(L[2]=k) = 4, …, T(L[i]=k) = 2i • T(knotinL)=2*8+1=17 •Wethenobtain:􏰣􏰦 =∑􏰧∈􏰪􏰫􏰦􏰧􏰨􏰩(􏰧)

What about a probability distribution?
• We have a sample space and a random variable. • Now,weneedaprobabilitydistribution.
• Thisisgiventousintheproblemstatement.
–Foreachi,Pr􏰬􏰭 =􏰮 =􏰝 –Pr 􏰮􏰗􏰈􏰊􏰭􏰗􏰯􏰭􏰰􏰊 =􏰝 􏰝􏰟
􏰱
• If you don’t get a probability distribution from the problem statement, you have to figure out how likely each input is, and come up with your own.

Computing the average running time
•Wenowknow:􏰣􏰦 =∑􏰧∈􏰪􏰫􏰦􏰧􏰨􏰩(􏰧)
• T(e) = running time for event e in S9
• T(L[i]=k)=2i T(knotinL)=17
• S9 = { L[1]=k, L[2]=k, …, L[8]=k, k not in L} • Probability distribution:
– For each i, Pr 􏰬 􏰭 = 􏰮 = 􏰝 –Pr 􏰮􏰗􏰈􏰊􏰭􏰗􏰯􏰭􏰰􏰊 =􏰝 􏰝􏰟
􏰱
•Therefore:􏰣􏰦 =􏰲􏰬1 =􏰮􏰳􏰴􏰬1 =􏰮 +
⋯+􏰲􏰬8 =􏰮􏰳􏰴􏰬8 =􏰮 + 􏰲 􏰮􏰗􏰈􏰊􏰭􏰗􏰬 􏰳􏰴(􏰮􏰗􏰈􏰊􏰭􏰗􏰬)

The final answer
• Recall: T(L[i]=k) = 2i T(k not in L) = 17
– For each i, Pr 􏰬 􏰭 = 􏰮 = 􏰝 –Pr 􏰮􏰗􏰈􏰊􏰭􏰗􏰯􏰭􏰰􏰊 =􏰝 􏰝􏰟
􏰱
• 􏰣 􏰦 = 􏰲 􏰬 1 = 􏰮 􏰳􏰴 􏰬 1 = 􏰮 + ⋯ +
􏰲 􏰬 8 = 􏰮 􏰳􏰴 􏰬 8 = 􏰮 + 􏰱 􏰵
􏰲 􏰮􏰗􏰈􏰊􏰭􏰗􏰬 􏰳􏰴 􏰮􏰗􏰈􏰊􏰭􏰗􏰬 =􏰝􏰟+􏰝􏰟+
⋯ + 􏰝􏰟 + 􏰝􏰶 = 13 􏰝􏰟 􏰱
• Thus,theaveragerunningtimeis13.

Slightly harder problem: L[1..n]
Search(k, L[1..n]):
for i = 1..n
if L[i].key == k then return true
return false
• Problem: what is the average running time of Search, given the following probabilities?
– Pr 􏰬 􏰭 = 􏰮 = 􏰝 􏰱􏰙
–Pr􏰮􏰗􏰈􏰊􏰭􏰗􏰬 =􏰝􏰱

Computing E[T]: part 1
Search(k, L[1..n]):
for i = 1..n
if L[i].key == k then return true
return false
• What is our sample space?
– Sn+1 = { L[1]=k, L[2]=k, …, L[n]=k, k not in L }
• What is our random variable?
– Let T(e) = running time for event e in Sn+1
• What is the running time of each event? – T(L[i]=k) = 2i, T(k not in L)=2n+1

Computing E[T]: part 2
• Whatweknow:
– Sn+1 = { L[1]=k, L[2]=k, …, L[n]=k, k not in L } – T(L[i]=k) = 2i, T(k not in L)=2n+1
–Pr􏰮􏰗􏰈􏰊􏰭􏰗􏰬=􏰝 andPr􏰬􏰭=􏰮=􏰝 􏰱 􏰱􏰙
• Nowwecancompute􏰣􏰦 =∑􏰧∈􏰪􏰥􏰷􏰸􏰦􏰧􏰨􏰩􏰧.
–􏰣􏰦=
T L 1 = k Pr L 1 = k +
T L 2 = k Pr L 2 = k + ⋯ + T L n = k Pr L n = k +
T 􏰮􏰗􏰈􏰊􏰭􏰗􏰬 Pr(􏰮􏰗􏰈􏰊􏰭􏰗􏰬)
–􏰣􏰦 =2􏰝 +4􏰝 +⋯+2n􏰝 + 2n+1 􏰝 􏰱􏰙 􏰱􏰹 􏰱􏰹 􏰱

Computing E[T]: part 3 –􏰣􏰦 =2􏰝 +4􏰝 +⋯+2n􏰝 + 2n+1 􏰝
􏰝􏰱􏰙􏰱􏰹 􏰱􏰹􏰝􏰱
–􏰣􏰦 =􏰙 1+2+⋯+􏰗 + 2􏰗+1 􏰱
–􏰣􏰦 =􏰝􏰙(􏰙􏰺􏰝)+􏰱􏰙􏰺􏰝=􏰙􏰺􏰝+􏰱􏰙􏰺􏰝=􏰻􏰙+1 􏰙􏰱􏰱􏰱􏰱􏰱
– Thus, Search(k, L[1..n]) has expected (or average) running time 3n/2+1 for the given probabilities.