CS代写 COMP30024 Artificial Intelligence

COMP30024 Artificial Intelligence
Week 10 Problem Sheet Solutions
• Tutors: These are not unique, complete (and definitely not optimally pedagogical) solutions. there are other ways of deriving these results and the way you prefer is to some extent an aesthetic judgement.
• Students: If you want to attain any benefit from the problems you should make a genuine attempt at them before you see the solutions.

Copyright By PowCoder代写 加微信 powcoder

Comments/corrections are as always greatly appreciated by the teaching team.
1. Under oath [T] You are a witness to a night-time hit-and-run accident involving a taxi in Athens. All taxis in Athens are blue or green. You swear under oath that the taxi was blue. Extensive testing shows that under the dim lighting conditions, discrimination between blue and green is 75% reliable.
(a) Is it possible to calculate the most likely colour for the taxi? (Hint: distinguish carefully between the proposition that the taxi is blue and the proposition that it appears blue.)
• Probability that the actual color is blue, given you observed blue:
p(Actual = •|Observed = •) = p(Observed = •|Actual = •)p(Actual = •)
p(Observed = •) • Probability that the actual color is green, given you observed blue:
p(Actual = •|Observed = •) = p(Observed = •|Actual = •)p(Actual = •) p(Observed = •)
• Want to find the ratio of both posteriors, or odds-ratio:
O = p(Observed = •|Actual = •)p(Actual = •)
p(Observed = •|Actual = •)p(Actual = •) • Using the fact that p(Actual = •|Observed = •) = 0.75:
O = 3p(Actual = •) p(Actual = •)

(b) What if you are given the extra information that 9 out of 10 Athenian taxis are green?
If we know the prior probabilities: p(Actual = •) = 9p(Actual = •), then we can incorporate this into our calculation of the posterior ratio to find that, while you swear that the taxi is blue, being struck by a green taxi is still 3 times more likely.
2. Counterespionage [T] During the Cold War, the United States developed a speech-to-text algorithm to detect hidden dialects of Russian sleeper agents, who were trained to speak English with an American accent in Russia, and sent to the US to gather intelligence.
The method was based on the acoustic properties of Russian pronunciation of the word voksal (Russian for railway), borrowed from the English Vauxhall. The Americans postulated that it was impossible for any Russian to completely hide their accent – when a Russian would say ”Vauxhall”, the speech-to-text algorithm would detect this as ”voksal”.
This method was tested at a diplomatic gala where 20% of the guests were Russian sleeper agents and the remainder Americans. A counterespionage officer uniformly randomly chooses a guest and asks them to say ”Vauxhall”. A single letter is chosen uniformly randomly from the word generated by the speech-to-text algorithm, which is observed to be ”l”. Under these assumptions, what is the probability that the person is a Russian sleeper agent?
Let R be the event that the person tested is a Russian sleeper agent, and L be the event that an “l” is observed in the output of the speech-to-text algorithm, then using the posterior ratio:
r = p(R|L) = p(L|R)p(R) p(¬R|L) p(L|¬R)p(¬R)
=⇒p(R|L)= r =1 1+r 7
3. [T] (Based on Mackay 3.8/3.9 (Mackay, 2003)) On a game show1, the rules are as follows:
There are three doors, labelled 1, 2, 3. A single prize is hidden behind one of them, and goats behind the other two. You get to select one door. Initially your chosen door will not be opened. Instead the gameshow host will open one of the other two doors, and he will do so in a such a way as not to reveal the prize. For example, if you first choose door 1, the host will open either door 2 or door 3, depending on which one does not have the prize behind it.
At this point you will be given a new choice of door – you may either stick with your original choice, or you can switch to the other closed door. All doors will then be opened and you will receive whatever is behind your final choice of door.
1This classic problem derives its name from a game played on Let’s make a deal, hosted by .

(a) Imagine you choose door 1 first, then the host opens door 3, revealing nothing behind the door, as promised. Assuming you want the prize and not the goat, should you (a) stick with door 1, (b) switch to door 2, or (c) does it make no difference?
The key here is to write down probabilities of useful random variables. Let Hi denote the hypothesis that the prize is behind door i, and Dj the event that the host opens door j to reveal whatever is behind it. Without loss of generality assume i = 1, j = 3 since the problem is symmetric under permutation. In the following we implicitly condition on that fact we selected door 1:
p(H1|D3) = p(D3|H1)p(H1)
p(D3|H1)p(H1) + p(D3|H2)p(H2) + p(D3|H3)p(H3)
The last term in the evidence disappears, as the host cannot open a door to reveal a prize. p(D3|H1) = 1/2, as the host could have chosen between doors 2 and 3 to open. p(D3|H2) = 1 as the host has no choice in this scenario. Hence, assuming equal prior probabilities:
p(H1|D3) = 31
So you should switch to the alternate door to maximize your chance of winning the prize. On the surface, this appears paradoxical – but we must condition on the extra information contained in the statement that the host will always open a door that does not contain a prize, which breaks the symmetry between the chosen and unchosen doors.
(b) You are playing the same game as above, but just you have made your choice and the host is about to open one of the doors, a rather violent earthquake rattles the building and one of the three doors flies open. It happens to be door 3, and it happens not to have the prize behind it. You had initially chosen door 1.
Repositioning his toup ́ee, the host suggests, ‘OK, since you chose door 1 initially, door 3 is a valid door for me to open, according to the rules of the game; I’ll let door 3 stay open. Let’s carry on as if nothing happened.’
Should you stick with door 1, switch to door 2, or does it make no difference? Assume that the prize was placed randomly, that the host does not know where it is, and that the door flew open because its latch was broken by the earthquake. Hint: consider carefully whether your optimal decision is based on how the other door to be revealed is selected.
Proceed as before. Let D = 3 be the data that door 3 was opened, and the prize was not revealed. Then p(D|H3) = 0, because p(D∩H3) = 0. But now, assuming the earthquake has no preference between doors 1 and 2, the likelihoods p(D = 3|H1) = p(D = 3|H2) = 21 are symmetric. Assuming equal prior probabilities, the same is true for the posterior probabilities. Note that the conscious decision of the host to open non-random door in part (a) subtly breaks the symmetry that part (b) possesses. Distinguishing between these closely related scenarios requires careful consideration of the events involved in Bayes’ Theorem.
(c) You are again playing the same game as above, but now with N = 1000 doors instead of three. When you pick your initial choice of door, the host opens N − 2 = 998 of the

other N − 1 = 999 doors in such a way as to not reveal the prize. Assuming you want the prize, should you stick with your first choice, or switch to the other unopened door? It’s easiest to reason about Monty opening k of the other goat doors. Without loss of generality, say you pick door 1. Let H1 be the hypothesis the prize lies behind door 1. Denote by Dijkl… the event that doors i, j, k, l, . . . are opened without revealing the prize. Let S denote the set of all doors, labelled by their indices. Then:
1 p(Di1,…,ik |H1) = 􏰋N−1􏰌,
p(DS′|Hj)=0ifj∈S′ ⊆S, N
p(Di1,…,ik ) = 􏰖 p(Di1,…,ik |Ht)p(Ht) t=1
= 􏰖 p(Di1,…,ik |Ht)p(Ht) + p(Di1,…,ik |H1)p(H1) t∈/{i1,…,ik}
1111 =(N−k−1)􏰋N 􏰌·N+􏰋N 􏰌·N
t ∈/ {1, i1, . . . , ik}
Using the fact that each of the hypotheses have prior probability p(Ht) = 1/N and
1 ∈/ {i1, . . . , ik}
1 p(Di1,…,ik |Ht) = 􏰋N−2􏰌,
Putting these facts together,
p(Di1,…,ik |Ht)p(Ht) p(Ht |Di1 ,…,ik ) = 􏰕Nj =1 p(Di1 ,…,ik |Hj )p(Hj )
Note we can also arrive at the answer by noticing that if Monty opens k out of the N doors without revealing the prize, the probability of success by switching from your original choice is
(the probability of not picking the prize originally, times the probability of picking the prize out of the remaining k − 1 doors (as your original selected door is excluded).)

(d) You are again(!) playing the N doors variant, but just as the host is about to open one of the doors a violent earthquake rattles the building and N − 2 of the other N − 1 doors fly open to reveal N − 1 goats – i.e. the prize is behind either your initial choice of door or the other unopened door. This time, should you stick with your first choice, or switch to the other unopened door? Find an expression for the probability that the prize is behind your initial choice of door as N → ∞. Assume that the earthquake opens doors randomly, without regard to goats or prizes.
Assume again WLOG that you choose door 1 (recall this is distinct from the event H1, which is the event the prize is behind door 1). We can use the previous approach to show that the answer is now again, 1/2. This is because in this scenario,
p(DS\{1,y}|¬H1)=N−2·N−3×···×1= 1 . N−1N−2 2N−1
While p(DS\{1,y}|H1) = 1, because the earthquake only opened doors without a goat behind them.
So, by Bayes’ Theorem, using prior probability p(H1) = 1/N:
p(H1|p(DS\{1,y}) = p(DS\{1,y}|H1)p(H1) = 1
p(DS\{1,y}|¬H1)p(¬H1) + p(DS\{1,y}|H1)p(H1) 2 Follow-up, can you apply this reasoning to part (c)? Do you get the same answer?2
So it makes no difference to switch in this case where N − 2 of the remaining doors are randomly opened and the prize is not revealed, but you should switch if N − 2 of the remaining doors are consciously opened such that the prize is not revealed.
4. The Prosecutor’s Fallacy [T] In 1998, was tried for murder after two of her sons
died shortly after birth. During the trial, an expert witness for the prosecution testified that
the probability of a newborn dying of sudden infant death syndrome (SIDS) was 1 , so the 8500
probability of two deaths due to SIDS in one family was 􏰋 1 􏰌2, or about one in 73 million. 8500
Therefore, he continued, the probability of Clark’s innocence was one in 73 million. Identify the two major problems with this line of reasoning (supposing the figures are accurate).
The first (and arguably less sinful) mistake is in the calculation of the probability of the intersection of the event C1 ”first child dies of SIDS” and the event C2 ”second child dies of SIDS” by assuming that both newborn deaths are independent events and multiplying the probabilities. This is not valid as newborn deaths may depend on latent genetic factors G. To properly evaluate a sensible statistic we would either have to condition on G directly to get p(C1, C2|G), or obtain p(C1, C2) by marginalizing out G.
The second, more egregious mistake is the confusion of the direction of conditioning: the likeli- hood p(evidence|innocence) is very different from the posterior p(innocence|evidence). They are related by Bayes’ Theorem but otherwise tell very different stories. The likelihood captures the probability of the observed evidence given an assumed value of the unobserved hypothesis, and vice-versa for the posterior. Bayes’ Theorem says:
2Spoiler: yes, but it’s not very satisfying.

p(innocence|evidence) = p(evidence|innocence)p(innocence) p(evidence)
= p(evidence|innocence)p(innocence) p(evidence|innocence)p(innocence) + p(evidence|guilt)p(guilt)
The relevant posterior probability depends on the prior probability p(innocence) and
p(¬innocence) = p(guilt). Based on the fact that double infanticides are quite rare, we may
reasonably assume that p(innocence) is very high, and correspondingly p(guilt) is very low,
making the posterior p(innocence|evidence) ≈ 1 a fair assumption, and a statistic which is very
different from the 􏰋 1 􏰌2 figure put forward by the prosecution, which was literally only part 8500
of the equation.
Sadly, in a miscarriage of justice, Clark was convicted of murder based on statistically illiterate testimony by the prosecution witness, and spent three years in prison before her conviction was overturned. Her case led to the review of hundreds of other cases in the English justice system.
5. The Defense Attorney’s Fallacy [T] (Based on Mackay 3.11 (Mackay, 2003)) A woman has been murdered, and her husband is put on trial for this crime. Evidence comes to light that the defendant had a history of abusing his wife. The defense attorney argues that the evidence of abuse should be excluded on grounds of irrelevance – since, statistically, only 1 in 10,000 men with wives they abuse subsequently murder their wives. Supposing the figures given are accurate3, should the judge grant the motion to bar this evidence from trial? Or is the defense attorney a slimy trickster? If the latter, what is wrong with their argument?
The statistic quoted by the lawyer indicates the probability a randomly selected wife-beater will murder his wife. Let H be the hypothesis the husband is guilty and A the event the husband commits abuse against his wife, then the defense attorney’s argument is that p(H|A) = 1/10000. However, this fails to condition on a crucial piece of evidence – the sus- pect’s wife has been murdered. The probability that the suspect is guilty, given his wife has been murdered is a completely different quantity from the quoted figure.
Let M now be the event that the man’s wife was murdered. The court should be interested in p(H|A,M), not p(H|A) – when performing inference of an unobserved quantity, it is necessary to condition on all the observed data at hand. One may invoke Bayes’ Theorem to compute this – there are two possible options to decompose the joint probability:
p(H|A,M) = p(M|A,H)p(H|A) or p(A|M,H)p(H|M) p(M |A) p(A|M )
Both options are equally valid and the final choice is usually a matter of choosing whichever direction is easier to model. Here we are given both p(H|A) = 1/10000 and p(H|M) = 0.28, but it looks easier to estimate p(A|M,H) – given that a man murders his wife, how likely is
3In the USA, it is estimated that 2 million women are abused each year by their partners. In 1994, 4739 women were victims of homicide; of those, 1326 (28%) were perpetrated by male partners.

it to happen without a prior history of abuse? This should be pretty unlikely, and we might expect this to be 0.9 or higher. We can further assume that given the husband of a murdered wife is not guilty of murder, the probability p(A|M,¬H) is the unconditional probability of abuse, about 2% (assuming 100 million women in the US in 1994). Hence the posterior becomes:
p(H|A, M) = p(A|M, H)p(H|M) p(A|M )
= p(A|M, H)p(H|M) ≈ 0.95 p(A|M,H)p(H|M)+p(A|M,¬H)p(¬H|M)
So the posterior probability of guilt, assuming some plausible statistics, is almost 10000 times as large as the quoted figure by the defence attorney. When conducting inference it is impor- tant to condition on all observed data.
6. Particle physics Every fundamental particle in the universe has a property called ”spin”. This can be described as either integer in which case the particle is a boson, or half-integer, in which case the particle is a fermion. Assume that bosons and fermions occur with equal probability in our universe. A physicist is observing two particles, at least one of which has half-integer spin. Given this information, what is the probability that both particles are fermions?
This is purely mechanical. Let F represent the number of half-integer spin particles: p(F =2|F ≥1)= p(F ≥1|F =2)p(F =2) = 1
p(F ≥ 1) 3
7. Phone numbers You move into a new house; the phone is connected, and you’re 50% sure that the phone number is 740511, but not as sure as you would like to be. As an experiment, you pick up the phone and dial 740511; you obtain a ‘busy’ signal. Are you now more sure of your phone number? If so, how much? Report the posterior probability ratio of the two hypotheses. (Hint: assume that there are 100000 valid landline numbers in Melbourne, all of which are six digits.)
This requires some further assumptions, which you should state. Let H be your hypothesis that your phone number is 740511, and D be the data that you observed a busy signal when the number was dialled. We wish to find the posterior via the ratio:
p(H|D) = p(D|H)p(H) p(¬H|D) p(D|¬H)p(¬H)
If you ring yourself, you’re guaranteed to get a busy signal, p(D|H) = 1. We need to estimate the probability that ringing the number returns a busy signal if it is not your phone number, p(D|¬H). Assuming Melbourne has 100000 valid 6-digit numbers, and around 1% of phones

are in use during the day, then p(D|¬H) ≈ 10−3; a one-in-a-thousand chance of getting a busy number when dialling a random 6-digit number. Using p(H) = 1/2, this gives a posterior of:
p(H|D) = 1 ≈ 0.999 1 + p(¬H|D)
8. Maybe random Monty [T], [∗] Monty decides to change how he selects the door to be opened in response to the contestant’s initial choice of door. Before each show, Monty secretly flips a coin with probability p of Heads. If the coin lands Heads, Monty resolves to open a door with a goat (with equal probabilities if there is a choice). Otherwise, Monty will open a random door, with equal probabilities. Monty will never open the door that the contestant initially chooses. The contestant knows p but does not know the outcome of the coin flip.
When the show starts, the contestant chooses a door. Monty (who knows where the prize is) then opens a door. If the prize is revealed, the game is over; if a goat is revealed, the contestant is offered the option of switching. Without loss of generality, assume that the contestant chooses door 1 and then Monty opens door 2, revealing a goat. Derive an expression in terms of p for the contestant’s probability of success if they switch to door 3.
Again, let D2 be the event that Monty opens door 2 to reveal a goat, let Hi be the hypothesis the prize lies behind door i, and implicitly condition on the fact that you selected door 1 originally. You want to find:
p(H1|D2) = p(D2|H1)p(H1) p(D2|H1)p(H1) + p(D2|H3)p(H3)
Using the fact that p(D2|H2) = 0. However, you are unsure about the outcome of the coin flip, and need to average over this uncertainty. Let C denote the outcome of the coin flip with possible values {H, T }.
p(D2|Hi)=pp(D2|Hi,C =H)+(1−p)p(D2|Hi,C =T)
Monty acts normally if the coin lands heads, so p(D2|H1,C = H) = p(D2|H1,C = T) = 1/2,
but p(D2|H3,C = H) = 1 and p(D2|H3,C = T) = 1/2. This brings us to the final result; p(H1|D2) = 1
References
David M. Mackay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com