CS计算机代考程序代写 AI F71SM STATISTICAL METHODS

F71SM STATISTICAL METHODS

2 PROBABILITY

2.1 Introduction

A random experiment is an experiment which is repeatable under identical conditions, and
for which, at each repetition, the outcome is uncertain but is one of a known and describable
set of possible outcomes.

The sample space S is the set of possible outcomes.

An event A is a subset of S (but see below).

Event A occurs if the outcome of the experiment is an element of the set A.

The union of two events A and B, denoted A ∪ B, is the event which occurs ⇔ at least
one of events A,B occurs.

The intersection of two events A and B, denoted A∩B, is the event which occurs⇔ both
A,B occur.

The complement of event A, denoted A′, occurs if and only if the event A does not occur.

The empty set ∅, considered as a subset of S, contains none of the set of possible outcomes
of the experiment and so corresponds to the impossible event.

A ∪ A′ = S , A ∩ A′ = ∅

Events A and B are mutually exclusive ⇔ A ∩ B = ∅ (that is, the events cannot occur
simultaneously).

A set function is a function whose domain is a collection of sets.

Venn diagram

1

2.2 Probability

Probability is a set function (P ), also called a probability measure, on the collection of subsets
of S.

The domain of the function P is S. For A ∈ S, P (A) is a real number which gives the
probability that the outcome of the experiment is in A; it is the ‘probability that event A
occurs’. We require that P behave like relative frequency: it is a function which is non-
negative, bounded above (by 1), and additive. Thus leads us to the following definition in
which we declare probability as a function subject to the corresponding three axioms.

Axioms of Probability

Given a sample space the probability set function P : S → R is such that

A1 P (A) ≥ 0 for all A ∈ S

A2 P (S) = 1

A3 For A1, A2, A3, . . . ∈ S with Ai ∩ Aj = ∅ for i 6= j, P (∪iAi) =

i P (Ai)

2.3 Basic results for probabilities

(i) P (∅) = 0
Proof: S = S ∪∅ and S ∩∅ = ∅ ⇒ P (S) = P (S ∪∅) = P (S) +P (∅) by A3; result follows.

(ii) For any event A, P (A) ≤ 1
Proof: 1 = P (S) = P (A ∪ A′) = P (A) + P (A′) by A2, A3 and result follows by A1.

(iii) For any event A, P (A′) = 1− P (A)
Proof: from proof of (ii) above.

(iv) A ⊆ B ⇒ P (A) ≤ P (B)
Proof: A ⊆ B ⇒ B = A ∪ (B ∩ A′) ⇒ P (B) = P (A) + P (B ∩ A′) by A3, since
A ∩ (B ∩ A′) = ∅, and hence P (B) ≥ P (A) by A1.

2

(v) Addition rule: P (A ∪B) = P (A) + P (B)− P (A ∩B)
Proof: A ∪B = A ∪ (B ∩ A′) and A ∩ (B ∩ A′) = ∅ ⇒ P (A ∪B) = P (A) + P (B ∩ A′)
B = B ∩ S = B ∩ (A ∪ A′) = (B ∩ A) ∪ (B ∩ A′) and (B ∩ A) ∩ (B ∩ A′) = ∅
⇒ P (B) = P (B ∩ A) + P (B ∩ A′)
Together these results ⇒ P (A ∪B) = P (A) + P (B)− P (A ∩B)
This generalises the result that for A and B mutually exclusive events, P (A ∪ B) =
P (A) + P (B).

2.4 Evaluating probabilities in practice

(a) Long term relative frequency [e.g. P (drawing-pin lands with its pin sticking up)]

We observe that relative frequency tends to settle down as the number of trials increases.
Formally, we define the probability of the event occurring as the limit of the relative
frequency, so

as number of trials →∞, relative frequency → probability .
The graph below shows the relative frequency of occurrence of an event with probability
0.4 after 1, 2, . . . , 200 trials (the outcomes of the trials were simulated in R).

(b) Symmetry [e.g. P (fair six-sided die lands showing a 6)]

Finite number of equally likely outcomes.

In this case P (A) =
# outcomes favourable to A

# outcomes possible

3

The key to evaluating probabilities in this case is to define a sample space in a convenient
way and then to count the numbers of outcomes corresponding to various events.

For example, consider a throw of two fair six-sided dice, one red and the other blue. Let
A be the event ‘score = 7 or 8’. Let S = {(i, j) : i = 1, 2, 3, 4, 5, 6; j = 1, 2, 3, 4, 5, 6}
where i and j are the scores on the red and blue die respectively. S consists of 36 elements
(equally-likely outcomes).

A = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1), (2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}
A consists of 11 elements, so P (A) = 11/36 = 0.3056.

(c) Uniform distribution over an interval

In the case that the sample space is an interval on the real line and the event occurs ‘at
a random point in the interval’ we adopt a uniform distribution of probability over the
interval so that the probabilities of the event occurring in sub-intervals of the same length
are equal.

For example suppose we ‘choose a time at random’ between 14:00 and 15:00, then

P (selected time is before 14:30) = 0.5

P (selected time is between 14:20 and 14.45) = 25/60 = 0.4167

P (selected time is before 14:10 or after 14:50) = 20/60 = 0.3333

2.5 Conditional probability

We introduce the concept of the probability that an event occurs, conditional on another
specified event occurring (or, in other language, given that another specified event occurs).

For example, consider the event that in a throw of a fair six-sided die we score 6, conditional
on scoring more than 2. The event ‘scoring more than 2’ corresponds to the 4 equally-likely
outcomes {3, 4, 5, 6} and of these only 1 outcome corresponds to ‘score of 6’, so the probability
required is 1/4. Imposing the condition has effectively reduced/restricted the sample space
from {1, 2, 3, 4, 5, 6} to {3, 4, 5, 6}.

Note that the conditional probability can be expressed as the ratio of two unconditional
probabilities of events defined in terms of the original sample space of size 6 by 1

4
=

1/6
4/6

.

Again, consider a throw of two fair six-sided dice, one red and the other blue. Let A be the
event ‘score = 7 or 8’ and let B be the event ‘score = 8, 9 or 10’.

Let S = {(i, j) : i = 1, 2, 3, 4, 5, 6; j = 1, 2, 3, 4, 5, 6} where i and j are the scores on the red
and blue die respectively.

A = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1), (2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} 11 elements

B = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2), (3, 6), (4, 5), (5, 4), (6, 3), (4, 6), (5, 5), (6, 4)} 12 elements

P (A) = 11/36, P (B) = 12/36

A ∩B is the event ‘score of 8’ and P (A ∩B) = 5/36

4

Consider the event A conditional on B, that is ‘a score of 7 or 8 given that the score is 8,
9, or 10’.

The outcomes in B favourable to A are (2, 6), (3, 5), (4, 4), (5, 3), (6, 2) so the probability of

event A conditional on B is 5/12. This probability is
5/36

12/36
=

P (A ∩B)
P (B)

.

This motivates the general definition:

The probability of event A conditional on event B is denoted P (A | B) and is defined as

P (A | B) =
P (A ∩B)
P (B)

for P (B) 6= 0.

The multiplication rule for probabilities follows, namely P (A ∩B) = P (A)P (B | A).

For example, suppose we draw two balls at random, one after the other and without re-
placement, from a bag containing 6 red and 4 blue balls. Let A = first ball drawn is red, and
let B = second ball drawn is blue. Then

P (1st ball drawn is red and 2nd ball drawn is blue) = P (A ∩B) = P (A)P (B | A)

=
6

10
×

4

9
=

4

15
= 0.2667.

2.6 Independent events and independent trials

Events A and B are independent ⇔ P (A ∩B) = P (A)P (B)
[⇔ P (A | B) = P (A) and P (B | A) = P (B)]

So events A and B are independent if and only if the occurrence of one does not affect the
probability of occurrence of the other.

Events A1, A2, . . . , Ak are independent if and only if the probability of the intersection of
any 2, 3, . . . , k of the events equals the product of their respective probabilities. So, for three
events A,B,C to be independent, we require P (A∩B) = P (A)P (B), P (A∩C) = P (A)P (C),
P (B ∩ C) = P (B)P (C) and P (A ∩B ∩ C) = P (A)P (B)P (C).

A trial is a single repetition of a random experiment.

T1 and T2 are independent trials ⇔ all events defined on the outcome of T1 are inde-
pendent of all events defined on the outcome of T2.

2.7 Partitioning of an event

Let {E1, E2, . . . , Ek} be a partition of S and let A be an event.

Then A = A∩S = A∩(∪iEi) = ∪i (A ∩ Ei) and P (A) =

i P (A ∩ Ei) =

i P (Ei)P (A | Ei)

The event A has been partitioned into events A ∩ Ei, i = 1, 2, . . . , k. For example, with
k = 4:

5

P (A) is the sum of the probabilities of the events which make up the partition of A.

2.8 Bayes theorem

Let {E1, E2, . . . , Ek} be a partition of S and let A be an event.

Then P (Ei | A) =
P (Ei ∩ A)

P (A)
=

P (Ei)P (A | Ei)
P (A)

=
P (Ei)P (A | Ei)∑k
j=1 P (Ej)P (A | Ej)

, i = 1, 2, . . . , k

The result is often written in proportional terms:

P (Ei | A) ∝ P (Ei)P (A | Ei) , i = 1, 2, . . . , k.

The probabilities P (Ei), i = 1, 2, . . . , k, are the prior probabilities;
the probabilities P (Ei | A), i = 1, 2, . . . , k, are the posterior probabilities.

For example, suppose a population is made up of 60% men and 40% women. The percentages
of men and women in the population who have an iPod are 30% and 40% respectively. A person
is selected at random from the population and is found to have an iPod.

tree diagram

P (selected person is male) = P (person is male | has iPod) =
0.18

0.18 + 0.16
=

0.18

0.34
= 0.5294

6

2.9 Worked examples

2.1 In a family with five children, what is the probability that all the children are of the same
sex? What is the probability that the three oldest children are boys and the two youngest
are girls? What is the probability that the three oldest children are boys?

Solution:

A suitable sample space is S = {(x1, x2, x3, x4, x5) : xi = M,F ; i = 1, 2, 3, 4, 5} where xi
is the sex of the ith oldest child. Size of sample space: n(S) = 25 = 32.

Two outcomes are favourable to the event ‘all are same sex’, namely (M,M,M,M,M)
and (F, F, F, F, F ) . One outcome is favourable to the event ‘three oldest are boys and
two youngest are girls’, namely (M,M,M,F, F ). 22 = 4 outcomes are favourable to the
event ‘three oldest are boys’, namely any one of the form (M,M,M, ·, ·).
If we make the assumption that all 32 outcomes are equally likely (under what conditions
is this a reasonable assumption?) then

P (all same sex) =
2

32
= 0.0625

P (three oldest are boys and two youngest are girls) =
1

32
= 0.0313

P (three oldest are boys) =
4

32
= 0.125.

2.2 In how many ways can we choose 6 numbers from a group of 59 for a line in the UK
National Lottery?

Solution:

(
59

6

)
=

59!

6! 53!
=

59× 58× 57× 56× 55× 54
6× 5× 4× 3× 2× 1

= 45, 057, 474

Note: This leads to (approximately) ‘a chance of 1 in 45 million’ of winning the jackpot.

2.3 A fair die is thrown 4 times. Find P (total score is 4 or 24).

Solution:

S = {(x1, x2, x3, x4) : xi = 1, 2, 3, 4, 5, 6; i = 1, 2, 3, 4}, with n(S) = 64 = 1296.
Two outcomes are favourable to the event ‘total score is 4 or 24’, namely (1, 1, 1, 1) and
(6, 6, 6, 6). So P (score 4 or 24) = 2/1296 = 0.0015.

[Note: We are effectively taking a random sample of size 4 with replacement from the
population {1, 2, 3, 4, 5, 6}.]

2.4 A committee consists of 7 men and 4 women and a sub-committee of 6 is to be chosen
at random. Find the probabilities that the sub-committee contains exactly k women,
k = 0, 1, 2, 3, 4.

Solution:

# ways of choosing the sub-committee = # different selections of 6 people from 11 =
(
11
6

)
# different selections of k women from 4 =

(
4
k

)
# different selections of (6− k) men from 7 =

(
7

6−k

)
# ways of choosing a sub-committee containing k women =

(
4
k

)(
7

6−k

)

7

∴ P (sub-committee contains exactly k women) =
(4k)(

7
6−k)

(116 )

Substituting in values of k, then

P (sub-committee contains exactly k women) =




0.01515 k = 0
0.18182 k = 1
0.45455 k = 2
0.30303 k = 3
0.04545 k = 4

2.5 A bag contains 4 white and 3 red balls. Two balls are drawn out at random without
replacement. What is the probability that they are white and red respectively? What is
the probability that the second ball drawn is white?

Solution:

Formally: let Wi be ‘ith ball drawn is white’, Ri be ‘ith ball drawn is red’

P (W1 ∩R2) = P (W1)P (R2|W1) = (4/7)× (3/6) = 2/7
Noting that W1, R1 are mutually exclusive and exhaustive, we partition W2 as
W2 = W2 ∩ (W1 ∪R1) = (W2 ∩W1) ∪ (W2 ∩R1)
Then P (W2) = P ((W2 ∩W1) ∪ (W2 ∩R1)) = P (W2 ∩W1) + P (W2 ∩R1)
= P (W1)P (W2|W1) + P (R1)P (W2|R1)
= (4/7)× (3/6) + (3/7)× (4/6) = 2/7 + 2/7 = 4/7 = 0.5714.
It is very easy to sort out all possibilities with a ‘tree diagram’:

Note that P (W2) = P (W1).

8

2.6 Andrew and Brian play a round of golf. The probability that Andrew (Brian) gets a 4 at
the first hole is 0.3(0.6). Assuming independence, find the probability that at least one
of them gets a 4 at the first hole.

Solution:

Let A(B) be the event ‘Andrew (Brian) gets a 4’

Method 1: P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = P (A) + P (B) − P (A)P (B) (by
independence) = 0.3 + 0.6− 0.18 = 0.72.
Method 2: P (neither gets a 4) = P (A′ ∩ B′) = P (A′)P (B′) = 0.7 × 0.4 = 0.28, so
P (at least one gets a 4) = 1− 0.28 = 0.72.

2.7 Sampling inspection

Regular sampling of mass-produced items is carried out by taking random samples of 8
items from the production line. Each selected item is tested to find out if it is defective.
Assuming independence from item to item, and assuming that 10% of the production is
defective, we can adopt a model with i.i.d. trials and with P (selected item is defective) =
0.1.

∴ P (sample contains 2 defective items) =
(
8
2

)
× 0.12 × 0.96 = 0.1488

Note: Our production of items is finite. Strictly speaking, as we sample items one after
another the successive trials are not independent: the outcome of one trial conditions the
probabilities associated with all later trials.

e.g. P (2nd item defective|1st item OK) 6= P (2nd item defective)
and P (1st item OK and 2nd item defective) 6= P (1st item OK)P (2nd item defective)
However, if the population of items is large (as in the above illustration) and the sample
of moderate size, it is reasonable to adopt a model of independent trials with constant
probability of an item being defective i.e. a model of i.i.d. trials. We are in effect assuming
an unchanging population as an approximation to a population which is actually changing
slightly from trial to trial — we are using the theory of sampling with replacement to
approximate the real situation, which is sampling without replacement.

2.8 Players A and B throw a regular 6-sided die in turn. The first to throw a 6 wins the
game. A throws first. Find the probability that A wins the game.

Solution:

We can represent the sample space as the countable union of the events Ek, where

Ek = game ends on the kth throw of the die, k = 1, 2, 3, . . .

A B A B A B · · · Probability
E1 6 (1/6)
E2 6

′ 6 (5/6)(1/6)
E3 6

′ 6′ 6 (5/6)2(1/6)
E4 6

′ 6′ 6′ 6 (5/6)3(1/6)
E5 6

′ 6′ 6′ 6′ 6 (5/6)4(1/6)




where 6′ denotes ‘not a 6’.

9

P (A wins) = P (E1 ∪ E3 ∪ E5 ∪ . . .)
= P (E1) + P (E3) + P (E5) + · · ·
= 1/6 + (5/6)2(1/6) + (5/6)4(1/6) + · · ·
= (1/6)

(
1 + 25/36 + (25/36)2 + · · ·

)
= (1/6) (1− 25/36)−1 = (1/6)× (36/11) = 6/11 = 0.5455.

Note: the advantage, of course, lies with the player who throws first.

OR: Let p = P (A wins)

‘A wins’ = ‘A wins on first throw’ ∪ ‘die passes to B and B does not win the game’
After the die passes to B, B is then in exactly the same position as A was at the start of
the game, so p = 1/6 + (5/6)(1− p), which gives p = 6/11.

2.9 A stick of length 12cm is broken into two pieces at a point chosen at random along its
length. What is the probability that the rectangle which can be constructed using the
pieces as two adjacent sides has an area less than 27cm2?

Solution:

Lay the stick down and let x be the distance of the break point from the left end of the
stick.

S = {x : 0 < x < 12} The probability measure is length. Area of rectangle = x(12− x), which is less than 27 for x < 3 or x > 9.
Probability of event ‘x < 3 or x > 9’ is 6/12 = 0.5.

2.10 Two students are each, independently of the other, equally likely to arrive for a 10:15
lecture at any time between 10:15 and 10:30. Find the probability that their arrivals are
separated by at least 10 minutes.

Solution:

Call the students A and B and let x, y be their respective arrival times in minutes after
10:15.

S = {(x, y) : 0 < x < 15, 0 < y < 15} The probability measure is area. Area for which the students arrive more than 10 minutes apart is {|x − y| > 10}, which
has area 25 (draw a diagram to see this); area of S = 152 = 225.

Required probability = 25/225 = 1/9 = 0.1111.

10