20th August 2019
Probability theory
https://bitbucket.org/mfumagal/
statistical_inference
Matteo Fumagalli
Intended Learning Outcomes
By the end of this session, you will be able to:
Describe the principles of set theory and set operations
Illustrate the axiomatic foundations of probability theory and appropriate counting methods
Identify dependence and indepedence of events
Show the utility of distribution functions for random variables Demonstrate how to implement basic probability calculus in R
Set theory
Draw conclusions about a population of objects after an experiment. What are the possible outcomes of the experiment?
Sample space
The set S of all possible outcomes of a particular experiment is called the sample space for the experiment.
Set theory
Examples of experiments:
tossing a con: S = {H,T}
GCSE scores of randomly selected pupils:
Set theory
Examples of experiments: tossing a con: S = {H,T}
GCSE scores of randomly selected pupils: S = {0,1,2,…,8,9}.
reaction time to a stimulus:
Set theory
Examples of experiments: tossing a con: S = {H,T}
GCSE scores of randomly selected pupils: S = {0,1,2,…,8,9}.
reaction time to a stimulus: S = (0, ∞).
Set theory
jupyter notebook:
Imagine that our experiment consists on observing the nucleotidic sequence of a particular gene of interest. What is the sample space SG =?
Now suppose that we are interested in making inference on the amino acidic sequence of a protein. What is the sample space SP =?
Suppose that our observations consist in the divergence between orthologous genes of arbitrary length. What is the sample space for such divergence SD =?
Sets and events
One the sample space has been defined (e.g. countable?), we consider collections of possible outcomes of an experiments.
Event
An event is any collection of possible outcomes of an experiment, that is, any subset of S, including S itself.
Let A ben an event, a subset of S. We say that the event A occurs if the outcome of the experiment is in the set A.
A⊂B ⇐⇒ x∈A =⇒ x∈B A = B ⇐⇒ A ⊂ B and B ⊂ A
Set operations
Given any two events A and B:
Union: the union of A and B is the set of elements that belong to either A or B or both:
A ∪ B = {x : x ∈ A or x ∈ B}
Set operations
Given any two events A and B:
Union: the union of A and B is the set of elements that belong to either A or B or both:
A ∪ B = {x : x ∈ A or x ∈ B} Intersection: the intersection of A and B is the set of
elements that belong to both A and B:
A ∩ B = {x : x ∈ A and x ∈ B}
Set operations
Given any two events A and B:
Union: the union of A and B is the set of elements that belong to either A or B or both:
A ∪ B = {x : x ∈ A or x ∈ B} Intersection: the intersection of A and B is the set of
elements that belong to both A and B:
A ∩ B = {x : x ∈ A and x ∈ B}
Complementation: the complement of A is the set of all elements that are not in A:
A c = { x : x ∈/ A }
Set operations
jupyter notebook:
Consider an experiment of drawing a nucleotidic base at random. The sample space is S = {A,C,G,T} and some possible events are A = {G,C,A} and B = {T,G}.
What is the union of A and B? What is the intersect between A and B? What is the complement of A? What is the complement of the union of A and B?
Use Venn diagrams and R!
Properties of Set operations
Commutativity
Associativity
A∪B=B∪A A∩B=B∩A
A ∪ (B ∪ C) = (A ∪ B) ∪ C A ∩ (B ∩ C) = (A ∩ B) ∩ C
Properties of Set operations
Distributive laws
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) DeMorgan’s laws
(A∪B)c =Ac ∩Bc (A∩B)c =Ac ∪Bc
Partition
Two events A and B are disjoint (or mutually exclusive) if
A ∩ B = ∅.
The events A1, A2, … are pairwise disjoint if Ai ∩ Aj = ∅ for all i ̸= j.
If A1, A2, … are pairwise disjoint and ∞i=0 Ai = S, then the collection A1, A2, … forms a partition of S.
Axiomatic foundations
For each event A in the sample space S we associate a number between 0 and 1 that we call the probability of A, denoted as P(A).
A collection of subsets of S is called a sigma algebra (or Borel field), denoted by B, if it satisfies the following properties:
1 ∅∈B
2 ifA∈B,thenAc ∈B
3 if A1,A2,… ∈ B, then ∞i=1 Ai ∈ B
Axiomatic foundations
jupyter notebook:
Is the collection of sets {∅,S} a sigma algebra with S?
If S has n elements, there are 2n sets in B. What is B if S = {1, 2, 3}?
Probability function
Given a sample space S and an associated sigma algebra B, a probability function is a function P with domain B that satisfies:
1 P(A) ≥ 0 for all A ∈ B
2 P(S) = 1
3 If A1, A2, … ∈ B are pairwise disjoint, then P(∞i=1) = ∞i=1 P(Ai )
Any function P that satisfies these Axioms of Probability is called a probability function.
Probability function
Let S = {s1, s2, …, sn} a finite and/or countable set. Let B be any sigma algebra of subsets of S.
Let p1, p2, …, pn nonnegative numbers that sum to 1. For any A ∈ B, define P(A) by
P(A)= pi {i :si ∈A}
Then P is a probability function on B.
Calculus
If P is a probability function and A is any set in B, then
1 P(∅) = 0
2 P(A) ≤ 1
3 P (Ac ) = 1 − P (A)
prove nr.3?
Calculus
If P is a probability function and A and B are any sets in B, then
1 P (B ∩ Ac ) = P (B ) − P (A ∩ B )
2 P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
3 if A ⊂ B, then P(A) ≤ P(B)
Calculus
Since P(A ∪ B) ≤ 1, then
P(A ∩ B) ≥ P(A) + P(B) − 1
This is a special case of the Bonferroni’s Inequality: to bound the probability of a simultaneous event from the probabilities of the individual events.
Calculus
If P is a probability function, then
1 P(A)=∞i=1P(A∩Ci)foranypartitionC1,C2,…ofS 2 P(∞i=1 Ai ) ≤ ∞i=1 P(Ai ) for any sets A1, A2, …
This property is called Boole’s Inequality, a general version of the Bonferroni’s Inequality.
Counting
Methods of counting are used to construct probability assignments on finite sample spaces.
Fundamental Theorem of Counting
If a job consists of k separate tasks, the i − th of which can be done in ni ways, i = 1, …, k, then the entire job can be done in n1 ×n2 ×…×nk ways.
Counting
Let’s assume that one protein domain is comprised of 6 amino acids. To be able to calculate the probability of having exactly one sequence we first must count how many different combinations of 6 amino acids can be observed.
Is that enough information to calculate such probabilities? What else do we need?
Counting
without replacement with replacement
ordered unordered
Ordered, without replacement
The first amino acid can be selected in … ways, the second in …
Ordered, without replacement
The first amino acid can be selected in … ways, the second in …
For a positive integer n, n! (read n factorial) is the product of all the positive integers less than or equal to n, that is, n!=n×(n−1)×(n−2)×…×2×1. Wedefine0!=1.
Ordered, with replacement
All amino acids can be selected in … ways …
Unordered, without replacement
Since we know the number of ways when the ordering must be taken into account, …, we need to … this number by all the possible ways that 6 amino acid can be ordered, which is …
Unordered, without replacement
Since we know the number of ways when the ordering must be taken into account, …, we need to … this number by all the possible ways that 6 amino acid can be ordered, which is …
For nonnegative integers n and r, where n ≥ r, we define the symbol n, read n choose r, as
r
n n!
r =r!(n−r)!
These numbers are also called binomial coefficients.
Unordered, with replacement
Imagine to assign 6 amino acids on the 20 possible values. We can reduce the ways of counting noting that we just need to keep track of the arrangement and that the two boundaries of the 22 interval bounds (since we can 20 bins) are not relevant. Therefore, we have
25! ways. 6!19!
n+r −1 r
jupyter notebook: fill in the table of counting methods.
Enumerating outcomes
Suppose S = {s1, …, sn} is a finite sample space. If all the outcomes are equally likely, then P({si}) = 1/N for every outcome si.
If so,
P (A) = P ({si }) = 1 = nr. elements in A si∈A si∈A N nr. elements in S
Conditional probability
If A and B are events in S, and P(B) > 0, then the conditional probability of A given B, written P(A|B), is
P(A|B) = P(A ∩ B) P(B)
Conditional probability
If A and B are events in S, and P(B) > 0, then the conditional probability of A given B, written P(A|B), is
P(A|B) = P(A ∩ B) P(B)
The original sample space S has been updated to B, so that P(B|B) = 1.
IfAandB aredisjoints,thenP(A∩B)=0and
P(A|B) = P(B|A) = 0.
Independence
Two events, A and B, are statistically independent if P(A ∩ B) = P(A)P(B)
If the occurrence of B has no effect on the probability of A, then P(A|B) = P(A).
Independence
If A and B are independent events, then the following pairs are also independent:
A and Bc Ac and B Ac and Bc
Independence
A collection of events A1, A2, …, An are mutually independent if for any subcollection Ai1 , …, Aik , we have
kk
P( Aij ) = P(Aij )
j=1 j=1
Simultaneous independence requires a strong definition.
Random variables
Assume we are interested in the GC content of a nucleotidic sequence {A,C,G,T} of 50 base pairs.
If we record ”1” for a GC base and ”0” otherwise (A or T), what is the sample space for this experiment?
Random variables
Assume we are interested in the GC content of a nucleotidic sequence {A,C,G,T} of 50 base pairs.
If we record ”1” for a GC base and ”0” otherwise (A or T), what is the sample space for this experiment?
S has 250 elements!
Random variables
Assume we are interested in the GC content of a nucleotidic sequence {A,C,G,T} of 50 base pairs. We record ”1” for a GC base and ”0” otherwise (A or T),
If we define X = number of 1s recorded out of 50, what is the sample space for X?
Random variables
Assume we are interested in the GC content of a nucleotidic sequence {A,C,G,T} of 50 base pairs. We record ”1” for a GC base and ”0” otherwise (A or T),
If we define X = number of 1s recorded out of 50, what is the sample space for X?
S for X is the set of integers {0, 1, 2, …, 50}!
Random variables
When we define X, we created a mapping (a function) from the original sample space to a new sample space.
A random variable is a function from a sample space S into the real numbers.
In defining a random variable, we also define a new sample space. Can our probability function on the original sample space be used for the random variable?
Random variables
Suppose we have a sample space S = {s1, …, sn} with a probability function P and we define a random variable X with range
X = {x1, …, xm}.
PX(X =xi)=P({sj ∈S :X(sj)=xi}) If X is uncountable?
Random variables
Suppose we have a sample space S = {s1, …, sn} with a probability function P and we define a random variable X with range
X = {x1, …, xm}.
PX(X =xi)=P({sj ∈S :X(sj)=xi}) If X is uncountable?
PX(X ∈A)=P({s ∈S :X(s)∈A})
Distribution functions
With every random variable X, we associate a function called the cumulative distribution function of X.
The cumulative distribution function (or cdf) of a random variable X, denoted by FX(x) is defined by
FX (x) = PX (X ≤ x), for all x
Experiment: roll a die, what is FX (x)?
Cumulative distribution function
jupyter notebook:
Consider the experiment of sampling three nucleotidic bases and let X be the number of GC bases observed. What is the cdf of X?
Cumulative distribution function
jupyter notebook:
Consider the experiment of sampling three nucleotidic bases and let X be the number of GC bases observed. What is the cdf of X?
FX(x)=0,if −∞