CS570 Biomedical Science & Health IT
CS544 D1
Foundations of Analytics
Lecture 4
Guanglan Zhang
1
1
Counting results for different sampling methods.
Assuming that we have a set with n elements, and we want to draw k samples from the set, then the total number of ways we can do this is given by the following table.
https://www.probabilitycourse.com/chapter2/2_1_4_unordered_with_replacement.php
2
2
This is read “n choose k.”
n! read as “n factorial.”
Probability
P(A or B) = P(A∪B) = P(A) + P(B) when the events A and B are mutually exclusive
If the events A and B are not mutually exclusive, adding the probabilities of the two events will add the common outcomes (A&B) twice.
P(A or B)= P(A∪B) =P(A)+P(B)–P(A&B) = P(A)+P(B)–P(A∩B), where A and B are not mutually exclusive.
For any event E, the complement of the event is represented by not E.
P(not E)=1–P(E)
3
3
Conditional Probability
The conditional probability, P(B|A), is the probability that event B occurs given that event A occurs. It is read as the probability of B given A. Event A is called the given event.
If A and B are any two events, and P(A)>0, then the conditional probability rule applies,
The following properties apply to conditional probabilities as well. For the sample space S and any fixed event A, with P(A)>0,
P(B|A)≥0, for all events B⊂S
P(S|A)=1P(S|A)=1
If B1, B2, …, Bk are disjoint events, then P(B1 ∪ B2 ∪…∪ Bk |A)=P(B1 |A)+P(B2 |A)+…+P(Bk |A)
P(Bc|A)=1–P(B|A), where Bc is the complement of the event B
For any events A,B, and C, if B⊂C , then P(B|A)≤P(C|A)
P(B ∪ C|A)=P(B|A)+P(C|A)–P(B∩C|A) , where B and C are not disjoint.
The multiplication rule is derived from the conditional probability rule as follows:
P(A∩B)=P(A and B)=P(A)⋅P(B|A)
4
4
Conditional Probability Example – Card Deck
Consider the standard full deck of 52 playing cards. If selecting two cards from that deck in sequence, let the events
A = {first card drawn is an ace} and
B = {second card drawn is an ace}.
The above set consists of C(52,2)==51*52/2=1326 pairs of cards.
For the first card, there are 4 aces, hence P(A)=4/52
The probability for the second card being an ace depends on whether the first card is an ace or not.
If the first card is an ace, then the probability of the second card also being an ace is 3/51.
If the first card is not an king, then the probability of the second card being an ace is 4/51.
P(B|A)=3/51, and P(B|Ac)=4/51
By the multiplication law, the probability of both cards being aces is
P(A∩B)=P(A)⋅P(B|A)=(4/52)⋅(3/51)=0.00452
5
5
Conditional Probability Example – Red and blue balls
Consider a box with 3 red balls and 2 blue balls inside it. For the given problem of selecting two successive balls, we are interested in finding the probability that both the balls selected will be red.
Let A be the event that the first ball is red, and B be the event that the second ball is also red.
A = {first ball is red}, B = {second ball is red}
P(A)=3/5, and P(B|A)=2/4
Probability that both the balls are red
P(A∩B)=P(A)⋅P(B|A)=(3/5)⋅(2/4)=3/10=0.3
Similarly, the probability that both the balls are blue = (2/5)⋅(1/4)=0.1
The probability of the first ball being red and the second ball being blue = (3/5) ⋅(2/4)=0.3
6
6
Independent Events
Two events A and B are said to be independent if P(A∩B)=P(A)⋅P(B)
If the events A and B are independent, the occurrence of the event A has no effect on the event B.
If the events A and B are independent, then:
A and Bc are independent
Ac and B are independent
Ac and Bc are independent
Three events A, B, and C are mutually independent if and only if:
P(A∩B)=P(A)⋅P(B)
P(B∩C)=P(B)⋅P(C)
P(A∩C)=P(A)⋅P(C)
P(A∩B∩C)=P(A)⋅P(B)⋅P(C)
7
7
Independent Events Example – Coin Toss
Consider the sample space for a coin toss 5 times. Let A1,A2,…,A5 be the events that the corresponding coin toss is a head. In this case, P(A1)=1/2 and P(Ac1)=1/2, etc.
Since the events are mutually independent, the probability of all tosses being tails is:
The probability of having at least one head in the 5 coin toss then is:
8
8
Bayes’ Rule
Bayes’ rule is used for revising probabilities with newly acquired information.
Suppose A1,A2,…,Ak are mutually exclusive and exhaustive events, i.e., exactly one of the events must occur. Then, for any event B, the events (A1∩B),(A2∩B),…,(Ak∩B) are mutually exclusive, and P(B)=P(A1∩B)+P(A2∩B)+…+P(Ak∩B)
Using the multiplication rule: P(B)=P(A1)⋅P(B|A1)+P(A2)⋅P(B|A2)+…+P(Ak)⋅P(B|Ak)
In other words:
For Bayes’ rule, we assume that the probabilities P(A1), P(A2),…, P(Ak) are known from the given data. Also, we assume that the conditional probabilities P(B|A1),P(B|A2),…,P(B|Ak) are also known. Bayes’ rule is concerned with the computation of the probabilities P(A1|B),P(A2|B),…,P(Ak|B).,
For any i, the probabilities P(Ai|B) are computed as follows:
==
In Bayes’ rule, the probabilities P(Ai) are known as prior probabilities and the probabilities P(Ai|B) are known as posterior probabilities. The probabilities P(B|Ai) are the likelihood probabilities.
9
9
Bayes’ Rule Example
Suppose 7% of the population has lung disease. Among those having lung disease, 90% are smokers. Of those not having lung disease, 25% are smokers.
We want to use Bayes’ rule to determine the probability that a randomly selected smoker has lung disease.
Let A1 be the event that the person selected has lung disease and A2 be the event that the person selected has no lung disease. The two events A1 and A2 are mutually exclusive and exhaustive.
From the given data, the prior probabilities are P(A1)=0.07 and P(A2)=1−P(A1)=0.93
Let B be the event that the person selected is a smoker. From the given data, the likelihood probabilities are:
P(B|A1)=0.9 and P(B|A2)=0.25.
10
10
Bayes’ Rule Example (continued)
The probability that a randomly selected smoker has lung disease is:
The probability that a randomly selected smoker does not have lung disease is:
Note that the two posterior probabilities sum to 1.
11
11
Random Variables
A random variable is a function that associates a number with each outcome. So, it is a quantitative variable whose value depends on chance, i.e., which outcome is selected.
Here are some examples of random variables:
The sum of the dice when a pair of dice is rolled
The number of heads when a coin is tossed, say, three times
The number of siblings for the students in a class, etc.
Let S be the sample space for a random experiment E. A random variable X is a function:
X:S→R
that associates exactly one number, x, for each outcome w∈S: X(w)=x
Consider the experiment of flipping a fair coin twice. The set of possible outcomes for this experiment is the sample space S={HH,HT,TH,TT}.
12
12
Random Variables
Let the random variable x be the number of heads. This function, when applied to the outcomes, results in the values, 2, 1, 1, and 0, respectively.
For the above experiment, the set of all numbers that the random variable x can have is the set {0,1,2} also known as the support of x, SX.
The random variable discussed above is an example of a discrete random variable. However, if the support of the random variable contains all real numbers, the random variable is called a continuous random variable.
13
13
Functions and Function Arguments
A function takes zero or more arguments and returns a value.
A function with a single argument can be written as:
inc.1 <- function (x) {
return (x + 1)
}
The function can have an explicit return statement. Otherwise, the last statement is evaluated and returned.
inc.2 <- function (x, y) {
return (x - y)
}
inc.2(10, 20)
inc.2(x = 10, y = 20)
inc.2(10, y = 20)
When the function is invoked, the parameter names can be explicitly assigned the input values.
If the input values are named, then the inputs can be provided in any order.
The above function definitions expect two arguments to be provided. If one or more inputs are missing, the function invocation throws an error. The error messages show that the arguments have no defaults, and hence expect a value to be provided.
14
14
/docProps/thumbnail.jpeg