CMPUT 397 Reinforcement Learning:
Probabilities & Expectations
Rupam Mahmood January 10, 2020
R&L AI
Probabilities and intelligent systems
Probability is a measure of uncertainty
An intelligent system maximizes its “chances” of success Intelligent systems create a favorable future
Probabilities and expectations are tools for reasoning about uncertain future events
✓
✓
✓
✓
Let’s take the example of rolling a dice
We say the probability of observing 3 is 1/6 How to express it mathematically?
Rolling a dice is, an experiment, a repeatable process
with different possible results/outcomes
One outcome is 3. Outcomes are mutually exclusive
The set of all outcomes is called a sample space: { 1, 2, 3, 4, 5, 6 }
An event is a set of outcomes. The event of observing 4 or more: {4, 5, 6} Define P as a function mapping from events to probabilities: P(3) = 1/6
✓
✓
✓
✓ ✓ ✓ ✓
Probability axioms
Non-negativity: A probability is always non-negative
0 ≤ P(A), for all A
✓ Additivity:IfA∩B={},thenP(A∪B)=P(A)+P(B)
Unit measure: P(Ω) = 1, where Ω is the sample space What is the probability of observing 4 or more?
P({4, 5, 6}) = P(4) + P(5) + P(6) = 3/6 = 1/2
✓
✓ ✓ ✓
what kind of object is this?
Random variables
Random variables are a convenient way to express events
A Random variable is a function mapping from outcomes to real values For coin-tossing experiment: it can be X(head) = 1 and X(tail) = -1
For outcomes of dice-rolling experiment: X(a) = a
It allows succinct expressions for events such as [X ≥ 4]
which stands for { ω ∈ Ω: X(ω) ≥ 4 } = { 4, 5, 6 }
✓
✓
✓
✓
✓
Random variables: example
If we roll two dices, what is the probability of the sum being more than 2? Sample space: { (1,1), …, (1,6), (2,1), …, (2,6), …, (6,1), …, (6,6) }
We can define a random variable X standing for the sum
Then the event of “the sum being more than 2” can be written as [X > 2]
✓ Then1=P(Ω)=P([X=2]∪[X>2])=P(X=2)+P(X>2)
✓ ✓ ✓ ✓
Conditional probabilities
A conditional probability is a measure of an uncertain event when we know that another event has occurred
In the single dice-rolling experiment, if the sum is below 4, what is the probability that the value is more than 2
Definition: P(A | B) = P(A ∩ B) / P(B) ≠ P(A)
✓
✓
✓
A
A
S
B
S
Conditional probabilities: example
In the single dice-rolling experiment, if the sum is below 4, what is the probability that the value is more than 2
P( [Z > 2 | Z < 4] )
= P( [Z > 2] ∩ [Z < 4] ) / P( [Z < 4] ) = P( [Z = 3] ) / P( [Z < 4] )
= (1/6) / (1/2) = 1/3
✓
✓ ✓ ✓ ✓
Low of total probabilities
B
aa
B ∩ A1
B ∩ A2
A1 A2 A3
Ai ∩Aj =𝜙,i≠j, ∪iAi=Ω P(B) = ∑k P(B ∩ Ak)
= ∑k P(B | Ak) P(Ak)
B ∩ A3
✓
E[X] = ∑x P(X=x)
x∈𝓧
✓
An expected value of a random variable conditional on another event is a weighted average of possible outcomes, where the weights are the conditional probabilities of those outcomes given the event
E[X | Y=y ] = ∑ x P(X=x | Y=y )
x x ∈ ∈ 𝓧𝓧
✓
Expectation conditional on a random variable E[X | Y ] itself is a random variable, which is a function of another random variable Y
Expectations & conditional expectations
An expected value of a random variable is a weighted average of possible outcomes, where the weights are the probabilities of those outcomes
✓ ✓ ✓ ✓
Linearity: E[X + Y ] = E[X ] + E[Y ] Linearity: E[aX ] = aE[X ] Non-multiplicativity: E[XY ] ≠ E[X ] E[Y ]
Law of the unconscious statistician: E[ g(X) ] = ∑ g(x) P(X=x) x∈𝓧
Properties of expectations
Expectations: example
In the double dice-rolling experiment, What is the expected value of the sum of the two dice?
✓
Expectations: example
Show that E[X ] = E[ E[X | Y ] ]
✓