CS计算机代考程序代写 AI CMPUT 397 Reinforcement Learning:

CMPUT 397 Reinforcement Learning:

Probabilities & Expectations
Rupam Mahmood January 10, 2020
R&L AI

Probabilities and intelligent systems
Probability is a measure of uncertainty
An intelligent system maximizes its “chances” of success Intelligent systems create a favorable future
Probabilities and expectations are tools for reasoning about uncertain future events



Let’s take the example of rolling a dice
We say the probability of observing 3 is 1/6 How to express it mathematically?
Rolling a dice is, an experiment, a repeatable process 
 

with different possible results/outcomes
One outcome is 3. Outcomes are mutually exclusive
The set of all outcomes is called a sample space: { 1, 2, 3, 4, 5, 6 }
An event is a set of outcomes. The event of observing 4 or more: {4, 5, 6} Define P as a function mapping from events to probabilities: P(3) = 1/6



✓ ✓ ✓ ✓

Probability axioms
Non-negativity: A probability is always non-negative
 

0 ≤ P(A), for all A
✓ Additivity:IfA∩B={},thenP(A∪B)=P(A)+P(B)
Unit measure: P(Ω) = 1, where Ω is the sample space What is the probability of observing 4 or more?
P({4, 5, 6}) = P(4) + P(5) + P(6) = 3/6 = 1/2

✓ ✓ ✓
what kind of object is this?

Random variables
Random variables are a convenient way to express events
A Random variable is a function mapping from outcomes to real values For coin-tossing experiment: it can be X(head) = 1 and X(tail) = -1
For outcomes of dice-rolling experiment: X(a) = a
It allows succinct expressions for events such as [X ≥ 4]
 

which stands for { ω ∈ Ω: X(ω) ≥ 4 } = { 4, 5, 6 }




Random variables: example
If we roll two dices, what is the probability of the sum being more than 2? Sample space: { (1,1), …, (1,6), (2,1), …, (2,6), …, (6,1), …, (6,6) }
We can define a random variable X standing for the sum
Then the event of “the sum being more than 2” can be written as [X > 2]
✓ Then1=P(Ω)=P([X=2]∪[X>2])=P(X=2)+P(X>2)
✓ ✓ ✓ ✓

Conditional probabilities
A conditional probability is a measure of an uncertain event when we know that another event has occurred
In the single dice-rolling experiment, if the sum is below 4, what is the probability that the value is more than 2
Definition: P(A | B) = P(A ∩ B) / P(B) ≠ P(A)



A
A
S
B
S

Conditional probabilities: example
In the single dice-rolling experiment, if the sum is below 4, what is the probability that the value is more than 2
P( [Z > 2 | Z < 4] ) = P( [Z > 2] ∩ [Z < 4] ) / P( [Z < 4] ) = P( [Z = 3] ) / P( [Z < 4] ) = (1/6) / (1/2) = 1/3 ✓ ✓ ✓ ✓ ✓ Low of total probabilities B aa B ∩ A1 B ∩ A2 A1 A2 A3 
 Ai ∩Aj =𝜙,i≠j, ∪iAi=Ω P(B) = ∑k P(B ∩ Ak) = ∑k P(B | Ak) P(Ak) B ∩ A3 ✓ 
 
 E[X] = ∑x P(X=x)
 x∈𝓧 ✓ An expected value of a random variable conditional on another event is a weighted average of possible outcomes, where the weights are the conditional probabilities of those outcomes given the event
 
 E[X | Y=y ] = ∑ x P(X=x | Y=y )
 x x ∈ ∈ 𝓧𝓧 ✓ Expectation conditional on a random variable E[X | Y ] itself is a random variable, which is a function of another random variable Y Expectations & conditional expectations An expected value of a random variable is a weighted average of possible outcomes, where the weights are the probabilities of those outcomes
 ✓ ✓ ✓ ✓ Linearity: E[X + Y ] = E[X ] + E[Y ] Linearity: E[aX ] = aE[X ] Non-multiplicativity: E[XY ] ≠ E[X ] E[Y ] Law of the unconscious statistician: E[ g(X) ] = ∑ g(x) P(X=x) x∈𝓧 Properties of expectations Expectations: example In the double dice-rolling experiment, What is the expected value of the sum of the two dice? ✓ Expectations: example Show that E[X ] = E[ E[X | Y ] ] ✓