Making Simple Decisions
CSci 5512: Artificial Intelligence II
Instructor:
February 17, 2022
Copyright By PowCoder代写 加微信 powcoder
Instructor:
Making Simple Decisions
Announcements
HW2 due Feb 24
Making Simple Decisions
Instructor:
Preferences
A lotter is a situation with uncertain prizes Lottery L = [p, A; (1 − p), B]
A ≻ B A ∼ B A≽B
A is preferred to B indifference between A and B B notpreferredtoA
Making Simple Decisions
Instructor:
Rational Preferences
Preferences of a rational agent must obey constraints Constraints
Orderability: (A≻B)∨(B ≻A)∨(A∼B)
Transitivity: (A≻B)∧(B ≻C) =⇒ (A≻C)
Continuity:A≻B≻C =⇒∃p[p,A;1−p,C]∼B
Substitutability: A ∼ B =⇒ [p,A;1−p,C] ∼ [p,B;1−p,C]
Monotonicity:
A≻B =⇒ (p≥q ⇐⇒ [p,A;1−p,B]≽[q,A;1−q,B])
Decomposability:
[p,A;1−p,[q,B;,1−q,C]] ∼ [p,A;(1−p)q,B;(1−p)(1−q),C]
Violating the constraints leads to self-evident irrationality
Making Simple Decisions
Instructor:
Rational Preferences
Example of irrationality when violating transitivity Suppose agent has A ≻ B ≻ C ≻ A
If agent has A then offer to trade C for A+$1 which the agent will accept
Then offer to trade B for C + $1 which the agent will accept again
Finally trade A for B + $1
We are back where we started except now the agent has $3
Continue until agent has no money at all
Making Simple Decisions
Instructor:
Maximizing Expected Utility
Theorem (Ramsey, 1931; von Neumann and Morgenstern, 1944): Given preferences satisfying the constraints there exists a real-valued function U such that
U(A) ≥ U(B) ⇔ A ≽ B U([p1,S1;…;pn,Sn]) = piU(Si)
MEU principle
Choose the action that maximizes expected utility
Rational agent need not use utilities and probabilities Still can be consistent with MEU
Example: A lookup table for perfect tic-tac-toe
Making Simple Decisions
Instructor:
Utilities map state to real numbers. Which numbers? Standard approach to assessment of human utilities:
Compare a given state A to a standard lottery Lp that has “best possible prize” u ̄ with probability p
“worst possible catastrophe” u with probability 1 − p
Adjust lottery probability p until A ∼ Lp Normalized utilities: u ̄ = 1.0, u = 0.0
Behavior is invariant w.r.t. positive linear/affine transformation
U ̃(x) = k1U(x) + k2 where k1 > 0
Making Simple Decisions
Instructor:
Multi-attribute Utility
How to handle utility functions of many variables X1 , . . . , Xn ? What is U(Deaths, Noise, Cost)?
How to get (complex) utility functions from preferences? Two key ideas
Dominance structures
Decisions can be made without knowing U(x1, . . . , xn) Independence structures
Independence in preferences yields canonical forms for U(x1,…,xn)
Making Simple Decisions
Instructor:
Strict Dominance
Typically define attributes such that U is monotonic in each Strict dominance:
Choice B strictly dominates choice A iff
∀i Xi (B) ≥ Xi (A) (and hence U(B) ≥ U(A))
Strict dominance seldom holds in practice
Making Simple Decisions
Instructor:
Stochastic Dominance
Distribution p1 stochastically dominates distribution p2 iff
∀t p1(x)dx ≤ p2(t)dt
⇔ ∀tP1(X≤t)≤P2(X≤t)
Making Simple Decisions
Instructor:
Stochastic Dominance (cont.)
Let U be monotonic in x
Action A1 has outcome distribution p1 Action A2 has outcome distribution p2 A1 stochastically dominates A2
p1(x)U(x)dx ≥ p2(x)U(x)dx
⇔ EP1[U] ≥ EP2[U]
Multi-attribute case: stochastic dominance on all attributes
Making Simple Decisions
Instructor:
Preference structure: Deterministic
X1 and X2 preferentially independent (PI) of X3 iff preference between ⟨x1, x2, x3⟩ and ⟨x1′ , x2′ , x3⟩ does not depend on x3 Example: ⟨Noise,Cost,Safety⟩
⟨ 20,000 suffer, $ 4.6 billion, 0.06 deaths/mpm ⟩ ⟨ 70,000 suffer, $ 4.2 billion, 0.06 deaths/mpm ⟩
Theorem (Leontief, 1947): If every pair of attributes is PI of its complement, then every subset of attributes is PI of its complement ≡ Mutual PI
Theorem (Debreu, 1960): Mutual PI =⇒ there exists additive value function:
V(S) = Vi(Xi(S)) i
Assess n single-attribute functions
Making Simple Decisions
Instructor:
Preference structure: Stochastic
Need to consider preferences over lotteries
X is utility-independent (UI) of Y iff preferences over lotteries
in X do not depend on Y
Mutual UI: Each subset is UI of its complement Mutual UI =⇒ ∃ multiplicative utility function:
U =k1U1 + k2U2 + k3U3 + k1k2U1U2 + k2k3U2U3 + k3k1U3U1 + k1k2k3U1U2U3
Making Simple Decisions
Instructor:
Decision Networks
Add decision (action) nodes and utility nodes to belief networks
Enable rational decision making
Algorithm:
For each value of action node, compute expected value of
utility node given (action, evidence) Return MEU action
Making Simple Decisions
Instructor:
Value of Information
Compute value of acquiring each possible piece of evidence Can be done directly from decision network
Example: Buying oil drilling rights
Solution: Compute expected value of information
Making Simple Decisions
Instructor:
General Formula
Current evidence E, current best action α
Possible action outcomes Si , potential new evidence Ej Expected value of best action without information
EU(α|E) = maxU(Si)P(Si|E,a)
Suppose we knew Ej = ejk , then we would choose αejk s.t.
EU(αejk |E,Ej = ejk) = maxU(Si)P(Si|E,a,Ej = ejk)
Ej is a random variable whose value is currently unknown
Must compute expected gain over all possible values
Value of perfect information
VPIE(Ej) = P(Ej = ejk|E)EU(αejk |E,Ej = ejk) −EU(α|E) k
Making Simple Decisions
Instructor:
Properties of VPI
Non-negative (in expectation, not post hoc)
∀j,E VPIE(Ej) ≥ 0 Non-additive (consider, e.g., obtaining Ej twice)
VPIE (Ej , Ek ) ̸= VPIE (Ej ) + VPIE (Ek ) Order-independent
VPIE(Ej,Ek) = VPIE(Ej)+VPIE,Ej (Ek) = VPIE(Ek)+VPIE,Ek (Ej)
When more than one piece of evidence can be gathered, Maximizing VPI for each to select one is not always optimal Evidence-gathering becomes a sequential decision problem
Making Simple Decisions
Instructor:
Qualitative Behaviors
(a) Choice is obvious, information worth little
(b) Choice is non-obvious, information worth a lot (c) Choice is non-obvious, information worth little
Instructor:
Making Simple Decisions
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com