代写 algorithm game math AI statistic software Bayesian react theory comp4620/8620: Advanced Topics in AI Foundations of Artificial Intelligence

comp4620/8620: Advanced Topics in AI Foundations of Artificial Intelligence
Marcus Hutter
Australian National University Canberra, ACT, 0200, Australia http://www.hutter1.net/
ANU

Foundations of Artificial Intelligence – 2 – Marcus Hutter
Abstract: Motivation
The dream of creating artificial devices that reach or outperform human intelligence is an old one, however a computationally efficient theory of true intelligence has not been found yet, despite considerable efforts in the last 50 years. Nowadays most research is more modest, focussing on solving more narrow, specific problems, associated with only some aspects of intelligence, like playing chess or natural language translation, either as a goal in itself or as a bottom-up approach. The dual, top down approach, is to find a mathematical (not computational) definition of general intelligence. Note that the AI problem remains non-trivial even when ignoring computational aspects.

Foundations of Artificial Intelligence – 3 – Marcus Hutter
Abstract: Contents
In this course we will develop such an elegant mathematical parameter-free theory of an optimal reinforcement learning agent embedded in an arbitrary unknown environment that possesses essentially all aspects of rational intelligence. Most of the course is devoted to giving an introduction to the key ingredients of this theory, which are important subjects in their own right: Occam’s razor; Turing machines; Kolmogorov complexity; probability theory; Solomonoff induction; Bayesian sequence prediction; minimum description length principle; agents; sequential decision theory; adaptive control theory; reinforcement learning; Levin search and extensions.

Foundations of Artificial Intelligence – 4 – Marcus Hutter
Background and Context
• Organizational
• Artificial General Intelligence
• Natural and Artificial Approaches
• On Elegant Theories of
• What is (Artificial) Intelligence?
• What is Universal Artificial Intelligence? • Relevant Research Fields
• Relation between ML & RL & (U)AI
• Course Highlights

Foundations of Artificial Intelligence – 5 – Marcus Hutter
Organizational – ANU Course COMP4620/8620
• Lecturer: Elliot Catt, Sultan J. Majeed
• Tutor: Sam Yang-Zhao, Tianyu Wang
• When: Semester 2, 2019. Lecture/Tutorials/Labs:
Generic timetable: http://timetabling.anu.edu.au/sws2019/ Detailed Schedule: See course homepage
• Where: Australian National University
• Register with ISIS or Wattle or Admin or Lecturer.
• Course is based on: book “Universal AI” (2005) by M.H.
• Literature: See course homepage
• Course Homepage: More/all information available at http://cs.anu.edu.au/courses/COMP4620/

Foundations of Artificial Intelligence – 6 – Marcus Hutter
Artificial General Intelligence
What is (not) the goal of AGI research?
• Is: Build general-purpose Super-Intelligences.
• Not: Create AI software solving specific problems. • Might ignite a technological Singularity.
What is (Artificial) Intelligence?
What are we really doing and aiming at?
• Is it to build systems by trial&error, and if they do something we think is smarter than previous systems, call it success?
• Is it to try to mimic the behavior of biological organisms?
We need (and have!) theories which
can guide our search for intelligent algorithms.

Foundations of Artificial Intelligence – 7 – Marcus Hutter
“Natural” Approaches
copy and improve (human) nature
Biological Approaches to Super-Intelligence
• Brain Scan & Simulation • Genetic Enhancement
• Brain Augmentation
Not the topic of this course

Foundations of Artificial Intelligence – 8 – Marcus Hutter
“Artificial” Approaches
Design from first principles. At best inspired by nature.
Artificial Intelligent Systems:
• Logic/language based: expert/reasoning/proving/cognitive systems.
• Economics inspired:
• Cybernetics:
• Machine Learning:
• Information processing: data compression ≈ intelligence.
Separately too limited for AGI, but jointly very powerful.
utility, sequential decisions, game theory. adaptive dynamic control.
reinforcement learning.
Topic of this course: Foundations of “artificial” approaches to AGI

Foundations of Artificial Intelligence – 9 – Marcus Hutter
There is an Elegant Theory of …
Cellular Automata ⇒ Iterative maps ⇒ QED ⇒ Super-Strings ⇒ Universal AI ⇒
… Computing
…Chaos and Order
… Chemistry
… the Universe
… Super Intelligence

Foundations of Artificial Intelligence – 10 – Marcus Hutter
What is (Artificial) Intelligence?
Intelligence can have many faces ⇒ formal definition difficult
What is AI?
Thinking
Acting
humanly
Cognitive Science
Turing test, Behaviorism
rationally
Laws Thought
Doing the Right Thing
• reasoning
• creativity
• association
• generalization
• pattern recognition
• problem solving
• memorization • planning
• achieving goals • learning
• optimization
• self-preservation
• vision
• language processing
• motor skills
• classification
• induction • deduction • …
Collection of 70+ Defs of Intelligence
http://www.vetta.org/
definitions-of-intelligence/
Real world is nasty: partially unobservable, uncertain, unknown, non-ergodic, reactive, vast, but luckily structured, …

Foundations of Artificial Intelligence – 11 – Marcus Hutter
What is Universal Artificial Intelligence?
• Sequential Decision Theory solves the problem of rational agents in uncertain worlds if the environmental probability distribution is known.
• Solomonoff’s theory of Universal Induction solves the problem of sequence prediction for unknown prior distribution.
• Combining both ideas one arrives at
A Unified View of Artificial Intelligence
==
Decision Theory = Probability + Utility Theory ++
Universal Induction = Ockham + Bayes + Turing
Group project: Implement a Universal Agent able to learn by itself to
play TicTacToe/Pacman/Poker/… www.youtube.com/watch?v=yfsMHtmGDKE

Foundations of Artificial Intelligence – 12 – Marcus Hutter
Relevant Research Fields
(Universal) Artificial Intelligence has interconnections with (draws from and contributes to) many research fields:
• computer science (artificial intelligence, machine learning), • engineering (information theory, adaptive control),
• economics (rational agents, game theory),
• mathematics (statistics, probability),
• psychology (behaviorism, motivation, incentives), • philosophy (reasoning, induction, knowledge).

Foundations of Artificial Intelligence – 13 – Marcus Hutter
Relation between ML & RL & (U)AI
Universal Artificial Intelligence
Covers all Reinforcement Learning problem types
Statistical Machine Learning
Mostly i.i.d. data classification, regression, clustering
RL Problems & Algorithms
Stochastic, unknown, non-i.i.d. environments
Artificial Intelligence
Traditionally deterministic, known world / planning problem

Foundations of Artificial Intelligence – 14 – Marcus Hutter
Course Highlights
• Formal definition of (general rational) Intelligence.
• Optimal rational agent for arbitrary problems.
• Philosophical, mathematical, and computational background.
• Some approximations, implementations, and applications. (learning TicTacToe, PacMan, simplified Poker from scratch)
• State-of-the-art artificial general intelligence.

Foundations of Artificial Intelligence – 15 – Marcus Hutter
Table of Contents
1. A SHORT TOUR THROUGH THE COURSE
2. INFORMATION THEORY & KOLMOGOROV COMPLEXITY
3. BAYESIAN PROBABILITY THEORY
4. ALGORITHMIC PROBABILITY & UNIVERSAL INDUCTION
5. MINIMUM DESCRIPTION LENGTH
6. THE UNIVERSAL SIMILARITY METRIC
7. BAYESIAN SEQUENCE PREDICTION
8. UNIVERSAL RATIONAL AGENTS
9. THEORY OF RATIONAL AGENTS
10. APPROXIMATIONS & APPLICATIONS
11. DISCUSSION

A Short Tour Through the Course – 16 – Marcus Hutter
1 A SHORT TOUR THROUGH THE COURSE

A Short Tour Through the Course – 17 – Marcus Hutter
Informal Definition of (Artificial) Intelligence
Intelligence measures an agent’s ability to achieve goals in a wide range of environments. [S. Legg and M. Hutter]
Emergent: Features such as the ability to learn and adapt, or to understand, are implicit in the above definition as these capacities enable an agent to succeed in a wide range of environments.
The science of Artificial Intelligence is concerned with the construction of intelligent systems/artifacts/agents and their analysis.
What next? Substantiate all terms above: agent, ability, utility, goal, success, learn, adapt, environment, …
=== =====
Never trust a theory if it is not supported by an experiment experiment theory

A Short Tour Through the Course – 18 – Marcus Hutter
Induction→Prediction→Decision→Action
Having or acquiring or learning or inducing a model of the environment an agent interacts with allows the agent to make predictions and utilize them in its decision process of finding a good next action.
Induction infers general models from specific observations/facts/data, usually exhibiting regularities or properties or relations in the latter.
Example
Induction: Find a model of the world economy.
Prediction: Use the model for predicting the future stock market. Decision: Decide whether to invest assets in stocks or bonds. Action: Trading large quantities of stocks influences the market.

A Short Tour Through the Course – 19 – Marcus Hutter
Science ≈ Induction ≈ Occam’s Razor
• Grue Emerald Paradox:
Hypothesis 1: All emeralds are green.
Hypothesis 2: All emeralds found till y2020 are green, thereafter all emeralds are blue.
• Which hypothesis is more plausible? H1! Justification?
• Occam’s razor: take simplest hypothesis consistent with data.
is the most important principle in machine learning and science. • Problem: How to quantify “simplicity”? Beauty? Elegance?
Description Length!
[The Grue problem goes much deeper. This is only half of the story]

A Short Tour Through the Course – 20 – Marcus Hutter
Information Theory & Kolmogorov Complexity
• Quantification/interpretation of Occam’s razor:
• Shortest description of object is best explanation.
• Shortest program for a string on a Turing machine T leads to best extrapolation=prediction.
KT (x) = min{l(p) : T(p) = x} p
• Prediction is best for a universal Turing machine U. Kolmogorov-complexity(x) = K(x) = KU (x) ≤ KT (x) + cT

A Short Tour Through the Course – 21 – Marcus Hutter
Bayesian Probability Theory
Given (1): Models P(D|Hi) for probability of observing data D, when Hi is true.
Given (2): Prior probability over hypotheses P(Hi). Goal: Posterior probability P(Hi|D) of Hi,
after having seen data D. Solution:
􏰄P (D|Hi) · P (Hi)
i P(D|Hi)·P(Hi)
Bayes’ rule: P(Hi|D) =
(1) Models P(D|Hi) usually easy to describe (objective probabilities)
(2) But Bayesian prob. theory does not tell us how to choose the prior P(Hi) (subjective probabilities)

A Short Tour Through the Course – 22 – Marcus Hutter
Algorithmic Probability Theory
• Epicurus: If more than one theory is consistent with the observations, keep all theories.
• ⇒ uniform prior over all Hi?
• Refinement with Occam’s razor quantified
in terms of Kolmogorov complexity:
P(Hi) := 2−KT/U (Hi)
• Fixing T we have a complete theory for prediction. Problem: How to choose T.
• Choosing U we have a universal theory for prediction. Observation: Particular choice of U does not matter much. Problem: Incomputable.

A Short Tour Through the Course – 23 – Marcus Hutter
Inductive Inference & Universal Forecasting
• Solomonoff combined Occam, Epicurus, Bayes, and Turing into one formal theory of sequential prediction.
• M(x) = probability that a universal Turing machine outputs x when provided with
fair coin flips on the input tape.
• A posteriori probability of y given x is M(y|x) = M(xy)/M(x).
• Given x ̇1,..,x ̇t−1, the probability of xt is M(xt|x ̇1…x ̇t−1).
• Immediate “applications”:
– Weather forecasting: xt ∈ {sun,rain}.
– Stock-market prediction: xt ∈ {bear,bull}.
– Continuing number sequences in an IQ test: xt ∈ N.
• Optimal universal inductive reasoning system!

A Short Tour Through the Course – 24 – Marcus Hutter
The Minimum Description Length Principle
• Approximation of Solomonoff, since M is incomputable:
• M (x) ≈ 2−KU (x) (quite good)
• KU (x) ≈ KT (x) (very crude)
• Predict y of highest M(y|x) is approximately same as
• MDL: Predict y of smallest KT (xy).

A Short Tour Through the Course – 25 – Marcus Hutter
Application: Universal Clustering
• Question: When is object x similar to object y?
• Universal solution: x similar to y
⇔ x can be easily (re)constructed from y
⇔ K(x|y) := min{l(p) : U(p,y) = x} is small.
• Universal Similarity: Symmetrize&normalize K(x|y).
• Normalized compression distance: Approximate K ≡ KU by KT .
• Practice: For T choose (de)compressor like lzw or gzip or bzip(2).
• Multiple objects ⇒ similarity matrix ⇒ similarity tree.
• Applications: Completely automatic reconstruction (a) of the evolutionary tree of 24 mammals based on complete mtDNA, and (b) of the classification tree of 52 languages based on the declaration of human rights and (c) many others. [Cilibrasi&Vitanyi’05]

A Short Tour Through the Course – 26 – Marcus Hutter
Sequential Decision Theory
Setup: For t = 1,2,3,4,…
Given sequence x1, x2, …, xt−1 (1) predict/make decision yt, (2) observe xt,
(3) suffer loss Loss(xt,yt),
(4) t → t + 1, goto (1)
Goal: Minimize expected Loss.
Greedy minimization of expected loss is optimal if:
Important: Decision yt does not influence env. (future observations). Loss function is known.
Problem: Expectation w.r.t. what?
Solution: W.r.t. universal distribution M if true distr. is unknown.

A Short Tour Through the Course – 27 – Marcus Hutter
Example: Weather Forecasting
Observation xt ∈ X = {sunny, rainy} Decision yt ∈ Y = {umbrella, sunglasses}
Loss umbrella sunglasses
sunny
rainy 0.3
Taking umbrella/sunglasses does not influence future weather (ignoring butterfly effect)
0.1
0.0 1.0

A Short Tour Through the Course – 28 – Marcus Hutter
Agent Model with Reward
if actions/decisions a influence the environment q
r1 | o1
r2 | o2
r3 | o3 r4 | o4 ✟ ❍❨

❍❍❍ ✟✙ ❍
✟✟✟
r5 | o5
r6 | o6
Agent tape … Environ- tape … p ment q
work
work
P ✶✏ PPPPPPq✏✏✏✏✏✏

y1
y2
y3
y4
y5
y6

A Short Tour Through the Course – 29 – Marcus Hutter
Rational Agents in Known Environment
• Setup: Known deterministic or probabilistic environment
• Fields: AI planning & sequential decision theory & control theory
• Greedy maximization of reward r (=−Loss) no longer optimal. Example: Chess
• Agent has to be farsighted.
• Optimal solution: Maximize future (expected) reward sum,
called value.
• Problem: Things drastically change if environment is unknown

A Short Tour Through the Course – 30 – Marcus Hutter
Rational Agents in Unknown Environment
Additional problem: (probabilistic) environment unknown. Fields: reinforcement learning and adaptive control theory Big problem: Exploration versus exploitation
Bayesian approach: Mixture distribution ξ.
1. What performance does Bayes-optimal policy imply? It does not necessarily imply self-optimization (Heaven&Hell example).
2. Computationally very hard problem.
3. Choice of horizon? Immortal agents are lazy.
Universal Solomonoff mixture ⇒ universal agent AIXI.
Represents a formal (math., non-comp.) solution to the AI problem? Most (all AI?) problems are easily phrased within AIXI.

A Short Tour Through the Course – 31 – Marcus Hutter
Computational Issues: Universal Search
• Levin search: Fastest algorithm for inversion and optimization problems.
• Theoretical application:
Assume somebody found a non-constructive proof of P=NP, then Levin-search is a polynomial time algorithm for every NP (complete) problem.
• Practical (OOPS) applications (J. Schmidhuber) Mazes, towers of hanoi, robotics, …
• FastPrg: The asymptotically fastest and shortest algorithm for all well-defined problems.
• Computable Approximations of AIXI:
AIXItl and AIξ and MC-AIXI-CTW and ΦMDP.
=
• Human Knowledge Compression Prize: (50’000 C)

A Short Tour Through the Course – 32 – Marcus Hutter
Monte-Carlo AIXI Applications
without providing any domain knowledge, the same agent is able to self-adapt to a diverse range of interactive environments.
[VNH+11]
www.youtube.com/watch?v=yfsMHtmGDKE
1 0.8 0.6 0.4 0.2 0
Experience
10000 100000
100 1000
1000000
Optimal
Cheese Maze
Tiger
4×4 Grid
TicTacToe
Biased RPS
Kuhn Poker
Pacman
Normalised Average Reward per Cycle

A Short Tour Through the Course – 33 – Marcus Hutter
Discussion at End of Course
• What has been achieved?
• Made assumptions.
• General and personal remarks. • Open problems.
• Philosophical issues.

A Short Tour Through the Course – 34 – Marcus Hutter
Exercises
1. [C10] What could the probability p that the sun will rise tomorrow be? What might a philosopher, statistician, physicist, etc. say?
2. [C15] Justify Laplace’ rule (p = n+1 , where n= #days sun rose in n+2
past)
3. [C05] Predict sequences: 2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,? 3,1,4,1,5,9,2,6,5,3,?,
1,2,3,4,?
4. [C10] Argue in (1) and (3) for different continuations.

A Short Tour Through the Course – 35 – Marcus Hutter
Introductory Literature
[HMU06] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Language, and Computation. Addison-Wesley, 3rd edition, 2006.
[RN10] S. J. Russell and P. Norvig. Artificial Intelligence. A Modern Approach. Prentice-Hall, Englewood Cliffs, NJ, 3rd edition, 2010.
[LV08] M. Li and P. M. B. Vit ́anyi. An Introduction to Kolmogorov Complexity and its Applications. Springer, Berlin, 3rd edition, 2008.
[SB98] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
[Leg08] S. Legg. Machine Super Intelligence. PhD Thesis, Lugano, 2008. [Hut05] M. Hutter. Universal Artificial Intelligence: Sequential Decisions
based on Algorithmic Probability. Springer, Berlin, 2005.
See http://www.hutter1.net/ai/introref.htm for more.