代写 algorithm Scheme math statistic Bayesian Bayesian Sequence Prediction – 208 – Marcus Hutter

Bayesian Sequence Prediction – 208 – Marcus Hutter
7 BAYESIAN SEQUENCE PREDICTION
• The Bayes-Mixture Distribution
• Relative Entropy and Bound
• Predictive Convergence
• Sequential Decisions and Loss Bounds
• Generalization: Continuous Probability Classes • Summary

Bayesian Sequence Prediction – 209 – Marcus Hutter
Bayesian Sequence Prediction: Abstract
We define the Bayes mixture distribution and show that the posterior converges rapidly to the true posterior by exploiting some bounds on the relative entropy. Finally we show that the mixture predictor is also optimal in a decision-theoretic sense w.r.t. any bounded loss function.

Bayesian Sequence Prediction – 210 – Marcus Hutter
Notation: Strings & Probabilities
Strings: x= x1:n :=x1x2…xn with xt ∈X and x0∀ν ν∈M ν∈M
• The weights wν may be interpreted as the prior degree of belief that the true environment is ν, or kν = ln w−1 as a complexity penalty
(prefix code length) of environment ν.
• Then ξ(x1:m) could be interpreted as the prior subjective belief probability in observing x1:m.
ν

Bayesian Sequence Prediction – 212 – Marcus Hutter
A Universal Choice of ξ and M
• We have to assume the existence of some structure on the
environment to avoid the No-Free-Lunch Theorems [Wolpert 96].
• We can only unravel effective structures which are describable by (semi)computable probability distributions.
• So we may include all (semi)computable (semi)distributions in M.
• Occam’s razor and Epicurus’ principle of multiple explanations tell
us to assign high prior belief to simple environments.
• Using Kolmogorov’s universal complexity measure K(ν) for environments ν one should set wν = 2−K(ν), where K(ν) is the length of the shortest program on a universal TM computing ν.
• The resulting mixture ξ is Solomonoff’s (1964) universal prior.
• In the following we consider generic M and wν.

Bayesian Sequence Prediction – 213 – Marcus Hutter
Relative Entropy
Relative entropy: D(p||q) := 􏰄 pi ln pi i qi
Properties: D(p||q) ≥ 0 and D(p||q) =􏰅0 ⇔ p = q
Instantaneous relative entropy: dt(x