On universal prediction and Bayesian confirmation
Under an Elsevier user license
open archive
Keywords
Sequence prediction
Bayes
Solomonoff prior
Kolmogorov complexity
Occam’s razor
Prediction bounds
Model classes
Philosophical issues
Symmetry principle
Confirmation theory
Black raven paradox
Reparametrization invariance
Old-evidence/updating problem
(non)Computable environments
References
- [1]
- T. BayesAn essay towards solving a problem in the doctrine of chancesPhilosophical Transactions of the Royal Society, 53 (1763), pp. 376-398Reprinted in Biometrika 45 (1958) 296–315
- [2]
- J.M. BernardoReference posterior distributions for Bayesian inference (with discussion)Journal of the Royal Statistical Society, B41 (1979), pp. 113-147
- [3]
- R. CarnapThe Continuum of Inductive MethodsUniversity of Chicago Press, Chicago (1952)
- [4]
- B.S. Clarke, A.R. BarronInformation-theoretic asymptotics of Bayes methodsIEEE Transactions on Information Theory, 36 (1990), pp. 453-471
- [5]
- A. Chernov, M. HutterMonotone conditional complexity bounds on future prediction errorsProc. 16th International Conf. on Algorithmic Learning Theory, ALT’05, Singapore, LNAI, vol. 3734, Springer, Berlin (2005), pp. 414-428
- [6]
- G.J. ChaitinA theory of program size formally identical to information theoryJournal of the ACM, 22 (3) (1975), pp. 329-340
- [7]
- A. Chernov, M. Hutter, J. SchmidhuberAlgorithmic complexity bounds on future prediction errorsInformation and Computation, 205 (2) (2007), pp. 242-261
- [8]
- R. Cilibrasi, P.M.B. VitányiClustering by compressionIEEE Transactions on Information Theory, 51 (4) (2005), pp. 1523-1545
- [9]
- R. Cilibrasi, P.M.B. VitányiSimilarity of objects and the meaning of wordsProc. 3rd Annual Conferene on Theory and Applications of Models of Computation, TAMC’06, LNCS, vol. 3959, Springer (2006), pp. 21-45
- [10]
- A.P. DawidStatistical theory. The prequential approachJournal of the Royal Statistical Society, Series A, 147 (1984), pp. 278-292
- [11]
- J. EarmanBayes or Bust? A Critical Examination of Bayesian Confirmation TheoryMIT Press, Cambridge, MA (1993)
- [12]
- P. GácsOn the symmetry of algorithmic informationSoviet Mathematics Doklady, 15 (1974), pp. 1477-1480
- [13]
- P. GácsOn the relation between descriptional complexity and algorithmic probabilityTheoretical Computer Science, 22 (1983), pp. 71-93
- [14]
- M. Hutter, An.A. MuchnikUniversal convergence of semimeasures on individual random sequencesProc. 15th International Conf. on Algorithmic Learning Theory, ALT’04, Padova, LNAI, vol. 3244, Springer, Berlin (2004), pp. 234-248
- [15]
- M. HutterNew error bounds for Solomonoff predictionJournal of Computer and System Sciences, 62 (4) (2001), pp. 653-667
- [16]
- M. HutterConvergence and loss bounds for Bayesian sequence predictionIEEE Transactions on Information Theory, 49 (8) (2003), pp. 2061-2067
- [17]
- M. HutterOn the existence and convergence of computable universal priorsProc. 14th International Conf. on Algorithmic Learning Theory, ALT’03, Sapporo, LNAI, vol. 2842, Springer, Berlin (2003), pp. 298-312
- [18]
- M. HutterOptimality of universal Bayesian prediction for general loss and alphabetJournal of Machine Learning Research, 4 (2003), pp. 971-1000
- [19]
- M. HutterUniversal Artificial Intelligence: Sequential Decisions Based on Algorithmic ProbabilitySpringer, Berlin (2004)300 pp.
- [20]
- M. HutterOn generalized computable universal priors and their convergenceTheoretical Computer Science, 364 (2006), pp. 27-41
- [21]
- E.T. JaynesProbability Theory: The Logic of ScienceCambridge University Press, Cambridge, MA (2003)
- [22]
- H. JeffreysAn invariant form for the prior probability in estimation problemsProc. Royal Society London, volume Series A 186 (1946), pp. 453-461
- [23]
- H. JeffreysTheory of Probability(3rd edition), Clarendon Press, Oxford (1961)
- [24]
- A.N. KolmogorovThree approaches to the quantitative definition of informationProblems of Information and Transmission, 1 (1) (1965), pp. 1-7
- [25]
- R.E. KrichevskiyLaplace’s law of succession and universal encodingIEEE Transactions on Information Theory, 44 (1998), pp. 296-303
- [26]
- R.E. Kass, L. WassermanThe selection of prior distributions by formal rulesJournal of the American Statistical Association, 91 (435) (1996), pp. 1343-1370
- [27]
- P. LaplaceThéorie analytique des probabilitésCourcier, Paris (1812)[English translation by F. W. Truscott and F. L. Emory: A Philosophical Essay on Probabilities, Dover, 1952, pp. 16–17]
- [28]
- L.A. LevinLaws of information conservation (non-growth) and aspects of the foundation of probability theoryProblems of Information Transmission, 10 (3) (1974), pp. 206-210
- [29]
- M. Li, P.M.B. VitányiAn Introduction to Kolmogorov Complexity and its Applications(2nd edition), Springer, New York (1997)
- [30]
- A. Lempel, J. ZivOn the complexity of finite sequencesIEEE Transactions on Information Theory, 22 (1976), pp. 75-81
- [31]
- P. MaherProbability captures the logic of scientific confirmationC. Hitchcock (Ed.), Contemporary Debates in Philosophy of Science, Blackwell Publishing (2004), pp. 69-93(Chapter 3)
- [32]
- N. Merhav, M. FederUniversal predictionIEEE Transactions on Information Theory, 44 (6) (1998), pp. 2124-2147
- [33]
- Markus Müller, Stationary algorithmic probability, Technical Report, TU Berlin, Berlin, 2006. http://arXiv.org/abs/cs/0608095
- [34]
- J. Poland, M. HutterOn the convergence speed of MDL predictions for Bernoulli sequencesProc. 15th International Conf. on Algorithmic Learning Theory, ALT’04, Padova, LNAI, vol. 3244, Springer, Berlin (2004), pp. 294-308
- [35]
- J. Poland, M. HutterMDL convergence speed for Bernoulli sequencesStatistics and Computing, 16 (2) (2006), pp. 161-175
- [36]
- J.J. RissanenA universal prior for integers and estimation by minimum description lengthAnnals of Statistics, 11 (2) (1983), pp. 416-431
- [37]
- J.J. RissanenStochastic Complexity in Statistical InquiryWorld Scientific, Singapore (1989)
- [38]
- J. SchmidhuberHierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limitInternational Journal of Foundations of Computer Science, 13 (4) (2002), pp. 587-612
- [39]
- J. SchmidhuberThe speed prior: A new simplicity measure yielding near-optimal computable predictionsProc. 15th Conf. on Computational Learning Theory, COLT-2002, Sydney, LNAI, vol. 2375, Springer, Berlin (2002), pp. 216-228
- [40]
- J. SchmidhuberOptimal ordered problem solverMachine Learning, 54 (3) (2004), pp. 211-254
- [41]
- R.J. SolomonoffA formal theory of inductive inference: Parts 1 and 2Information and Control, 7 (1964), pp. 1-22, 224–254
- [42]
- R.J. SolomonoffComplexity-based induction systems: Comparisons and convergence theoremsIEEE Transactions on Information Theory, IT-24 (1978), pp. 422-432
- [43]
- V.N. VapnikThe Nature of Statistical Learning Theory(2nd edition), Springer, Berlin (1999)
- [44]
- P. WalleyInferences from multinomial data: learning about a bag of marblesJournal of the Royal Statistical Society B, 58 (1) (1996), pp. 3-57
- [45]
- C.S. WallaceStatistical and Inductive Inference by Minimum Message LengthSpringer, Berlin (2005)
- [46]
- D.H. Wolpert, W.G. MacreadyNo free lunch theorems for optimizationIEEE Transactions on Evolutionary Computation, 1 (1) (1997), pp. 67-82
- [47]
- A.K. Zvonkin, L.A. LevinThe complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithmsRussian Mathematical Surveys, 25 (6) (1970), pp. 83-124
Copyright © 2007 Elsevier Ltd. All rights reserved.