We gratefully acknowledge support from
the Simons Foundation
and member institutions
Full-text links:

Download:

Current browse context:

physics.data-an

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Physics > Data Analysis, Statistics and Probability

Title: Maximizing the information learned from finite data selects a simple model

Abstract: We use the language of uninformative Bayesian prior choice to study the selection of appropriately simple effective models. We advocate for the prior which maximizes the mutual information between parameters and predictions, learning as much as possible from limited data. When many parameters are poorly constrained by the available data, we find that this prior puts weight only on boundaries of the parameter manifold. Thus it selects a lower-dimensional effective theory in a principled way, ignoring irrelevant parameter directions. In the limit where there is sufficient data to tightly constrain any number of parameters, this reduces to Jeffreys prior. But we argue that this limit is pathological when applied to the hyper-ribbon parameter manifolds generic in science, because it leads to dramatic dependence on effects invisible to experiment.
Comments: 9 pages, 8 figures. v3 has improved discussion and adds an appendix about MDL and Bayes factors, and matches version to appear in PNAS (modulo comma placement). Title changed from "Rational Ignorance: Simpler Models Learn More Information from Finite Data"
Subjects: Data Analysis, Statistics and Probability (physics.data-an); Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT); Statistics Theory (math.ST); Machine Learning (stat.ML)
Journal reference: PNAS February 2018
DOI: 10.1073/pnas.1715306115
Cite as: arXiv:1705.01166 [physics.data-an]
  (or arXiv:1705.01166v3 [physics.data-an] for this version)

Submission history

From: Michael Abbott [view email]
[v1] Tue, 2 May 2017 20:27:14 GMT (1530kb,D)
[v2] Fri, 1 Sep 2017 13:55:26 GMT (3194kb,D)
[v3] Wed, 14 Feb 2018 15:24:36 GMT (4780kb,D)