The online home for the publications of the American Statistical Association

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Advanced and citation search

Journal of the American Statistical Association

Volume 96, Issue 454, 2001

Abstract

This article reviews the principle of minimum description length (MDL) for problems of model selection. By viewing statistical modeling as a means of generating descriptions of observed data, the MDL framework discriminates between competing models based on the complexity of each description. This approach began with Kolmogorov's theory of algorithmic complexity, matured in the literature on information theory, and has recently received renewed attention within the statistics community. Here we review both the practical and the theoretical aspects of MDL as a tool for model selection, emphasizing the rich connections between information theory and statistics. At the boundary between these two disciplines we find many interesting interpretations of popular frequentist and Bayesian procedures. As we show, MDL provides an objective umbrella under which rather disparate approaches to statistical modeling can coexist and be compared. We illustrate the MDL principle by considering problems in regression, nonparametric curve estimation, cluster analysis, and time series analysis. Because model selection in linear regression is an extremely common problem that arises in many applications, we present detailed derivations of several MDL criteria in this context and discuss their properties through a number of examples. Our emphasis is on the practical application of MDL, and hence we make extensive use of real datasets. In writing this review, we tried to make the descriptive philosophy of MDL natural to a statistics audience by examining classical problems in model selection. In the engineering literature, however, MDL is being applied to ever more exotic modeling situations. As a principle for statistical modeling in general, one strength of MDL is that it can be intuitively extended to provide useful tools for new problems.

KEY WORDS

 

Details

  • Published online: 31 Dec 2011

Author affiliations

  • a Mark H. Hansen is Member of the Technical Staff, Statistics and Data Mining Research Department of Bell Laboratories in Murray Hill, NJ. Bin Yu is Associate Professor in statistics at University of California in Berkeley. Her research was partially supported by NSF grants DF98-02314 and DMS-9803063, and ARO grant DAAG55-98-1-0341. The authors thank Jianhua Huang for his help with a preliminary draft of this article. The authors would also like to thank Ed George, Robert Kohn, Wim Sweldens, Martin Wells, Andrew Gelman, John Chambers and two anonymous referees for helpful comments.

Journal news

Article metrics