Bayesian Analysis
- Bayesian Anal.
- Volume 11, Number 2 (2016), 573-597.
Importance Sampling Schemes for Evidence Approximation in Mixture Models
Jeong Eun Lee and Christian P. Robert
Full-text: Open access
Abstract
The marginal likelihood is a central tool for drawing Bayesian inference about the number of components in mixture models. It is often approximated since the exact form is unavailable. A bias in the approximation may be due to an incomplete exploration by a simulated Markov chain (e.g. a Gibbs sequence) of the collection of posterior modes, a phenomenon also known as lack of label switching, as all possible label permutations must be simulated by a chain in order to converge and hence overcome the bias. In an importance sampling approach, imposing label switching to the importance function results in an exponential increase of the computational cost with the number of components. In this paper, two importance sampling schemes are proposed through choices for the importance function: a maximum likelihood estimate (MLE) proposal and a Rao–Blackwellised importance function. The second scheme is called dual importance sampling. We demonstrate that this dual importance sampling is a valid estimator of the evidence. To reduce the induced high demand in computation, the original importance function is approximated, but a suitable approximation can produce an estimate with the same precision and with less computational workload.
Article information
Source
Bayesian Anal., Volume 11, Number 2 (2016), 573-597.
Dates
First available in Project Euclid: 25 August 2015
Permanent link to this document
https://projecteuclid.org/euclid.ba/1440507475
Digital Object Identifier
doi:10.1214/15-BA970
Mathematical Reviews number (MathSciNet)
MR3472003
Zentralblatt MATH identifier
1357.62116
Keywords
model evidence importance sampling mixture models marginal likelihood
Citation
Lee, Jeong Eun; Robert, Christian P. Importance Sampling Schemes for Evidence Approximation in Mixture Models. Bayesian Anal. 11 (2016), no. 2, 573--597. doi:10.1214/15-BA970. https://projecteuclid.org/euclid.ba/1440507475
References
- Antoniak, C. (1974). “Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems.” The Annals of Statistics, 2: 1152–1174.Mathematical Reviews (MathSciNet): MR365969
Zentralblatt MATH: 0335.60034
Digital Object Identifier: doi:10.1214/aos/1176342871
Project Euclid: euclid.aos/1176342871 - Ardia, D., Baştürk, N., Hoogerheide, L., and van Dijk, H. K. (2012). “A comparative study of Monte Carlo methods for efficient evaluation of marginal likelihood.” Computational Statistics and Data Analysis, 56: 3398–3414.
- Berkhof, J., Mechelen, I. v., and Gelman, A. (2003). “A Bayesian approach to the selection and testing of mixture models.” Statistical Sinica, 13(3): 423–442.
- Carlin, B. and Chib, S. (1995). “Bayesian model choice through Markov chain Monte Carlo.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 57(3): 473–484.
- Celeux,
G., Hurn, M., and Robert, C. P. (2000). “Computational and inferential
difficulties with mixture posterior distributions.” Journal of the American Statistical Association, 95(3): 957–979.Mathematical Reviews (MathSciNet): MR1804450
Zentralblatt MATH: 0999.62020
Digital Object Identifier: doi:10.1080/01621459.2000.10474285 - Chen, M.-H., Shao, Q. M., and Ibrahim, J. G. (2000). Monte Carlo Methods in Bayesian Computation. Springer Series in Statistics, first edition.
- Chib, S. (1995). “Marginal likelihoods from the Gibbs output.” Journal of the American Statistical Association, 90: 1313–1321.Mathematical Reviews (MathSciNet): MR1379473
Zentralblatt MATH: 0868.62027
Digital Object Identifier: doi:10.1080/01621459.1995.10476635 - — (1996). “Calculating posterior distributions and modal estimates in Markov mixture models.” Journal of Econometrics, 75: 79–97.Mathematical Reviews (MathSciNet): MR1414504
Zentralblatt MATH: 0864.62010
Digital Object Identifier: doi:10.1016/0304-4076(95)01770-4 - Chopin, N. (2002). “A sequential particle filter method for static models.” Biometrika, 89(3): 539–552.Mathematical Reviews (MathSciNet): MR1929161
Zentralblatt MATH: 1036.62062
Digital Object Identifier: doi:10.1093/biomet/89.3.539 - Chopin, N. and Robert, C. P. (2010). “Properties of nested sampling.” Biometrika, 97: 741–755.Mathematical Reviews (MathSciNet): MR2672495
Zentralblatt MATH: 1195.62185
Digital Object Identifier: doi:10.1093/biomet/asq021 - Congdon, P. (2006). “Bayesian model choice based on Monte Carlo estimates of posterior model probabilities.” Computational Statistics and Data Analysis, 50: 346–357.
- DiCiccio,
A. P., Kass, R. E., Raftery, A., and Wasserman, L. (1997). “Computing
Bayes factors by combining simulation and asymptotic approximations.” Journal of the American Statistical Association, 92: 903–915.Mathematical Reviews (MathSciNet): MR1482122
Zentralblatt MATH: 1050.62520
Digital Object Identifier: doi:10.1080/01621459.1997.10474045 - Diebolt, J. and Robert, C. (1994). “Estimation of finite mixture distributions through Bayesian sampling.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 56: 363–375.
- Doucet, A., Godsill, S., and Andrieu, C. (2000). “On sequential Monte Carlo sampling methods for Bayesian filtering.” Statistics and Computing, 10: 197–208.
- Escobar, M. and West, M. (1995). “Bayesian density estimation and inference using mixtures.” Journal of the American Statistical Association, 90(430): 577–588.Mathematical Reviews (MathSciNet): MR1340510
Zentralblatt MATH: 0826.62021
Digital Object Identifier: doi:10.1080/01621459.1995.10476550 - Friel, N. and Pettitt, A. N. (2008). “Marginal likelihood estimation via power posteriors.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70: 589–607.Mathematical Reviews (MathSciNet): MR2420416
Zentralblatt MATH: 05563360
Digital Object Identifier: doi:10.1111/j.1467-9868.2007.00650.x - Friel, N. and Wyse, J. (2012). “Estimating the evidence: a review.” Statistica Neerlandica, 66(3): 288–308.Mathematical Reviews (MathSciNet): MR2955421
Digital Object Identifier: doi:10.1111/j.1467-9574.2011.00515.x - Frühwirth-Schnatter, S. (2001). “Markov Chain Monte Carlo estimation for classical and dynamic switching and mixture models.” Journal of the American Statistical Association, 96: 194–209.Mathematical Reviews (MathSciNet): MR1952732
Zentralblatt MATH: 1015.62022
Digital Object Identifier: doi:10.1198/016214501750333063 - — (2004). “Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques.” Journal of Econometrics, 7: 143–167.Mathematical Reviews (MathSciNet): MR2076630
Zentralblatt MATH: 1053.62087
Digital Object Identifier: doi:10.1111/j.1368-423X.2004.00125.x - Frühwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models. Springer Series in Statistics, first edition.
- — (2008). bayesf : Finite Mixture and Markov Switching Models. MATLAB package version 2.0. http://statmath.wu.ac.at/ fruehwirth/monographie/book_matlab_version_2.0.pdf
- Gelfand, A. E. and Smith, A. F. M. (1990). “Sampling-based approaches to calculating marginal densities.” Journal of the American Statistical Association, 85: 398–409.Mathematical Reviews (MathSciNet): MR1141740
Zentralblatt MATH: 0702.62020
Digital Object Identifier: doi:10.1080/01621459.1990.10476213 - Gelman,
A. and Meng, X. L. (1998). “Simulating normalizing constants: From
importance sampling to bridge sampling to path sampling.” Statistical Science, 13: 163–185.Mathematical Reviews (MathSciNet): MR1647507
Digital Object Identifier: doi:10.1214/ss/1028905934
Project Euclid: euclid.ss/1028905934 - Geweke, J. (2012). “Interpretation and inference in mixture models: simple MCMC works.” Computational Statistics and Data Analysis, 51: 3529–3550.
- Green, P. (1995). “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.” Biometrika, 85(4): 711–732.Mathematical Reviews (MathSciNet): MR1380810
Zentralblatt MATH: 0861.62023
Digital Object Identifier: doi:10.1093/biomet/82.4.711 - Jasra,
A., Holmes, C., and Stephens, D. (2005). “Markov Chain Monte Carlo
methods and the label switching problem in Bayesian mixture modeling.” Statistical Science, 20(1): 50–67.Mathematical Reviews (MathSciNet): MR2182987
Digital Object Identifier: doi:10.1214/088342305000000016
Project Euclid: euclid.ss/1118065042 - Jeffreys, H. (1939). Theory of Probability. Oxford, The Clarendon Press, first edition.
- Marin, J. and Robert, C. (2007). Bayesian Core. Springer-Verlag, New York.
- — (2010a). “Importance sampling methods for Bayesian discrimination between embedded models.” In: Chen, M.-H., Dey, D., Müller, P., Sun, D., and Ye, K. (eds.), Frontiers of Statistical Decision Making and Bayesian Analysis. Springer-Verlag, New York.
- — (2010b). “On resolving the Savage–Dickey paradox.” Electronic Journal of Statistics, 4: 643–654.Mathematical Reviews (MathSciNet): MR2660536
Zentralblatt MATH: 06166520
Digital Object Identifier: doi:10.1214/10-EJS564
Project Euclid: euclid.ejs/1278682959 - Marin,
J.-M., Mengersen, K., and Robert, C. P. (2005). “Bayesian modelling and
inference on mixtures of distributions.” In: Rao, C. and Dey, D.
(eds.), Handbook of Statistics, volume 25. Springer-Verlag, New York.Mathematical Reviews (MathSciNet): MR2490536
Digital Object Identifier: doi:10.1016/S0169-7161(05)25016-2 - Marin, J.-M. and Robert, C. P. (2008). “Approximating the marginal likelihood in mixture models.” Bulletin of the Indian Chapter of ISBA, 1: 2–7.
- Meng, X. L. and Schilling, S. (2002). “Warp Bridge sampling.” Journal of Computational Graphical Statistics, 11(3): 552–586.
- Meng, X. L. and Wong, W. H. (1996). “Simulating ratios of normalizing constants via a simple identity.” Statistica Sinica, 6: 831–860.
- Mira, A. and Nicholls, G. (2004). “Bridge estimation of the probability density at a point.” Statistica Sinica, 14: 603–612.
- Neal, R. M. (1999). “Erroneous results in Marginal likelihood from the Gibbs output.” http://www.cs.toronto.edu/~radford/chib-letter.html
- — (2001). “Annealed importance sampling.” Statistics and Computing, 11: 125–139.
- Newton, M. A. and Raftery, A. E. (1994). “Approximate Bayesian inference with the weighted likelihood bootstrap.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 96(1): 3–48.
- Papastamoulis, P. (2013). label.switching: Relabelling MCMC outputs of mixture models. R package version 1.2. http://CRAN.R-project.org/package=label.switching
- Papastamoulis, P. and Iliopoulos, G. (2010). “An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions.” Journal of Computational and Graphical Statistics, 19(2): 313–331.
- Papastamoulis,
P. and Roberts, G. (2008). “Retrospective Markov chain Monte Carlo
methods for Dirichlet process hierarchical models.” Biometrika, 95: 315–321.Mathematical Reviews (MathSciNet): MR2409721
Zentralblatt MATH: 05563385
Digital Object Identifier: doi:10.1093/biomet/asm086 - Perrakis, K., Ntzoufras, I., and Tsionas, E. G. (2014). “On the use of marginal posteriors in marginal likelihood estimation via importance sampling.” Computational Statistics and Data Analysis, 77: 54–69.
- Raftery, A., Newton, M., Satagopan, J., and Krivitsky, P. (2006). “Estimating the integrated likelihood via posterior simulation using the harmonic mean identity.” Technical Report 499, University of Washington, Department of Statistics.
- Rasmussen, C. E. (2000). “The Infinite Gaussian Mixture Model.” In: Advances in Neural Information Processing Systems 12, 554–560. MIT Press.
- Richardson, S. and Green, P. (1997). “On Bayesian analysis of mixtures and with an unknown number of components.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(4): 731–792.
- Robert, C. and Marin, J.-M. (2008). “On some difficulties with a posterior probability approximation technique.” Bayesian Analysis, 3(2): 427–442.Mathematical Reviews (MathSciNet): MR2407433
Digital Object Identifier: doi:10.1214/08-BA316
Project Euclid: euclid.ba/1340370554 - Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer, second edition.
- Rodriguez, C. and Walker, S. (2014). “Label switching in Bayesian mixture models:Deterministic relabeling strategies.” Journal of Computational and Graphical Statistics, 21(1): 23–45.Mathematical Reviews (MathSciNet): MR3173759
Digital Object Identifier: doi:10.1080/10618600.2012.735624 - Rubin,
D. B. (1987). “Comment on “The calculation of posterior distributions
by data augmentation” by M. A. Tanner and W. H. Wong.” Journal of the American Statistical Association, 82: 543–546.Mathematical Reviews (MathSciNet): MR898357
Zentralblatt MATH: 0619.62029
Digital Object Identifier: doi:10.1080/01621459.1987.10478458 - — (1988). “Using the SIR algorithm to simulate posterior distributions.” In: Bernardo, J. M., DeGroot, M. H., Lindley, D. V., and Smith, A. F. M. (eds.), Bayesian Statistics, 3, 395–402. Oxford University Press.
- Satagopan, J., Newton, M., and Raftery, A. (2000). “Easy Estimation of Normalizing Constants and Bayes Factors from Posterior Simulation: Stabilizing the Harmonic Mean Estimator.” Technical Report 1028, University of Wisconsin-Madison, Department of Statistics.
- Scott, S. L. (2002). “Bayesian methods for hidden Markov models: recursive computing in the 21st Century.” Journal of the American Statistical Association, 97: 337–351.Mathematical Reviews (MathSciNet): MR1963393
Zentralblatt MATH: 1073.65503
Digital Object Identifier: doi:10.1198/016214502753479464 - Servidea, J. D. (2002). “Bridge sampling with dependent random draws: techniques and strategy.” Ph.D. thesis, Department of Statistics, The University of Chicago.
- Skilling, J. (2007). “Nested sampling for Bayesian computations.” Bayesian Analysis, 1(4): 833–859.Mathematical Reviews (MathSciNet): MR2282208
Digital Object Identifier: doi:10.1214/06-BA127
Project Euclid: euclid.ba/1340370944 - Stephens,
M. (2000a). “Bayesian Analysis of Mixture Models with an Unknown Number
of Components – An Alternative to Reversible Jump Methods.” The Annals of Statistics, 28(1): 40–74.Mathematical Reviews (MathSciNet): MR1762903
Zentralblatt MATH: 1106.62316
Digital Object Identifier: doi:10.1214/aos/1016120364
Project Euclid: euclid.aos/1016120364 - — (2000b). “Dealing with label switching in mixture models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62: 795–809.Mathematical Reviews (MathSciNet): MR1796293
Zentralblatt MATH: 0957.62020
Digital Object Identifier: doi:10.1111/1467-9868.00265 - Tierney, L. and Kadane, J. (1986). “Accurate approximations for posterior moments and marginal densities.” Journal of the American Statistical Association, 81: 82–86.Mathematical Reviews (MathSciNet): MR830567
Zentralblatt MATH: 0587.62067
Digital Object Identifier: doi:10.1080/01621459.1986.10478240 - Verdinelli, I. and Wasserman, L. (1995). “Computing Bayes factors using a generalization of the Savage–Dickey density ratio.” Journal of the American Statistical Association, 90: 614–618.Mathematical Reviews (MathSciNet): MR1340514
Zentralblatt MATH: 0826.62022
Digital Object Identifier: doi:10.1080/01621459.1995.10476554 - Voter, A. F. (1985). “A Monte Carlo method for determining free-energy differences and transition state theory rate constants.” Journal of Chemical Physics, 82: 1890–1899.

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Keeping
the balance—Bridge sampling for marginal likelihood estimation in
finite mixture, mixture of experts and Markov mixture models
Frühwirth-Schnatter, Sylvia, Brazilian Journal of Probability and Statistics, 2019 - Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling
Jasra, A., Holmes, C. C., and Stephens, D. A., Statistical Science, 2005 - Mixture models applied to heterogeneous populations
Cavalcante, Carolina V. and Gonçalves, Kelly C. M., Brazilian Journal of Probability and Statistics, 2018
- Keeping
the balance—Bridge sampling for marginal likelihood estimation in
finite mixture, mixture of experts and Markov mixture models
Frühwirth-Schnatter, Sylvia, Brazilian Journal of Probability and Statistics, 2019 - Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling
Jasra, A., Holmes, C. C., and Stephens, D. A., Statistical Science, 2005 - Mixture models applied to heterogeneous populations
Cavalcante, Carolina V. and Gonçalves, Kelly C. M., Brazilian Journal of Probability and Statistics, 2018 - Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation
Järvenpää, Marko, Gutmann, Michael U., Pleska, Arijus, Vehtari, Aki, and Marttinen, Pekka, Bayesian Analysis, 2019 - Posterior simulation via the signed root log-likelihood ratio
Kharroubi, S. A. and Sweeting, T. J., Bayesian Analysis, 2010 - Statistical inference for the doubly stochastic self-exciting process
Clinet, Simon and Potiron, Yoann, Bernoulli, 2018 - Nonparametric estimation of the lifetime and disease onset distributions for a survival-sacrifice model
Gomes, Antonio Eduardo, Groeneboom, Piet, and Wellner, Jon A., Electronic Journal of Statistics, 2019 - Extreme-cum-median ranked set sampling
Ahmed, Shakeel and Shabbir, Javid, Brazilian Journal of Probability and Statistics, 2019 - A preferential attachment model for the stellar initial mass function
Cisewski-Kehe, Jessi, Weller, Grant, and Schafer, Chad, Electronic Journal of Statistics, 2019 - Exact Mean Integrated Squared Error
Marron, J. S. and Wand, M. P., Annals of Statistics, 1992