Statistics Surveys
- Statist. Surv.
- Volume 4 (2010), 40-79.
A survey of cross-validation procedures for model selection
Sylvain Arlot and Alain Celisse
Full-text: Open access
Abstract
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its (apparent) universality. Many results exist on model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.
Article information
Source
Statist. Surv. Volume 4 (2010), 40-79.
Dates
First available in Project Euclid: 9 March 2010
Permanent link to this document
http://projecteuclid.org/euclid.ssu/1268143839
Digital Object Identifier
doi:10.1214/09-SS054
Mathematical Reviews number (MathSciNet)
MR2602303
Zentralblatt MATH identifier
1190.62080
Subjects
Primary: 62G08: Nonparametric regression
Secondary: 62G05: Estimation 62G09: Resampling methods
Keywords
Model selection cross-validation leave-one-out
Citation
Arlot, Sylvain; Celisse, Alain. A survey of cross-validation procedures for model selection. Statist. Surv. 4 (2010), 40--79. doi:10.1214/09-SS054. http://projecteuclid.org/euclid.ssu/1268143839.
References
- Akaike, H. (1970). Statistical predictor identification., Ann. Inst. Statist. Math., 22:203–217.Mathematical Reviews (MathSciNet): MR286233
Zentralblatt MATH: 0259.62076
Digital Object Identifier: doi: 10.1007/BF02506337 - Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In, Second International Symposium on Information Theory (Tsahkadsor, 1971), pages 267–281. Akadémiai Kiadó, Budapest.
- Allen, D. M. (1974). The relationship between variable selection and data augmentation and a method for prediction., Technometrics, 16:125–127.Mathematical Reviews (MathSciNet): MR343481
Digital Object Identifier: doi: 10.2307/1267500
JSTOR: links.jstor.org - Alpaydin, E. (1999). Combined 5 x 2 cv F test for comparing supervised classification learning algorithms., Neur. Comp., 11(8):1885–1892.
- Anderson, R. L., Allen, D. M., and Cady, F. B. (1972). Selection of predictor variables in linear multiple regression. In bancroft, T. A., editor, In Statistical papers in Honor of George W. Snedecor. Iowa: iowa State University Press.Mathematical Reviews (MathSciNet): MR418296
- Arlot, S. (2007)., Resampling and Model Selection. PhD thesis, University Paris-Sud 11. http://tel.archives-ouvertes.fr/tel-00198803/en/.
- Arlot, S. (2008a). Suboptimality of penalties proportional to the dimension for model selection in heteroscedastic regression., arXiv:0812.3141.
- Arlot, S. (2008b)., V-fold cross-validation improved: V-fold penalization. arXiv:0802.0566v2.
- Arlot, S. (2009). Model selection by resampling penalization., Electron. J. Stat., 3:557–624 (electronic).Mathematical Reviews (MathSciNet): MR2519533
Digital Object Identifier: doi: 10.1214/08-EJS196
Project Euclid: euclid.ejs/1245415825 - Arlot, S. and Celisse, A. (2009). Segmentation in the mean of heteroscedastic data via cross-validation., arXiv:0902.3977v2.
- Baraud, Y. (2002). Model selection for regression on a random design., ESAIM Probab. Statist., 6:127–146 (electronic).
- Barron, A., Birgé, L., and Massart, P. (1999). Risk bounds for model selection via penalization., Probab. Theory Related Fields, 113(3):301–413.Mathematical Reviews (MathSciNet): MR1679028
Zentralblatt MATH: 0946.62036
Digital Object Identifier: doi: 10.1007/s004400050210 - Bartlett, P. L., Boucheron, S., and Lugosi, G. (2002). Model selection and error estimation., Machine Learning, 48:85–113.
- Bellman, R. E. and Dreyfus, S. E. (1962)., Applied Dynamic Programming. Princeton.Mathematical Reviews (MathSciNet): MR140369
- Bengio, Y. and Grandvalet, Y. (2004). No unbiased estimator of the variance of, K-fold cross-validation. J. Mach. Learn. Res., 5:1089–1105 (electronic).Mathematical Reviews (MathSciNet): MR2248010
- Bhansali, R. J. and Downham, D. Y. (1977). Some properties of the order of an autoregressive model selected by a generalization of Akaike’s FPE criterion., Biometrika, 64(3):547–551.
- Birgé, L. and Massart, P. (2001). Gaussian model selection., J. Eur. Math. Soc. (JEMS), 3(3):203–268.Mathematical Reviews (MathSciNet): MR1848946
Zentralblatt MATH: 1037.62001
Digital Object Identifier: doi: 10.1007/s100970100031 - Birgé, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection., Probab. Theory Related Fields, 138(1-2):33–73. Mathematical Reviews (MathSciNet): MR2288064
Zentralblatt MATH: 1112.62082
Digital Object Identifier: doi: 10.1007/s00440-006-0011-8 - Blanchard, G. and Massart, P. (2006). Discussion: “Local Rademacher complexities and oracle inequalities in risk minimization” [Ann. Statist., 34 (2006), no. 6, 2593–2656] by V. Koltchinskii. Ann. Statist., 34(6):2664–2671.Mathematical Reviews (MathSciNet): MR2329460
Digital Object Identifier: doi: 10.1214/009053606000001037
Project Euclid: euclid.aos/1179935057 - Boucheron, S., Bousquet, O., and Lugosi, G. (2005). Theory of classification: a survey of some recent advances., ESAIM Probab. Stat., 9:323–375 (electronic).Mathematical Reviews (MathSciNet): MR2182250
Zentralblatt MATH: 1136.62355
Digital Object Identifier: doi: 10.1051/ps:2005018 - Bousquet, O. and Elisseff, A. (2002). Stability and Generalization., J. Machine Learning Research, 2:499–526.Mathematical Reviews (MathSciNet): MR1929416
Zentralblatt MATH: 1007.68083
Digital Object Identifier: doi: 10.1162/153244302760200704 - Bowman, A. W. (1984). An alternative method of cross-validation for the smoothing of density estimates., Biometrika, 71(2):353–360.Mathematical Reviews (MathSciNet): MR767163
Digital Object Identifier: doi: 10.1093/biomet/71.2.353
JSTOR: links.jstor.org - Breiman, L. (1996). Heuristics of instability and stabilization in model selection., Ann. Statist., 24(6):2350–2383.Mathematical Reviews (MathSciNet): MR1425957
Zentralblatt MATH: 0867.62055
Digital Object Identifier: doi: 10.1214/aos/1032181158
Project Euclid: euclid.aos/1032181158 - Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984)., Classification and regression trees. Wadsworth Statistics/Probability Series. Wadsworth Advanced Books and Software, Belmont, CA.
- Breiman, L. and Spector, P. (1992). Submodel selection and evaluation in regression. the x-random case., International Statistical Review, 60(3):291–319.
- Burman, P. (1989). A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76(3):503–514.
- Burman, P. (1990). Estimation of optimal transformations using, v-fold cross validation and repeated learning-testing methods. Sankhyā Ser. A, 52(3):314–345.Mathematical Reviews (MathSciNet): MR1178041
- Burman, P., Chow, E., and Nolan, D. (1994). A cross-validatory method for dependent data., Biometrika, 81(2):351–358.Mathematical Reviews (MathSciNet): MR1294896
Zentralblatt MATH: 0825.62669
Digital Object Identifier: doi: 10.1093/biomet/81.2.351
JSTOR: links.jstor.org - Burman, P. and Nolan, D. (1992). Data-dependent estimation of prediction functions., J. Time Ser. Anal., 13(3):189–207.Mathematical Reviews (MathSciNet): MR1168164
Zentralblatt MATH: 0754.62018
Digital Object Identifier: doi: 10.1111/j.1467-9892.1992.tb00102.x - Burnham, K. P. and Anderson, D. R. (2002)., Model selection and multimodel inference. Springer-Verlag, New York, second edition. A practical information-theoretic approach.
- Cao, Y. and Golubev, Y. (2006). On oracle inequalities related to smoothing splines., Math. Methods Statist., 15(4):398–414.Mathematical Reviews (MathSciNet): MR2301659
- Celisse, A. (2008a). Model selection in density estimation via cross-validation. Technical report, arXiv:0811.0802.
- Celisse, A. (2008b)., Model Selection Via Cross-Validation in Density Estimation, Regression and Change-Points Detection. PhD thesis, University Paris-Sud 11, http://tel.archives-ouvertes.fr/tel-00346320/en/.
- Celisse, A. and Robin, S. (2008). Nonparametric density estimation by exact leave-p-out cross-validation., Computational Statistics and Data Analysis, 52(5):2350–2368.Mathematical Reviews (MathSciNet): MR2411944
- Chow, Y. S., Geman, S., and Wu, L. D. (1987). Consistent cross-validated density estimation., Ann. Statist., 11:25–38.Mathematical Reviews (MathSciNet): MR684860
Zentralblatt MATH: 0509.62033
Digital Object Identifier: doi: 10.1214/aos/1176346053
Project Euclid: euclid.aos/1176346053 - Chu, C.-K. and Marron, J. S. (1991). Comparison of two bandwidth selectors with dependent errors., Ann. Statist., 19(4):1906–1918.Mathematical Reviews (MathSciNet): MR1135155
Zentralblatt MATH: 0738.62042
Digital Object Identifier: doi: 10.1214/aos/1176348377
Project Euclid: euclid.aos/1176348377 - Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation., Numer. Math., 31(4):377–403.Mathematical Reviews (MathSciNet): MR516581
Zentralblatt MATH: 0377.65007
Digital Object Identifier: doi: 10.1007/BF01404567 - Dalelane, C. (2005). Exact oracle inequality for sharp adaptive kernel density estimator. Technical report, arXiv.
- Daudin, J.-J. and Mary-Huard, T. (2008). Estimation of the conditional risk in classification: The swapping method., Comput. Stat. Data Anal., 52(6):3220–3232.Mathematical Reviews (MathSciNet): MR2424787
- Davies, S. L., Neath, A. A., and Cavanaugh, J. E. (2005). Cross validation model selection criteria for linear regression based on the Kullback-Leibler discrepancy., Stat. Methodol., 2(4):249–266.Mathematical Reviews (MathSciNet): MR2205599
Digital Object Identifier: doi: 10.1016/j.stamet.2005.05.002 - Davison, A. C. and Hall, P. (1992). On the bias and variability of bootstrap and cross-validation estimates of error rate in discrimination problems., Biometrika, 79(2):279–284.Mathematical Reviews (MathSciNet): MR1185130
Zentralblatt MATH: 0751.62029
Digital Object Identifier: doi: 10.1093/biomet/79.2.279
JSTOR: links.jstor.org - Devroye, L., Györfi, L., and Lugosi, G. (1996)., A probabilistic theory of pattern recognition, volume 31 of Applications of Mathematics (New York). Springer-Verlag, New York.
- Devroye, L. and Wagner, T. J. (1979). Distribution-Free performance Bounds for Potential Function Rules., IEEE Transaction in Information Theory, 25(5):601–604.Mathematical Reviews (MathSciNet): MR545015
Digital Object Identifier: doi: 10.1109/TIT.1979.1056087 - Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms., Neur. Comp., 10(7):1895–1924.
- Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation., J. Amer. Statist. Assoc., 78(382):316–331.Mathematical Reviews (MathSciNet): MR711106
Zentralblatt MATH: 0543.62079
Digital Object Identifier: doi: 10.2307/2288636
JSTOR: links.jstor.org - Efron, B. (1986). How biased is the apparent error rate of a prediction rule?, J. Amer. Statist. Assoc., 81(394):461–470.Mathematical Reviews (MathSciNet): MR845884
Zentralblatt MATH: 0621.62073
Digital Object Identifier: doi: 10.2307/2289236
JSTOR: links.jstor.org - Efron, B. (2004). The estimation of prediction error: covariance penalties and cross-validation., J. Amer. Statist. Assoc., 99(467):619–642. With comments and a rejoinder by the author.Mathematical Reviews (MathSciNet): MR2090899
Zentralblatt MATH: 1117.62324
Digital Object Identifier: doi: 10.1198/016214504000000692 - Efron, B. and Morris, C. (1973). Combining possibly related estimation problems (with discussion)., J. R. Statist. Soc. B, 35:379.
- Efron, B. and Tibshirani, R. (1997). Improvements on cross-validation: the.632+ bootstrap method., J. Amer. Statist. Assoc., 92(438):548–560.Mathematical Reviews (MathSciNet): MR1467848
Zentralblatt MATH: 0887.62044
Digital Object Identifier: doi: 10.2307/2965703
JSTOR: links.jstor.org - Fromont, M. (2007). Model selection by bootstrap penalization for classification., Mach. Learn., 66(2–3):165–207.
- Geisser, S. (1974). A predictive approach to the random effect model., Biometrika, 61(1):101–107.Mathematical Reviews (MathSciNet): MR418322
Zentralblatt MATH: 0275.62065
Digital Object Identifier: doi: 10.1093/biomet/61.1.101
JSTOR: links.jstor.org - Geisser, S. (1975). The predictive sample reuse method with applications., J. Amer. Statist. Assoc., 70:320–328.
- Girard, D. A. (1998). Asymptotic comparison of (partial) cross-validation, GCV and randomized GCV in nonparametric regression., Ann. Statist., 26(1):315–334.Mathematical Reviews (MathSciNet): MR1608164
Zentralblatt MATH: 0932.62047
Digital Object Identifier: doi: 10.1214/aos/1030563988
Project Euclid: euclid.aos/1030563988 - Grünwald, P. D. (2007)., The Minimum Description Length Principle. MIT Press, Cambridge, MA, USA.
- Györfi, L., Kohler, M., Krzyżak, A., and Walk, H. (2002)., A distribution-free theory of nonparametric regression. Springer Series in Statistics. Springer-Verlag, New York.Mathematical Reviews (MathSciNet): MR1987353
- Hall, P. (1983). Large sample optimality of least squares cross-validation in density estimation., Ann. Statist., 11(4):1156–1174.Mathematical Reviews (MathSciNet): MR720261
Zentralblatt MATH: 0599.62051
Project Euclid: euclid.aos/1176346329 - Hall, P. (1987). On Kullback-Leibler loss and density estimation., The Annals of Statistics, 15(4):1491–1519.Mathematical Reviews (MathSciNet): MR913570
Zentralblatt MATH: 0678.62045
Digital Object Identifier: doi: 10.1214/aos/1176350606
Project Euclid: euclid.aos/1176350606 - Hall, P., Lahiri, S. N., and Polzehl, J. (1995). On bandwidth choice in nonparametric regression with both short- and long-range dependent errors., Ann. Statist., 23(6):1921–1936.Mathematical Reviews (MathSciNet): MR1389858
Zentralblatt MATH: 0856.62041
Digital Object Identifier: doi: 10.1214/aos/1034713640
Project Euclid: euclid.aos/1034713640 - Hall, P., Marron, J. S., and Park, B. U. (1992). Smoothed cross-validation., Probab. Theory Related Fields, 92(1):1–20.Mathematical Reviews (MathSciNet): MR1156447
Zentralblatt MATH: 0742.62042
Digital Object Identifier: doi: 10.1007/BF01205233 - Hall, P. and Schucany, W. R. (1989). A local cross-validation algorithm., Statist. Probab. Lett., 8(2):109–117.Mathematical Reviews (MathSciNet): MR1017876
- Härdle, W. (1984). How to determine the bandwidth of some nonlinear smoothers in practice. In, Robust and nonlinear time series analysis (Heidelberg, 1983), volume 26 of Lecture Notes in Statist., pages 163–184. Springer, New York.Mathematical Reviews (MathSciNet): MR786307
- Härdle, W., Hall, P., and Marron, J. S. (1988). How far are automatically chosen regression smoothing parameters from their optimum?, J. Amer. Statist. Assoc., 83(401):86–101. With comments by David W. Scott and Iain Johnstone and a reply by the authors.Mathematical Reviews (MathSciNet): MR941001
Zentralblatt MATH: 0644.62048
Digital Object Identifier: doi: 10.2307/2288922
JSTOR: links.jstor.org - Hart, J. D. (1994). Automated kernel smoothing of dependent data by using time series cross-validation., J. Roy. Statist. Soc. Ser. B, 56(3):529–542.
- Hart, J. D. and Vieu, P. (1990). Data-driven bandwidth choice for density estimation based on dependent data., Ann. Statist., 18(2):873–890.Mathematical Reviews (MathSciNet): MR1056341
Zentralblatt MATH: 0703.62045
Digital Object Identifier: doi: 10.1214/aos/1176347630
Project Euclid: euclid.aos/1176347630 - Hart, J. D. and Wehrly, T. E. (1986). Kernel regression estimation using repeated measurements data., J. Amer. Statist. Assoc., 81(396):1080–1088.Mathematical Reviews (MathSciNet): MR867635
Zentralblatt MATH: 0635.62030
Digital Object Identifier: doi: 10.2307/2289087
JSTOR: links.jstor.org - Hastie, T., Tibshirani, R., and Friedman, J. (2009)., The elements of statistical learning. Springer Series in Statistics. Springer-Verlag, New York. Data mining, inference, and prediction. 2nd edition.Mathematical Reviews (MathSciNet): MR2722294
- Herzberg, A. M. and Tsukanov, A. V. (1986). A note on modifications of jackknife criterion for model selection., Utilitas Math., 29:209–216.Mathematical Reviews (MathSciNet): MR846203
- Herzberg, P. A. (1969). The parameters of cross-validation., Psychometrika, 34:Monograph Supplement.
- Hesterberg, T. C., Choi, N. H., Meier, L., and Fraley, C. (2008). Least angle and l1 penalized regression: A review., Statistics Surveys, 2:61–93 (electronic).Mathematical Reviews (MathSciNet): MR2520981
Zentralblatt MATH: 1189.62070
Digital Object Identifier: doi: 10.1214/08-SS035
Project Euclid: euclid.ssu/1211317636 - Hills, M. (1966). Allocation Rules and their Error Rates., J. Royal Statist. Soc. Series B, 28(1):1–31.
- Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. (1999). Bayesian Model Averaging: A tutorial., Statistical Science, 14(4):382–417. Mathematical Reviews (MathSciNet): MR1765176
Digital Object Identifier: doi: 10.1214/ss/1009212519
Project Euclid: euclid.ss/1009212519 - Huber, P. (1964). Robust estimation of a local parameter., Ann. Math. Statist., 35:73–101.Mathematical Reviews (MathSciNet): MR161415
Zentralblatt MATH: 0136.39805
Digital Object Identifier: doi: 10.1214/aoms/1177703732
Project Euclid: euclid.aoms/1177703732 - John, P. W. M. (1971)., Statistical design and analysis of experiments. The Macmillan Co., New York.Mathematical Reviews (MathSciNet): MR273748
- Jonathan, P., Krzanowki, W. J., and McCarthy, W. V. (2000). On the use of cross-validation to assess performance in multivariate prediction., Stat. and Comput., 10:209–229.
- Kearns, M., Mansour, Y., Ng, A. Y., and Ron, D. (1997). An Experimental and Theoretical Comparison of Model Selection Methods., Machine Learning, 27:7–50.
- Kearns, M. and Ron, D. (1999). Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation., Neural Computation, 11:1427–1453.
- Koltchinskii, V. (2001). Rademacher penalties and structural risk minimization., IEEE Trans. Inform. Theory, 47(5):1902–1914.
- Lachenbruch, P. A. and Mickey, M. R. (1968). Estimation of Error Rates in Discriminant Analysis., Technometrics, 10(1):1–11.Mathematical Reviews (MathSciNet): MR223016
Digital Object Identifier: doi: 10.2307/1266219
JSTOR: links.jstor.org - Larson, S. C. (1931). The shrinkage of the coefficient of multiple correlation., J. Edic. Psychol., 22:45–55.
- Lecué, G. (2006). Optimal oracle inequality for aggregation of classifiers under low noise condition. In Gabor Lugosi, H. U. S., editor, 19th Annual Conference On Learning Theory, COLT06., pages 364–378. Springer.Mathematical Reviews (MathSciNet): MR2280618
- Lecué, G. (2007). Suboptimality of penalized empirical risk minimization in classification. In, COLT 2007, volume 4539 of Lecture Notes in Artificial Intelligence. Springer, Berlin.Mathematical Reviews (MathSciNet): MR2397584
- Leung, D., Marriott, F., and Wu, E. (1993). Bandwidth selection in robust smoothing., J. Nonparametr. Statist., 2:333–339.Mathematical Reviews (MathSciNet): MR1256384
Zentralblatt MATH: 05143391
Digital Object Identifier: doi: 10.1080/10485259308832562 - Leung, D. H.-Y. (2005). Cross-validation in nonparametric regression with outliers., Ann. Statist., 33(5):2291–2310.Mathematical Reviews (MathSciNet): MR2211087
Zentralblatt MATH: 1086.62055
Digital Object Identifier: doi: 10.1214/009053605000000499
Project Euclid: euclid.aos/1132936564 - Li, K.-C. (1985). From Stein’s unbiased risk estimates to the method of generalized cross validation., Ann. Statist., 13(4):1352–1377.Mathematical Reviews (MathSciNet): MR811497
Zentralblatt MATH: 0605.62047
Digital Object Identifier: doi: 10.1214/aos/1176349742
Project Euclid: euclid.aos/1176349742 - Li, K.-C. (1987). Asymptotic optimality for, Cp, CL, cross-validation and generalized cross-validation: discrete index set. Ann. Statist., 15(3):958–975.Mathematical Reviews (MathSciNet): MR902239
Zentralblatt MATH: 0653.62037
Digital Object Identifier: doi: 10.1214/aos/1176350486
Project Euclid: euclid.aos/1176350486 - Mallows, C. L. (1973). Some comments on, Cp. Technometrics, 15:661–675.
- Markatou, M., Tian, H., Biswas, S., and Hripcsak, G. (2005). Analysis of variance of cross-validation estimators of the generalization error., J. Mach. Learn. Res., 6:1127–1168 (electronic).Mathematical Reviews (MathSciNet): MR2249851
- Massart, P. (2007)., Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard.
- Molinaro, A. M., Simon, R., and Pfeiffer, R. M. (2005). Prediction error estimation: a comparison of resampling methods., Bioinformatics, 21(15):3301–3307.
- Mosteller, F. and Tukey, J. W. (1968). Data analysis, including statistics. In Lindzey, G. and Aronson, E., editors, Handbook of Social Psychology, Vol. 2. Addison-Wesley.
- Nadeau, C. and Bengio, Y. (2003). Inference for the generalization error., Machine Learning, 52:239–281.
- Nemirovski, A. (2000). Topics in Non-Parametric Statistics. In Bernard, P., editor, Lecture Notes in Mathematics, Lectures on Probability Theory and Statistics, Ecole d’ete de Probabilities de Saint-Flour XXVIII - 1998. M. Emery, A. Nemirovski, D. Voiculescu.
- Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression., Ann. Statist., 12(2):758–765.Mathematical Reviews (MathSciNet): MR740928
Zentralblatt MATH: 0544.62063
Digital Object Identifier: doi: 10.1214/aos/1176346522
Project Euclid: euclid.aos/1176346522 - Opsomer, J., Wang, Y., and Yang, Y. (2001). Nonparametric regression with correlated errors., Statist. Sci., 16(2):134–153.Mathematical Reviews (MathSciNet): MR1861070
Digital Object Identifier: doi: 10.1214/ss/1009213287
Project Euclid: euclid.ss/1009213287 - Picard, R. R. and Cook, R. D. (1984). Cross-validation of regression models., J. Amer. Statist. Assoc., 79(387):575–583.Mathematical Reviews (MathSciNet): MR763576
Zentralblatt MATH: 0547.62047
Digital Object Identifier: doi: 10.2307/2288403
JSTOR: links.jstor.org - Politis, D. N., Romano, J. P., and Wolf, M. (1999)., Subsampling. Springer Series in Statistics. Springer-Verlag, New York.Mathematical Reviews (MathSciNet): MR1707286
- Quenouille, M. H. (1949). Approximate tests of correlation in time-series., J. Roy. Statist. Soc. Ser. B., 11:68–84.
- Raftery, A. E. (1995). Bayesian Model Selection in Social Research., Siociological Methodology, 25:111–163.
- Ripley, B. D. (1996)., Pattern Recognition and Neural Networks. Cambridge Univ. Press.Mathematical Reviews (MathSciNet): MR1438788
- Rissanen, J. (1983). Universal Prior for Integers and Estimation by Minimum Description Length., The Annals of Statistics, 11(2):416–431.Mathematical Reviews (MathSciNet): MR696056
Zentralblatt MATH: 0513.62005
Digital Object Identifier: doi: 10.1214/aos/1176346150
Project Euclid: euclid.aos/1176346150 - Ronchetti, E., Field, C., and Blanchard, W. (1997). Robust linear model selection by cross-validation., J. Amer. Statist. Assoc., 92:1017–1023.Mathematical Reviews (MathSciNet): MR1482132
Zentralblatt MATH: 1067.62551
Digital Object Identifier: doi: 10.2307/2965566
JSTOR: links.jstor.org - Rudemo, M. (1982). Empirical Choice of Histograms and Kernel Density Estimators., Scandinavian Journal of Statistics, 9:65–78.Mathematical Reviews (MathSciNet): MR668683
- Sauvé, M. (2009). Histogram selection in non gaussian regression., ESAIM: Probability and Statistics, 13:70–86.Mathematical Reviews (MathSciNet): MR2502024
Zentralblatt MATH: 1180.62061
Digital Object Identifier: doi: 10.1051/ps:2008002 - Schuster, E. F. and Gregory, G. G. (1981). On the consistency of maximum likelihood nonparametric density estimators. In Eddy, W. F., editor, Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface, pages 295–298. Springer-Verlag, New York.Mathematical Reviews (MathSciNet): MR650809
- Schwarz, G. (1978). Estimating the dimension of a model., Ann. Statist., 6(2):461–464.Mathematical Reviews (MathSciNet): MR468014
Zentralblatt MATH: 0379.62005
Digital Object Identifier: doi: 10.1214/aos/1176344136
Project Euclid: euclid.aos/1176344136 - Shao, J. (1993). Linear model selection by cross-validation., J. Amer. Statist. Assoc., 88(422):486–494.Mathematical Reviews (MathSciNet): MR1224373
Zentralblatt MATH: 0773.62051
Digital Object Identifier: doi: 10.2307/2290328
JSTOR: links.jstor.org - Shao, J. (1996). Bootstrap model selection., J. Amer. Statist. Assoc., 91(434):655–665.Mathematical Reviews (MathSciNet): MR1395733
Zentralblatt MATH: 0869.62030
Digital Object Identifier: doi: 10.2307/2291661
JSTOR: links.jstor.org - Shao, J. (1997). An asymptotic theory for linear model selection., Statist. Sinica, 7(2):221–264. With comments and a rejoinder by the author.
- Shibata, R. (1984). Approximate efficiency of a selection procedure for the number of regression variables., Biometrika, 71(1):43–49.Mathematical Reviews (MathSciNet): MR738324
Zentralblatt MATH: 0543.62053
Digital Object Identifier: doi: 10.1093/biomet/71.1.43
JSTOR: links.jstor.org - Stone, C. (1984). An asymptotically optimal window selection rule for kernel density estimates., The Annals of Statistics, 12(4):1285–1297.Mathematical Reviews (MathSciNet): MR760688
Zentralblatt MATH: 0599.62052
Digital Object Identifier: doi: 10.1214/aos/1176346792
Project Euclid: euclid.aos/1176346792 - Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions., J. Roy. Statist. Soc. Ser. B, 36:111–147. With discussion and a reply by the authors.
- Stone, M. (1977). Asymptotics for and against cross-validation., Biometrika, 64(1):29–35.Mathematical Reviews (MathSciNet): MR474601
Zentralblatt MATH: 0368.62046
Digital Object Identifier: doi: 10.1093/biomet/64.1.29
JSTOR: links.jstor.org - Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso., J. Royal Statist. Soc. Series B, 58(1):267–288.
- van der Laan, M. J. and Dudoit, S. (2003). Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples. Working Paper Series Working Paper 130, U.C. Berkeley Division of Biostatistics. available at, http://www.bepress.com/ucbbiostat/paper130.
- van der Laan, M. J., Dudoit, S., and Keles, S. (2004). Asymptotic optimality of likelihood-based cross-validation., Stat. Appl. Genet. Mol. Biol., 3:Art. 4, 27 pp. (electronic).
- van der Laan, M. J., Dudoit, S., and van der Vaart, A. W. (2006). The cross-validated adaptive epsilon-net estimator., Statist. Decisions, 24(3):373–395.Mathematical Reviews (MathSciNet): MR2305113
- van der Vaart, A. W., Dudoit, S., and van der Laan, M. J. (2006). Oracle inequalities for multi-fold cross validation., Statist. Decisions, 24(3):351–371.Mathematical Reviews (MathSciNet): MR2305112
- van Erven, T., Grünwald, P. D., and de Rooij, S. (2008). Catching up faster by switching sooner: A prequential solution to the aic-bic dilemma., arXiv:0807.1005.
- Vapnik, V. (1982)., Estimation of dependences based on empirical data. Springer Series in Statistics. Springer-Verlag, New York. Translated from the Russian by Samuel Kotz.Mathematical Reviews (MathSciNet): MR672244
- Vapnik, V. N. (1998)., Statistical learning theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons Inc., New York. A Wiley-Interscience Publication.Mathematical Reviews (MathSciNet): MR1641250
- Vapnik, V. N. and Chervonenkis, A. Y. (1974)., Teoriya raspoznavaniya obrazov. Statisticheskie problemy obucheniya. Izdat. “Nauka”, Moscow. Theory of Pattern Recognition (In Russian).Mathematical Reviews (MathSciNet): MR474638
- Wahba, G. (1975). Periodic splines for spectral density estimation: The use of cross validation for determining the degree of smoothing., Communications in Statistics, 4:125–142.Mathematical Reviews (MathSciNet): MR428658
Digital Object Identifier: doi: 10.1080/03610927508827233 - Wahba, G. (1977). Practical Approximate Solutions to Linear Operator Equations When the Data are Noisy., SIAM Journal on Numerical Analysis, 14(4):651–667.Mathematical Reviews (MathSciNet): MR471299
Zentralblatt MATH: 0402.65032
Digital Object Identifier: doi: 10.1137/0714044
JSTOR: links.jstor.org - Wegkamp, M. (2003). Model selection in nonparametric regression., The Annals of Statistics, 31(1):252–273. Mathematical Reviews (MathSciNet): MR1962506
Zentralblatt MATH: 1019.62037
Digital Object Identifier: doi: 10.1214/aos/1046294464
Project Euclid: euclid.aos/1046294464 - Yang, Y. (2001). Adaptive Regression by Mixing., J. Amer. Statist. Assoc., 96(454):574–588.Mathematical Reviews (MathSciNet): MR1946426
Zentralblatt MATH: 1018.62033
Digital Object Identifier: doi: 10.1198/016214501753168262
JSTOR: links.jstor.org - Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation., Biometrika, 92(4):937–950.Mathematical Reviews (MathSciNet): MR2234196
Digital Object Identifier: doi: 10.1093/biomet/92.4.937 - Yang, Y. (2006). Comparing learning methods for classification., Statist. Sinica, 16(2):635–657.
- Yang, Y. (2007). Consistency of cross validation for comparing regression procedures., Ann. Statist., 35(6):2450–2473.Mathematical Reviews (MathSciNet): MR2382654
Zentralblatt MATH: 1129.62039
Digital Object Identifier: doi: 10.1214/009053607000000514
Project Euclid: euclid.aos/1201012968 - Zhang, P. (1993). Model selection via multifold cross validation., Ann. Statist., 21(1):299–313.Mathematical Reviews (MathSciNet): MR1212178
Zentralblatt MATH: 0770.62053
Digital Object Identifier: doi: 10.1214/aos/1176349027
Project Euclid: euclid.aos/1176349027
The American Statistical Association, the Bernoulli Society, the Institute of Mathematical Statistics, and the Statistical Society of Canada

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Cross-validation in nonparametric regression with outliers
Leung, Denis Heng-Yan, The Annals of Statistics, 2005 - Modeling Publication Selection Effects in Meta-Analysis
Hedges, Larry V., Statistical Science, 1992 - A Comparison of Generalized Cross Validation and Modified Maximum Likelihood for Estimating the Parameters of a Stochastic Process
Stein, Michael L., The Annals of Statistics, 1990
- Cross-validation in nonparametric regression with outliers
Leung, Denis Heng-Yan, The Annals of Statistics, 2005 - Modeling Publication Selection Effects in Meta-Analysis
Hedges, Larry V., Statistical Science, 1992 - A Comparison of Generalized Cross Validation and Modified Maximum Likelihood for Estimating the Parameters of a Stochastic Process
Stein, Michael L., The Annals of Statistics, 1990 - Variable selection in semiparametric regression modeling
Li, Runze and Liang, Hua, The Annals of Statistics, 2008 - Optimal cross-validation in density estimation with the $L^{2}$-loss
Celisse, Alain, The Annals of Statistics, 2014 - Bandwidth Selection for Kernel Density Estimation
Chiu, Shean-Tsong, The Annals of Statistics, 1991 - Large Sample Optimality of Least Squares Cross-Validation in Density Estimation
Hall, Peter, The Annals of Statistics, 1983 - Consistency of cross validation for comparing regression procedures
Yang, Yuhong, The Annals of Statistics, 2007 - Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set
Li, Ker-Chau, The Annals of Statistics, 1987 - Resampling: consistency of substitution estimators
Putter, Hein and van Zwet, Willem R., The Annals of Statistics, 1996