Abstract
We propose a nonparametric method for identifying parsimony and for producing a statistically efficient estimator of a large covariance matrix. We reparameterise a covariance matrix through the modified Cholesky decomposition of its inverse or the one-step-ahead predictive representation of the vector of responses and reduce the nonintuitive task of modelling covariance matrices to the familiar task of model selection and estimation for a sequence of regression models. The Cholesky factor containing these regression coefficients is likely to have many off-diagonal elements that are zero or close to zero. Penalised normal likelihoods in this situation with L1 and L2 penalities are shown to be closely related to Tibshirani's (1996) LASSO approach and to ridge regression. Adding either penalty to the likelihood helps to produce more stable estimators by introducing shrinkage to the elements in the Cholesky factor, while, because of its singularity, the L1 penalty will set some elements to zero and produce interpretable models. An algorithm is developed for computing the estimator and selecting the tuning parameter. The proposed maximum penalised likelihood estimator is illustrated using simulation and a real dataset involving estimation of a 102 × 102 covariance matrix.
Received August 2004. Revised October 2005.
Author notes
1Department of Statistics, Texas A&M University, College Station, Texas 77843-3143, U.S.A. jianhua@stat.tamu.edu, 2Department of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6340, U.S.A. nliu@wharton.upenn.edu, 3Division of Statistics, Northern Illinois University, DeKalb, Illinois 60115-2854, U.S.A. pourahm@math.niu.edu, 4Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, U.S.A. lxliu@biostat.columbia.edu