We gratefully acknowledge support from
the Simons Foundation
and member institutions
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo Facebook logo LinkedIn logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Network Cross-Validation for Determining the Number of Communities in Network Data

Abstract: The stochastic block model and its variants have been a popular tool in analyzing large network data with community structures. Model selection for these network models, such as determining the number of communities, has been a challenging statistical inference task. In this paper we develop an efficient cross-validation approach to determine the number of communities, as well as to choose between the regular stochastic block model and the degree corrected block model. Our method, called network cross-validation, is based on a block-wise edge splitting technique, combined with an integrated step of community recovery using sub-blocks of the adjacency matrix. The solid performance of our method is supported by theoretical analysis of the sub-block parameter estimation, and is demonstrated in extensive simulations and a data example. Extensions to more general network models are also discussed.
Comments: 22 pages
Subjects: Methodology (stat.ME)
Cite as: arXiv:1411.1715 [stat.ME]
  (or arXiv:1411.1715v1 [stat.ME] for this version)

Submission history

From: Jing Lei [view email]
[v1] Thu, 6 Nov 2014 19:44:33 GMT (776kb,D)