On the Difficulty of Selecting Ising Models With Approximate Recovery


Abstract:

In this paper, we consider the problem of estimating the underlying graph associated with an Ising model given a number of independent and identically distributed samples. We adopt an approximate recovery criterion that allows for a number of missed edges or incorrectly included edges, in contrast with the widely studied exact recovery problem. Our main results provide information-theoretic lower bounds on the sample complexity for graph classes imposing constraints on the number of edges, maximal degree, and other properties. We identify a broad range of scenarios where, either up to constant factors or logarithmic factors, our lower bounds match the best known lower bounds for the exact recovery criterion, several of which are known to be tight or near-tight. Hence, in these cases, approximate recovery has a similar difficulty to exact recovery in the minimax sense. Our bounds are obtained via a modification of Fano's inequality for handling the approximate recovery criterion, along with suitably designed ensembles of graphs that can broadly be classed into two categories: 1) those containing graphs that contain several isolated edges or cliques and are thus difficult to distinguish from the empty graph; 2) those containing graphs for which certain groups of nodes are highly correlated, thus making it difficult to determine precisely which edges connect them. We support our theoretical results on these ensembles with numerical experiments.
Page(s): 625 - 638
Date of Publication: 29 July 2016

ISSN Information:

Funding Agency:


I. Introduction

Graphical models are a widely-used tool for providing compact representations of the conditional independence relations between random variables, and arise in areas such as image processing [1] , statistical physics [2], computational biology [3], natural language processing [4], and social network analysis [5]. The problem of graphical model selection consists of recovering the graph structure given a number of independent samples from the underlying distribution.

Contact IEEE to Subscribe

References

References is not available for this document.