References & Citations
Statistics > Applications
Title: A Log-Linear Graphical Model for Inferring Genetic Networks from High-Throughput Sequencing Data
(Submitted on 17 Apr 2012 (v1), last revised 28 May 2012 (this version, v2))
Abstract: Gaussian graphical models are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies to measure gene expression. As the resulting high-dimensional count data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for modeling gene networks based on this discrete data. We develop a novel method for estimating high-dimensional Poisson graphical models, the Log-Linear Graphical Model, allowing us to infer networks based on high-throughput sequencing data. Our model assumes a pair-wise Markov property: conditional on all other variables, each variable is Poisson. We estimate our model locally via neighborhood selection by fitting 1-norm penalized log-linear models. Additionally, we develop a fast parallel algorithm, an approach we call the Poisson Graphical Lasso, permitting us to fit our graphical model to high-dimensional genomic data sets. In simulations, we illustrate the effectiveness of our methods for recovering network structure from count data. A case study on breast cancer microRNAs, a novel application of graphical models, finds known regulators of breast cancer genes and discovers novel microRNA clusters and hubs that are targets for future research.
Submission history
From: Genevera Allen [view email][v1] Tue, 17 Apr 2012 23:34:57 GMT (4102kb,D)
[v2] Mon, 28 May 2012 20:46:35 GMT (2146kb,D)