Brief paperRobust dynamical network structure reconstruction☆
Introduction
One of the fundamental interests in systems biology is the discovery of the specific biochemical mechanisms that explain the observed behaviour of a particular biological system. In particular, we consider the problem of reconstructing the network structure from input and partially measured output data of a dynamical system, and in turn uncovering the underlying mechanisms responsible for the observed behaviour. The biological network reconstruction problem challenges come from the necessity to deal with noisy and partial measurements (in particular, the number of hidden/unobservable nodes and their position in the network is unknown) taken from a nonlinear and stochastic dynamical network.
There are several tools in the literature to infer causal network structures. These tools are mainly rooted in three fields: Bayesian inference (Dojer et al., 2006, Yu et al., 2004), information theory (ARACNe Basso et al., 2005, Butte and Kohane, 2000, Faith et al., 2007) and ODE methods (inferelator Bansal and di Bernardo, 2007, Bonneau et al., 2006, di Bernardo et al., 2005, Gardner et al., 2003, Sontag, 2008). Details on these and other methods can be found in several reviews of the field such as Bansal, Belcastro, Ambesi-Impiombato, and Bernardo (2007), Cantone et al. (2009), De Smet and Marchal (2010), Hecker, Lambeck, Toepfer, Someren, and Guthke (2009) and Koyuturk (2010). The vast majority of network reconstruction methods produce estimates of network structure regardless of the informativity of the underlying data. In particular, most methods produce estimates of network structure even in cases with data from only a few experiments. Such data may not contain enough information to enable the accurate reconstruction of the actual network, thus the obtained network estimates can be arbitrarily different from the true network structure (Cantone et al., 2009). To compensate for the lack of information in data, most methods have heuristics that try to “guess” at the remaining information, either by specifying prior distributions or by appealing to a priori beliefs about the nature of real biological networks, such as looking for the sparsest network. Nevertheless, these heuristics bias the results and lead to incorrect estimates of the network structure.
In contrast, our approach has been to identify the conditions when data is sufficiently informative to enable accurate network reconstruction. The results indicate that even in an ideal situation, when the underlying network is linear and time-invariant (LTI) and the measurements are noise-free, network reconstruction is impossible without additional information (Gonçalves & Warnick, 2008). Surprisingly, this information gap is not due to a lack of data, or a deficiency in the number of experiments, but rather it occurs because system states are only partially observed; the information gap is present in all data sets except those that satisfy certain experimental conditions. Our analysis identified a particular experimental protocol that satisfies these necessary conditions to ensure that data will be sufficiently informative to enable network reconstruction. This protocol suggests the following.
- 1.
A network composed of measured species demands experiments.
- 2.
Each experiment requires a distinct input that independently controls a measured species, i.e. experimental input must affect measured species and no other measured species except, possibly, indirectly through measured species .
The work in Gonçalves and Warnick (2008), however, did not take into account the realistic scenario that typically systems are nonlinear and data are noisy. This paper extends and details earlier results in Gonçalves and Warnick (2009) by developing an effective method to reconstruct networks in the presence of noise and nonlinearities, assuming that the conditions for network reconstruction presented above in (1) and (2) have been met. Steady-state (resp. time-series) data can be used to reconstruct the Boolean (resp. dynamical) network structure of the system.
The paper is organised as follows. After a motivating example showing that input–output data alone does not enable network reconstruction, Section 2 reviews dynamical structure functions and gives fundamental results concerning their usefulness in the network reconstruction problem. Section 3 presents the main results of the paper regarding robust network reconstruction from input–output data subject to noise and nonlinearities. Finally, we conclude the paper with biologically inspired examples in Section 4.
Notation. For a matrix denotes the element in the th row and th column while denotes its th column. For a column vector denotes its th element. We define . denotes the identity matrix. When it is clear from the context, we omit the explicit dependence of transfer functions on the Laplace variable , e.g. we write instead of .
Motivating example. Consider the transfer function obtained from data (partial observations) using system identification tools. For simplicity, assume that accurately represents the input–output relation of the original system. This transfer function is consistent with two state-space realisations given by , and (i.e., the third state is hidden/non-observable). Note that both realisations are minimal and correspond to very different network structures as seen in Fig. 1. This demonstrates that even in the idealised setting (LTI system, no noise and perfect system identification), network reconstruction in the presence of hidden/unobservable states is not possible without additional information about the system.
Section snippets
Dynamical structure functions and network reconstruction
In Gonçalves and Warnick (2008) we introduced the notion of dynamical structure functions and showed how they can be used to obtain necessary and sufficient conditions for network reconstruction. For the sake of clarity and completeness, we state these previously obtained results here without proofs. We refer the interested reader to Gonçalves and Warnick (2008) and Yuan, Stan, Warnick, and Goncalves (2009) for the corresponding proofs.
Consider a nonlinear system
Robust network structure reconstruction
In this section, we consider the problem of robustly reconstructing dynamical network structures. Data are obtained from input–output measurements of a noisy nonlinear system. From this type of data we aim to find the internal network structure associated with the linearised system (2). To average out the noise, data-collection experiments are repeated times. For simplicity of exposition, we assume that no a priori information on the internal network structure is available. The results
Biologically inspired examples
This section illustrates with two examples the theoretical results presented in the previous section. The corresponding sets of ordinary differential equation describing the dynamics of the considered networks are used to generate noisy data, which are then fed to our reconstruction algorithm in order to assess its ability to recover the correct network structure.
Conclusion and discussion
This paper proposes a new network reconstruction method in the presence of noise and nonlinearities based on dynamical structure functions. The key idea is to find minimal distances between the existent data and the data required to obtain particular network structures. The method was illustrated with two biologically oriented examples. They showed that even in the presence of nonlinearities and considerable noise network reconstruction was possible. Eventually, when the signal-to-noise ratio
Ye Yuan was born on October 1986. He received his B.Eng. degree (Valedictorian) from the Department of Automation, Shanghai Jiao Tong University in 2008, M. Phil. from the Department of Engineering, Cambridge University in 2009. He is currently a second-year Ph.D. student in Control Group, Department of Engineering, University of Cambridge. Ye was a visiting student at University of New Mexico, the Hong Kong University of Science and Technology, and Luxembourg Centre for Systems Biomedicine. He
References (27)
- et al.
A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches
Cell
(2009) - et al.
Robust and optimal control
(1996) - Yuan, Y., Stan, G., Warnick, S., & Goncalves, J. (2010). Robust dynamical network reconstruction from noisy data. In...
- Yuan, Y., Stan, G., Warnick, S., & Goncalves, J. (2009). Minimal dynamical structure realisations with application to...
- et al.
Advances to Bayesian network inference for generating causal networks from observational biological data
Bioinformatics
(2004) - Young
An introduction to Hilbert space
(1988) - et al.
Making sense of it all: bacterial chemotaxis
Nature Reviews Molecular Cell Biology
(2004) - Sontag
Network reconstruction based on steady-state data
Essays in Biochemistry
(2008) - et al.
A model invalidation-based approach for elucidating biological signalling pathways, applied to the chemotaxis pathway in R. sphaeroides
BMC Systems Biology
(2009) - et al.
Interampatteness—a generic property of biochemical networks
IET Systems Biology
(2009)
Cited by (113)
Sparse DCM for whole-brain effective connectivity from resting-state fMRI data
2020, NeuroImageCitation Excerpt :Alternative, and possibly richer, modelling frameworks could of course be considered. For instance, in the control and system identification community several dynamical models with an underlying network structure have been studied (see (Yuan et al., 2011; Chiuso and Pillonetto, 2012; Weerts et al., 2018; Zorzi and Sepulchre, 2016) and references therein). The sparse DCM model provides a good trade-off between model complexity and the need to account for physiological insights and computational issues, all of which should be considered when estimating models for high dimensional data (fMRI recordings) from relatively small (i.e., measured for short time intervals) datasets.
Recommended articles (6)
- Research article
Decentralized Laplacian Eigenvalues Estimation and Collaborative Network Topology Identification
IFAC Proceedings Volumes, Volume 45, Issue 26, 2012, pp. 7-12 - Research article
Identifiability in dynamic network identification
IFAC-PapersOnLine, Volume 48, Issue 28, 2015, pp. 1409-1414 - Research article
Dynamic network structure identification with prediction error methods - basic examples
IFAC Proceedings Volumes, Volume 45, Issue 16, 2012, pp. 876-881 - Research article
Indiscernible topological variations in DAE networks
Automatica, Volume 101, 2019, pp. 280-289 - Research article
On the identifiability of dynamical networks
IFAC-PapersOnLine, Volume 50, Issue 1, 2017, pp. 10580-10585 - Research article
Identifiability of dynamic networks with part of the nodes noise-free
IFAC-PapersOnLine, Volume 49, Issue 13, 2016, pp. 19-24
Ye Yuan was born on October 1986. He received his B.Eng. degree (Valedictorian) from the Department of Automation, Shanghai Jiao Tong University in 2008, M. Phil. from the Department of Engineering, Cambridge University in 2009. He is currently a second-year Ph.D. student in Control Group, Department of Engineering, University of Cambridge. Ye was a visiting student at University of New Mexico, the Hong Kong University of Science and Technology, and Luxembourg Centre for Systems Biomedicine. He is now a visiting student at CDS, Caltech. His research interest lies in the mathematical control theory with applications to network and biology. He is the recipient of Dorothy Hodgkin Postgraduate Awards (Microsoft Research Ph.D. Scholarship), Cambridge Overseas Scholarship and Henry Lester Scholarship.
Guy-Bart Stan was born in Liège, Belgium, in 1977. He received his Ph.D. degree in Applied Sciences (Analysis and Control of Nonlinear Dynamical Systems) from the University of Liège, Belgium in 2005. In 2005, Dr. Stan worked as a Senior Digital Signal Processing Engineer at Philips Applied Technologies, Leuven, Belgium. From 2006 until 2009, he worked as Research Associate in the Control Group of the Department of Engineering at the University of Cambridge, UK, being supported by an EU-FP6 IEF Marie-Curie Fellowship and the UK EPSRC, successively. In 2008, he was a Visiting Scientist at the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, USA. Since 2009 Dr. Stan is a University Lecturer in Engineering Design for Synthetic Biology Systems in the Department of Bioengineering and the EPSRC-funded Centre for Synthetic Biology and Innovation at Imperial College London. His current research interests include the mathematical modelling, analysis and control of complex biological systems/networks occurring in synthetic biology, systems biology, and technological systems.
Sean Warnick received the B.S.E. degree from Arizona State University in 1993, and the S.M. and Ph.D. degrees in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology in 1995 and 2003, respectively. He attended ASU on scholarship from the Flinn Foundation, graduated summa cum laude, and was named the Outstanding Graduate of the College of Engineering and Applied Sciences. Since 2003 Dr. Warnick has been with the Computer Science Department at Brigham Young University, where he is currently an Associate Professor and Director of the interdisciplinary Information and Decision Algorithms Laboratories. Dr. Warnick has also held visiting appointments at Cambridge University (Summer 2006), the University of Maryland (Summer 2008), and the National Security Agency, where he was named the Distinguished Visiting Professor for three consecutive years (2008–2010). Dr. Warnick’s research interests focus on complex networks of uncertain dynamical systems, where he considers issues of representation, identification and reconstruction, control, security and robustness, and verification and validation.
Jorge Goncalves received his Licenciatura (5-year S.B.) degree from the University of Porto, Portugal, and the M.S. and Ph.D. degrees from the Massachusetts Institute of Technology, Cambridge, MA, all in Electrical Engineering and Computer Science, in 1993, 1995, and 2000, respectively. He then held two postdoctoral positions, first at the Massachusetts Institute of Technology for seven months, and from May 2001 to March 2004 at the California Institute of Technology with the Control and Dynamical Systems division. Since April 2004 he has been a Lecturer in the Information Engineering Division of the Department of Engineering at the University of Cambridge. Since 2005 he has been also a Fellow of Pembroke College, Cambridge. He was a visiting researcher at the University of Luxembourg from June to December 2010. Also he is a visiting researcher at California Institute of Technology (January–September 2011). He was the recipient of the Best Student Paper Award at the Automatic Control Conference, Chicago, IL, June 2000.
His research interests include modelling, analysis and control of complex and hybrid systems. In particular, modelling and analysis in systems and synthetic biology, collaborating with biologists in different areas such as circadian rhythms and gene regulatory networks.
- ☆
This work was supported in part by EPSRC grant numbers EP/G066477/1 and EP/I029753/1, AFRL FA8750-09-2-0219 and Microsoft Research through the Ph.D. Scholarship Program. The material in this paper was partially presented at the 49th IEEE Conference on Decision and Control, December 15–17, 2010, Atlanta, Georgia, USA. This paper was recommended for publication in revised form by Associate Editor Elling Jacobsen, under the direction of Guest Editor Francis J. Doyle III.