Information-based fitness and the emergence of criticality in living systems
See allHide authors and affiliations
Edited* by William Bialek, Princeton University, Princeton, NJ, and approved May 27, 2014 (received for review October 12, 2013)
Significance
Recently, evidence has been mounting that biological systems might operate at the borderline between order and disorder, i.e., near a critical point. A general mathematical framework for understanding this common pattern, explaining the possible origin and role of criticality in living adaptive and evolutionary systems, is still missing. We rationalize this apparently ubiquitous criticality in terms of adaptive and evolutionary functional advantages. We provide an analytical framework, which demonstrates that the optimal response to broadly different changing environments occurs in systems organizing spontaneously—through adaptation or evolution—to the vicinity of a critical point. Furthermore, criticality turns out to be the evolutionary stable outcome of a community of individuals aimed at communicating with each other to create a collective entity.
Abstract
Empirical evidence suggesting that living systems might operate in the vicinity of critical points, at the borderline between order and disorder, has proliferated in recent years, with examples ranging from spontaneous brain activity to flock dynamics. However, a well-founded theory for understanding how and why interacting living systems could dynamically tune themselves to be poised in the vicinity of a critical point is lacking. Here we use tools from statistical mechanics and information theory to show that complex adaptive or evolutionary systems can be much more efficient in coping with diverse heterogeneous environmental conditions when operating at criticality. Analytical as well as computational evolutionary and adaptive models vividly illustrate that a community of such systems dynamically self-tunes close to a critical state as the complexity of the environment increases while they remain noncritical for simple and predictable environments. A more robust convergence to criticality emerges in coevolutionary and coadaptive setups in which individuals aim to represent other agents in the community with fidelity, thereby creating a collective critical ensemble and providing the best possible tradeoff between accuracy and flexibility. Our approach provides a parsimonious and general mechanism for the emergence of critical-like behavior in living systems needing to cope with complex environments or trying to efficiently coordinate themselves as an ensemble.
Physical systems undergo phase transitions from ordered to disordered states on changing control parameters (1, 2). Critical points, with all their remarkable properties (1, 2), are only observed upon parameter fine tuning. This is in sharp contrast to the ubiquity of critical-like behavior in complex living matter. Indeed, empirical evidence has proliferated that living systems might operate at criticality (3)—i.e. at the borderline between order and disorder—with examples ranging from spontaneous brain behavior (4) to gene expression patterns (5), cell growth (6), morphogenesis (7), bacterial clustering (8), and flock dynamics (9). Even if none of these examples is fully conclusive and even if the meaning of “criticality” varies across these works, the criticality hypothesis—as a general strategy for the organization of living matter—is a tantalizing idea worthy of further investigation.
Here we present a framework for understanding how self-tuning to criticality can arise in living systems. Unlike models of self-organized criticality in which some inanimate systems are found to become critical in a mechanistic way (10), our focus here is on general adaptive or evolutionary mechanisms, specific to biological systems. We suggest that the drive to criticality arises from functional advantages of being poised in the vicinity of a critical point.
However, why is a living system fitter when it is critical? Living systems need to perceive and respond to environmental cues and to interact with other similar entities. Indeed, biological systems constantly try to encapsulate the essential features of the huge variety of detailed information from their surrounding complex and changing environment into manageable internal representations, and they use these as a basis for their actions and responses. The successful construction of these representations, which extract, summarize, and integrate relevant information (11), provides a crucial competitive advantage, which can eventually make the difference between survival and extinction. We suggest here that criticality is an optimal strategy to effectively represent the intrinsically complex and variable external world in a parsimonious manner. This is in line with the hypothesis that living systems benefit from having attributes akin to criticality—either statistical or dynamical (3)—such as a large repertoire of dynamical responses, optimal transmission and storage of information, and exquisite sensitivity to environmental changes (2, 5, 12⇓⇓⇓–16).
As conjectured long ago, the capability to perform complex computations, which turns out to be the fingerprint of living systems, is enhanced in “machines” operating near a critical point (17⇓–19), i.e., at the border between two distinct phases: a disordered phase, in which perturbations and noise propagate unboundedly—thereby corrupting information transmission and storage—and an ordered phase where changes are rapidly erased, hindering flexibility and plasticity. The marginal, critical situation provides a delicate compromise between these two impractical tendencies, an excellent tradeoff between reproducibility and flexibility (12, 13, 16) and, on larger time scales, between robustness and evolvability (20). A specific example of this general framework is genetic regulatory networks (19, 21). Cells ranging from those in complex organisms to single-celled microbes such as bacteria respond to signals in the environment by modifying the expression of their genes. Any given genetic regulatory network, formed by the genes (nodes) and their interactions (edges) (22), can be tightly controlled to robustly converge to a fixed almost-deterministic attractor—i.e. a fixed “phenotype”—or it can be configured to be highly sensitive to tiny fluctuations in input signals, leading to many different attractors, i.e., to large phenotypic variability (23). These two situations correspond to the ordered and disordered phases, respectively. The optimal way for genetic regulatory networks to reconcile controllability and sensitivity to environmental cues is to operate somewhere in between the two limiting and impractical limits alluded to above (19) as has been confirmed in different experimental setups (5, 7, 24). Still, it is not clear how such tuning to criticality comes about.
Our goal here is to exploit general ideas from statistical mechanics and information theory to construct a quantitative framework showing that self-tuning to criticality is a convenient strategy adopted by living systems to effectively cope with the intrinsically complex external world in an efficient manner, thereby providing an excellent compromise between accuracy and flexibility. To provide some further intuition, we use genetic regulatory networks as a convenient guiding example, but one could equally well consider neural networks, models for the immune response, groups of animals exhibiting collective behavior, etc., with each specific realization requiring a more detailed modeling of its special attributes.
We uncover coevolutionary and coadaptive mechanisms by which communities of living systems, even in the absence of other forms of environmental complexity, converge to be almost critical in the process of understanding each other and creating a “collective entity.” The main result is that criticality is an evolutionary/adaptive stable solution reached by living systems in their striving to cope with complex heterogeneous environments or when trying to efficiently coordinate themselves as an ensemble.
Results
Mathematical Framework.
The external environment in which living systems operate is highly variable, largely unpredictable, and describable in terms of probability distribution functions. Living systems need to modify their internal state to cope with external conditions, and they do so in a probabilistic manner. To be specific, but without loss of generality, we represent an environmental cue “perceived” and processed by a living system as a string of N (binary) variables, s = (s1, s2, … sN). A specific environmental source is modeled by the probability distribution Psrc with which it produces each of the 2N possible states. For concreteness, this distribution is assumed to depend on a set of parameters, α = (α1, α2, …), accounting for environmental variability. We turn now to an individual living system or “agent,” which seeks to adapt itself to cope with the perceived stimuli/signals emanating from a given environmental source. This is accomplished by changing its internal state, encapsulated in a second probability distribution function, Pint, specified by a different—smaller in principle—parameter set β = (β1, β2, …) aimed at capturing the essential features of Psrc in the most efficient—although in general imperfect—way (see Fig. 1). Henceforth we will denote the external source and its internal representation by Psrc(s|α) and Pint(s|β) respectively.
Living systems coping with the environment. A illustrates a living system responding to an environmental source (e.g., a bacteria responding to some external conditions such as the presence/absence of some nutrients, pH concentration, or temperature). A given source, labeled by the set of parameters α, can only be probabilistically gauged by the system. Psrc(s|α) is the most accurate representation that the system can potentially generate in terms of the Boolean variables (or bits) s. However, such a representation might not be accessible to the system by merely changing its internal state parameters, β, and the actual internal state, Pint(s|β) (e.g., the probability of a gene expression pattern), is usually an imperfect proxy for Psrc(s|α). The optimal choice of parameters β—aiming at capturing the most relevant features of the environment—is obtained by minimizing the KL divergence of Pint(s|β) from Psrc(s|α). In genetic networks, changing internal parameters is equivalent to changing the interactions between the different (Boolean) variables (nodes of the networks in the figure). B shows a more complex scenario, where the system has to cope with multiple and diverse sources. The internal state has to be able to accommodate each of them. In C, the environment is not imposed ad hoc but, instead, it is composed of other individuals, and every agent needs to cope with (“understand”) the states of the others. Each agent evolves similarly to the others in the community, trying to exhibit the same kind of state, generating in this way a self-organized environment. In the case of sufficiently heterogeneous externally imposed sources as well as in the self-organized case, we find that evolutionary/adaptive dynamics drive the systems to operate close to criticality.
In our guiding example, the external cues could be, for instance, the environmental (temperature, pH, …) conditions, which are variable and can only be probabilistically gauged by a cell/bacterium. The binary vector s = (s1, s2, … sN) can be thought of as the on/off state of the different N genes in its (Boolean) genetic regulatory network (19, 21, 22). In this way, Psrc(s|α) can be interpreted as the probability that the most convenient state aimed at by the system to cope with a given environmental condition is s, while Pint(s|β) is the actual probability for the genetic network state (attractor) of a given individual—with its limitations—to be s. Without loss of generality, we consider that there is (at least) one control parameter, say β1, such that—other parameters being fixed—it determines in which phase the network is operating.
Our thesis is that the capacity of living systems to tune their internal states to efficiently cope with variable external conditions provides them with a strong competitive advantage. Thus, the internal state Pint(s|β) should resemble as closely as possible the one most in accord with the environmental signal Psrc(s|α); in other words, one seeks the distribution that the system should express to best respond to the external conditions. Information theory provides us with a robust measure of the “closeness” between the aimed (source) and the actual (internal) probability distribution functions. Indeed, the Kullback−Leibler (KL) divergence (25), D(α|β), quantifies the information loss when the internal state is used to approximate the source (see Materials and Methods). The KL divergence is asymmetric in the two involved probability distributions, it is never negative, and it vanishes if and only if the two distributions are identical (SI Appendix, section S2). Minimizing the KL divergence with respect to the internal state parameters, β, generates the optimal, although in general imperfect, internal state aimed at representing or coping with a given source (see Fig. 1A).
More generally, in an ever-changing world, the requirement for an individual is not just to reproduce a single source with utmost fidelity but rather to be able to successfully cope with a group of highly diverse sources (see Fig. 1B). A particularly interesting example of this would comprise a community of similar individuals who together strive to establish some kind of a common collective language (see Fig. 1C). In any of these complex situations, our working hypothesis is that an individual has a larger “fitness” when a characteristic measure, e.g., the mean, of its KL divergences from the set of diverse sources is small, i.e., fit agents are those whose internal states are close to those required by existing external conditions.
As an illustrative example, consider two individual agents A and B—the source for A is B and vice versa—each of them with its own probabilistic gene network. The relative fitnesses of A and B are determined by how well the set of cues (described by the probability distribution Psrc) of one organism is captured by the other with minimum information loss, and vice versa [for utter simplicity, we could assume that the distributions associated with A and B correspond to equilibrium distributions of an Ising model (1, 2) at similar inverse temperatures βA and βB]. If βA = βB, the two distributions would be identical and the KL divergence would vanish. However, this is not a stable solution. Indeed, if the two parameters are not identical but close, the difference between their respective KL divergences from each to the other is (see Materials and Methods):
Computational Experiments
We have developed diverse computational evolutionary and adaptive models exploiting the ideas above. The dynamical rules used in these models are not meant to, necessarily, mimic the actual dynamics of living systems; rather, they are efficient ways to optimize fitness. In the evolutionary models, inspired by the genetic algorithm (21, 27), a community of M individuals—each one characterized by its own set of internal parameters β—evolves in time through the processes of death, birth, and mutation (see Materials and Methods). Individuals with larger fitness, i.e., with a smaller mean KL divergence from the rest of sources, have a larger probability to produce an offspring, which—apart from small random mutations—inherits its parameters from its ancestor. On the other hand, agents with low fitness are more likely to die and be removed from the community. In the adaptive models, individuals can change their internal parameters if the attempted variation implies an increase of their corresponding fitnesses (see Materials and Methods). These evolutionary/adaptive rules result in the ensemble of agents converging to a steady state distribution, which we aim at characterizing. We obtain similar results in two families of models, which differ in the way in which the environment is treated. In the first, the environment is self-generated by a community of coevolving/coadapting individuals, while, in the second, the variable external world is defined ad hoc.
Coevolutionary Model.
The environment perceived by each individual consists of the other M − 1 systems in the community, which it aims at “understanding” and coping with. In the simplest computational implementation of this idea (see Materials and Methods), a pair of individual agents is randomly selected from the community at each time step and each of these two individuals constitutes the environmental source for the other. Given that the KL divergence is not symmetric (see Materials and Methods), one of the two agents has a larger fitness and thus a greater probability of generating progeny, while the less fit system is more likely to die. This corresponds to a fitness function of agent i, which is a decreasing function of the KL divergence from the other. In this case, as illustrated in Fig. 2 (and in Movies S1 and S2), the coevolution of M = 100 agents—which [n their turn are sources—leads to a very robust evolutionarily stable steady-state distribution. Indeed, Fig. 2 Left shows that for three substantially different initial parameter distributions (very broad, and localized in the ordered and in the disordered phases, respectively), the community coevolves in time to a unique localized steady state distribution, which turns out to be peaked at the critical point (i.e., where the Fisher information peaks; see Fig. 2 Right and SI Appendix, section S4). This conclusion is robust against model details and computational implementations: the solution peaked at criticality is an evolutionary stable attractor of the dynamics. The same conclusions hold for an analogous coadaptive model in which the systems adapt rather than dying and replicating (see SI Appendix, section S6).
Coevolutionary model leads self-consistently to criticality: A community of M living systems (or agents) evolves according to a genetic algorithm dynamics (27). Each agent i (i = 1, …, M) is characterized by a two-parameter (
Movie S1. Simulation of the coevolutionary model leading self-consistently to criticality. A community of agents or cognitive systems coevolve according to a genetic algorithm. Different colors represent different initial conditions and so individuals of different colors do not interact with each other. Each agent has an internal representation of the rest of the community, symbolized with a dot in the two-parameter space (β1 and β2). Individuals with better representations reproduce more probably, and the offspring inherit the parameters from their parents with small mutations. The simulation shows how, independently of the initial condition, the individuals self-tune to the maximum of the Fisher information or critical point.
Movie S2. Simulation of the coevolutionary model for small systems. As in Movie S1, a community of agents coevolve to understand each other. Each agent constructs a representation of the environment, symbolized with a dot in the two-parameter space (β1 and β2). The information is encoded in strings of N binary variables. Every agent has N = 10 in the upper panel and 100 in the lower one. The Fisher Information for the two parameters’ distribution is plotted in the background. We can see how, after iterating the genetic algorithm, the agents localize at the maximum of the Fisher Information. When N increases, the peak approaches the critical point.
Evolutionary Model.
An ensemble of M agents are exposed at each particular time to a heterogeneous complex environment consisting of S independent environmental sources, each one with a different Psrc and thus parametrized by diverse αs (see Fig. 3). The set of S sources is randomly extracted from a broadly distributed pool of possible sources occurring with different probabilities, ρsrc(α). The fitness of an individual with parameters β with respect to any given environment is taken to be a decreasing function of the average KL divergence from the diverse external stimuli:
Evolutionary model leading to near to criticality in complex environments. A community of M agents undergoes a genetic algorithm dynamics (27). Each agent is simultaneously exposed to diverse stimuli s provided by S different sources, each one characterized by a probability Psrc(s|αu) with u = 1, …, S, fully specified by parameters αu. At each time step, S sources are randomly drawn with probability ρsrc(αu) (in this case, a uniform distribution with support in the colored region). Each agent i (i = 1, …, M) has an internal state Pint(s|βi) aimed at representing—or coping with—the environment. Agents’ fitness increases as the mean KL divergence from the set of sources to which they are exposed decreases. The higher the fitness of an individual, the lower its probability of dying. An agent that is killed is replaced by a new individual with a parameter β inherited from one of the other agents (and, with some probability, a small variation/mutation). The community dynamically evolves and eventually reaches a steady state distribution of parameters, p(β). The six panels in the figure correspond to different supports (colored regions) for uniform source distributions, ρsrc(αu). The dashed line is the generalized susceptibility (Fisher information) of the internal probability distribution, which exhibits a peak at the critical point separating an ordered from a disordered phase. Heterogeneous source pools (Top and Middle) lead to distributions peaked at criticality, whereas for homogeneous sources (Bottom), the communities are not critical but specialized. Stimuli distributions are parametrized in a rather simplistic way as
Movie S3. Simulation of the evolutionary model leading to criticality in complex environments. A community of agents in different environments evolves according to a genetic algorithm. Every agent is represented with a black dot on the vertical axis. At each time step, different sources are generated from the colored region, every one characterized by a parameter β. The agents have an internal representation of the sources, encoded in their own internal parameter β. Individuals with better representations have more chances to reproduce, and the offspring inherit the parameter β from their parents with a small mutation. We can see that, when the source pools are heterogeneous, as occurs for the left and right panels, the community evolves near the maximum of the Fisher Information, or, in other words, the critical point. However, when the sources are very specific, the agents do not become critical, as occurs for the central panels.
Analytical Results for the Dynamical Models
A generic probability distribution can be rewritten to parallel the standard notation in statistical physics, P(s|γ) = exp(−H(s|γ))/Z(γ), where the factor Z(γ) is fixed through normalization. The function H can be generically written as
To proceed further, we need to compute the internal state distribution in the presence of diverse sources distributed with ρsrc(α). In this case, we compute the value of β which minimizes the average KL divergence to the sources α as written above (an alternative possibility—which is discussed in SI Appendix, section S3—is to identify the optimal β for each specific source and then average over the source distribution), leading to the condition:
Discussion and Conclusions
Under the mild assumption that living systems need to construct good although approximate internal representations of the outer complex world and that such representations are encoded in terms of probability distributions, we have shown—by using concepts from statistical mechanics and information theory—that the encoding probability distributions do necessarily lie where the generalized susceptibility or Fisher information exhibits a peak (25), i.e., in the vicinity of a critical point, providing the best possible compromise to accommodate both regular and noisy signals.
In the presence of broadly different ever-changing heterogeneous environments, computational evolutionary and adaptive models vividly illustrate how a collection of living systems eventually clusters near the critical state. A more accurate convergence to criticality is found in a coevolutionary/coadaptive setup in which individuals evolve/adapt to represent with fidelity other agents in the community, thereby creating a collective “language,” which turns out to be critical.
These ideas apply straightforwardly to genetic and neural networks—where they could contribute to a better understanding of why neural activity seems to be tuned to criticality—but have a broader range of implications for general complex adaptive systems (21). For example, our framework could be applicable to some bacterial communities for which a huge phenotypic (internal state) variability has been empirically observed (29). Such a large phenotypic diversification can be seen as a form of “bet hedging,” an adaptive survival strategy analogous to stock market portfolio management (30), which turns out to be a straightforward consequence of individuals in the community being critical. Usually, from this point of view, generic networks diversify their “assets” among multiple phenotypes to minimize the long-term risk of extinction and maximize the long-term expected growth rate in the presence of environmental uncertainty (30). Similar bet-hedging strategies have been detected in viral populations and could be explained as a consequence of their respective communities having converged to a critical state, maximizing the hedging effect. Similarly, criticality has been recently shown to emerge through adaptive information processing in machine learning, where networks are trained to produce a desired output from a given input in a noisy environment; when tasks of very different complexity need to be simultaneously learned, networks adapt to a critical state to enhance their performance (31). In summary, criticality in some living systems could result from the interplay between their need for producing accurate representations of the world, their need to cope with many widely diverse environmental conditions, and their well-honed ability to react to external changes in an efficient way. Evolution and adaptation might drive living systems to criticality in response to this smart cartography.
Materials and Methods
Kullback–Leibler Divergence.
Given two probability distributions P(s) and Q(s) for variables s, the KL divergence of Q(s) from P(s),
quantifies the loss of information when Q(s) is used to approximate P(s) (25). Indeed, in the large T limit, the probability
Fisher Information and Criticality.
Given a probability distribution P(s|γ)—where γ can stand either for α or β—the Fisher Information is defined as
where μ and ν are parameter labels and the average 〈⋅〉γ is performed with respect to P(⋅|γ). It measures the amount of information encoded in the states s about the parameters γ (25). This follows from the Cramér−Rao inequality, which states that the error made when we estimate γ from one state s is, on average, greater than (or at least equal to) the inverse of the Fisher information (25). In particular, if χ happens to diverge at some point, it is possible to specify the associated parameters with maximal precision (26). With the parametrization used in the main text, the Fisher information is the generalized susceptibility in the statistical mechanics terminology and measures the response of the system to parameter variations:
Coevolutionary Model.
The kth agent of the community is described by a probability distribution Pint(s|βk) ∝ exp{−Hint(s|βk)}, with
Evolutionary Model.
A community of agents receiving external stimuli from an outer and heterogeneous environment is modeled as follows. Every specific environmental source corresponds to a probability distribution Psrc(s|α) ∝ exp(−Hsrc(s|α)), with
Acknowledgments
We are indebted to T. Hoang, D. Pfaff, J. Uriagereka, S. Vassanelli, and M. Zamparo for useful discussions and to W. Bialek and two anonymous referees for many insightful suggestions. A.M., J.G., and S.S. acknowledge Cariparo Foundation for financial support. M.A.M. and J.H. acknowledge support from J. de Andalucia P09-FQM-4682 and the Spanish MINECO FIS2009-08451.
Footnotes
↵1J.H. and J.G. contributed equally to this work.
- ↵2To whom correspondence may be addressed. Email: mamunoz@onsager.ugr.es or amos.maritan@pd.infn.it.
Author contributions: J.H., J.G., S.S., M.A.M., J.R.B., and A.M. designed research; J.H., J.G., and S.S. performed research; and J.H., J.G., S.S., M.A.M., J.R.B., and A.M. wrote the paper.
The authors declare no conflict of interest.
↵*This Direct Submission article had a prearranged editor.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1319166111/-/DCSupplemental.
References
- ↵
- Stanley HE
- ↵
- Binney J,
- Dowrick N,
- Fisher A,
- Newman M
- ↵
- Mora T,
- Bialek W
- ↵
- Beggs JM,
- Plenz D
- ↵
- Nykter M,
- et al.
- ↵
- Furusawa C,
- Kaneko K
- ↵
- Krotov D,
- Dubuis JO,
- Gregor T,
- Bialek W
- ↵
- Chen X,
- Dong X,
- Be’er A,
- Swinney HL,
- Zhang HP
- ↵
- Bialek W,
- et al.
- ↵
- Jensen HJ
- ↵
- Edlund JA,
- et al.
- ↵
- Chialvo DR
- ↵
- Beggs JM
- ↵
- Kinouchi O,
- Copelli M
- ↵
- Mora T,
- Walczak AM,
- Bialek W,
- Callan CG Jr.
- ↵
- Shew WL,
- Plenz D
- ↵
- Langton C
- ↵
- Bertschinger N,
- Natschläger T
- ↵
- Kauffman S
- ↵
- Wagner A
- ↵
- Gros C
- ↵
- de Jong H
- ↵
- Huang S
- ↵
- Balleza E,
- et al.
- ↵
- Cover TM,
- Thomas J
- ↵
- Mastromatteo I,
- Marsili M
- ↵
- Goldberg DE
- ↵
- Schwab DJ,
- Nemenman I,
- Mehta P
- ↵
- Kussell E,
- Leibler S
- ↵
- Wolf DM,
- Vazirani VV,
- Arkin AP
- ↵
- Goudarzi A,
- Teuscher C,
- Gulbahce N,
- Rohlf T
Article Classifications
- Physical Sciences
- Physics