Perspective | Published:

From networks to optimal higher-order models of complex systems

Nature Physicsvolume 15pages313320 (2019) | Download Citation

AbstractAbstract

Rich data are revealing that complex dependencies between the nodes of a network may not be captured by models based on pairwise interactions. Higher-order network models go beyond these limitations, offering new perspectives for understanding complex systems.

Access provided by University of Bath

MainMain

Network science provides powerful analytical, statistical and computational methods to describe the behaviour of complex systems1. Complex systems are typically composed of a large number of components. Each component interacts directly with only relatively few others, while influencing more components indirectly via chains of direct interactions. Since both direct and indirect interactions determine the behaviour and function of a system, network models of complex systems capture both — generally in two steps.

First, components are represented as nodes xi. Direct interactions between them are represented with possibly weighted and directed pairwise links xixj, which are captured in adjacency matrices or associated to random walk and Laplacian matrices. Second, non-adjacent nodes are transitively connected by matrix algebraic methods; in applications such as eigenvector centrality or spectral clustering, for example, these would be given by products of matrices or eigenvalue decompositions. The application of these methods assumes that, given adjacent pairwise links xixj and xjxk, a node xi can indirectly influence another node xk through a transitive path xixjxk with two independent steps. This assumption is ubiquitous in network science. It is at the root of node-ranking and community detection algorithms2,3,4,5, of scalable techniques to calculate shortest paths, optimal flows and cuts6, as well as of visualization methods7.

The success of network models across the sciences rests on their ability to connect the structure, dynamics and function of arbitrary systems on the basis of abundant data on pairwise interactions between their components. Compared with mean-field approaches, where the interactions between all elements are summarized through a single averaged field, network models often have greater explanatory power because they account for the sparse and non-random topologies of social, biological and technological systems1. However, new forms of high-dimensional and time-resolved data have now also shed light on the limitations of these models.

Rich data indicate who interacts with whom, but also what different types of interactions exist, when and in which order they occur, and whether interactions involve pairs or larger sets of nodes. These seemingly disparate types of data have something in common: they provide us with information on higher-order dependencies between the components of a system, which lay beyond the reach of models that exclusively capture pairwise links. This has profound consequences for network models of relational data — a cornerstone in the interdisciplinary study of complex systems. For example, higher-order dependencies have been shown to either speed up or slow down dynamical processes8,9,10, change node rankings11,12,13 and alter community structures12,14,15,16,17,18.

An active community of researchers is developing higher-order network models that account for different types of higher-order dependencies in data on complex systems. Such models better capture how the components of complex systems directly and indirectly influence each other, promising improved explanatory power at the expense of increased model complexity. Further progress will require integrative approaches that combine novel network-analytic methods for rich data with scalable statistical inference and machine-learning techniques. These will help address open questions, such as finding models that optimally balance under- and overfitting in dependence of available data, or establishing the existence and scalability of a single framework that can capture different types of higher-order dependencies. Finding good answers can in turn further improve our understanding of the structure, dynamics and function of complex systems.

In this Perspective, after a brief overview of different classes of higher-order network models, we focus on the consideration of non-Markovian paths in time-stamped interaction data. We illustrate how that affects fundamental network science methods applied across disciplines, namely community detection, node ranking and the modelling of dynamical processes. Finally, we discuss challenges in developing optimal higher-order models that take advantage of rich data on higher-order dependencies while avoiding the risk of overfitting.

Modelling higher-order dependencies in complex systemsModelling higher-order dependencies in complex systems

Recent work on higher-order network models can be divided into three different yet related lines of research. The first line challenges the assumption that the influence between a system’s components can be decomposed into links of a single type, introducing instead multilayer higher-order models with multiple link types8,19. The second line questions the assumption that the influence between components in a complex system can be decomposed into pairwise links, developing models that generalize pairwise links to arbitrary node sets, which we refer to as combinatorial higher-order models14,20,21,22. The third line challenges the idea that the indirect influence between the components of a system can be understood based on transitive paths formed by independent links. Leveraging information on real paths inferred from time-series data, this research has introduced non-Markovian higher-order network models. They account for correlations in the sequence of nodes traversed by paths that cannot be captured by first-order Markov models10,12.

Multilayer models account for the fact that many real complex systems exhibit multiple types of interactions. Examples include multi-modal transportation systems23, interdependent layers of power and communication infrastructures24, multilayer financial networks25 or multi-faceted relationships between individuals in social systems26. Multilayer generalizations of networks seek to account for these features in, for example, the modelling of spreading processes8, the detection of modular structures27 and the ranking of nodes28,29.

Combinatorial models reproduce many-body interactions, which appear in many systems and necessitate higher-order models that capture information beyond pairwise interactions. Examples include triangles, which are known to be fundamental building blocks of social networks30, cliques in scientific co-authorship networks31, feed-forward loop network motifs in biochemical transcription networks32 and temporal social networks33, spatial coexistence relations between species in an ecosystem34 and trigenic interactions in gene regulatory networks35. Research on combinatorial models has introduced high-dimensional generalizations of graphs from topological data analysis. These include hypergraphs, in which links can join any set of nodes36, and more recently simplicial complexes, in which simplices can join any set of nodes and all subsets of those nodes37,38.

The need for non-Markovian models has been highlighted by a number of studies, which have used high-resolution time-series data to reveal complex higher-order patterns in paths that are not captured by standard network models. Examples include flight itineraries of passengers, patients moving between hospital wards39, time-stamped interactions in social networks40,41,42, scholarly citation networks12, temporal patterns in trade relations43,44, human mobility10,12,45, navigation paths of humans in information networks17,46, patient pathways in hospital networks47 and traces of dynamical processes in networked systems15. By leveraging applications of higher-order Markov chains in time-series analysis48, sequence mining49,50,51, behavioural modelling52,53 and natural language processing54,55, recent research on non-Markovian higher-order models has generalized networks to higher-dimensional representations that account for higher-order dependencies in paths.

Despite differences in motivation and mathematical underpinning, these approaches share a motivation: that standard network models are too simple to explain the complex paths of influence in high-dimensional and time-resolved data on biological, technical, economic and social systems, and thereby cannot adequately connect their structure, dynamics and function. In practice, this is achieved by modelling higher-order dependencies in complex systems and further constraining paths beyond what is expected from the network topology.

As an illustration, consider an ego network with five nodes in which ego communicates by different means with two friends and two colleagues, but rarely passes on information between them (Fig. 1). A standard network model would wash out this kind of higher-order node dependencies, whereas a random walk as an information flow model would form paths across independent pairwise links (Fig. 1a). In contrast, all higher-order network models better capture the constraints on the information paths so that they tend to stay among friends or colleagues. This is achieved by considering node dependencies in the underlying data in different ways: a non-Markovian model records the temporal order of messages so that paths continue depending on where they come from (Fig. 1b), a multilayer model differentiates communication means so that paths mainly stay within associated layers (Fig. 1c), and a combinatorial model combines group and multiple pairwise communication in a simplicial complex and considers paths that move between links that share a triangle (Fig. 1d). In this way, higher-order network models further constrain the indirect paths by which different parts of a system influence each other.

Fig. 1: Different approaches to model an ego network with higher-order dependencies between nodes.
Fig. 1

ad, Ego (central node) communicates by different means with two friends (left nodes) and two colleagues (right nodes). Green and purple arrows highlight paths from one friend (purple) and one colleague (green) through ego. To which nodes these paths can continue depends on the constraints set by a standard network model with Markovian dynamics (a), a non-Markovian network model (b), a multilayer network with Markovian dynamics within layers (c) and a simplicial complex where the paths move between links that share a triangle (d). The thickness of the arrows indicates the volume of flows between nodes.

In the following, we review how modelling of higher-order dependencies between nodes with proper constraints on paths can provide a better understanding of complex systems. We focus on non-Markovian higher-order network models, which explicitly question the assumption that indirect influence between distant nodes happens through transitive paths — common in standard network models.

A useful example to illustrate this concept is the reconstruction of paths from time-series data (Fig. 2a). The temporal information available in this data helps either directly infer the paths or cascades through which information propagates in a system, or indirectly capture time-stamped links that define the concept of causal, or time-respecting, paths9. Considering pairwise interactions, a standard network model would portray the link topology of the underlying system as shown in Fig. 2b. This representation discards information on the links’ contribution to paths, implicitly suggesting that nodes can indirectly influence each other via transitive paths that traverse nodes in a memoryless, Markovian fashion. In our example, nodes A and B can both indirectly influence D and E via four transitive, Markovian paths: ACD, ACE, BCD and BCE (Fig. 2c). However, a closer look at the interaction order in the time-series data (Fig. 2a) reveals that only two of these four possible paths exist in the sequence (Fig. 2d). Network analytic methods that assume transitive, Markovian paths, are therefore not valid. This shortcoming can be overcome by a path-centric view that generalizes networks to higher-order models of paths10,11,12,15,17,56. Figure 2e illustrates this idea with a second-order model that accounts for the topology of paths of length two. In the spirit of higher-order Markov chain models, this model can be represented with a memory network12, where state nodes represent states in a second-order state space and links encode possible transitions between states. Depending on the topology of paths, each of the five physical nodes A, B, C, D and E, which typically are the objects of interest in the real world, has one or more state nodes (Fig. 2e). These state nodes enable efficient higher-order network models of paths. A path described by a Markovian model on the state nodes, directed from one state node to the next with a probability that does not depend on previously visited state nodes, appears non-Markovian on the physical nodes (Fig. 2f). This modelling approach can be generalized to arbitrary order m by adding one state node for each prefix of m – 1 nodes that precedes the current physical node on a path. In this way, we can construct network models that capture higher-order effects in paths for any given order m.

Fig. 2: Non-Markovian higher-order models can better capture the topology of paths in complex systems.
Fig. 2

a, A rich source of path information is time-series data that capture interaction sequences between the components of a system. b,c, Focusing on pairwise interactions, network models abstract a system’s topology with nodes and links (b) while assuming that paths are transitive and Markovian (c). d, Due to the chronological ordering of interactions, the actual paths of indirect influence in time-series data can deviate from this assumption. e,f, Focusing on paths rather than pairwise interactions, higher-order network models with, for example, state nodes (e) can capture the actual topology of indirect influence (f).

Non-Markovian paths and community detectionNon-Markovian paths and community detection

Community detection57 is an umbrella term for a large number of algorithms that group nodes into distinct modules to simplify and highlight essential structures in the network topology. As higher-order network models can capture more complex forms of interactions, generalized community-detection algorithms can capture more complex forms of relational regularities. An example is citation flows between journals and scientific communities with long flow persistence times. A standard network model where journals are connected by weighted directed links that are built by aggregating citations between their articles fails to capture the complex citation flows through multidisciplinary journals such as Nature (Fig. 3a,c)12.

Fig. 3: Community detection of paths can capture overlapping communities.
Fig. 3

The underlying data from Thomson Reuters Web of Science65 are chains of citing articles aggregated in journals, like in Fig. 2a with nodes interpreted as articles in journals A–E. a, A standard first-order Markov representation of citation flows from four specialized journals through multidisciplinary Nature. b, A second-order representation with one state node for each citing journal. c, The standard network representation mixes flows and washes out the boundary between fields. d, A second-order Markov model captures the fact that citation flows through a multidisciplinary journal depends on where they come from and highlights overlapping fields in Nature.

Citation flows from different fields mix and move in a non-realistic way across fields, as the output citation flow from the multidisciplinary journal depends only on the total number of citations directed to another journal, irrespective of where the citations are coming from. For example, Fig. 3c illustrates that, within a standard first-order Markov model, most citation flows from two microbiology journals would continue to two plant science journals. As a result, all these journals would be best assigned to the same field. This showcases how community detection based on a standard network model can wash out boundaries between modules and fail to assign nodes to multiple overlapping modules.

In contrast, a second-order Markov representation of citation flows, which takes into account where citations come from, captures the fact that most citation flows coming to Nature from one field return to the same field (Fig. 3b,d). For example, when going from a first- to a second-order Markov representation, the relative amount of citation flows that return to the same journal after two steps, averaged over all journals, increases from 11% to 22% (ref. 12). Moreover, the non-returning citation flows behave in a more realistic way: Fig. 3b,d illustrates how citation flows from the Journal of Microbiology and the Journal of Bacteriology in microbiology mostly return to either journal and, similarly, how citation flows from Plant Cell and Plant Physiology in plant science mostly return to those journals. As a consequence, citation flows stay within their respective fields, highlighting the multidisciplinary character of the journal Nature. Averaged over all journals, the flow persistence within fields, the probability that citation flows stay within the same field in the next step, increases by 38% (ref. 58). A higher-order representation of non-Markovian citation paths is critical for capturing overlapping research fields in multidisciplinary journals.

Non-Markovian paths and node centralitiesNon-Markovian paths and node centralities

Algorithms that identify important nodes are among the success stories of network science. They help us locate critical elements in networked infrastructures, identify influential actors in social systems or find relevant pages in the World Wide Web. At the heart of these applications are measures for the centrality of nodes based on, for example, their occurrence on the shortest paths between other nodes, their role in flow processes or their influence on the steady state of stochastic dynamics1,59,60.

As these methods assume that node centrality can be characterized on the basis of the topology of pairwise interactions, they can be improved by accounting for the complex structure of paths in high-dimensional, time-resolved data. An example is shown in Fig. 4, which is based on time-stamped social interactions between software developers in a major open-source project. The network model of these interactions (Fig. 4a) allows us to estimate the importance of nodes, for example, using betweenness centrality, a measure that assigns high centrality to a node v if many shortest paths between pairs of other nodes pass through v (ref. 61). The resulting node centralities are represented by node sizes in Fig. 4a, indicating that node B is the most important node in the system.

Fig. 4: Non-Markovian paths change the centrality of nodes in time-stamped social network data.
Fig. 4

a,b, Betweenness centralities calculated based on shortest paths in a network model of time-stamped interactions (a) do not capture the true importance of nodes calculated based on causal paths that respect causality in the underlying time-series data (b). c,d, The alluvial diagrams highlight the fact that the chronological order of interactions alters the shortest causal paths passing through nodes A and B (d), compared with what we would expect based on the topology of direct interactions (c), thus considerably changing the betweenness centrality of nodes.

But is this a good estimate for the relative importance of different developers? We can answer this question by inferring causal paths in the underlying time-series data. That is, we consider which paths exist based on the chronological ordering and timing of time-stamped interactions. In a nutshell, for two interactions AB and BC, a causal path ABC can only exist if AB occurs before BC. Hence, time-stamped network data allow us to calculate causal path statistics that may or may not be consistent with the assumptions in transitive, Markovian paths of standard network models17. In the example shown in Fig. 4, a calculation of betweenness centralities based on actual shortest causal paths11 considerably shifts the relative importance of different developers. The alluvial diagrams in Fig. 4c,d visualize these differences, revealing that the shortest causal paths passing through node B are considerably more constrained than expected. This is due to temporal patterns in human communication behaviour that are not captured by a standard network model. As a result, node B is less central than we would assume based on the network topology. In contrast, node A, which ranks among the least central nodes from a topology perspective, turns out to be the most important node in terms of causal paths in the interaction sequence.

Higher-order models open new ways to address these limitations of existing centrality measures. We can, for instance, generalize networks to higher-order network models that resemble high-dimensional de Bruijn graphs10,17,62. Each node in such an m-dimensional model represents a path of length m – 1. Relative frequencies of paths of length m in time-series data are represented by weighted links, connecting nodes that overlap by m – 1 nodes. This simple construction generalizes standard network models to higher-order generative models of paths, each model of order m being a line graph of the model with order m – 1 (Fig. 5). Similar to memory networks, we can use such models to define higher-order generalizations of path-based centrality measures such as betweenness or closeness11. Moreover, spectral measures such as PageRank or eigenvector centrality can be redefined based on eigenvectors of linear operators derived from de Bruijn graphs or memory networks12,15,17. These novel measures help us better quantify the importance of elements in a complex system, considering a system’s topology as well as temporal patterns in non-Markovian paths. Besides statistical methods that can be used to detect correlations that warrant higher-order models, cross-validation analyses show that the predictions generated by such models indeed outperform those of standard network models17, which confirms that higher-order models can better generalize to unseen data.

Fig. 5: De Bruijn graphs with m dimensions help generalize network analytic methods to higher-order models.
Fig. 5

a, First-order model with dimensionality m = 1 for a set of observed causal paths between four nodes A, B, C and D. b, Second-order model with m = 2. c, Third-order model with m = 3. Starting from a first-order network model, higher-order models can be generated by an iterative line graph construction. The absence of transitions that correspond to a possible transitive path in the underlying first-order network, such as BDBDBD, indicates constraints in the observed paths that change the causal topology of the system.

Non-Markovian paths and dynamical processesNon-Markovian paths and dynamical processes

Along with giving us the ability to reason about topological features including community structures or node centralities, network science enables us to understand how the topology of a system influences dynamical processes, and thus its function. Much of this research is based on the analytical study of linear dynamical systems in which Laplacian, adjacency or transition matrices encode direct pairwise interactions between a system’s elements. The eigenvalues and eigenvectors of these matrix operators capture how the topology of a system influences the efficiency of diffusion and propagation processes, whether it enforces or mitigates the stability of dynamical systems, or if it hinders or fosters collective dynamics.

Although such algebraic methods help relate the structure and dynamics of complex systems, they also come with the assumption of transitive, Markovian paths, which is not justified in many real systems. Figure 6a illustrates an example of such a system — the London Tube system modelled as a network, where nodes represent train stations and links capture direct train connections. To understand how the topology of this transportation network influences its efficiency and robustness, it is common to study its influence on dynamical processes. As a simple example, consider a discrete-time model for the diffusion of passengers who start their journey at a single station at time t = 0 and travel one station per discrete time step. We further adjust each passenger’s probability to continue across a given link based on data on average passenger volumes between London Tube stations, making the passenger more likely to continue through links with high passenger volume. The flow diagram in Fig. 6b shows the first five steps in this process. Assuming transitive, Markovian paths, it highlights how the system’s topology shapes diffusion dynamics. Alternatively, using available data on actual passenger itineraries, we can study this diffusion process based on real paths (Fig. 6c). This study reveals that the topology of the system is not sufficient to explain the complex non-Markovian paths and flows in the system10. Specifically, Fig. 6c reveals a strong directional preference — which would be better captured by a non-backtracking random walk — rooted in the non-Markovian characteristics of paths and the underlying geography. These patterns considerably influence the process and limit what the topology alone can tell us about the robustness and efficiency of real transportation networks.

Fig. 6: Non-Markovian paths in networked systems influence the evolution of diffusion processes.
Fig. 6

a, Network model of the London Tube system, where links capture direct train connections between stations. b,c, The flow diagrams show the first five steps of a discrete-time diffusion process starting in node highlighted in red in a. The widths of flows capture the number of passengers moving on paths between particular nodes in the process. While b shows the dynamics of the process using transitive and Markovian paths in the network, c shows the evolution of diffusion across the non-Markovian paths created by the specific ordering of train connections in the London Tube system. The causal topology created by such non-Markovian paths influences dynamical processes and challenges our understanding of real complex systems.

Non-Markovian higher-order models help us to overcome these issues. We can, for instance, generalize Laplacian and transition matrices to high-dimensional de Bruijn graph models10 that capture the causal topology shaped by non-Markovian paths. Such higher-order representations enable the generalization of methods for dynamical systems, such as eigendecompositions, spectral analysis or stability theory, to systems with non-Markovian paths. They make it possible to analytically study the complex interplay between time and topology in networked systems, and explain why non-Markovian characteristics of paths can both decelerate and accelerate dynamical processes and collective dynamics10,63.

PerspectivesPerspectives

To explain the behaviour of complex systems, we must understand how its components influence each other. Network science provides powerful tools to address this challenge. Network abstractions of direct, pairwise interactions help us explain emergent phenomena that arise from essential features of a system’s topology, rather than from the details of a particular system. Moreover, by combining graph-theoretic methods with ensemble-based techniques, network science provides a solid foundation for statistical analysis, inference and machine learning in relational data. However, the limits of what network models can teach us about real systems are becoming increasingly evident as a result of rich, recently available data on social, technical and biological systems. Capturing complex paths in these data requires advanced modelling techniques, which comes with new challenges but also exciting opportunities for interdisciplinary exchanges between physics, computer science and statistics.

Model selection is an epistemological challenge. Given rich, high-dimensional and time-stamped data on complex systems, how do we know that our selected model explains how a system’s components influence each other? Referring to Ockham’s razor, a good model should be maximally parsimonious: it should make minimal assumptions to enable generalizable statements that go beyond the specific system under study. However, a good model must also be sufficiently sophisticated to explain paths observed in real systems, which is where standard network models often fall short. In other words, much like network science has exposed patterns in the link topology, we need higher-order models that best compress information by modelling higher-order dependencies in complex systems. Effectively, finding such optimal models based on rich data becomes a machine-learning problem, where standard networks are merely one of many possible outcomes.

Scalability is a computational challenge. The size of non-Markovian models often grows exponentially with their order so that an analysis becomes quickly infeasible. Moreover, statistically reliable inference of such models typically requires vast volumes of data, which may not be available. Finally, fixed higher-order models can simultaneously under- and overfit paths in real systems. These issues highlight the need for computational and statistical methods that use variable-order15,58 or multi-order models17 and model-order-reduction techniques16 to generate computationally tractable models that neither under- nor overfit the data (see Box 1). While model selection and statistical learning can be used to fit non-Markovian higher-order models in time-series data17, little is known about how we can address this challenge for other classes of higher-order models and data.

A unified, higher-order modelling framework is an interdisciplinary challenge. While multilayer, combinatorial and non-Markovian higher-order models enrich network science in different ways, a unified framework can potentially combine their strengths. For example, generalized links and paths in combinatorial models, which define them between arbitrary node sets, as well as multilayer models, which include heuristic interlayer links, can benefit from the path-centric view of non-Markovian models with generalized links from data on paths. Similarly, the non-Markovian perspective can benefit from advances made by the other approaches. For example, these other approaches offer generalizations of generative models64 that help us to detect structural patterns and identify the simple mechanisms by which they emerge. Since little is known about the mechanisms by which similar non-Markovian patterns emerge across different systems, a new class of higher-order generative network models would provide a step forward. Finally, a unified mathematical language can enable universal methods to select optimal models across different modelling approaches.

Addressing these challenges, higher-order modelling techniques will be able to leverage existing network methods and extend them towards optimal models that better explain the inner workings and behaviour of complex systems.

Additional informationAdditional information

Journal peer review information: Nature Physics thanks David Gleich and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

ReferencesReferences

  1. 1.

    Newman, M. E. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003).

  2. 2.

    Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. U. Complex networks: structure and dynamics. Phys. Rep. 424, 175–308 (2006).

  3. 3.

    Newman, M. E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006).

  4. 4.

    Estrada, E. & Higham, D. J. Network properties revealed through matrix functions. SIAM Rev. 52, 696–714 (2010).

  5. 5.

    Porter, M. A. & Gleeson, J. P. Dynamical Systems on Networks: A Tutorial (Springer, 2006).

  6. 6.

    Even, S. Graph Algorithms (Cambridge Univ. Press, 2011).

  7. 7.

    Seary, A. J. & Richards, W. D. Spectral methods for analyzing and visualizing networks: an introduction. In Dynamic Social Network Modeling and Analysis: Workshop Summary and Papers 209–228 (The National Academies Press, 2000).

  8. 8.

    De Domenico, M., Granell, C., Porter, M. A. & Arenas, A. The physics of spreading processes in multilayer networks. Nat. Phys. 12, 901–906 (2016).

  9. 9.

    Holme, P. & Saramäki, J. Temporal networks. Phys. Rep. 519, 97–125 (2012).

  10. 10.

    Scholtes, I. et al. Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. 5, 5024 (2014).

  11. 11.

    Scholtes, I., Wider, N. & Garas, A. Higher-order aggregate networks in the analysis of temporal networks: path structures and centralities. Eur. Phys. J. B 89, 61 (2016).

  12. 12.

    Rosvall, M., Esquivel, A. V., Lancichinetti, A., West, J. D. & Lambiotte, R. Memory in network flows and its effects on spreading dynamics and community detection. Nat. Commun. 5, 5630 (2014).

  13. 13.

    Benson, A. R. Three hypergraph eigenvector centralities. Preprint at https://arxiv.org/abs/1807.09644 (2018).

  14. 14.

    Benson, A. R., Gleich, D. F. & Leskovec, J. Higher-order organization of complex networks. Science 353, 163–166 (2016).

  15. 15.

    Xu, J., Wickramarathne, T. L. & Chawla, N. V. Representing higher-order dependencies in networks. Sci. Adv. 2, e1600028 (2016).

  16. 16.

    Peixoto, T. P. & Rosvall, M. Modelling sequences and temporal networks with dynamic community structures. Nat. Commun. 8, 582 (2017).

  17. 17.

    Scholtes, I. When is a network a network? Multi-order graphical model selection in pathways and temporal networks. In Proc. 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ‘17 1037–1046 (ACM, 2017).

  18. 18.

    Schaub, M. T., Benson, A. R., Horn, P., Lippner, G. & Jadbabaie, A. Random walks on simplicial complexes and the normalized hodge laplacian. Preprint at https://arxiv.org/abs/1807.05044 (2018).

  19. 19.

    Kivelä, M. et al. Multilayer networks. J. Complex Netw. 2, 203–271 (2014).

  20. 20.

    Arenas, A., Fernandez, A., Fortunato, S. & Gomez, S. Motif-based communities in complex networks. J. Phys. A 41, 224001 (2008).

  21. 21.

    Jiang, X., Lim, L.-H., Yao, Y. & Ye, Y. Statistical ranking and combinatorial hodge theory. Math. Program. 127, 203–244 (2011).

  22. 22.

    Petri, G., Scolamiero, M., Donato, I. & Vaccarino, F. Topological strata of weighted complex networks. PLoS ONE 8, e66506 (2002).

  23. 23.

    Cardillo, A. et al. Modeling the multi-layer nature of the European Air Transport Network: resilience and passengers re-scheduling under random failures. Eur. Phys. J. Spec. Top. 215, 23–33 (2013).

  24. 24.

    Buldyrev, S. V., Parshani, R., Paul, G., Stanley, H. E. & Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 464, 1025–1028 (2010).

  25. 25.

    Battiston, S., Caldarelli, G. & D’Errico, M. in Interconnected Networks (ed. Garas, A.) 195–229 (Springer, 2016).

  26. 26.

    Moreno, J. L. Who Shall Survive? A New Approach to the Problem of Human Interrelations (Beacon House, 1934).

  27. 27.

    Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P. Community structure in time-dependent, multiscale, and multiplex networks. Science 328, 876–878 (2010).

  28. 28.

    Halu, A., Mondragón, R. J., Panzarasa, P. & Bianconi, G. Multiplex PageRank. PLoS ONE 8, e78293 (2013).

  29. 29.

    De Domenico, M., Solé-Ribalta, A., Omodei, E., Gómez, S. & Arenas, A. Ranking in interconnected multilayer networks reveals versatile nodes. Nat. Commun. 6, 6868 (2015).

  30. 30.

    Granovetter, M. S. The strength of weak ties. Am. J. Sociol. 78, 1360–1380 (1973).

  31. 31.

    Patania, A., Petri, G. & Vaccarino, F. The shape of collaborations. EPJ Data Sci. 6, 18 (2017).

  32. 32.

    Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif. Proc. Natl Acad. Sci. USA 100, 11980–11985 (2003).

  33. 33.

    Paranjape, A., Benson, A. R. & Leskovec, J. Motifs in temporal networks. In Proc. 10th ACM International Conference on Web Search and Data Mining 601–610 (ACM, 2017).

  34. 34.

    Levine, J. M., Bascompte, J., Adler, P. B. & Allesina, S. Beyond pairwise mechanisms of species coexistence in complex communities. Nature 546, 56–64 (2017).

  35. 35.

    Kuzmin, E. et al. Systematic analysis of complex genetic interactions. Science 360, eaao1729 (2018).

  36. 36.

    Ghoshal, G., Zlatić, V., Caldarelli, G. & Newman, M. E. Random hypergraphs and their applications. Phys. Rev. E 79, 066118 (2009).

  37. 37.

    Mukherjee, S. & Steenbergen, J. Random walks on simplicial complexes and harmonics. Random Struct. Algorithms 49, 379–405 (2016).

  38. 38.

    Boissonnat, J.-D., Chazal, F. & Yvinec, M. Geometric and Topological Inference Vol. 57 (Cambridge Univ. Press, 2018).

  39. 39.

    Liljeros, F., Giesecke, J. & Holme, P. The contact network of inpatients in a regional healthcare system. A longitudinal case study. Math. Popul. Stud. 14, 269–284 (2007).

  40. 40.

    Karsai, M., Kaski, K. & Kertész, J. Correlated dynamics in egocentric communication networks. PLoS ONE 7, e40612 (2012).

  41. 41.

    Pfitzner, R., Scholtes, I., Garas, A., Tessone, C. J. & Schweitzer, F. Betweenness preference: quantifying correlations in the topological dynamics of temporal networks. Phys. Rev. Lett. 110, 198701 (2013).

  42. 42.

    Wei, W. & Carley, K. M. Measuring temporal patterns in dynamic social networks. ACM Trans. Knowl. Discov. Data 10, 9 (2015).

  43. 43.

    Lentz, H. H. K., Selhorst, T. & Sokolov, I. M. Unfolding accessibility provides a macroscopic approach to temporal networks. Phys. Rev. Lett. 110, 118701 (2013).

  44. 44.

    Koher, A., Lentz, H. H. K., Hövel, P. & Sokolov, I. M. Infections on temporal networks — a matrix-based approach. PLoS ONE 11, e0151209 (2016).

  45. 45.

    Matamalas, J. T., De Domenico, M. & Arenas, A. Assessing reliable human mobility patterns from higher order memory in mobile communications. J. R. Soc. Interface 13, 20160203 (2016).

  46. 46.

    Asztalos, A. & Toroczkai, Z. Network discovery by generalized random walks. Europhys. Lett. 92, 50008 (2010).

  47. 47.

    Palla, G. et al. Complex clinical pathways of an autoimmune disease. J. Complex Netw. 6, 206–214 (2017).

  48. 48.

    Box, G. E., Jenkins, G. M. & Reinsel, G. C. Time Series Analysis: Forecasting and Control (John Wiley & Sons, 2013).

  49. 49.

    Salzberg, S. L., Delcher, A. L., Kasif, S. & White, O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26, 544–548 (1998).

  50. 50.

    Benson, A. R., Gleich, D. F. & Lim, L.-H. The spacey random walk: a stochastic process for higher-order data. SIAM Rev. 59, 321–345 (2017).

  51. 51.

    Lu, X., Wetter, E., Bharti, N., Tatem, A. J. & Bengtsson, L. Approaching the limit of predictability in human mobility. Sci. Rep. 3, 2923 (2013).

  52. 52.

    Chierichetti, F., Kumar, R., Raghavan, P. & Sarlós, T. Are web users really Markovian? In Proc. 21st Int. Conf. on World Wide Web 609–618 (ACM, 2012).

  53. 53.

    Singer, P., Helic, D., Taraghi, B. & Strohmaier, M. Detecting memory and structure in human navigation patterns using Markov chain models of varying order. PLoS ONE 9, e114952 (2014).

  54. 54.

    Shannon, C. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).

  55. 55.

    Dunning, T. Statistical Identification of Language Technical Report (Computing Research Laboratory, 1994).

  56. 56.

    Edler, D. et al. Mapping higher-order network flows in memory and multilayer networks with Infomap. Algorithms 10, 112 (2017).

  57. 57.

    Fortunato, S. Community detection in graphs. Phys. Rep. 486, 75–174 (2010).

  58. 58.

    Persson, C., Bohlin, L., Edler, D. & Rosvall, M. Maps of sparse Markov chains efficiently reveal community structure in network flows with memory. Preprint at https://arxiv.org/abs/1606.08328 (2016).

  59. 59.

    Borgatti, S. P. Centrality and network flow. Soc. Netw. 27, 55–71 (2005).

  60. 60.

    Brandes, U., Heine, M., Müller, J. & Ortmann, M. Positional dominance: concepts and algorithms. In Conf. Algorithms and Discrete Applied Mathematics 60–71 (Springer, 2017).

  61. 61.

    Freeman, L. A set of measures of centrality based on betweeneess. Sociometry 40, 35–41 (1977).

  62. 62.

    de Bruijn, N. G. A combinatorial problem. Proc. K. Ned. Akad. Wet. 49, 758–764 (1946).

  63. 63.

    Zhang, Y., Garas, A. & Scholtes, I. Controllability of temporal networks: an analysis using higher-order networks. Preprint at https://arxiv.org/abs/1701.06331 (2017).

  64. 64.

    Ciftcioglu, E. N., Ramanathan, R. & Basu, P. Generative models for global collaboration relationships. Sci. Rep. 7, 11160 (2017).

  65. 65.

    2013 Journal Citation Reports® (Thomson Reuters, 2013).

Download references

AcknowledgementsAcknowledgements

M.R. was supported by the Swedish Research Council, grant 2016-00796. I.S. acknowledges support by the Swiss National Science Foundation, grant 176938.

Author informationAuthor information

Affiliations

  1. Mathematical Institute, University of Oxford, Oxford, UK

    • Renaud Lambiotte
  2. Integrated Science Lab, Department of Physics, Umeå University, Umeå, Sweden

    • Martin Rosvall
  3. Data Analytics Group, Department of Informatics (IfI), University of Zurich, Zurich, Switzerland

    • Ingo Scholtes

Authors

  1. Search for Renaud Lambiotte in:

  2. Search for Martin Rosvall in:

  3. Search for Ingo Scholtes in:

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Ingo Scholtes.

About this articleAbout this article

Publication history

Received

Accepted

Published

Issue Date

DOI

https://doi.org/10.1038/s41567-019-0459-y

Share this article

Anyone you share the following link with will be able to read this content: