Nature Communications | Article
Robust classification of salient links in complex networks
- Journal name:
- Nature Communications
- Volume:
- 3,
- Article number:
- 864
- DOI:
- doi:10.1038/ncomms1847
- Received
- Accepted
- Published
Abstract
Complex networks in natural, social and technological systems generically exhibit an abundance of rich information. Extracting meaningful structural features from data is one of the most challenging tasks in network theory. Many methods and concepts have been proposed to address this problem such as centrality statistics, motifs, community clusters and backbones, but such schemes typically rely on external and arbitrary parameters. It is unknown whether generic networks permit the classification of elements without external intervention. Here we show that link salience is a robust approach to classifying network elements based on a consensus estimate of all nodes. A wide range of empirical networks exhibit a natural, network-implicit classification of links into qualitatively distinct groups, and the salient skeletons have generic statistical properties. Salience also predicts essential features of contagion phenomena on networks, and points towards a better understanding of universal features in empirical networks that are masked by their complexity.
Subject terms:
At a glance
Figures
-
Figure 1: Generic statistical properties of heterogeneous complex networks. (a) Geographical representation of the worldwide air traffic network (top), black dots represent airports, links represent passenger flux between them, link weights wij are colour encoded from dark (weak) to white (strong). Networks on the lower left and right represent the Florida Bay food web and the World trade network, respectively. Nodes in the food web are species and links represent the exchange of biomass; in the trade network nodes are countries and links quantify exchange in assets measured in USD. (b) Relative frequencies f(w)=kwlp(w/kwl) and p(b) of link weights w and link betweenness b of representative transportation, biological, ecological, social, and economic networks. Link weights are normalized by the mean weight kwl. Details on each network are provided in Methods. In all networks link weights and betweenness are distributed across many orders of magnitude, and both statistics exhibit heavy tails. The substantial variability in these quantities is also reflected in their coefficient of variation (see Table 1).
-
Figure 2: Computation of link salience and properties of the HSS. (a) For each reference node r in the weighted network on the left, the SPT T(r) is computed. The superposition of all trees according to equation (1) assigns a value sij to each link in the original network. Salience values are shown on the right with link colour: red is high salience and grey is low. (b) The collection of high-salience links (red) for the networks shown in Fig. 1. The full networks are shown in grey. (c) The relative frequency p(s) of non-zero salience values s. The distribution p(s) is bimodal in all networks under consideration. This key feature of bimodality of p(s) provides a plausible, parameter-insensitive classification of links, salient (s≈1) versus non-salient (s≈0), and implies that nodes in these networks typically agree whether a link is essential or not. The HSS is defined as the collection of links that accumulate near s≈1. Upper and lower insets depict, respectively, the degree distribution p(k) of the HSSs and mean next-neighbour degree kknnl as a function of node degree k. The HSS degree distribution is typically scale-free (see Supplementary Fig. S2) and the skeletons are typically strongly disassortative. Note that although they may be, and often are, divided into multiple components, the largest connected component of the skeleton typically dominates. This connectedness is not imposed, but is an emergent property of salience. (See Supplementary Table S2).
-
Figure 3: Salience and betweenness capture different aspects of centrality. (a) A schematic planar network in which the colour of links quantifies betweenness b (left) and salience s (right). High betweenness links tend to be located near the barycentre of the network,39 whereas high-salience links are distributed evenly throughout the network. (b) A simple linear chain shows the reason for this effect. A link in the centre serves as a shortest-path bridge between all pairs of nodes, and so has the highest betweenness. But as all SPTs are identical, all links have identical salience. (c) A scatter plot (red dots) of link salience s versus link betweenness b for the air traffic network (point density is quantified in grey). The vertical dotted line marks s=1/2 and the solid curves represent the theoretical bounds of equation (3). The projected density p(b) is shown on the left. The lack of any clear correlation in the scatter plot is typical of all networks in Fig. 1 (See Supplementary Fig. S3 for additional correlograms.) (d) Scatter plots (in light red) of betweenness b (left) and salience s (right) versus link weight w in the air traffic network. The bottom and top of the lower whiskers, the dot, and the bottom and top of the upper whiskers correspond to the 0, 25, 50, 75, and 100th percentiles, respectively. The dashed line indicates a scaling relationship w~bγ with γ≈0.2. Although the network exhibits a positive correlation between link weight and link betweenness, the HSS incorporates links with weights spanning the entire range of observed values; no clear correlation of weight with salience exists. These properties are observed in the other networks as well.
-
Figure 4: Salience in random networks. (a) Salience distributions p(s) in fully connected networks with 1,000 nodes and weights assigned using a power law p(w)~w−(1+α) for various tail exponents α. Complete networks serve as models of systems with all-to-all interactions, such as the inter-industry trade network. Only for unrealistically broad weight distributions (α≲1), does p(s) exhibit a bimodal character. If α>1, bimodality is absent. (b) Salience distributions in preferential attachment networks (1,000 nodes)23 with degree distribution p(k)~k−3 and uniform weights do not exhibit bimodal salience (heavy dashed line). If however the power-law weight distribution is superimposed on the preferential attachment topology, bimodal salience emerges for realistic values of α. (c) For the range of tail exponents α and β, the colour code quantifies the magnitude χ of bimodality in the salience distribution pα,β(s) of a network, with a scale-free degree distribution with exponent β (constructed using the configuration model4) and a scale-free weight distribution with exponent α. Small values of χ correspond to a bimodal pα,β(s). The bimodality measure χ was computed using Kolmogorov–Smirnov distance between pα,β(s) for s>0 and the idealized reference distribution q(s)=δ(s−1).
-
Figure 5: Salience predicts infection pathways in stochastic epidemic models. The scatter plots show the directed salience sd against the normalized frequency of appearance in infection pathways h for each link in an ensemble of 100 networks, averaged over 1,000 epidemic realizations for each member of the ensemble. As in Fig. 3, the plots are divided horizontally into bins, with the heavy black lines indicating quartiles within each bin. Insets show link betweenness b versus h, and correlation coefficients are listed in Table 3. Top, Weights distributed narrowly and uniformly around a constant w0. bottom, Weights distributed according to p(w)~w−(1+β) with α=2.
References
- Newman, M. E. J. The structure and function of complex networks. SIAM Rev 45, 167–256 (2003).
- Strogatz, S. H. Exploring complex networks. Nature 410, 268–76 (2001).
- Albert, R. & Barabási, A.- L. Statistical mechanics of complex networks. Rev. Modern Phys. 74, 47–97 (2002).
- Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. Complex networks: structure and dynamics. Phys. Rep. 424, 175–308 (2006).
- Vespignani, A. Predicting the behavior of techno-social systems. Science 325, 425–428 (2009).
- Thiemann, C., Theis, F., Grady, D., Brune, R. & Brockmann, D. The structure of borders in a small world. PLoS ONE 5, e15422 (2010).
- Brockmann, D. Following the money. Phys. World 23, 31–34 (2010).
- Barrat, A., Barthélemy, M., Pastor-Satorras, R. & Vespignani, A. The architecture of complex weighted networks. PNAS 101, 3747–3752 (2004).
- Guimerà, R., Mossa, S., Turtschi, A. & Amaral, L. A. N. The worldwide air transportation network: anomalous centrality, community structure, and cities' global roles. PNAS 102, 7794–7799 (2005).
- Brockmann, D., Hufnagel, L. & Geisel, T. The scaling laws of human travel. Nature 439, 462–465 (2006).
- Woolley-Meza, O. et al. Complexity in human transportation networks: a comparative analysis of worldwide air transportation and global cargo ship movements. Eur. Phys. J. B 84, 589–600 (2011).
- Allesina, S., Alonso, D. & Pascual, M. A general model for food web structure. Science 320, 658–661 (2008).
- Camacho, J., Guimerà, R. & Amaral, L. A. N. Robust patterns in food web structure. Phys. Rev. Lett. 88, 8–11 (2002).
- Lazer, D. et al. Computational social science. Science 323, 721–723 (2009).
- Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabási, A.- L. The large-scale organization of metabolic networks. Nature 407, 651–654 (2000).
- Almaas, E., Kovács, B., Vicsek, T., Oltvai, Z. N. & Barabási, A.- L. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature 427, 839–843 (2004).
- Barabási, A.- L. & Oltvai, Z. N. Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 5, 101–113 (2004).
- Ravasz, E. & Barabási, A.- L. Hierarchical organization in complex networks. Phys. Rev. E 67, 1–7 (2003).
- Alon, U. Network motifs: theory and experimental approaches. Nat. Rev. Genet. 8, 450–461 (2007).
- Newman, M. E. J. & Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 69, 26113 (2004).
- Liljeros, F., Edling, C. R., Amaral, L. A. N., Stanley, H. E. & Åberg, Y. The web of human sexual contacts. Nature 411, 907–908 (2001).
- Kleinberg, J. & Lawrence, S. The structure of the web. Science 294, 1849–1850 (2001).
- Barabási, A.- L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
- Newman, M. E. J. Analysis of weighted networks. Phys. Rev. E 70, 56131 (2004).
- Colizza, V., Barrat, A., Barthélemy, M. & Vespignani, A. The role of the airline transportation network in the prediction and predictability of global epidemics. PNAS 103, 2015–2020 (2006).
- Brockmann, D. & Theis, F. Moneycirculation, trackable items, and the emergence of universal human mobility patterns. IEEE Pervasive Computing 7, 28–35 (2008).
- Hufnagel, L., Brockmann, D. & Geisel, T. Forecast and control of epidemics in a globalized world. Proc. Natl. Acad. Sci. USA 101, 15124–15129 (2004).
- Borgatti, S. P. & Everett, M. G. A graph-theoretic perspective on centrality. Soc. Networks 28, 466–484 (2006).
- Wu, Z., Braunstein, L. A., Havlin, S. & Stanley, H. E. Transport in weighted networks: partition into superhighways and roads. Phys. Rev. Lett. 96, 1–4 (2006).
- Wang, H., Hernandez, J. M. & Van Mieghem, P. Betweenness centrality in a weighted network. Phys. Rev. E 77, 1–10 (2008).
- Tumminello, M., Aste, T., Di Matteo, T. & Mantegna, R. N. A tool for filtering information in complex systems. PNAS 102, 10421–10426 (2005).
- Serrano, M. A., Boguñá, M. & Vespignani, A. Extracting the multiscale backbone of complex weighted networks. PNAS 106, 6483–6488 (2009).
- Radicchi, F., Ramasco, J. J. & Fortunato, S. Information filtering in complex weighted networks. Phys. Rev. E 83, 1–9 (2011).
- Caldarelli, G Scale-Free Networks: Complex Webs in Nature and Technology (Oxford University Press, USA, 2007).
- Dijkstra, E. W. A note on two problems in connexion with graphs. Numer. Math. 1, 269–271 (1959).
- Newman, M. E. J. Networks: An Introduction (Oxford University Press, USA, 2010).
- Newman, M. E. J. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002).
- Clauset, A., Shalizi, C. R. & Newman, M. E. J. Power-law distributions in empirical data. SIAM Rev. 51, 661–703 (2009).
- Barthélemy, M. Spatial networks. Phys. Rep. 499, 1–101 (2011).
- Freeman, L. C. A set of measures of centrality based on betweenness. Sociometry 40, 35–41 (1977).
- Holme, P. Core-periphery organization of complex networks. Phys. Rev. E 72, 46111 (2005).
- Dall'Asta, L., Barrat, A., Barthélemy, M. & Vespignani, A. Vulnerability of weighted networks. J. Stat. Mech. Theor. Exp. 2006, P04006–P04006 (2006).
- Guimerà, R., Sales-Pardo, M. & Amaral, L. A. N. Classes of complex networks defined by role-to-role connectivity profiles. Nat. Phys. 3, 63–69 (2007).
- Sales-Pardo, M., Guimerà, R., Moreira, A. A. & Amaral, L. A. N. Extracting the hierarchical organization of complex systems. PNAS 104, 15224–15229 (2007).
- Woolley-Meza, O., Grady, D., Bagrow, J.P., & Brockmann, D. Eyjafjallajökull and 9/11: The impact of disasters on the world-wide air-traffic, In Preparation (2012).
- Kaluza, P., Kölzsch, A., Gastner, M. T. & Blasius, B. The complex network of global cargo ship movements. J. R. Soc. Interface 7, 1093–1103 (2010).
- White, J. G., Southgate, E., Thomson, J. N. & Brenner, S. The structure of the nervous system of the nematode Caenorhabditis elegans. Phil. Trans. R. Soc. London 314, 1–340 (1986).
- Watts, D. J. & Strogatz, S. H. Collective dynamics of 'small-world' networks. Nature 393, 440–442 (1998).
- Reed, J. L., Vo, T. D., Schilling, C. H. & Palsson, B. O. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol 4, R54 (2003).
- Ulanowicz, R. E., Bondavalli, C. & Egnotovich, M. S. Network analysis of trophic dynamics in south florida ecosystem. FY 97: the Florida Bay ecosystem. Technical report, CBL 98–123 (1998) .
- Garlaschelli, D. & Loffredo, M. Fitness-dependent topological properties of the world trade web. Phys. Rev. Lett. 93, 1–4 (2004).
- Garlaschelli, D. & Loffredo, M. Structure and evolution of the world trade network. Physica A: Stat. Mech. Appl. 355, 138–144 (2005).
- Serrano, M. & Boguñá, M. Topology of the world trade web. Phys. Rev. E 68, 1–4 (2003).
- Newman, M. E. J. The structure of scientific collaboration networks. PNAS 98, 404–409 (2001).
Author information
Affiliations
-
Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, Illinois, USA.
- Daniel Grady,
- Christian Thiemann &
- Dirk Brockmann
-
Max-Planck-Institut für Dynamik und Selbstorganisation, Göttingen, Germany.
- Christian Thiemann
-
Northwestern Institute on Complex Systems, Northwestern University, Evanston, Illinois, USA.
- Dirk Brockmann
Contributions
D.G., C.T. and D.B. contributed in designing the research. D.G. and D.B. helped in developing the theory. D.G., C.T. and D.B. analysed the data. D.G. performed epidemic simulations. D.G. and D.B. wrote the manuscript.
Competing financial interests
The authors declare no competing financial interests.
Author details
Daniel Grady
Search for this author in:
Christian Thiemann
Search for this author in:
Dirk Brockmann
Search for this author in:
Supplementary information
PDF files
- Supplementary Information (1,093 kB)
Supplementary Figures S1-S3, Supplementary Tables S1-S3, Supplementary Methods and Supplementary References.