Generalized network dismantling
See allHide authors and affiliations
Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved February 15, 2019 (received for review April 12, 2018)
Significance
The proper functioning of many sociotechnical systems depends on their level of connectivity. By removing or deactivating a specific set of nodes, a network structure can be dismantled into isolated subcomponents, thereby disrupting the malfunctioning of a system or containing the spread of misinformation or an epidemic. We propose a generalized network-dismantling framework, which can take realistic removal costs into account such as the node price, the protection level, or removal energy. We discuss applications of cost-efficient dismantling strategies to real-world problems such as containing an epidemic or dismantling criminal or corruption networks.
Abstract
Finding an optimal subset of nodes in a network that is able to efficiently disrupt the functioning of a corrupt or criminal organization or contain an epidemic or the spread of misinformation is a highly relevant problem of network science. In this paper, we address the generalized network-dismantling problem, which aims at finding a set of nodes whose removal from the network results in the fragmentation of the network into subcritical network components at minimal overall cost. Compared with previous formulations, we allow the costs of node removals to take arbitrary nonnegative real values, which may depend on topological properties such as node centrality or on nontopological features such as the price or protection level of a node. Interestingly, we show that nonunit costs imply a significantly different dismantling strategy. To solve this optimization problem, we propose a method which is based on the spectral properties of a node-weighted Laplacian operator and combine it with a fine-tuning mechanism related to the weighted vertex cover problem. The proposed method is applicable to large-scale networks with millions of nodes. It outperforms current state-of-the-art methods and opens more directions for understanding the vulnerability and robustness of complex systems.
Networking the world creates many opportunities, but sometimes also produces undesired side effects. For instance, in a hyperconnected world, systemic instability can seriously undermine the functionality of a network based on cascading effects (1). The quick global spread of rumors and fake news may be seen as recent examples (2, 3), while the spread of epidemics (4, 5) or failure propagation (6, 7) is a problem that has been around much longer. Furthermore, it is known that the network structure, for example the exponent characterizing scale-free networks, is of particular importance for the controllability of cascading effects (8). For certain scaling exponents of scale-free networks, the variance or mean value of relevant quantities may not be well defined, which means that unpredictable or uncontrollable behavior may result. It may then be impossible to contain epidemic-spreading processes. Similar circumstances may make it impossible to contain the spread of computer viruses or misinformation—a problem that not only is relevant for the quick increase of cyberthreats, but also may undermine the functionality of markets or societal or political institutions.
However, the removal or deactivation of even a small set of nodes may dismantle the network into isolated subcomponents and thereby stop the malfunctioning of a system. The effectiveness of node removal depends on the network structure and the removal strategy. For example, scale-free networks (9, 10) are known to be more robust to random removals than Erdős–Rényi networks (11, 12), but at the same time more vulnerable to targeted attacks (13⇓⇓–16).
Finding the most efficient removal strategy which dismantles a network (17, 18) into isolated subcomponents of a specified maximum size at minimum overall cost belongs to a class of hard computational problems, called nondeterministic polynomial hard (NP-hard) problems. Essentially, this implies that there is currently no algorithm that can find the best dismantling solution for large-scale networks. It is, therefore, a challenge to find good approximations of the optimal dismantling strategy. For example, novel approximations (17⇓⇓⇓⇓⇓–23) have been proposed based on spin-glass and optimal percolation theory. However, all these methods make the implicit assumption that the cost of removing nodes is the same. Only recently have people been interested in the effect of removal costs that depends on the node degree (24, 25). However, these studies were restricted to random network structures (24) or edge-based strategies (25), but they did not tackle the generalized network-dismantling optimization problem.
This paper addresses the question of how to select the set of nodes in a network that, when removed or (de)activated, can stop the spread of (dis)information, mitigate an epidemic, or disrupt a malicious system by fragmenting it into small components at minimum overall cost. In the generalized network-dismantling problem, the cost of removing a node can be an arbitrary nonnegative real value, which can, for example, be specified as a function of a node’s centrality properties (26) or also nontopological variables. Understanding the key relationships between dismantling solutions and their overall costs enables one to increase the level of robustness of real-world systems.
The main contributions of this paper are as follows:(i) We argue that some problems such as containing crime, corruption, epidemics, or fake news cannot be adequately addressed with the well-known network-dismantling problem assuming identical node removal costs and that for nonunit removal costs another, more efficient solution strategy is needed. (ii) We study the generalized network-dismantling problem, which seeks to find a set of nodes that, when removed from a network, results in a network fragmentation into components of at most size C at minimal overall cost. Compared with the prevailing approach in network science (17⇓⇓⇓⇓⇓–23), we allow for removal costs that have arbitrary nonnegative real values. (iii) We formulate a node-weighted graph-cutting objective function, which determines the upper bound for the node-weighted bisection. We study its analytical solution and approximation and present analytical bounds and convergence proofs. (iv) To dismantle large-scale networks, we propose an efficient iterative node-weighted spectral bisection method, which has complexity
The generalized network-dismantling problem is related to the weighted partitioning problem in graph theory (29), which was addressed by means of vertex separators. The main difference is that the weighted partitioning problem specifies the number of partitions and the network-dismantling problem specifies only the target size C. For more differences between separator, partitioning, and dismantling problems, see SI Appendix, section 7. Although different versions of weighted separator problems were studied long ago by graph theory (30, 31), the main focus was not on realistic node removal costs for real-world networks. Moreover, according to a recent review of graph partitioning methods (32), these methods are not well applicable to large-scale complex networks due to their broad (or even heavy-tailed) degree distribution compared with traditional graphs. Probably for this reason, the weighted partitioning problem has not been applied as much in the network science community (17⇓⇓⇓⇓–22).
Problems with Nonuniform Removal Costs
While criminal and corruption networks are one of humanity’s biggest problems, it seems that effective ways to dismantle them are still needed. A typical approach to fight organized crime and corruption is to try to identify the underlying organization’s network and then to remove the leader of the organization. It turns out, however, that it often requires an extremely great effort to remove the higher echelons of such organizations, because of their special protection measures. Removal costs of criminals or corrupt persons largely depend on their position in the network. It has also been found that it is often ineffective to remove the boss of a corruption or criminal network, as someone else will quickly take the leadership position of the organization and continue running the criminal or corruption network (33); besides, the transition period is often characterized by an increase in the level of crime, until the power struggle is decided. Therefore, we generalize the dismantling problem to nonunit node removal costs. As we will show, this class of problems has different kinds of solutions. Specifically, the dismantling procedure does not go for the big nodes first. It is less costly (i.e., more effective) to dismantle the network by initially removing some medium-sized nodes. In this paper, we propose an algorithm to solve the generalized network-dismantling problem and apply it to a variety of problems ranging from crime networks to epidemic spreading to corruption networks.
Generalized Network-Dismantling Problem
For a network
Node-Weighted Partition.
Let us assume that we want to partition the network
When the cost matrix equals the identity matrix (
The node-weighted spectral cut is recursively applied to M and
Spectral Approximation.
To find the second-smallest eigenvectors for large-scale networks, we propose the following simple and elegant approximation algorithm, which falls into the class of power-iteration methods (36). Note that
The intuition that the random vector v converges exponentially to some eigenvector of
If
The computational complexity of recursively applying this procedure to smaller and smaller partitions is
Fine-Tuning of the Spectral Solution.
Let us represent by
(A) Steps of the GND algorithm. (A, 1) The inputs are the adjacency matrix A and the node removal cost matrix W. (A, 2) Construction of the cost-weighted network defined by the matrix B and its corresponding node-weighted Laplacian
Reinsertion.
Finally, as the proposed GND method is offering a recursive solution, some of the nodes from early stages of fragmentation do not contribute to the final stage of complete fragmentation. Thus, to produce better dismantling solutions [GND with reinsertion (GNDR)], we apply the reinsertion method (19).
Results
To demonstrate the applicability of the proposed generalized network-dismantling framework to realistic scenarios, we apply it to some real-world networks and show that the current state-of-the-art dismantling strategy (19) delivers different results from the nonunit cost problem, as expected. In addition to the complete dismantling, we also focus on the partial dismantling of the system’s giant connected component (GCC), which reflects the fact that the budget is usually limited such that only a partial dismantling is possible.
Fig. 2 shows some results of network dismantling, which correspond to suppressing the spread of misinformation, computer viruses, or other harmful contagion processes on the online social network [Petster–Hamster (37)]. The cost for the 80% partial dismantling with the state-of-the-art Min-Sum strategy (19) is 0.4. However, although the Min-Sum algorithm removes only 5% of nodes in this process, its cost is rather large. The reason for this becomes clear if we study the degree distribution of the removed nodes in Fig. 2C, where we note that the largest hubs tend to be removed by Min-Sum. In contrast, the random removal of nodes, also known as the site percolation process, with the same cost of 0.4 achieves fragmentation to ∼75% of the original GCC size. This implies that the state-of-the-art algorithms for unit-cost problems tend to be inefficient when applied to problems with nonunit costs. However, with the same cost of 0.4, our GND method fragments the network to 62% of the original GCC size, and for the target of 80% of the GCC size, the corresponding cost is only 0.2.
(A) Network dismantling of three strategies: Min-Sum algorithm (17), random removal (site percolation), and GND. For the same dismantling cost 0.4, the result of Min-Sum is 5% worse than the random removal. (B) Dynamical process [susceptible-infected-removed (SIR) model (27) with
Next, we study the dismantling on different real-world networks for three different state-of-the-art methods: equal graph partitioning (EGP) (39), Min-Sum (17), and belief propagation-guided decimation (BPD) (20). The networks in the main text include (i) a crime network with 754 nodes obtained by the projection of a bipartite network of persons and crimes (37), (ii) a corruption network (38) with 309 nodes and 3,281 edges, (iii) the online social network of Petster–Hamster (37) with 2,000 nodes and 16,631 edges, and (iv) a power-grid network (37) with 4,941 nodes and 6,594 edges. For more algorithms and networks see SI Appendix, section 8.
In Fig. 3, the results show that, for partial dismantling, the proposed methodology (GND and GNDR) achieves the same fragmentation level with a much smaller overall dismantling cost: c = 0.2 with 0.18 (GND) vs. 0.42 (BPD) for the crime network, c = 0.2 with 0.24 (GND) vs. 0.9 (Min-Sum) for the corruption network, c = 0.2 with 0.63 (GND) vs. 0.8 (BPD) for the Petster–Hamster network, and c = 0.3 with 0.11 (GND) vs. 0.17 (BPD) for the power-grid network. In Fig. 4, we show the performance of different algorithms for the unit-cost case on two networks, where the GND and GNDR algorithms show better or comparable performance. More detailed experiments for the unit and nonunit costs are in SI Appendix, section 8. Also in SI Appendix, section 8, we summarize the ratio of the removal cost of BPD, Min-Sum, and GNDR algorithms when the target size is 0.8, 0.6, 0.4, and 0.2, respectively. The results show that the performance of GNDR is better or at least comparable for both degree-based and unit node removal costs. In SI Appendix, section 7, we have constructed the benchmark network with many loops, for which we know the optimal solution. Interestingly, we observe that loops can create problems for the BPD and Min-Sum algorithms and that the optimal solutions for network dismantling and its generalized version may coincide or deviate, depending on the chosen value of c.
Dismantling of the criminal and corruption networks and creation of firewalls to stop the spread of misinformation or malicious software in online networks. We show the size of the GCC vs. the overall dismantling cost for four different networks: (A) crime network (37), (B) corruption network (38), (C) Petster–Hamster online social network (37), and (D) power-grid network (37). The dismantling strategy with a smaller area under the curve performs better. The EGP-W, Min-Sum-W, and BPD-W algorithms are the weighted versions of the original EGP, Min-Sum, and BPD algorithms, generalized to nonunit node removal costs (SI Appendix, section 8). For comparison with more algorithms see SI Appendix, section 8 and Fig. S4.
(A and B) Dismantling curves showing the performance of node removal for the unit-cost case, for the (A) Petster–Hamster online social network (37) and (B) power-grid network (37). We observe that even for unit costs and complete dismantling, our proposed dismantling strategy (GNDR) provides good solutions. For the comparison of more algorithms and more networks, see SI Appendix, section 8, Figs. S5 and S6, and Table S1.
If external information about the removal costs is available, we are able to incorporate it into the matrix W and proceed with our GND method. SI Appendix, section 2 gives spectral bounds for general nonnegative weights, for which the same spectral approximation method can be used. In Fig. 5, we show the results for the world airport network, where the cost
Comparison of the dismantling performance for the airport network, where the removal cost is the total passenger flow of the airport. (A) Setting the target size to
Summary and Conclusions
In this paper, we have studied the generalized network-dismantling problem, which seeks to find a set of nodes allowing one to dismantle a network into components up to small size C in the most cost-effective way. We do not make the assumption that the cost of removing nodes is the same for all of the nodes, which has been typically made before. Instead, we allow for node removal costs that are given by topological properties or nontopological features such as the price or protection level of a node. We acknowledge that, for the unit-cost scenario, the BPD and Min-Sum methods (17, 20) provide good solutions in many cases and have provided additional insights about the problem. However, for networked systems with nonunit node removal costs, current state-of-the-art dismantling methods will often not produce near-optimal results, while our proposed methods (GND and GNDR) do. These are based on a blend of spectral properties of a node-weighted Laplacian operator, a power-iteration method, and weighted vertex cover approximations. Understanding the theory behind network dismantling (17, 20, 24) opens up more research directions for all scientists interested in designing more robust and resilient systems in the future. Interestingly, our dismantling strategy is different from previous ones as it removes medium-size rather than big nodes first. Our results are relevant for the robustness and recommended (re)organization of current sociotechnical systems for different realistic costs. For example, we have demonstrated that generalized network dismantling can enable cost-effective immunization strategies against harmful contagion in social and transportation networks as well as the disruption of criminal and corruption networks.
Ethics
The method presented in this paper aims at offering a possible solution for emergencies where cutting a dysfunctional network into pieces can restore its functionality. However, we also warn of potential misuses or dual uses. When not applied in appropriate contexts and ways, the use of the dismantling approach may undermine the proper functionality of networks. Therefore, we point out that related ethical issues must be sufficiently, appropriately, and transparently addressed (40) when the method is applied. The method must be restricted to legitimate uses and actors. It may be justified to stop harmful cascading problems such as deadly epidemics and the spreading of disruptive computer malware or to dismantle criminal organizations or corruption networks. Note, however, that the use of dismantling strategies to contain misinformation can be potentially problematic, as it may result in censorship if a government, company, news agency, or other institution decides what is misinformation or not. See SI Appendix, section 9 for more details.
Acknowledgments
The authors thank K. K. Kleinberg and A. Lancic for useful comments. X.-L.R. thanks the China Scholarship Council for financial support. N.A.-F. and D.H. are grateful for support from the European Union Horizon 2020 projects: SoBigData under Grant 654024 and CIMPLEX under Grant 641191.
Footnotes
↵1X.-L.R. and N.G. contributed equally to this work.
- ↵2To whom correspondence should be addressed. Email: anino@ethz.ch.
Author contributions: D.H. and N.A.-F. designed research; X.-L.R. and N.A.-F. performed research; X.-L.R., N.G., and N.A.-F. contributed new methods/analytic tools; X.-L.R., N.G., D.H., and N.A.-F. analyzed data; and X.-L.R., N.G., D.H., and N.A.-F. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The code and data have been deposited at https://github.com/renxiaolong/Generalized-Network-Dismantling.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1806108116/-/DCSupplemental.
- Copyright © 2019 the Author(s). Published by PNAS.
This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).
References
- ↵
- Helbing D
- ↵
- Del Vicario M, et al.
- ↵
- Waldrop MM
- ↵
- Vespignani A
- ↵
- Brockmann D,
- Helbing D
- ↵
- Holme P,
- Kim BJ
- ↵
- Buldyrev SV,
- Parshani R,
- Paul G,
- Stanley HE,
- Havlin S
- ↵
- Dorogovtsev SN,
- Goltsev AV,
- Mendes JFF
- ↵
- Barabási A-L,
- Albert R
- ↵
- Dorogovtsev SN,
- Mendes JFF,
- Samukhin AN
- ↵
- Erdős P,
- Rényi A
- ↵
- Gilbert EN
- ↵
- Albert R,
- Jeong H,
- Barabási A-L
- ↵
- Cohen R,
- Erez K,
- Ben-Avraham D,
- Havlin S
- ↵
- Schneider CM,
- Moreira AA,
- Andrade JS,
- Havlin S,
- Herrmann HJ
- ↵
- Gallos LK, et al.
- ↵
- Braunstein A,
- Dall’Asta L,
- Semerjian G,
- Zdeborová L
- ↵
- Morone F,
- Makse HA
- ↵
- Zdeborová L,
- Zhang P,
- Zhou H-J
- ↵
- Mugisha S,
- Zhou H-J
- ↵
- Morone F,
- Min B,
- Bo L,
- Mari R,
- Makse HA
- ↵
- Tian L,
- Bashan A,
- Shi D-N,
- Liu Y-Y
- ↵
- Zhou H-J
- ↵
- Patron A,
- Cohen R,
- Li D,
- Havlin S
- ↵
- Ren X-L,
- Gleinig N,
- Tolić D,
- Antulov-Fantulin N
- ↵
- Lü L, et al.
- ↵
- Tolić D,
- Kleineberg K-K,
- Antulov-Fantulin N
- ↵
- Bar-Yehuda R,
- Even S
- ↵
- Feige U,
- Hajiaghayi MT,
- Lee JR
- ↵
- Fiedler M
- ↵
- Guattery S,
- Miller GL
- ↵
- Buluç A,
- Meyerhenke H,
- Safro I,
- Sanders P,
- Schulz C
- ↵
- Duijn PAC,
- Kashirin V,
- Sloot PMA
- ↵
- Janson S,
- Thomason A
- ↵
- Riolo MA,
- Newman MEJ
- ↵
- Alex P,
- Simon HD,
- Liou K-P
- ↵
- Schwabe D
- Kunegis J
- ↵
- Ribeiro HV,
- Alves LGA,
- Martins AF,
- Lenzi EK,
- Perc M
- ↵
- Chen Y,
- Paul G,
- Havlin S,
- Liljeros S,
- Stanley HE
- ↵
- Helbing D
- Helbing D, et al.
Article Classifications
- Physical Sciences
- Applied Mathematics