An Integrated Approach for Accelerating Stochastic Block Partitioning

Publisher: IEEE

Abstract:
Community detection, or graph partitioning, is a fundamental problem in graph analytics with applications in a wide range of domains including bioinformatics, social media analysis, and anomaly detection. Stochastic block partitioning (SBP) is a community detection algorithm based on sequential Bayesian inference. SBP is highly accurate even on graphs with a complex community structure. However, it does not scale well to large real-world graphs that can contain upwards of a million vertices due to its sequential nature. Approximate methods that break computational dependencies improve the scalability of SBP via parallelization and data reduction. However, these relaxations can lead to low accuracy on graphs with complex community structure. In this paper, we introduce additional synchronization steps through vertex-level data batching to improve the accuracy of such methods. We then leverage batching to develop a high-performance parallel approach that improves the scalability of SBP while maintaining accuracy. Our approach is the first to integrate data reduction, shared-memory parallelization, and distributed computation, thus efficiently utilizing distributed computing resources to accelerate SBP. On a one-million vertex graph processed on 64 compute nodes with 128 cores each, our approach delivers a speedup of 322x over the sequential baseline and 6.8x over the distributed-only implementation. To the best of our knowledge, this Graph Challenge submission is the highest-performing SBP implementation to date and the first to process the one-million vertex graph using SBP.
Date of Conference: 25-29 September 2023
Date Added to IEEE Xplore: 25 December 2023
ISBN Information:
ISSN Information:
Publisher: IEEE
Conference Location: Boston, MA, USA

I. Introduction

Community detection [1], also known as graph partitioning, is a fundamental problem in the field of graph analytics. It aims to find blocks of strongly connected vertices, such that vertices within a block are more strongly interconnected than vertices in different blocks. These communities tend to correspond to functional groups within a graph. Community detection has gained significant attention in a wide variety of domains including bioinformatics [2], [3], workload balancing [4], and recommendation systems [5].

References

References is not available for this document.