Scale-Free Graph Networks with Trillions of Edges: Rapid Generation using 1000 GPUs

Abstract

Synthetic networks are very useful in investigations of complex systems across the scientific spectrum, such as cyber-infrastructures, social networks, internet, and epidemiological contact networks. Scale-free networks are among the most common types of synthetic networks used for scientific investigation as formal approximations to complex real-life networks, which are technically characterized as exhibiting a power-law degree distribution. As real-life complex systems grow larger, generating networks that sufficiently mimic reality becomes computationally expensive. To address this issue, we present the latest results from efficiently scaled execution of our novel parallel algorithm for generating random scale-free networks using the preferential-attachment model. The algorithm, named cuPPA, is custom-designed to exploit multiple graphical processing units (GPUs) in distributed computing nodes. Our generator is useful in the development, debugging, and testing phases of high-performance computing (HPC) applications that use very large network data. During those phases, the overhead of moving actual network structure data can be avoided by generating the surrogate network structure data on the fly. The speed of the algorithm is sufficiently high to significantly reduce the computational overhead during the development phases of the network application. Furthermore, our generator provides determinism for the reproducibility of the HPC execution during development. Our algorithm generates extremely large scale-free networks of 4 trillion edges in less than 8 minutes using 1008 NVIDIA Volta GPUs of the Summit supercomputer. To our knowledge, this result represents the first achievement ever of graph network generation at this scale of parallel execution with over thousand GPUs. Moreover, it is equally applicable to offline processing as well as online stream processing – our algorithm is uniquely suitable for generating networks in a streaming mode without the need for explicitly storing (writing to disk) the entire network, and is suitable for targeting even larger scales with quadrillions of edges.

Kalyan Perumalla
Kalyan Perumalla

Kalyan Perumalla is Founder and President of Discrete Computing, Inc. He led advanced research and development at ORNL and holds senior faculty appointments at UTK, GT, and UNL.

Next
Previous

Related