Revisiting Cyclic Reduction and Parallel Prefix-Based Algorithms for Tri-diagonal Systems of Equations


Direct solvers based on prefix computation and cyclic reduction algorithms exploit the special structure of tridiagonal systems of equations to deliver better parallel performance compared to those designed for more general systems of equations. This performance advantage is even more pronounced for block tridiagonal systems. In this paper, we re-examine the performances of these two algorithms taking the effects of block size into account. Depending on the block size, the parameter space spanned by the number of block rows, size of the blocks and the processor count is shown to favor one or the other of the two algorithms. A critical block size that separates these two regions is shown to emerge and its dependence both on problem dependent parameters and on machine-specific constants is established. Empirical verification of these analytical findings is carried out on up to 2048 cores of a Cray XT4 system.

[Pub 117]

Kalyan Perumalla
Kalyan Perumalla

As a Federal Program Manager in Advanced Scientific Computing Research at the U.S. Dept. of Energy, Office of Science, Kalyan Perumalla manages a $100-million R&D portfolio covering AI, HPC, Quantum, SciDAC, and Basic Computer Science. In his 25-year R&D leadership experience, he previously led advanced R&D as Distinguished Research Staff Member at the Oak Ridge National Laboratory (ORNL) developing scalable software and applications on the world’s largest supercomputers for 17 years, including as a line manager and a founding group leader. He has held senior faculty and adjunct appointments at UTK, GT, and UNL, and was an IAS Fellow at Durham University.
