Graphical processing units (GPUs) are now established as efficient, alternative computing platforms for certain niche applications. Computationally intensive simulations are among applications that can utilize GPUs as computing co-processors. This tutorial introduces the concepts and algorithms for executing simulations on GPUs. Algorithmic aspects for multi-pass execution of time-stepped simulations, and refinements for discrete event execution are described. Examples from applications such as agent-based simulations will be used to illustrate implementations, with source code extracts. Also briefly introduced is advanced material such as use of clusters of multiple GPUs, using a combination of the Message Passing Interface (MPI) and the Common Unified Device Architecture (CUDA). Implementation challenges, such as memory hierarchies and latency hiding needs, will be described. The tutorial is structured to minimize duplication of existing GPU literature, but to be self-contained and customized for simulation applications.
[Pub 80]