The Grail Cluster
The Grail cluster is available for use only by CIERA researchers. This computer cluster is exceptional in that it provides compute nodes not only for general-purpose high-performance computation with 1684 compute ‘cores’, but also includes 10 special-purpose nodes with General Purpose Graphics Processing Units (GPGPUs) appropriate to scientific computing which allow accelerated parallel computations. The cluster also includes an additional three “high-memory” nodes. This cluster is used not only for running simulations for research, but serves as a training ground for undergraduate students in HPC computing and computational research.
In more detail, the cluster consists of 48 standard compute nodes, 10 GPU compute nodes, and 3 high-memory nodes. Almost all nodes have dual 14-core CPUs (Intel Xeon E5-2680 v4 [‘Broadwell’ cores], running at 2.4 GHz). The standard compute nodes and GPU nodes have 128 GB of RAM per node; the high-memory nodes have 512 GB per node. The nodes are interconnected with Infiniband FDR. Grail’s ten additional GPU nodes contain dual NVIDIA Tesla K80 cards; each card has two NVIDIA Kepler GK210 GPUs with 12 GB of RAM per GPU. The cluster also contains significant storage space for computations and simulation results: 44 TB of high-speed disk storage directly connected to the cluster in a RAID configuration.
3High Memory Nodes
The Northwestern’s Research Computing group houses and maintains the Grail cluster alongside its Quest High-Performance computer. Their group has excellent space, cooling, and power (with UPS support) to support the system.
Funding for the Grail computer cluster was provided by the National Science Foundation’s Major Research Instrumentation (MRI) program, grant PHY-1126812.
If you use Grail for your research, here is some example text to acknowledge NSF support, in your paper: “This work used computing resources at CIERA funded by NSF PHY-1126812.”