Abstraction of Computation and Data Motion in High-Performance Computing Systems

Date

2019-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Supercomputers are at the forefront of science and technology and play a crucial role in the advancement of contemporary scientific and industrial domains. Such advancement is due to the rapid developments of the underlying hardware of supercomputers, which in turn, have led to complicated hardware designs. Unlike decades ago when supercomputers were homogeneous in their design, their current developments have been widely heterogeneous to lower their energy and power consumption.

As hardware architectures of supercomputers become complex, so do the software applications that target them. In recent years, scientists have been utilizing directive-based programming models, such as OpenMP and OpenACC, to mitigate the complexity of developing software. These programming models enable scientists to parallelize their code with minimum code interventions from their developers. However, targeting heterogeneous systems effectively is still a challenge despite having productive programming environments.

In this dissertation, we will introduce a directive-based programming model and a hierarchical model to improve the usability and portability of several scientific applications and prepare them for the exascale era. For the first model, our pointerchain directive replaces a chain of pointers with its corresponding effective address inside a parallel region of code. Pointerchain enables developers to efficiently perform deep copying of the data structures in heterogeneous platforms. Based on our analysis, pointerchain has led to 39% and 38% reductions in the amount of generated code and the total executed instructions, respectively.

Secondly, our hierarchical model, Gecko, abstracts the underlying memory hierarchy of the exascale platforms. This abstraction paves the way for developing scientific applications on supercomputers. To prove its feasibility, we developed an implementation of Gecko as a directive-based programming model. Moreover, to evaluate its effectiveness, we ported real scientific benchmark applications — ranging from linear algebra to fluid dynamics — to Gecko. Furthermore, we also demonstrated how Gecko helps developers with code portability and ease-of-use in real scenarios. Gecko achieved a 3.3 speedup on a four-GPU system with respect to one single GPU while having only a single source-code base.

Description

Keywords

Abstraction, Hierarchy, Heterogeneity, Portable, Shared Memory, Programming, Model, Language, Exascale, HPC, Directives, Supercomputer

Citation

Portions of this document appear in: Ghane, Millad, Sunita Chandrasekaran, and Margaret S. Cheung. "Gecko: Hierarchical Distributed View of Heterogeneous Shared Memory Architectures." In Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores, pp. 21-30. ACM, 2019. And in: Ghane, Millad, Sunita Chandrasekaran, and Margaret S. Cheung. "pointerchain: Tracing pointers to their roots–A case study in molecular dynamics simulations." Parallel Computing 85 (2019): 190-203. And in: Ghane, Millad, Sunita Chandrasekaran, and Margaret S. Cheung. "Assessing Performance Implications of Deep Copy Operations via Microbenchmarking." arXiv preprint arXiv:1906.01128 (2019).