Performance Tuning and Modeling of Communication in Parallel Applications



Journal Title

Journal ISSN

Volume Title



The goal of high performance computing is executing very large problems in the least amount of time, typically by deploying parallelization techniques. However, in- troducing parallelization to an application also introduces synchronization and com- munication overhead, which in turn creates a performance bottleneck. Performance modeling and tuning can be used to predict and ease this bottleneck to improve the overall performance of the application. There are two aspects of an application which can be improved from performance point of view, namely, the computational section and the communication section. The time spent in communication operations is a major factor in determining the scalability of parallel applications. Tuning the parameters of a communication library can be used to adapt its characteristics to a particular platform, minimizing the communication time of an application. On the other hand performance modeling can be used to predict the performance using the network and application attributes. The goal of this dissertation is to improve the performance of a parallel applica- tion by performance tuning and performance modeling. Specifically, we introduce the notion of a personalized MPI library, highlighting the necessity and the methodology each application needs to have a communication library tuned for the particular plat- form. Secondly, this dissertation contributes towards the theoretical understanding of impact and limitations of point-to-point communication performance on collective communication and the overall application. This study has been further extended to develop performance models for communication aspect of collective I/O for one and two dimensional data decomposition, and for two file partitioning strategies, namely even and static partitioning.



Performance tuning, Performance models, Collective communication, Point-to-point communication


Portions of this document appear in: Jha, Shweta, and Edgar Gabriel. "Impact and limitations of point-to-point performance on collective algorithms." In 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 261-266. IEEE, 2016. And in: Jha, Shweta, Edgar Gabriel, and Saber Feki, "A Personalized MPI library for Exascale Applications and Environments," Workshop on Exascale MPI at Supercomputing Conference 2014, November 17, 2014,New Orleans, LA, USA.