Gabriel, Edgar2019-09-132019-09-13May 20172017-05May 2017Portions of this document appear in: Jha, Shweta, and Edgar Gabriel. "Impact and limitations of point-to-point performance on collective algorithms." In 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 261-266. IEEE, 2016. And in: Jha, Shweta, Edgar Gabriel, and Saber Feki, "A Personalized MPI library for Exascale Applications and Environments," Workshop on Exascale MPI at Supercomputing Conference 2014, November 17, 2014,New Orleans, LA, USA.https://hdl.handle.net/10657/4509The goal of high performance computing is executing very large problems in the least amount of time, typically by deploying parallelization techniques. However, in- troducing parallelization to an application also introduces synchronization and com- munication overhead, which in turn creates a performance bottleneck. Performance modeling and tuning can be used to predict and ease this bottleneck to improve the overall performance of the application. There are two aspects of an application which can be improved from performance point of view, namely, the computational section and the communication section. The time spent in communication operations is a major factor in determining the scalability of parallel applications. Tuning the parameters of a communication library can be used to adapt its characteristics to a particular platform, minimizing the communication time of an application. On the other hand performance modeling can be used to predict the performance using the network and application attributes. The goal of this dissertation is to improve the performance of a parallel applica- tion by performance tuning and performance modeling. Specifically, we introduce the notion of a personalized MPI library, highlighting the necessity and the methodology each application needs to have a communication library tuned for the particular plat- form. Secondly, this dissertation contributes towards the theoretical understanding of impact and limitations of point-to-point communication performance on collective communication and the overall application. This study has been further extended to develop performance models for communication aspect of collective I/O for one and two dimensional data decomposition, and for two file partitioning strategies, namely even and static partitioning.application/pdfengThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).Performance tuningPerformance modelsCollective communicationPoint-to-point communicationPerformance Tuning and Modeling of Communication in Parallel Applications2019-09-13Thesisborn digital