A Compiler Optimization Framework for Directive-Based GPU Computing

dc.contributor.advisorChapman, Barbara M.
dc.contributor.committeeMemberGabriel, Edgar
dc.contributor.committeeMemberSubhlok, Jaspal
dc.contributor.committeeMemberShi, Weidong
dc.contributor.committeeMemberRodgers, Gregory
dc.creatorTian, Xiaonan 1983-
dc.date.accessioned2018-11-30T21:08:31Z
dc.date.available2018-11-30T21:08:31Z
dc.date.createdAugust 2016
dc.date.issued2016-08
dc.date.submittedAugust 2016
dc.date.updated2018-11-30T21:08:31Z
dc.description.abstractIn the past decade, accelerators, commonly Graphics Processing Units (GPUs), have played a key role in achieving Petascale performance and driving efforts to reach Exascale. However, significant advances in programming models for accelerator-based systems are required in order to close the gap between achievable and theoretical peak performance. These advances should not come at the cost of programmability. Directive-based programming models for accelerators, such as OpenACC and OpenMP, help non-expert programmers to parallelize applications productively and enable incremental parallelization of existing codes, but typically will result in lower performance than CUDA or OpenCL. The goal of this dissertation is to shrink this performance gap by supporting fine-grained parallelism and locality-awareness within a chip. We propose a comprehensive loop scheduling transformation to help users more effectively exploit the multi-dimensional thread topology within the accelerator. An innovative redundant execution mode is developed in order to reduce unnecessary synchronization overhead. Our data locality optimizations utilize the different types of memory in the accelerator. Where the compiler analysis is insufficient, an explicit directive-based approach is proposed to guide the compiler to perform specific optimizations. We have chosen to implement and evaluate our work using the OpenACC programming model, as it is the most mature and well-supported of the directive-based accelerator programming models, having multiple commercial implementations. However, the proposed methods can also be applied to OpenMP. For the hardware platform, we choose GPUs from Advanced Micro Devices (AMD) and NVIDIA, as well as Accelerated Processing Units (APUs) from AMD. We evaluate our proposed compiler framework and optimization algorithms with SPEC and NAS OpenACC benchmarks; the result suggests that these approaches will be effective for improving overall performance of code executing on GPUs. With the data locality optimizations, we observed up to 3.22 speedup running NAS and 2.41 speedup while running SPEC benchmarks. For the overall performance, the proposed compiler framework generates code with competitive performance to the state-of-art of commercial compiler from NVIDIA PGI.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10657/3546
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectOpenACC
dc.subjectCompiler Construction
dc.subjectCompiler Optimization
dc.subjectData Locality Optimization
dc.subjectLoop Scheduling
dc.subjectRegister Optimization
dc.titleA Compiler Optimization Framework for Directive-Based GPU Computing
dc.type.dcmiText
dc.type.genreThesis
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TIAN-DISSERTATION-2016.pdf
Size:
2.55 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: