High-Performance Sparse Fourier Transform on Parallel Architectures

dc.contributor.advisorChapman, Barbara M.
dc.contributor.committeeMemberGnawali, Omprakash
dc.contributor.committeeMemberShah, Shishir Kirit
dc.contributor.committeeMemberShi, Weidong
dc.contributor.committeeMemberMay, Elebeoba E.
dc.creatorWang, Cheng 1987-
dc.creator.orcid0000-0003-1801-637X
dc.date.accessioned2018-07-13T21:47:37Z
dc.date.available2018-07-13T21:47:37Z
dc.date.createdMay 2016
dc.date.issued2016-05
dc.date.submittedMay 2016
dc.date.updated2018-07-13T21:47:38Z
dc.description.abstractFast Fourier Transform (FFT) is one of the most important numerical algorithms widely used in numerous scientific and engineering computations. With the emergence of big data problems, however, in which the size of the processed data can easily exceed terabytes, it is challenging to acquire, process and store a sufficient amount of data to compute the FFT in the first place. The recently developed \textit{sparse} FFT (sFFT) algorithm provides a solution to this problem. The sFFT can compute a compressed Fourier transform by using only a small subset of the input data, thus achieves significant performance improvement. Modern homogeneous and heterogeneous multicore and manycore architectures are now part of the mainstream computing scene and can offer impressive performance for many applications. The computations that arise in sFFT lend it naturally to efficient parallel implementations. In this dissertation, we present efficient parallel implementations of the sFFT algorithm on three state-of-the-art parallel architectures, namely multicore CPUs, GPUs and a heterogeneous multicore embedded system. While the increase in the number of cores and memory bandwidth on modern architectures provide an opportunity to improve the performance through sophisticated parallel algorithm design, the sFFT is inherently complex, and numerous challenges need to address to deliver the optimal performance. In this dissertation, various parallelization and performance optimization techniques are proposed and implemented. Our parallel sFFT is more than 5x and 20x faster than the sequential sFFT on multicore CPUs and GPUs, respectively. Compared to full-size FFT libraries, the parallel sFFT achieves more than 9x speedup on multicore CPUs and 12x speedup on GPUs for a broad range of signal spectra.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.citationPortions of this document appear in: Wang, Cheng, Mauricio Araya-Polo, Sunita Chandrasekaran, Amik St-Cyr, Barbara Chapman, and Detlef Hohl. "Parallel sparse FFT." In Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms, p. 10. ACM, 2013. DO: 10.1145/2535753.2535764.
dc.identifier.urihttp://hdl.handle.net/10657/3269
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectSparse FFT
dc.subjectParallel computing
dc.subjectCompressive Sensing
dc.subjectGPU
dc.titleHigh-Performance Sparse Fourier Transform on Parallel Architectures
dc.type.dcmiText
dc.type.genreThesis
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
WANG-DISSERTATION-2016.pdf
Size:
1.36 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: