Chapman, Barbara M.2018-07-132018-07-13May 20162016-05May 2016Portions of this document appear in: Wang, Cheng, Mauricio Araya-Polo, Sunita Chandrasekaran, Amik St-Cyr, Barbara Chapman, and Detlef Hohl. "Parallel sparse FFT." In Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms, p. 10. ACM, 2013. DO: 10.1145/2535753.2535764.http://hdl.handle.net/10657/3269Fast Fourier Transform (FFT) is one of the most important numerical algorithms widely used in numerous scientific and engineering computations. With the emergence of big data problems, however, in which the size of the processed data can easily exceed terabytes, it is challenging to acquire, process and store a sufficient amount of data to compute the FFT in the first place. The recently developed \textit{sparse} FFT (sFFT) algorithm provides a solution to this problem. The sFFT can compute a compressed Fourier transform by using only a small subset of the input data, thus achieves significant performance improvement. Modern homogeneous and heterogeneous multicore and manycore architectures are now part of the mainstream computing scene and can offer impressive performance for many applications. The computations that arise in sFFT lend it naturally to efficient parallel implementations. In this dissertation, we present efficient parallel implementations of the sFFT algorithm on three state-of-the-art parallel architectures, namely multicore CPUs, GPUs and a heterogeneous multicore embedded system. While the increase in the number of cores and memory bandwidth on modern architectures provide an opportunity to improve the performance through sophisticated parallel algorithm design, the sFFT is inherently complex, and numerous challenges need to address to deliver the optimal performance. In this dissertation, various parallelization and performance optimization techniques are proposed and implemented. Our parallel sFFT is more than 5x and 20x faster than the sequential sFFT on multicore CPUs and GPUs, respectively. Compared to full-size FFT libraries, the parallel sFFT achieves more than 9x speedup on multicore CPUs and 12x speedup on GPUs for a broad range of signal spectra.application/pdfengThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).Sparse FFTParallel computingCompressive SensingGPUHigh-Performance Sparse Fourier Transform on Parallel Architectures2018-07-13Thesisborn digital