Parallel I/O on Compressed Data Files

dc.contributor.advisorSubhlok, Jaspal
dc.contributor.committeeMemberGabriel, Edgar
dc.contributor.committeeMemberWu, Panruo
dc.contributor.committeeMemberShah, Shishir Kirit
dc.contributor.committeeMemberLindner, Peggy
dc.creatorSingh, Siddhesh P.
dc.date.accessioned2023-05-26T15:19:37Z
dc.date.createdMay 2022
dc.date.issued2022-04-28
dc.date.updated2023-05-26T15:19:39Z
dc.description.abstractThe increase in processing power of modern computing hardware has not been accompanied by a proportional increase in the performance of storage technology leading to an imbalance in cluster and parallel computing architectures where input-output (I/O) operations may bottleneck the overall performance of the system. This makes necessary the use of sophisticated software solutions to overcome limitations on I/O performance. One method is to apply specialized algorithms in parallel I/O to optimize data transfer. Another solution to this problem is to use data compression to effectively reduce the amount of data which is transferred between processing and storage units. An under examined area of research is the intersection of parallel I/O and data compression and how these two techniques can be combined in High Performance Computing (HPC) environments. This dissertation presents a general model for incorporating data compression within existing parallel I/O algorithms and evaluates the performance benefits obtained through performing parallel I/O on compressed data files. In particular, the dissertation presents an Open MPI-I/O (OMPIO) implementation which incorporates arbitrary compression libraries within the two phase I/O algorithm through a new file format. The results indicate significant performance and space saving benefits through this approach and the parallel compression semantics presented in this dissertation provide a theoretical basis for future research in parallel I/O and data compression.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.citationPortions of this document appear in: Singh, Siddhesh Pratap, and Edgar Gabriel. "Parallel I/O on Compressed Data Files: Semantics, Algorithms, and Performance Evaluation." In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 192-201. IEEE, 2020.
dc.identifier.urihttps://hdl.handle.net/10657/14271
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectParallel I/O
dc.subjectData compression
dc.subjectMPI
dc.subjectOpen MPI
dc.titleParallel I/O on Compressed Data Files
dc.type.dcmiText
dc.type.genreThesis
dcterms.accessRightsThe full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period.
local.embargo.lift2024-05-01
local.embargo.terms2024-05-01
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SINGH-DISSERTATION-2022.pdf
Size:
2.55 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.43 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: