ALGORITHMS AND DATA STRUCTURES TO DETECT ONCOVIRUSES IN HUMAN CANCER USING NEXT GENERATION SEQUENCING DATA
dc.contributor.advisor | Fofanov, Yuriy | |
dc.contributor.committeeMember | Widger, William R. | |
dc.contributor.committeeMember | Tsekos, Nikolaos V. | |
dc.creator | Zhu, Rui 1980- | |
dc.date.accessioned | 2014-12-19T13:28:45Z | |
dc.date.available | 2014-12-19T13:28:45Z | |
dc.date.created | December 2012 | |
dc.date.issued | 2012-12 | |
dc.date.updated | 2014-12-19T13:28:45Z | |
dc.description.abstract | Evidence suggests human cancer can be induced by viruses. One way to test this hypothesis is to look for viral sequences in the human cancer genome. Next Generation Sequencing (NGS) technology sequences the whole human genome in a short period of time. This opens a door for a systematic analysis of the human genome and a thorough search for oncogenic viral sequences in cancer. However, a huge amount of sequencing reads generated by NGS poses a great challenge on the computational part of data analysis in terms of computing speed and memory usage. Data structures such as hash and tree are widely implemented to improve the performance of computing algorithms. Here, I described both data structures that have been developed in our center and compared their performance. Hash out performed tree when mapping the reads to a small reference sequence database. Subsequently, real human cancer data were analyzed by using the hash-based mapper and different oncoviral sequences were found in different cancers. | |
dc.description.department | Computer Science, Department of | |
dc.format.digitalOrigin | born digital | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://hdl.handle.net/10657/834 | |
dc.language.iso | eng | |
dc.rights | The author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s). | |
dc.subject | Next-generation sequencing | |
dc.subject | Oncovirus | |
dc.subject | Cancer | |
dc.subject | Hash | |
dc.subject | Tree | |
dc.subject | Sequence reads | |
dc.subject.lcsh | Computer science | |
dc.title | ALGORITHMS AND DATA STRUCTURES TO DETECT ONCOVIRUSES IN HUMAN CANCER USING NEXT GENERATION SEQUENCING DATA | |
dc.type.dcmi | Text | |
dc.type.genre | Thesis | |
thesis.degree.college | College of Natural Sciences and Mathematics | |
thesis.degree.department | Computer Science, Department of | |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | University of Houston | |
thesis.degree.level | Masters | |
thesis.degree.name | Master of Science |