Effect of Repeatable Regions on Ability to Estimate Copy Number Variation in Human Genome by High Throughput Sequencing

dc.contributor.advisorFofanov, Yuriy
dc.contributor.committeeMemberWidger, William R.
dc.contributor.committeeMemberOrdonez, Carlos
dc.contributor.committeeMemberTsekos, Nikolaos V.
dc.contributor.committeeMemberShah, Shishir Kirit
dc.creatorGolovko, Georgiy 1983-
dc.date.accessioned2018-02-15T20:06:54Z
dc.date.available2018-02-15T20:06:54Z
dc.date.createdDecember 2012
dc.date.issued2012-12
dc.date.submittedDecember 2012
dc.date.updated2018-02-15T20:06:54Z
dc.description.abstractGenomic differences (mutations) in humans are profoundly influenced by their distinction as either germ line (inherited) or somatic (developed over one’s life span). Such mutations can vary from a single nucleotide insertion, deletion, or substitution in a gene to a complete duplication or deletion of a large amount of genomic material ranging from thousands of nucleotides to an entire chromosome ultimately referred to as Copy Number Variations (CNV). While a large number of genomic variations have no significant influence on the overall quality of life, certain types of variations in a human genome called abnormalities are known to be associated with genetic disorders including cancer, autism, schizophrenia, just to name a few. Recent advancements in DNA sequencing technologies have made it possible to utilize High Throughput Sequencing (HTS) to identify and detect CNVs. The focus of this research is the development of computational methods used to address the challenges of analyzing high throughput DNA sequence data for quality assessment in relatively large genomes (e.g. human genome) to detect copy number variations and including the data representation. An evolutionary programming approach has been developed to use the set of novel algorithms and data structures introduced in this dissertation for the purpose of efficiently and accurately mapping genomic reads to one or more reference genomes. I have developed computational tools that make it possible to identify the undesirable effects of repetitive regions in the human genome with the ability to identify CNVs and propose a novel approach to reduce their influence on genomic analysis.
dc.description.departmentComputer Science, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10657/2188
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectHTS
dc.subjectNGS
dc.subjectRepeatable Regions in Human Genome
dc.subjectCNV
dc.subjectBioinformatics
dc.subjectComputer science
dc.titleEffect of Repeatable Regions on Ability to Estimate Copy Number Variation in Human Genome by High Throughput Sequencing
dc.type.dcmiText
dc.type.genreThesis
thesis.degree.collegeCollege of Natural Sciences and Mathematics
thesis.degree.departmentComputer Science, Department of
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Georgiy Golovko(0633390), Final .pdf
Size:
3.09 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.12 KB
Format:
Plain Text
Description: