Pavlidis, Ioannis T.2018-12-032018-12-03August 2012016-08August 201Portions of this document appear in: Kosoy, Michael, Ying Bai, Russell Enscore, Maria Rosales Rizzo, Scott Bender, Vsevolod Popov, Levent Albayrak, Yuriy Fofanov, and Bruno Chomel. "Bartonella melophagi in blood of domestic sheep (Ovis aries) and sheep keds (Melophagus ovinus) from the southwestern US: cultures, genetic characterization, and ecological connections." Veterinary microbiology 190 (2016): 43-49. DOI: 10.1016/j.vetmic.2016.05.009.Portions of this document appear in: Kosoy, Michael, Ying Bai, Russell Enscore, Maria Rosales Rizzo, Scott Bender, Vsevolod Popov, Levent Albayrak, Yuriy Fofanov, and Bruno Chomel. "Bartonella melophagi in blood of domestic sheep (Ovis aries) and sheep keds (Melophagus ovinus) from the southwestern US: cultures, genetic characterization, and ecological connections." Veterinary microbiology 190 (2016): 43-49. DOI: 10.1016/j.vetmic.2016.05.009.http://hdl.handle.net/10657/3624Understanding the basic rules of bacterial evolution and adaptation is critical in developing new anti-bacterial drugs, the use of bacteria in biotechnology applications as well as in combating undesired consequences of bacterial presence in industrial and environmental settings such as corrosion, product spoilage, and degradation. Accumulation of single nucleotide mutations beneficial (or neutral) for bacterial survival is a well-studied mechanism of bacterial adaptation which also reflects the time of species separation from a common ancestor (molecular clock hypothesis). The gene loss or gain due to horizontal gene transfer is another much more dynamic mechanism of bacterial adaptation. Using these mechanisms, bacteria can acquire new features such as virulence factors, locomotion ability (flagella), and heat or drug resistance. A major functional characteristic of bacterial species is the presence of particular gene sets common to the species (core genome) together with genes that are available to individual or groups of genomes (pan genome). The technical difficulties however, lie in how one can identify the same genes or gene families in evolutionarily distant organisms: 1. Identification of a sequence-similarity threshold 2. Computational complexity of sequence clustering algorithms 3. Creation of a biologically meaningful cluster topology In this work, we have developed methods to improve the quality and performance of gene clustering including heuristics free, novel sequence alignment algorithms able to cluster a large number of sequences significantly faster than traditional methods (a few days compared to months of computation) that permit the identification of appropriate similarity thresholds and formation of biologically meaningful cluster topology. The developed algorithms were used to build a “functional similarity” tree of the species reflecting gene composition similarity. The performed analysis also identified co-appearance and avoidance patterns of genes in bacterial species. We have applied the proposed methods to 22 genomes from Bartonella spp. using 34,060 genes.application/pdfengThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).BioinformaticsSequence alignmentSequence clusteringClustering algorithmGlobal alignmentGene profilesFunctional similarityBacteriaBartonellaNovel Alignment Based Clustering Algorithms for Pan Genome Analysis of Bacteria Species2018-12-03Thesisborn digital