Are Parallel and Convergent Changes in Proteins Indicative of Adaptive Evolution?



Journal Title

Journal ISSN

Volume Title



The independent evolution of identical or similar phenotypical traits in different lineages is referred to as convergent evolution. At the phenotypic level, convergent changes are usually regarded as evidence for adaptation due to natural selection. It is, however, unclear whether such changes at the molecular level are also indicative of adaptive evolution. This dissertation attempts to tackle issues pertaining to parallel and convergent amino acid replacements and their evolutionary significance. First, I examined parallel and convergent replacements in all one-to-one orthologous proteins from nine mammals and twelve Drosophila species. I compared the numbers of inferred parallel and convergent amino acid replacements with the expectations derived from two evolutionary models: the JTT model and a new selection-free amino acid replacement model. When the selection-free model was used, no excessive parallel or convergent replacements were found and the observed numbers of such replacements could be explained without invoking positive selection. I also demonstrated that many parallel and convergent changes reported in the literature constitute false positives due to the discordance that sometimes exists between the gene tree and the species tree. Second, I investigated the effects of taxonomic sample size, tree balance, choice of multiple sequence alignment method, and choice of ancestral state reconstruction method on the accuracy of parallel and convergent amino acid replacement identification. Sample size has a profound impact, whereby identification accuracy increases significantly with sample size. I also found that the other three factors and some interactions (e.g., sample size by reconstruction method and tree balance by reconstruction method) influence identification accuracy, but to a much lesser extent. My study indicates that the best method combination to use is either T-COFFEE+CODEML or MUSCLE+CODEML. The Clustal Ω+FastML combination should be avoided. Third, to facilitate future studies of parallel and convergent replacements, I developed a Python package called ProtParCon that can be used to process molecular data and identify parallel and convergent amino acid replacements. As demonstrated in a case study of lysozyme c sequences, ProtParCon is capable of rapidly and efficiently analyzing parallel and convergent replacements in real biological datasets.



Parallelism, Convergence, Adaptive evolution, Amino Acid Replacements, Protein Evolution