Doctoral thesis
Open access

MetaPIGA 4: an evolutionary computation approach for estimating phylogenetic trees

ContributorsGrbic, Dorde
Defense date2018-01-16

Phylogeny inference is a field of biology that attempts reconstructing the hierarchical historical relationships among homologous sets of characters. Inferring phylogenetic trees is a crucial step in many biological fields spanning from molecular biology and ecology to epidemiology and forensics. Inferring phylogenies is, for the most part, a very computationally-intensive task, a problem amplified by the recent introduction of complex probabilistic (including Bayesian) evolutionary models and the maximum likelihood (ML) criterion. ML assumes that trees with higher likelihood to produce the observed data should be preferred over trees with lower likelihood. Early algorithms for inferring phylogenies focused on finding the best (highest likelihood) tree. However, given the computational intractability of such a problem, the field of phylogeny inference moved from the quest for the best tree to the estimation of probability distribution of solutions. Basically, the goal is to find excellent (likelihood-wise) solutions and provide statistical (possibly Bayesian) support to each branch of the tree. Evolutionary algorithms (EA) are potentially efficient approaches for reaching that goal. An EA keeps a population of phylogenies (solutions) in the memory, proposes new phylogenies through mutations and recombinations, and uses the fitness value (here, the likelihood) of the solutions to select which proposals survive and which others are discarded. The selection scheme defines how likely sub-optimal solutions are to survive in the hope for escaping local optima and finding yet better solutions. Conversely, hill-climbing is an approach that simply replaces a solution in memory only if a better proposal has been found. By definition, hill-climbing quickly reaches a local optimum but is unable to escape it. In this thesis, I have implemented a hybrid EA that combines these two approaches. This EA uses the Step-wise Addition Parsimony algorithm to produce 4 starting solutions, occasionally uses hill-climbing, and performs a (lambda, mu) selection to allow populations to escape the current local optimum and further explore the solutions-space. Furthermore, the EA implemented here preserves multiple populations (in a meta-population setting) that use the consensus-pruning principle to exchange structural information during the exploration phase. I have shown that this combination of optimization elements outperforms both classical EA and hill-climbing algorithms. The hybrid EA can successfully explore the solution space throughout the analysis and is robust to population size. This hybrid EA, implemented as the MetaPIGA 4 software package, generates results of qualities comparable to those produced by current state-of-the-art algorithms for phylogeny inference. Additionally, I implemented in MetaPIGA 4 new features and improvements such as GPU computing, codon models, ancestral state reconstruction, and a MetaPIGA 4 web-service.

  • Phylogeny inference
  • Evolutionary computation
  • Genetic algorithm
  • Bayesian inference
  • Maximum likelihood
  • Phylogenetics
Research group
Citation (ISO format)
GRBIC, Dorde. MetaPIGA 4: an evolutionary computation approach for estimating phylogenetic trees. 2018. doi: 10.13097/archive-ouverte/unige:104677
Main files (1)

Technical informations

Creation05/18/2018 5:11:00 PM
First validation05/18/2018 5:11:00 PM
Update time03/15/2023 8:15:01 AM
Status update03/15/2023 8:15:00 AM
Last indexation01/29/2024 9:28:50 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack