Published on Sat Oct 09 2021

StarBeast3: Adaptive Parallelised Bayesian Inference of the Multispecies Coalescent

Douglas, J., Jimenez-Silva, C., Bouckaert, R.

Bayesian multispecies coalescent methods address these issues. This is achieved by embedding a set of gene trees within a species tree. However, this approach comes at the cost of increased computational demand.

3
38
93
Abstract

As genomic sequence data becomes increasingly available, inferring the phylogeny of the species as that of concatenated genomic data can be enticing. However, this approach makes for a biased estimator of branch lengths and substitution rates and an inconsistent estimator of tree topology. Bayesian multispecies coalescent methods address these issues. This is achieved by embedding a set of gene trees within a species tree and jointly inferring both under a Bayesian framework. However, this approach comes at the cost of increased computational demand. Here, we introduce StarBeast3 -- a software package for efficient Bayesian inference of the multispecies coalescent model via Markov chain Monte Carlo. We gain efficiency by introducing cutting-edge proposal kernels and adaptive operators, and StarBeast3 is particularly efficient when a relaxed clock model is applied. Furthermore, gene tree inference is parallelised, allowing the software to scale with the size of the problem. We validated our software and benchmarked its performance using three real and two synthetic datasets. Our results indicate that StarBeast3 is up to one-and-a-half orders of magnitude faster than StarBeast2, and therefore more than two orders faster than *BEAST, depending on the dataset and on the parameter, and is suitable for multispecies coalescent inference on large datasets (100+ genes). StarBeast3 is open-source and is easy to set up with a friendly graphical user interface.