Timed phylogenies with BactDating
— Posts — 1 min read
You want to calculate the time to most recent common ancestor (tMRCA) and dates for internal splits for a given phylogeny. BactDating is an R package available at: https://github.com/xavierdidelot/BactDating
This example assumes you've removed recombination using ClonalFrameML. The output from ClonalFrameML is imported into Bactdating with the command
BactDating will generate a number of plots:
- trace: Will show the parameters for each iteration, these should look like they're converging (stabilizing). Otherwise increase number of iterations (nbIts)
- treeroot: Will plot the tree, and estimate which branch is the root.
- TreeCI: Timed phylogeny with confidence intervals.
- t: Just the tree itself - no funny business.
The timed phylogeny will look like this, where the blue bars indicate the 95% confidence interval:
Has it converged?
You can tell looking at the
plot( res, 'trace') output. You are looking for cases where the parameters have a similar mean and variance.
Root to tip
Plots the root to tip. This won't adjust for any recombination (unlike bactdate > res result above). You want to see an even spread of points within the dotted lines.
Is the clock signal significant?
One way to test this is to repeat the analysis but give every node the same date or giving a random set of dates (the aim is to feed the program nonsense). The results generated by the real dating should be clearly better than results generated with the nonsense dating.
The sample below shows you how to set every tip to have the same date and compare the results via the Deviance information criterion (DIC). This measures the simplicity of the model and how well it fits the data. The lower the DIC the better. It is a quantitative way to see if your result with the real dates is really significant.
Because the DIC is taking into account model complexity, it is important to make sure that both the random nonsense result and the real result have actually converged.
Further tests of convergence
For further testing of convergence, you can export the BactDating result to the format required by the
coda package using the command:
You can then compute for example the effective sample size of the parameters using:
The more samples the better but you should have at least 100 before you take your results seriously.