Skip to content

Nabil-Fareed Alikhan

MicroBinfie Podcast, 01 What bioinformatics software not to write part 1

microbinfie, podcast2 min read

In this episode we identify areas of “Peak-bioinformatics”. There are a lot of existing bioinformatics software out there - more often than not the new tool you want to write already exists or a new tool cannot effectively improve. We discuss this in terms of genome assembly, read mapping and phylogenetics. Question and comments? microbinfie@gmail.com SHOW NOTES Generally, novel software is not needed if: There are a plethora of existing tools The problem is more or less solved or its been shown to be unsolvable The underlying technology or problem is now obsolete and/or superceded by other methods. Multiple sequence aligners: MAFFT https://mafft.cbrc.jp/alignment/software/ MUSCLE https://www.drive5.com/muscle/ Whole genome aligners: Mauve http://darlinglab.org/mauve/mauve.html Mugsy http://mugsy.sourceforge.net/ Sibellia http://bioinf.spbau.ru/sibelia Parsnp https://github.com/marbl/parsnp Assemblers: SPADES https://github.com/ablab/spades Skesa https://github.com/ncbi/SKESA Velvet https://www.ebi.ac.uk/~zerbino/velvet/ Abyss https://github.com/bcgsc/abyss Edena http://www.genomic.ch/edena.php Ray http://denovoassembler.sourceforge.net/ Long read assemblies HGAP https://github.com/PacificBiosciences/Bioinformatics- Training/wiki/HGAP-2.0 Flye https://github.com/fenderglass/Flye Canu https://github.com/marbl/canu Ra https://www.biorxiv.org/content/10.1101/656306v1 Unicycler https://github.com/rrwick/Unicycler Read mapping BWA http://bio- bwa.sourceforge.net/ Bowtie2 http://bio-bwa.sourceforge.net/ Minimap2 https://github.com/lh3/minimap2 (BWA better for short reads: https://lh3.github.io/2018/04/02/minimap2-and-the-future-of-bwa) BBtools https://jgi.doe.gov/data-and-tools/bbtools/ BLAST/BLAT: https://genome.ucsc.edu/FAQ/FAQblat.html SMALT https://www.sanger.ac.uk/science/tools/smalt-0 Snippy https://github.com/tseemann/snippy Phenix: https://github.com/phe- bioinformatics/PHEnix Variant callers GATK: https://software.broadinstitute.org/gatk/ VIPR: https://www.viprbrc.org/brc/home.spg?decorator=vipr Varscan2 http://varscan.sourceforge.net/ Workflow managers Bespoke example https://github.com/VertebrateResequencing/vr-codebase Snakemake. https://snakemake.readthedocs.io/en/stable/ Nextflow https://www.nextflow.io/ Galaxy https://usegalaxy.org/ Bpipe https://github.com/ssadedin/bpipe Phylogenetics: Raxml - Raxml-NG https://cme.h-its.org/exelixis/software.html IQTREE http://www.iqtree.org/ FastTree http://www.microbesonline.org/fasttree/ BEAST 1&2 https://www.beast2.org/ RevBayes https://revbayes.github.io/ Metagenomics Taxonomic classification: Megan, Kraken, SIGMA, MIDAS, metaphlan2, mOTUs. Assemblers: MetaSpades, metaflye, MEGAHIT, MetaVelvet, a lot of single isolate assemblers have been tweaked to run on metagenomes. https://github.com/lskatz/Kalamari AMR https://github.com/arpcard/amr_curation https://food-safety- bioinformatics-hackathon.github.io/AMR-protocols/ ABRICATE https://github.com/tseemann/abricate ARIBA https://www.sanger.ac.uk/science/tools/ariba Too many detection tools: https://docs.google.com/spreadsheets/d/18XGWpDiaE249qQKDAL7gdBC ka0Z1drpA_s3FElfMJe0/edit#gid=0 Other mentioned resources Mentioned Recent review. Zhang W, Chen J, Yang Y, Tang Y, Shang J, Shen B (2011) A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies. PLoS ONE 6(3): e17915. https://doi.org/10.1371/journal.pone.0017915 The Assemblerthon: https://assemblathon.org/ Blog post describing that BWA better for short reads: https://lh3.github.io/2018/04/02/minimap2-and- the-future-of-bwa The science web: https://thescienceweb.wordpress.com/2015/03/23/each-bioinformatician- to-have-their-own-personal-short-read-aligner-by-2016/