Episode 06: Writing good bioinformatics software with Torsten Seemann
📅28 November 2019
⏱️00:43:46
🎙️Microbial Bioinformatics
👥Guest
Microbiological Diagnostic Unit, University of Melbourne
In this episode of the microbinfie podcast, Associate Professor Torsten Seemann shares insights into developing effective bioinformatics software, drawing from his extensive experience creating widely-used tools like Prokka, Snippy, and Barrnap.
Join us for an insightful discussion with Torsten Seemann, a renowned bioinformatician and software developer. Torsten is the author of several widely-used bioinformatics tools that have contributed significantly to the field. His notable tools include:
- Prokka: Rapid prokaryotic genome annotation.
- Snippy: Efficient variant calling and core genome alignment.
- Barrnap: Fast ribosomal RNA prediction.
- Abricate: Mass screening of contigs for antimicrobial resistance or virulence genes using BLAST.
- Shovill: Assembler and pipeline for assembling bacterial isolate Illumina reads.
- Nullarbor: A pipeline for bacterial genome assembly, annotation, and reporting.
For more information and to explore Torsten's work, visit his GitHub page.
Key Points
1. Principles of Good Bioinformatics Software
- Focus on creating robust, user-friendly tools that solve real research problems
- Prioritize simplicity, minimal dependencies, and out-of-the-box functionality
- Use standard file formats to ensure interoperability
2. Software Development Philosophy
- Develop tools based on personal research needs
- Create solutions with minimal configuration requirements
- Use curated, high-quality databases to improve annotation reliability
3. Challenges in Bioinformatics Tool Adoption
- Difficulty in finding and installing new software
- Importance of clear documentation and installation instructions
- Need for responsive support and user-friendly interfaces
Take-Home Messages
- Less is more: simplify dependencies and parameters
- Prioritize user experience in software development
- Use standard formats and curated databases for reliable results