Nabil-Fareed Alikhan
Senior bioinformatician and software engineer with 15 years building production pipelines and genomic data platforms used by researchers and public health agencies worldwide. Proven track record delivering at national scale: 80,000+ pathogen genomes processed for COVID-19 surveillance, 620,000+ for global AMR monitoring. Fluent in Python, JavaScript, Nextflow/NF-core, cloud (AWS/GCP), and HPC.
Email:nabil@happykhan.comWebsite:happykhan.comGitHub:happykhanLocation:Oxford, UK
Education
2010–2015
PhD in Microbiology
University of Queensland
2009
BSc (1st Class Hons) in Microbiology
University of Queensland
Professional Experience
2024–Present
Senior Bioinformatician
Centre for Genomic Pathogen Surveillance, University of Oxford
- Lead development of PathogenWatch, AMRwatch, and vaccines.watch: web platforms integrating over 620,000 pathogen genomes for global AMR surveillance, used by public health agencies in 90+ countries
- Build and maintain production ETL pipelines processing genomic, epidemiological, and metadata across heterogeneous data sources
- Architect systems for FAIR data delivery: standardised outputs consumable by downstream analytical and reporting tools
- Stack: Python, JavaScript, Docker, HPC, cloud (AWS/GCP), Nextflow, NF-core
2018–2023
Bioinformatics Scientific Programmer / Interim Head of Informatics
Quadram Institute Bioscience
- Ran computational infrastructure for a team of 20+ scientists; responsible for pipeline deployment, HPC cluster management, and cloud migration (CLIMB-BIG-DATA, £1.9M MRC)
- Built high-throughput processing pipelines for COG-UK: released 80,000+ SARS-CoV-2 genomes through infrastructure I designed and maintained; results contributed to UK government briefings
- Developed CoronaHiT (Genome Medicine 2021), an Illumina-based SARS-CoV-2 sequencing workflow adopted nationally; developed RonaQC, a QC pipeline for national surveillance
- Led automated testing, CI/CD, and containerisation (Docker/Singularity) standards across the informatics team
2014–2018
Senior Research Fellow / Research Fellow in Pathogen Bioinformatics
University of Warwick
- Built comparative genomics pipelines for Salmonella, E. coli, and Campylobacter at population scale
- Co-developed EnteroBase: analytical infrastructure for 400,000+ bacterial genomes; co-developed GrapeTree (Genome Research 2018), a visualisation tool for large-scale population structure
Key Projects
| Project | Scale | Stack | Outcome |
|---|---|---|---|
| COG-UK pipeline (CoronaHiT / RonaQC) | 80,000+ genomes | Python, Nextflow, Docker, HPC | National SARS-CoV-2 surveillance |
| AMRwatch | 620,000+ genomes | Python, JS, PostgreSQL, cloud | Used by WHO/ECDC-adjacent agencies |
| EnteroBase | 400,000+ genomes | Python, HPC, web | Standard tool in molecular epidemiology |
| BRIG | 94,000+ downloads | Java | 3,000+ citations, taught in universities |
| PathogenWatch | Multi-pathogen, cloud | JS, Python, cloud | Production platform, CGPS flagship |
Technical Skills
- Pipeline development: Nextflow, NF-core, Snakemake, shell scripting
- Languages: Python (expert), JavaScript (proficient), R, Bash, Java
- Infrastructure: HPC (SLURM), Docker, Singularity, AWS, GCP, Linux server admin
- Data: PostgreSQL, ETL design, FAIR data principles, REST APIs
- Dev practices: Git, CI/CD (GitHub Actions), automated testing (pytest), code review, Agile
- Bioinformatics: Genome assembly, read mapping, phylogenetics, population genomics, metagenomics, transcriptomics
Selected Publications
- CoronaHiT: high-throughput sequencing of SARS-CoV-2 genomes. Alikhan et al.. Genome Medicine (2021).
- GrapeTree: visualisation of core genomes at scale. Page et al.. Genome Research (2018).
- BRIG: BLAST Ring Image Generator. Alikhan et al.. BMC Genomics (2011) — 3,000+ citations.
Full list: scholar.google.com/citations?user=qP5cpssAAAAJ · h-index 30, 11,369 citations, 49 publications