Nabil-Fareed Alikhan

Bioinformatics · Microbial Genomics · Software Development

Industry CV

Print / PDF version · Full academic CV
Email: nabil@happykhan.comWebsite: happykhan.comGitHub: happykhan

Senior bioinformatician and software engineer with 15 years building production pipelines and genomic data platforms used by researchers and public health agencies worldwide. Proven track record delivering at national scale: 80,000+ pathogen genomes processed for COVID-19 surveillance, 620,000+ for global AMR monitoring. Fluent in Python, JavaScript, Nextflow/NF-core, cloud (AWS/GCP), and HPC.

30
H-Index
11,369
Citations
15
Years Experience
94,150
Software Downloads
0
Coffees Drunk

Education

PhD in Microbiology
University of Queensland, Australia
2010–2015
Thesis: Escherichia coli virulence: a genomic approach
Supervisor: Scott Beatson
BSc (Hons, 1st Class) in Microbiology
University of Queensland
2009
Thesis: Comparative genome analysis of Escherichia coli VR50
BSc in Biochemistry & Bachelor of Information Technology
University of Queensland
2004–2008

Experience

2024 – Present

Senior Bioinformatician

Centre for Genomic Pathogen Surveillance, University of Oxford
Lead development of PathogenWatch, AMRwatch, and vaccines.watch: web platforms integrating over 620,000 pathogen genomes for global AMR surveillance, used by public health agencies in 90+ countries. Build and maintain production ETL pipelines processing genomic, epidemiological, and metadata across heterogeneous data sources. Architect systems for FAIR data delivery.
2018 – 2023

Bioinformatics Scientific Programmer / Interim Head of Informatics

Quadram Institute Bioscience
Ran computational infrastructure for a team of 20+ scientists. Built high-throughput pipelines for COG-UK: released 80,000+ SARS-CoV-2 genomes through infrastructure I designed and maintained. Developed CoronaHiT (Genome Medicine 2021) and RonaQC for national surveillance. Managed grants and infrastructure totalling over £5M.
2014 – 2018

Senior Research Fellow / Research Fellow in Pathogen Bioinformatics

University of Warwick
Built comparative genomics pipelines for Salmonella, E. coli, and Campylobacter at population scale. Co-developed EnteroBase: analytical infrastructure for 400,000+ bacterial genomes. Co-developed GrapeTree (Genome Research 2018), a visualisation tool for large-scale population structure.

Key Projects

ProjectScaleStackOutcome
COG-UK pipeline (CoronaHiT / RonaQC)80,000+ genomesPython, Nextflow, Docker, HPCNational SARS-CoV-2 surveillance
AMRwatch620,000+ genomesPython, JS, PostgreSQL, cloudUsed by WHO/ECDC-adjacent agencies
EnteroBase400,000+ genomesPython, HPC, webStandard tool in molecular epidemiology
BRIG94,000+ downloadsJava3,000+ citations, taught in universities
PathogenWatchMulti-pathogen, cloudJS, Python, cloudProduction platform, CGPS flagship

Technical Skills

Pipeline Development
Nextflow / NF-core
Snakemake
Shell scripting
Languages
Python
JavaScript
Bash
R
Java
Infrastructure
HPC (SLURM)
Docker / Singularity
AWS
GCP
Linux admin
Data & Dev Practices
PostgreSQL / ETL
REST APIs
Git / GitHub Actions
pytest / CI

Selected Publications

Full publication list: Google Scholar · Full academic CV · Print version