Nabil-Fareed Alikhan

Senior bioinformatician and software engineer with 15 years building production pipelines and genomic data platforms used by researchers and public health agencies worldwide. Proven track record delivering at national scale: 80,000+ pathogen genomes processed for COVID-19 surveillance, 620,000+ for global AMR monitoring. Fluent in Python, JavaScript, Nextflow/NF-core, cloud (AWS/GCP), and HPC.

Email:nabil@happykhan.comWebsite:happykhan.comGitHub:happykhanLocation:Oxford, UK

Education

2010–2015

PhD in Microbiology

University of Queensland

2009

BSc (1st Class Hons) in Microbiology

University of Queensland

Professional Experience

2024–Present

Senior Bioinformatician

Centre for Genomic Pathogen Surveillance, University of Oxford

Lead development of PathogenWatch, AMRwatch, and vaccines.watch: web platforms integrating over 620,000 pathogen genomes for global AMR surveillance, used by public health agencies in 90+ countries
Build and maintain production ETL pipelines processing genomic, epidemiological, and metadata across heterogeneous data sources
Architect systems for FAIR data delivery: standardised outputs consumable by downstream analytical and reporting tools
Stack: Python, JavaScript, Docker, HPC, cloud (AWS/GCP), Nextflow, NF-core

2018–2023

Bioinformatics Scientific Programmer / Interim Head of Informatics

Quadram Institute Bioscience

Ran computational infrastructure for a team of 20+ scientists; responsible for pipeline deployment, HPC cluster management, and cloud migration (CLIMB-BIG-DATA, £1.9M MRC)
Built high-throughput processing pipelines for COG-UK: released 80,000+ SARS-CoV-2 genomes through infrastructure I designed and maintained; results contributed to UK government briefings
Developed CoronaHiT (Genome Medicine 2021), an Illumina-based SARS-CoV-2 sequencing workflow adopted nationally; developed RonaQC, a QC pipeline for national surveillance
Led automated testing, CI/CD, and containerisation (Docker/Singularity) standards across the informatics team

2014–2018

Senior Research Fellow / Research Fellow in Pathogen Bioinformatics

University of Warwick

Built comparative genomics pipelines for Salmonella, E. coli, and Campylobacter at population scale
Co-developed EnteroBase: analytical infrastructure for 400,000+ bacterial genomes; co-developed GrapeTree (Genome Research 2018), a visualisation tool for large-scale population structure

Key Projects

Project	Scale	Stack	Outcome
COG-UK pipeline (CoronaHiT / RonaQC)	80,000+ genomes	Python, Nextflow, Docker, HPC	National SARS-CoV-2 surveillance
AMRwatch	620,000+ genomes	Python, JS, PostgreSQL, cloud	Used by WHO/ECDC-adjacent agencies
EnteroBase	400,000+ genomes	Python, HPC, web	Standard tool in molecular epidemiology
BRIG	94,000+ downloads	Java	3,000+ citations, taught in universities
PathogenWatch	Multi-pathogen, cloud	JS, Python, cloud	Production platform, CGPS flagship

Technical Skills

Pipeline development: Nextflow, NF-core, Snakemake, shell scripting
Languages: Python (expert), JavaScript (proficient), R, Bash, Java
Infrastructure: HPC (SLURM), Docker, Singularity, AWS, GCP, Linux server admin
Data: PostgreSQL, ETL design, FAIR data principles, REST APIs
Dev practices: Git, CI/CD (GitHub Actions), automated testing (pytest), code review, Agile
Bioinformatics: Genome assembly, read mapping, phylogenetics, population genomics, metagenomics, transcriptomics

Selected Publications

CoronaHiT: high-throughput sequencing of SARS-CoV-2 genomes. Alikhan et al.. Genome Medicine (2021).
GrapeTree: visualisation of core genomes at scale. Page et al.. Genome Research (2018).
BRIG: BLAST Ring Image Generator. Alikhan et al.. BMC Genomics (2011) — 3,000+ citations.

Full list: scholar.google.com/citations?user=qP5cpssAAAAJ · h-index 30, 11,369 citations, 49 publications