Nabil-Fareed Alikhan

Bioinformatics · Microbial Genomics · Software Development

Publications

View my complete publication list on Google Scholar →

Showing 51 of 51 publications

amr. watch - Monitoring Antimicrobial Resistance Trends from Global Genomics Data

David et al. (2025) bioRxiv 2025.04.17.649298

Show Details

Authors:

S David, JD Caballero, N Couto, K Abudahab, N Fareed-Alikhan, C Yeats, A Underwood, A Molloy, D Connor, HM Shane, PM Ashton, H Grundmann, MT Holden, EJ Feil, SB Sia, P Donado-Godoy, RK Lingegowda, IN Okeke, S Argimón, DM Aanensen

Abstract:

Background: Whole genome sequencing (WGS) is increasingly supporting routine pathogen surveillance at local and national levels, providing comparable data that can inform on the emergence and spread of antimicrobial resistance (AMR) globally. However, the potential for shared WGS data to guide interventions around AMR remains under-exploited, in part due to challenges in collating and transforming the growing volumes of data into timely insights. We present an interactive platform, amr.watch (https://amr.watch), that enables interrogation of AMR trends from public WGS data on an ongoing basis to support research and policy. Methods: The amr.watch platform incorporates, analyses and visualises high-quality WGS data from WHO-defined priority bacterial pathogens. Analytics are performed using community-standard methods with bespoke species-specific curation of AMR mechanisms. Findings: By 31 March 2025, amr.watch included data from 620,700 pathogen genomes with geotemporal information, with highly variable representation of different species and geographic regions. By integrating WGS data with sampling information, amr.watch enables users to assess geotemporal trends among genotypic variants (e.g. sequence types) and AMR mechanisms, with implications for interventions including antimicrobial prescribing and drug and vaccine development. Interpretation: amr.watch is an information platform for scientists and policy-makers delivering ongoing situational awareness of AMR trends from genomic data. As broad adoption of WGS continues, amr.watch is positioned to monitor both pathogen populations and our global efforts in genomic surveillance, guiding control strategies tailored to each pathogen's characteristics.

DOI: 10.1101/2025.04.17.649298

Monitoring of Vaccine Targets and Interventions Using Global Genome Data: vaccines. watch

David et al. (2025) bioRxiv 2025.06.13.659488

Show Details

Authors:

S David, K Abudahab, N Couto, WOD Daningrat, C Yeats, A Molloy, PM Ashton, NF Alikhan, DM Aanensen

Abstract:

The expansion of pathogen genome sequencing into routine disease surveillance programmes is set to bring rapidly-growing volumes of increasingly structured data on a global scale. This has the potential to deliver exciting opportunities for accelerating vaccine development and monitoring. Here we present an interactive platform, vaccines.watch (https://vaccines.watch), which aims to support decision-making around vaccine formulations and roll-out by enabling interrogation of vaccine target diversity from global genome data. We have initially focused on targets included in existing or prospective multivalent polysaccharide- based vaccines for Streptococcus pneumoniae, Klebsiella pneumoniae (and related species) and Acinetobacter baumannii. The platform currently displays data for The expansion of pathogen genome sequencing into routine disease surveillance programmes is set to bring rapidly-growing volumes of increasingly structured data on a global scale. This has the potential to deliver exciting opportunities for accelerating vaccine development and monitoring. Here we present an interactive platform, vaccines.watch (https://vaccines.watch), which aims to support decision-making around vaccine formulations and roll-out by enabling interrogation of vaccine target diversity from global genome data. We have initially focused on targets included in existing or prospective multivalent polysaccharide- based vaccines for Streptococcus pneumoniae, Klebsiella pneumoniae (and related species) and Acinetobacter baumannii. The platform currently displays data for >100k high-quality genomes with geotemporal sampling information (post-2010), with new genomes assembled, analysed and incorporated on an ongoing basis (every 4 hours) as public data are newly deposited. Crucially, users can view vaccine target information in the broader context of genotypic variants (e.g. sequence types) and antimicrobial resistance markers. The platform also enables users to review the composite serotypes of pneumococcal vaccine formulations among the available genomes. For example, using data in vaccines.watch from 3 June 2025, we observed that serotypes included in the PCV13 and PCV21 formulations accounted for 36.2% (11,907/32,918) and 87.4% (28,764/32,918) of global public genomes, respectively. The platform also enables continuous review of the global genomic landscape of the included pathogens, enabling identification of gaps (e.g. in geographic coverage) that should be targeted with increased genomic surveillance. Indeed we demonstrate that substantial geographic gaps remain in the coverage of available genomes, with over half of countries contributing no genomes for each of the three pathogens. However, while caution in interpretation is important, as global representativeness of genome data grows, vaccines.watch is positioned to support different stages of the vaccine pipeline, from selection of target antigens to post-rollout monitoring of population changes.

DOI: 10.1101/2025.06.13.659488

PATH-SAFE Consortium Recommendations for Genomic Surveillance of Food-Borne Diseases Escherichia coli and Listeria monocytogenes

Gally et al. (2025) FSA Research and Evidence

Show Details

Authors:

D Gally, M Maiden, K Jolley, KM McIntyre, S Ott, A Darby, N Loman, RA Kingsley, A Chalka, K Holt, A McNally, K Baker, M Avison, M AbuOun, D Graham, C Jenkins, M Chattaway, S Nair, T Connor, A Vallejo-Trujillo, J King, E Haynes, R Ellis, J McElhiney, D Dorey-Robinson, M Gilmour, A Painset, A Egli, A Reimer, A Mather, M Allard, E Stevens, K Yahara, P Lehours, T Seemann, C Jenkins, RS Hendriksen, F Aarestrup, D Aanensen, R Acton, NF Alikhan, A Blanton, J Baker, J Walker, G Lewis-Woodhouse, D Connor, C Yeats, K Abudahab, P Shinde, C Vegvari

Abstract:

Whole-genome sequencing (WGS) for food-borne disease (FBD) surveillance provides many benefits, including new insights in disease transmission, virulence and antimicrobial resistance (AMR), fast and precise outbreak tracing and source attribution, as well as streamlined and reproducible analysis through digital data that, from a technical point of view, can be easily shared. The National foodborne disease genomic data platform (the PATH-SAFE platform) will offer a trusted environment for WGS data sharing and analysis for UK agencies involved in FBD surveillance. Following the successful implementation of the platform for Salmonella, in the second phase the platform will be expanded to Escherichia coli and Listeria monocytogenes. Where possible, the platform will draw on existing and validated solutions. For de novo genome assembly, EToKI and vanilla SPAdes provide the best results for E. coli, and Pathogenwatch provides the best results for L. monocytogenes. Analysis of genomic data is greatly enhanced by assigning genomes into well-defined cluster groups, which should be available on the PATH-SAFE platform. Specifically, tools for MLST and cgMLST should be available on the PATH-SAFE platform for both E. coli and L. monocytogenes. In addition, HierCC and ClermonTyping tools should be available for E. coli. The platform should implement tools for clustering E. coli and L. monocytogenes based on MLST/cgMLST profiles. Clusters should be named according to their HierCC codes. The PATH-SAFE platform should implement a tool for predicting E. coli serotypes from sequence data. ECTyper has been selected as the only up-to-date tool for this purpose. Although serotype determination of E. coli isolates is useful for historical reasons, the platform should be designed in a way that makes it easy to switch to hierarchical clustering of isolates. The identification of genetic virulence determinants is essential in the analysis of E. coli and L. monocytogenes. VirulenceFinder and AdhesiomeR should be implemented in the PATH-SAFE platform for virulence determinant identification in E. coli. The Pasteur L. monocytogenes Scheme has been selected for virulence determinant identification in L. monocytogenes, although none of the available databases seems to include all known genes determining virulence in L. monocytogenes. As the PATH-SAFE platform is expanded to new food-borne pathogens, integrating a tool to differentiate species will be useful. Speciator has been validated and shown to be 99.9% in agreement with Kraken in correctly assigning E. coli genomes. The number of metadata fields should be small initially to facilitate upload of data and use of the PATH-SAFE platform and to be consistent with UK GDPR obligations. Minimum metadata requirements of the platform should be compatible with the metadata collected by each agency. They should also be compatible with concerns around data sharing and legal obligations. All metadata must be processed in line with UK GDPR guidelines and align with organisational policies and relevant legislation. The PATH-SAFE platform should implement a gated access model that will allow participating agencies to share additional metadata with trusted partners and at the same time minimise the risk of leaking sensitive information. The PATH-SAFE platform should have an automated QC mechanism for validating uploaded metadata. Options for both bulk upload of metadata and for upload of individual metadata fields should be offered by the platform. In addition, functionality for regular automated uploads could be provided. Experiences with implementing WGS for FBD surveillance in the UK, Switzerland and Canada show that collaboration of reference laboratories carrying out sequencing analyses and epidemiological and One Health units providing metadata is critical for prioritising isolates for outbreak investigations.

DOI: 10.46756/001c.143984

Epidemiological Characterization and Genetic Variation of the SARS-CoV-2 Delta Variant in Palestine

Ereqat et al. (2024) Pathogens 13:6 521

Show Details

Authors:

S Ereqat, NF Alikhan, A Al-Jawabreh, M Matthews, A Al-Jawabreh, L de Oliveira Martins, AJ Trotter, M Al-Kaila, AJ Page, MJ Pallen, A Nasereddin

Abstract:

The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p < 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p $<$ 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR 0.5), with Al-Khalil, the district with the highest prevalence of COVID-19, witnessing the highest frequency of transitions. Genetic diversity analysis indicated closely related haplogroups, as haplotype diversity (Hd) is high but has low nucleotide diversity (The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p $<$ 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR $$ 0.5), with Al-Khalil, the district with the highest prevalence of COVID-19, witnessing the highest frequency of transitions. Genetic diversity analysis indicated closely related haplogroups, as haplotype diversity (Hd) is high but has low nucleotide diversity (π). However, nucleotide diversity (The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p $<$ 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR $$ 0.5), with Al-Khalil, the district with the highest prevalence of COVID-19, witnessing the highest frequency of transitions. Genetic diversity analysis indicated closely related haplogroups, as haplotype diversity (Hd) is high but has low nucleotide diversity ($$). However, nucleotide diversity (π) in Palestine is still higher than the global figures. Neutrality tests were significantly (p The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p $<$ 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR $$ 0.5), with Al-Khalil, the district with the highest prevalence of COVID-19, witnessing the highest frequency of transitions. Genetic diversity analysis indicated closely related haplogroups, as haplotype diversity (Hd) is high but has low nucleotide diversity ($$). However, nucleotide diversity ($$) in Palestine is still higher than the global figures. Neutrality tests were significantly (p < 0.05) low, including Tajima's D, Fu-Li's F, and Fu-Li's D, suggesting one or more of the following: population expansion, selective sweep, and natural negative selection. Wright's F-statistic (Fst) showed genetic differentiation (Fst The emergence of new SARS-CoV-2 variants in Palestine highlights the need for continuous genetic surveillance and accurate screening strategies. This case series study aimed to investigate the geographic distribution and genetic variation of the SARS-CoV-2 Delta Variant in Palestine in August 2021. Samples were collected at random in August 2021 (n = 571) from eight districts in the West Bank, Palestine. All samples were confirmed as positive for COVID-19 by RT-PCR. The samples passed the quality control test and were successfully sequenced using the ARTIC protocol. The Delta Variant was revealed to have four dominant lineages: B.1.617 (19%), AY.122 (18%), AY.106 (17%), and AY.121 (13%). The study revealed eight significant purely spatial clusters (p $<$ 0.005) distributed in the northern and southern parts of Palestine. Phylogenetic analysis of SARS-CoV-2 genomes (n = 552) showed no geographically specific clades. The haplotype network revealed three haplogroups without any geographic distribution. Chronologically, the Delta Variant peak in Palestine was shortly preceded by the one in the neighboring Israeli community and shortly followed by the peak in Jordan. In addition, the study revealed an extremely intense transmission network of the Delta Variant circulating between the Palestinian districts as hubs (SHR $$ 0.5), with Al-Khalil, the district with the highest prevalence of COVID-19, witnessing the highest frequency of transitions. Genetic diversity analysis indicated closely related haplogroups, as haplotype diversity (Hd) is high but has low nucleotide diversity ($$). However, nucleotide diversity ($$) in Palestine is still higher than the global figures. Neutrality tests were significantly (p $<$ 0.05) low, including Tajima's D, Fu-Li's F, and Fu-Li's D, suggesting one or more of the following: population expansion, selective sweep, and natural negative selection. Wright's F-statistic (Fst) showed genetic differentiation (Fst > 0.25) with low to medium gene flow (Nm). Recombination events were minimal between clusters (Rm) and between adjacent sites (Rs). The study confirms the utility of the whole genome sequence as a surveillance system to track the emergence of new SARS-CoV-2 variants for any possible geographical association and the use of genetic variation analysis and haplotype networking to delineate any minimal change or slight deviation in the viral genome from a reference strain.

DOI: 10.3390/pathogens13060521

Genomic Diversity and Epidemiological Significance of Non-Typhoidal Salmonella Found in Retail Food Collected in Norfolk, UK

Bloomfield et al. (2023) Microbial Genomics 9:7 001075

Show Details

Authors:

SJ Bloomfield, N Janecko, R Palau, NF Alikhan, AE Mather

Abstract:

Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6%), prawn (2.9Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6$%$), prawn (2.9%) and pork (1.3Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6$%$), prawn (2.9$%$) and pork (1.3%) samples and included 14 serovars, of which Salmonella Infantis and Salmonella Enteritidis were the most common. The S. Enteritidis isolates were only isolated from imported chicken. No antimicrobial resistance determinants were found in prawn isolates, whilst 5.1Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6$%$), prawn (2.9$%$) and pork (1.3$%$) samples and included 14 serovars, of which Salmonella Infantis and Salmonella Enteritidis were the most common. The S. Enteritidis isolates were only isolated from imported chicken. No antimicrobial resistance determinants were found in prawn isolates, whilst 5.1% of chicken and 0.64Non-typhoidal Salmonella (NTS) is a major cause of bacterial gastroenteritis. Although many countries have implemented whole genome sequencing (WGS) of NTS, there is limited knowledge on NTS diversity on food and its contribution to human disease. In this study, the aim was to characterise the NTS genomes from retail foods in a particular region of the UK and assess the contribution to human NTS infections. Raw food samples were collected at retail in a repeated cross-sectional design in Norfolk, UK, including chicken (n=311), leafy green (n=311), pork (n=311), prawn (n=279) and salmon (n=157) samples. Up to eight presumptive NTS isolates per positive sample underwent WGS and were compared to publicly available NTS genomes from UK human cases. NTS was isolated from chicken (9.6$%$), prawn (2.9$%$) and pork (1.3$%$) samples and included 14 serovars, of which Salmonella Infantis and Salmonella Enteritidis were the most common. The S. Enteritidis isolates were only isolated from imported chicken. No antimicrobial resistance determinants were found in prawn isolates, whilst 5.1$%$ of chicken and 0.64% of pork samples contained multi-drug resistant NTS. The maximum number of pairwise core non-recombinant single nucleotide polymorphisms (SNPs) amongst isolates from the same sample was used to measure diversity and most samples had a median of two SNPs (range: 0--251). NTS isolates that were within five SNPs to clinical UK isolates belonged to specific serovars: S. Enteritidis and S. Infantis (chicken), and S. I 4,[5],12:i- (pork and chicken). Most NTS isolates that were closely related to human-derived isolates were obtained from imported chicken, but further epidemiological data are required to assess definitively the probable source of the human cases. Continued WGS surveillance of Salmonella on retail food involving multiple isolates from each sample is necessary to capture the diversity of Salmonella and determine the relative importance of different sources of human disease.

DOI: 10.1099/mgen.0.001075

Investigation of Hospital Discharge Cases and SARS-CoV-2 Introduction into Lothian Care Homes

Cotton et al. (2023) Journal of Hospital Infection 13528-36

Show Details

Authors:

S Cotton, MP McHugh, R Dewar, JG Haas, K Templeton, SC Robson, TR Connor, NJ Loman, T Golubchik, RT Martinez Nunez, D Bonsall, A Rambaut, LB Snell, R Livett, C Ludden, S Corden, E Nastouli, G Nebbia, I Johnston, JA Prieto, K Saeed, DK Jackson, C Houlihan, D Frampton, WL Hamilton, ..., EE Vamos, HJ Webster, M Whitehead, C Wierzbicki, A Angyal, LR Green, M Whiteley, E Betteridge, IF Bronner, BW Farr, S Goodwin, SV Lensing, SA McCarthy, MA Quail, D Rajan, NM Redshaw, C Scott, L Shirley, SA Thurston, W Rowe, A Gaskin, T Le-Viet, J Bonfield, J Liddle, A Whitwham

Abstract:

Background The first epidemic wave of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in Scotland resulted in high case numbers and mortality in care homes. In Lothian, over one-third of care homes reported an outbreak, while there was limited testing of hospital patients discharged to care homes. Aim To investigate patients discharged from hospitals as a source of SARS-CoV-2 introduction into care homes during the first epidemic wave. Methods A clinical review was performed for all patients discharges from hospitals to care homes from 1st March 2020 to 31st May 2020. Episodes were ruled out based on coronavirus disease 2019 (COVID-19) test history, clinical assessment at discharge, whole-genome sequencing (WGS) data and an infectious period of 14 days. Clinical samples were processed for WGS, and consensus genomes generated were used for analysis using Cluster Investigation and Virus Epidemiological Tool software. Patient timelines were obtained using electronic hospital records. Findings In total, 787 patients discharged from hospitals to care homes were identified. Of these, 776 (99%) were ruled out for subsequent introduction of SARS-CoV-2 into care homes. However, for 10 episodes, the results were inconclusive as there was low genomic diversity in consensus genomes or no sequencing data were available. Only one discharge episode had a genomic, time and location link to positive cases during hospital admission, leading to 10 positive cases in their care home. Conclusion The majority of patients discharged from hospitals were ruled out for introduction of SARS-CoV-2 into care homes, highlighting the importance of screening all new admissions when faced with a novel emerging virus and no available vaccine.

DOI: 10.1016/j.jhin.2023.02.010

Dynamics of Salmonella Enterica and Antimicrobial Resistance in the Brazilian Poultry Industry and Global Impacts on Public Health

Alikhan et al. (2022) PLOS Genetics 18:6 e1010174

Show Details

Authors:

NF Alikhan, LZ Moreno, LR Castellanos, MA Chattaway, J McLauchlin, M Lodge, J O'Grady, R Zamudio, E Doughty, L Petrovska, MPV Cunha, T Knöbl, AM Moreno, AE Mather

Abstract:

Non-typhoidal Salmonella enterica is a common cause of diarrhoeal disease; in humans, consumption of contaminated poultry meat is believed to be a major source. Brazil is the world's largest exporter of chicken meat globally, and previous studies have indicated the introduction of Salmonella serovars through imported food products from Brazil. Here we provide an in-depth genomic characterisation and evolutionary analysis to investigate the most prevalent serovars and antimicrobial resistance (AMR) in Brazilian chickens and assess the impact to public health of products contaminated with S . enterica imported into the United Kingdom from Brazil. To do so, we examine 183 Salmonella genomes from chickens in Brazil and 357 genomes from humans, domestic poultry and imported Brazilian poultry products isolated in the United Kingdom. S . enterica serovars Heidelberg and Minnesota were the most prevalent serovars in Brazil and in meat products imported from Brazil into the UK. We extended our analysis to include 1,259 publicly available Salmonella Heidelberg and Salmonella Minnesota genomes for context. The Brazil genomes form clades distinct from global isolates, with temporal analysis suggesting emergence of these Salmonella Heidelberg and Salmonella Minnesota clades in the early 2000s, around the time of the 2003 introduction of the Enteritidis vaccine in Brazilian poultry. Analysis showed genomes within the Salmonella Heidelberg and Salmonella Minnesota clades shared resistance to sulphonamides, tetracyclines and beta-lactams conferred by sul2 , tetA and bla CMY-2 genes, not widely observed in other co-circulating serovars despite similar selection pressures. The sul2 and tetA genes were concomitantly carried on IncC plasmids, whereas bla CMY-2 was either co-located with the sul2 and tetA genes on IncC plasmids or independently on IncI1 plasmids. Long-term surveillance data collected in the UK showed no increase in the incidence of Salmonella Heidelberg or Salmonella Minnesota in human cases of clinical disease in the UK following the increase of these two serovars in Brazilian poultry. In addition, almost all of the small number of UK-derived genomes which cluster with the Brazilian poultry-derived sequences could either be attributed to human cases with a recent history of foreign travel or were from imported Brazilian food products. These findings indicate that even should Salmonella from imported Brazilian poultry products reach UK consumers, they are very unlikely to be causing disease. No evidence of the Brazilian strains of Salmonella Heidelberg or Salmonella Minnesota were observed in UK domestic chickens. These findings suggest that introduction of the Salmonella Enteritidis vaccine, in addition to increasing antimicrobial use, could have resulted in replacement of salmonellae in Brazilian poultry flocks with serovars that are more drug resistant, but less associated with disease in humans in the UK. The plasmids conferring resistance to beta-lactams, sulphonamides and tetracyclines likely conferred a competitive advantage to the Salmonella Minnesota and Salmonella Heidelberg serovars in this setting of high antimicrobial use, but the apparent lack of transfer to other serovars present in the same setting suggests barriers to horizontal gene transfer that could be exploited in intervention strategies to reduce AMR. The insights obtained reinforce the importance of One Health genomic surveillance.

DOI: 10.1371/journal.pgen.1010174

Clinical Performance of Direct RT-PCR Testing of Raw Saliva for Detection of SARS-CoV-2 in Symptomatic and Asymptomatic Individuals

Castillo-Bravo et al. (2022) Microbiology Spectrum 10:6 e02229-22

Show Details

Authors:

R Castillo-Bravo, N Lucca, L Lai, K Marlborough, G Brychkova, MS Sakhteh, C Lonergan, J O'Grady, NF Alikhan, AJ Trotter, AJ Page, B Smyth, PC McKeown, JDM Feenstra, C Ulekleiv, O Sorel, M Gandhi, C Spillane

Abstract:

RT-PCR tests based on RNA extraction from nasopharyngeal swabs (NPS) are promoted as the ``gold standard'' for SARS-CoV-2 detection. However, the use of saliva samples offers noninvasive self-collection more suitable for high-throughput testing. This study evaluated performance of the TaqPath COVID-19 Fast PCR Combo kit 2.0 assay for detection of SARS-CoV-2 in raw saliva relative to a lab-developed direct RT-PCR test (SalivaDirect-based PCR, SDB-PCR) and an RT-PCR test based on RNA extraction from NPS. Saliva and NPS samples were collected from symptomatic and asymptomatic individuals (N\,=\,615). Saliva samples were tested for SARS-CoV-2 using the TaqPath COVID-19 Fast PCR Combo kit 2.0 and the SDB-PCR, while NPS samples were tested by RT-PCR in RNA extracts according to the Irish national testing system. TaqPath COVID-19 Fast PCR Combo kit 2.0 detected SARS-CoV-2 in 52 saliva samples, of which 51 were also positive with the SDB-PCR. Compared to the NPS ``gold standard'' biospecimen method, 49 samples displayed concordant results, while three samples (35

DOI: 10.1128/spectrum.02229-22

Characterising the Persistence of RT-PCR Positivity and Incidence in a Community Survey of SARS-CoV-2

Eales et al. (2022) Wellcome Open Research 7102

Show Details

Authors:

O Eales, CE Walters, H Wang, D Haw, KEC Ainslie, CJ Atchison, AJ Page, S Prosolek, AJ Trotter, T Le Viet, NF Alikhan, LM Jackson, C Ludden, COVID-19 Genomics UK Consortium, D Ashby, CA Donnelly, G Cooke, W Barclay, H Ward, A Darzi, P Elliott, S Riley

Abstract:

Background: The REal-time Assessment of Community Transmission-1 (REACT-1) study has provided unbiased estimates of swab-positivity in England approximately monthly since May 2020 using RT-PCR testing of self-administered throat and nose swabs. However, estimating infection incidence requires an understanding of the persistence of RT-PCR swab-positivity in the community. Methods: During round 8 of REACT-1 from 6 January to 22 January 2021, we collected up to two additional swabs from 896 initially RT-PCR positive individuals approximately 6 and 9 days after their initial swab. Results: Test sensitivity and duration of positivity were estimated using an exponential decay model, for all participants and for subsets by initial N-gene cycle threshold (Ct) value, symptom status, lineage and age. A P-spline model was used to estimate infection incidence for the entire duration of the REACT-1 study. REACT-1 test sensitivity was estimated at 0.79 (0.77, 0.81) with median duration of positivity at 9.7 (8.9, 10.6) days. We found greater duration of positivity in those exhibiting symptoms, with low N-gene Ct values, or infected with the Alpha variant. Test sensitivity was found to be higher for those who were pre-symptomatic or with low N-gene Ct values. Compared to swab-positivity, our estimates of infection incidence included sharper features with evident transient increases around the time of changes in social distancing measures. Conclusions: These results validate previous efforts to estimate incidence of SARS-CoV-2 from swab-positivity data and provide a reliable means to obtain community infection estimates to inform policy response.

DOI: 10.12688/wellcomeopenres.17723.1

Future-Proofing and Maximizing the Utility of Metadata: The PHA4GE SARS-CoV-2 Contextual Data Specification Package

Griffiths et al. (2022) GigaScience 11giac003

Show Details

Authors:

EJ Griffiths, RE Timme, CI Mendes, AJ Page, NF Alikhan, D Fornika, F Maguire, J Campos, D Park, IB Olawoye, PE Oluniyi, D Anderson, A Christoffels, AGC da~Silva, R Cameron, D Dooley, LS Katz, A Black, I Karsch-Mizrachi, T Barrett, A Johnston, TR Connor, SM Nicholls, AA Witney, GH Tyson, SH Tausch, AR Raphenya, B Alcock, DM Aanensen, E Hodcroft, WWL Hsiao, ATR Vasconcelos, DR MacCannell

Abstract:

Abstract Background The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. Results As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. Conclusions Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.

DOI: 10.1093/gigascience/giac003

Replacement of the Alpha Variant of SARS-CoV-2 by the Delta Variant in Lebanon between April and June 2021

Merhi et al. (2022) Microbial Genomics 8:7

Show Details

Authors:

G Merhi, AJ Trotter, L de Oliveira Martins, J Koweyes, T Le-Viet, H Abou Naja, M Al Buaini, SJ Prosolek, NF Alikhan, M Lott, T Tohmeh, B Badran, OJ Jupp, S Gardner, MW Felgate, KA Makin, JM Wilkinson, R Stanley, AK Sesay, MA Webber, RK Davidson, N Ghosn, M Pallen, H Hasan, AJ Page, S Tokajian

Abstract:

The COVID-19 pandemic continues to expand globally, with case numbers rising in many areas of the world, including the Eastern Mediterranean Region. Lebanon experienced its largest wave of COVID-19 infections from January to April 2021. Limited genomic surveillance was undertaken, with just 26 SARS-CoV-2 genomes available for this period, nine of which were from travellers from Lebanon detected by other countries. Additional genome sequencing is thus needed to allow surveillance of variants in circulation. In total, 905 SARS-CoV-2 genomes were sequenced using the ARTIC protocol. The genomes were derived from SARS-CoV-2-positive samples, selected retrospectively from the sentinel COVID-19 surveillance network, to capture diversity of location, sampling time, sex, nationality and age. Although 16 PANGO lineages were circulating in Lebanon in January 2021, by February there were just four, with the Alpha variant accounting for 97The COVID-19 pandemic continues to expand globally, with case numbers rising in many areas of the world, including the Eastern Mediterranean Region. Lebanon experienced its largest wave of COVID-19 infections from January to April 2021. Limited genomic surveillance was undertaken, with just 26 SARS-CoV-2 genomes available for this period, nine of which were from travellers from Lebanon detected by other countries. Additional genome sequencing is thus needed to allow surveillance of variants in circulation. In total, 905 SARS-CoV-2 genomes were sequenced using the ARTIC protocol. The genomes were derived from SARS-CoV-2-positive samples, selected retrospectively from the sentinel COVID-19 surveillance network, to capture diversity of location, sampling time, sex, nationality and age. Although 16 PANGO lineages were circulating in Lebanon in January 2021, by February there were just four, with the Alpha variant accounting for 97% of samples. In the following 2 months, all samples contained the Alpha variant. However, this had changed dramatically by June and July 2021, when all samples belonged to the Delta variant. This study documents a ten-fold increase in the number of SARS-CoV-2 genomes available from Lebanon. The Alpha variant, first detected in the UK, rapidly swept through Lebanon, causing the country's largest wave to date, which peaked in January 2021. The Alpha variant was introduced to Lebanon multiple times despite travel restrictions, but the source of these introductions remains uncertain. The Delta variant was detected in Gambia in travellers from Lebanon in mid-May, suggesting community transmission in Lebanon several weeks before this variant was detected in the country. Prospective sequencing in June/July 2021 showed that the Delta variant had completely replaced the Alpha variant in under 6 weeks.

DOI: 10.1099/mgen.0.000838

Naming the Unnamed: Over 65,000 Candidatus Names for Unnamed Archaea and Bacteria in the Genome Taxonomy Database

Pallen et al. (2022) International Journal of Systematic and Evolutionary Microbiology 72:9 005482

Show Details

Authors:

MJ Pallen, LM Rodriguez-R, NF Alikhan

Abstract:

Thousands of new bacterial and archaeal species and higher-level taxa are discovered each year through the analysis of genomes and metagenomes. The Genome Taxonomy Database (GTDB) provides hierarchical sequence-based descriptions and classifications for new and as-yet-unnamed taxa. However, bacterial nomenclature, as currently configured, cannot keep up with the need for new well-formed names. Instead, microbiologists have been forced to use hard-to-remember alphanumeric placeholder labels. Here, we exploit an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB. These newly created names represent an important resource for the microbiology community, facilitating communication between bioinformaticians, microbiologists and taxonomists, while populating the emerging landscape of microbial taxonomic and functional discovery with accessible and memorable linguistic labels.

DOI: 10.1099/ijsem.0.005482

Combined Epidemiological and Genomic Analysis of Nosocomial SARS-CoV-2 Infection Early in the Pandemic and the Role of Unidentified Cases in Transmission

Snell et al. (2022) Clinical Microbiology and Infection 28:1 93-100

Show Details

Authors:

LB Snell, CL Fisher, U Taj, O Stirrup, B Merrick, A Alcolea-Medina, T Charalampous, AW Signell, HD Wilson, G Betancor, MT Kia Ik, E Cunningham, PR Cliff, S Pickering, RP Galao, R Batra, SJ Neil, MH Malim, KJ Doores, ST Douthwaite, G Nebbia, JD Edgeworth, AR Awan

Abstract:

To analyse nosocomial transmission in the early stages of the coronavirus 2019 (COVID-19) pandemic at a large multisite healthcare institution. Nosocomial incidence is linked with infection control interventions. Methods Viral genome sequence and epidemiological data were analysed for 574 consecutive patients, including 86 nosocomial cases, with a positive PCR test for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) during the first 19 days of the pandemic. Results Forty-four putative transmission clusters were found through epidemiological analysis; these included 234 cases and all 86 nosocomial cases. SARS-CoV-2 genome sequences were obtained from 168/234 (72%) of these cases in epidemiological clusters, including 77/86 nosocomial cases (90%). Only 75/168 (45%) of epidemiologically linked, sequenced cases were not refuted by applying genomic data, creating 14 final clusters accounting for 59/77 sequenced nosocomial cases (77%). Viral haplotypes from these clusters were enriched 1--14x (median 4x) compared to the community. Three factors implicated unidentified cases in transmission: (a) community-onset or indeterminate cases were absent in 7/14 clusters (50%), (b) four clusters (29%) had additional evidence of cryptic transmission, and (c) in three clusters (21%) diagnosis of the earliest case was delayed, which may have facilitated transmission. Nosocomial cases decreased to low levels (0--2 per day) despite continuing high numbers of admissions of community-onset SARS-CoV-2 cases (40--50 per day) and before the impact of introducing universal face masks and banning hospital visitors. Conclusion Genomics was necessary to accurately resolve transmission clusters. Our data support unidentified cases---such as healthcare workers or asymptomatic patients---as important vectors of transmission. Evidence is needed to ascertain whether routine screening increases case ascertainment and limits nosocomial transmission.

DOI: 10.1016/j.cmi.2021.07.040

Hospital Admission and Emergency Care Attendance Risk for SARS-CoV-2 Delta (B.1.617.2) Compared with Alpha (B.1.1.7) Variants of Concern: A Cohort Study

Twohig et al. (2022) The Lancet Infectious Diseases 22:1 35-42

Show Details

Authors:

KA Twohig, T Nyberg, A Zaidi, S Thelwall, MA Sinnathamby, S Aliabadi, SR Seaman, RJ Harris, R Hope, J Lopez-Bernal, E Gallagher, A Charlett, D De Angelis, AM Presanis, G Dabrera, C Koshy, A Ash, E Wise, N Moore, M Mori, N Cortes, J Lynch, S Kidd, D Fairley, T Curran, ..., J Sillitoe, M Spencer Chapman, S Thurston, G Tonkin-Hill, D Weldon, D Rajan, I Bronner, L Aigrain, N Redshaw, S Lensing, R Davies, A Whitwham, J Liddle, K Lewis, J Tovar-Corona, S Leonard, J Durham, A Bassett, S McCarthy, R Moll, K James, K Oliver, A Makunin, J Barrett, R Gunson

Abstract:

Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2ḑot3%) patients with the delta variant versus 764 (2Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2ḑot2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2ḑot26 [95% CI 1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1ḑot32--3Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3ḑot89]). 498 (5Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5ḑot7%) patients with the delta variant versus 1448 (4Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4ḑot2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1ḑot45 [1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1ḑot08--1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1ḑot95]). Most patients were unvaccinated (32,078 [74Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74ḑot0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1ḑot94 [95% CI 0Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0ḑot47--8Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8ḑot05] and for hospital admission or emergency care attendance 1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1ḑot58 [0Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0ḑot69--3Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3ḑot61]) were similar to the HRs for unvaccinated patients (2Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2ḑot32 [1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1ḑot29--4Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1$ḑot$29--4ḑot16] and 1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1$ḑot$29--4$ḑot$16] and 1ḑot43 [1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1$ḑot$29--4$ḑot$16] and 1$ḑot$43 [1ḑot04--1Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1$ḑot$29--4$ḑot$16] and 1$ḑot$43 [1$ḑot$04--1ḑot97]; p=0Background The SARS-CoV-2 delta (B.1.617.2) variant was first detected in England in March, 2021. It has since rapidly become the predominant lineage, owing to high transmissibility. It is suspected that the delta variant is associated with more severe disease than the previously dominant alpha (B.1.1.7) variant. We aimed to characterise the severity of the delta variant compared with the alpha variant by determining the relative risk of hospital attendance outcomes. Methods This cohort study was done among all patients with COVID-19 in England between March 29 and May 23, 2021, who were identified as being infected with either the alpha or delta SARS-CoV-2 variant through whole-genome sequencing. Individual-level data on these patients were linked to routine health-care datasets on vaccination, emergency care attendance, hospital admission, and mortality (data from Public Health England's Second Generation Surveillance System and COVID-19-associated deaths dataset; the National Immunisation Management System; and NHS Digital Secondary Uses Services and Emergency Care Data Set). The risk for hospital admission and emergency care attendance were compared between patients with sequencing-confirmed delta and alpha variants for the whole cohort and by vaccination status subgroups. Stratified Cox regression was used to adjust for age, sex, ethnicity, deprivation, recent international travel, area of residence, calendar week, and vaccination status. Findings Individual-level data on 43,338 COVID-19-positive patients (8682 with the delta variant, 34,656 with the alpha variant; median age 31 years [IQR 17--43]) were included in our analysis. 196 (2$ḑot$3%) patients with the delta variant versus 764 (2$ḑot$2%) patients with the alpha variant were admitted to hospital within 14 days after the specimen was taken (adjusted hazard ratio [HR] 2$ḑot$26 [95% CI 1$ḑot$32--3$ḑot$89]). 498 (5$ḑot$7%) patients with the delta variant versus 1448 (4$ḑot$2%) patients with the alpha variant were admitted to hospital or attended emergency care within 14 days (adjusted HR 1$ḑot$45 [1$ḑot$08--1$ḑot$95]). Most patients were unvaccinated (32,078 [74$ḑot$0%] across both groups). The HRs for vaccinated patients with the delta variant versus the alpha variant (adjusted HR for hospital admission 1$ḑot$94 [95% CI 0$ḑot$47--8$ḑot$05] and for hospital admission or emergency care attendance 1$ḑot$58 [0$ḑot$69--3$ḑot$61]) were similar to the HRs for unvaccinated patients (2$ḑot$32 [1$ḑot$29--4$ḑot$16] and 1$ḑot$43 [1$ḑot$04--1$ḑot$97]; p=0ḑot82 for both) but the precision for the vaccinated subgroup was low. Interpretation This large national study found a higher hospital admission or emergency care attendance risk for patients with COVID-19 infected with the delta variant compared with the alpha variant. Results suggest that outbreaks of the delta variant in unvaccinated populations might lead to a greater burden on health-care services than the alpha variant.

DOI: 10.1016/S1473-3099(21)00475-8

Benchmark Datasets for SARS-CoV-2 Surveillance Bioinformatics

Xiaoli et al. (2022) PeerJ 10e13821

Show Details

Authors:

L Xiaoli, JV Hagey, DJ Park, CA Gulvik, EL Young, NF Alikhan, A Lawsin, N Hassell, K Knipe, KF Oakeson, AC Retchless, M Shakya, CC Lo, P Chain, AJ Page, BJ Metcalf, M Su, J Rowell, E Vidyaprakash, CR Paden, AD Huang, D Roellig, K Patel, K Winglee, MR Weigand, LS Katz

Abstract:

Background Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), has spread globally and is being surveilled with an international genome sequencing effort. Surveillance consists of sample acquisition, library preparation, and whole genome sequencing. This has necessitated a classification scheme detailing Variants of Concern (VOC) and Variants of Interest (VOI), and the rapid expansion of bioinformatics tools for sequence analysis. These bioinformatic tools are means for major actionable results: maintaining quality assurance and checks, defining population structure, performing genomic epidemiology, and inferring lineage to allow reliable and actionable identification and classification. Additionally, the pandemic has required public health laboratories to reach high throughput proficiency in sequencing library preparation and downstream data analysis rapidly. However, both processes can be limited by a lack of a standardized sequence dataset. Methods We identified six SARS-CoV-2 sequence datasets from recent publications, public databases and internal resources. In addition, we created a method to mine public databases to identify representative genomes for these datasets. Using this novel method, we identified several genomes as either VOI/VOC representatives or non-VOI/VOC representatives. To describe each dataset, we utilized a previously published datasets format, which describes accession information and whole dataset information. Additionally, a script from the same publication has been enhanced to download and verify all data from this study. Results The benchmark datasets focus on the two most widely used sequencing platforms: long read sequencing data from the Oxford Nanopore Technologies platform and short read sequencing data from the Illumina platform. There are six datasets: three were derived from recent publications; two were derived from data mining public databases to answer common questions not covered by published datasets; one unique dataset representing common sequence failures was obtained by rigorously scrutinizing data that did not pass quality checks. The dataset summary table, data mining script and quality control (QC) values for all sequence data are publicly available on GitHub: https://github.com/CDCgov/datasets-sars-cov-2 . Discussion The datasets presented here were generated to help public health laboratories build sequencing and bioinformatics capacity, benchmark different workflows and pipelines, and calibrate QC thresholds to ensure sequencing quality. Together, improvements in these areas support accurate and timely outbreak investigation and surveillance, providing actionable data for pandemic management. Furthermore, these publicly available and standardized benchmark data will facilitate the development and adjudication of new pipelines.

DOI: 10.7717/peerj.13821

Genomic Diversity of Salmonella Enterica -The UoWUCC 10K Genomes Project

Achtman et al. (2021) Wellcome Open Research 5223

Show Details

Authors:

M Achtman, Z Zhou, NF Alikhan, W Tyne, J Parkhill, M Cormican, CS Chiou, M Torpdahl, E Litrup, DM Prendergast, JE Moore, S Strain, C Kornschober, R Meinersmann, A Uesbeck, FCX Weill, A Coffey, H Andrews-Polymenis, R Curtiss rd, S Fanning

Abstract:

Most publicly available genomes of Salmonella enterica are from human disease in the US and the UK, or from domesticated animals in the US. Here we describe a historical collection of 10,000 strains isolated between 1891-2010 in 73 different countries. They encompass a broad range of sources, ranging from rivers through reptiles to the diversity of all S. enterica isolated on the island of Ireland between 2000 and 2005. Genomic DNA was isolated, and sequenced by Illumina short read sequencing. The short reads are publicly available in the Short Reads Archive. They were also uploaded to EnteroBase, which assembled and annotated draft genomes. 9769 draft genomes which passed quality control were genotyped with multiple levels of multilocus sequence typing, and used to predict serovars. Genomes were assigned to hierarchical clusters on the basis of numbers of pair-wise allelic differences in core genes, which were mapped to genetic Lineages within phylogenetic trees. The University of Warwick/University College Cork (UoWUCC) project greatly extends the geographic sources, dates and core genomic diversity of publicly available S. enterica genomes. We illustrate these features by an overview of core genomic Lineages within 33,000 publicly available Salmonella genomes whose strains were isolated before 2011. We also present detailed examinations of HC400, HC900 and HC2000 hierarchical clusters within exemplar Lineages, including serovars Typhimurium, Enteritidis and Mbandaka. These analyses confirm the polyphyletic nature of multiple serovars while showing that discrete clusters with geographical specificity can be reliably recognized by hierarchical clustering approaches. The results also demonstrate that the genomes sequenced here provide an important counterbalance to the sampling bias which is so dominant in current genomic sequencing.

DOI: 10.12688/wellcomeopenres.16291.2

Defining the Analytical and Clinical Sensitivity of the ARTIC Method for the Detection of SARS-CoV-2

Alikhan et al. (2021)

Show Details

Authors:

NF Alikhan, J Quick, AJ Trotter, SC Robson, M Bashton, GL Kay, M Loose, S Rooke, M McHugh, AC Darby, SM Nicholls, NJ Loman, The COVID-19 Genomics UK (COG-UK) consortium, S Dervisevic, AJ Page, J O'Grady

Abstract:

Abstract The SARS-CoV-2 ARTIC amplicon protocol is the most widely used genome sequencing method for SARS-CoV-2, accounting for over 43% of publicly-available genome sequences. The protocol utilises 98 primers to amplify Abstract The SARS-CoV-2 ARTIC amplicon protocol is the most widely used genome sequencing method for SARS-CoV-2, accounting for over 43% of publicly-available genome sequences. The protocol utilises 98 primers to amplify \sim400bp fragments of the SARS-CoV-2 genome covering all 30,000 bases. Understanding the analytical performance metrics of this protocol will improve how the data is used and interpreted. Different concentrations of SARS-CoV-2 control material were used to establish the limit of detection (LoD) of the ARTIC protocol. Results demonstrated the LoD was a minimum of 25-50 virus particles per mL. The sensitivity of ARTIC was comparable to the published sensitivities of commercial diagnostics assays and could therefore be used to confirm diagnostic testing results. A set of over 3,600 clinical samples from three UK regions were then evaluated to compare the protocols performance to clinical diagnostic assays (Roche Lightcycler 480 II, AusDiagnostics, Roche Cobas, Hologic Panther, Corman RdRp, Roche Flow, ABI QuantStudio 5, Seegene Nimbus, Qiagen Rotorgene, Abbott M2000, Thermo TaqPath, Xpert). We developed a Python tool, RonaLDO, to perform this validation (available under the GNU GPL3 open-source licence from https://github.com/quadram-institute-bioscience/ronaldo ). Positives detected by diagnostic platforms were generally supported by sequencing data; platforms that used RT-qPCR were the best predictors of whether the sample would subsequently sequence successfully. To maximise success of sample sequencing for phylogenetic analysis, samples with Ct Abstract The SARS-CoV-2 ARTIC amplicon protocol is the most widely used genome sequencing method for SARS-CoV-2, accounting for over 43% of publicly-available genome sequences. The protocol utilises 98 primers to amplify $$400bp fragments of the SARS-CoV-2 genome covering all 30,000 bases. Understanding the analytical performance metrics of this protocol will improve how the data is used and interpreted. Different concentrations of SARS-CoV-2 control material were used to establish the limit of detection (LoD) of the ARTIC protocol. Results demonstrated the LoD was a minimum of 25-50 virus particles per mL. The sensitivity of ARTIC was comparable to the published sensitivities of commercial diagnostics assays and could therefore be used to confirm diagnostic testing results. A set of over 3,600 clinical samples from three UK regions were then evaluated to compare the protocols performance to clinical diagnostic assays (Roche Lightcycler 480 II, AusDiagnostics, Roche Cobas, Hologic Panther, Corman RdRp, Roche Flow, ABI QuantStudio 5, Seegene Nimbus, Qiagen Rotorgene, Abbott M2000, Thermo TaqPath, Xpert). We developed a Python tool, RonaLDO, to perform this validation (available under the GNU GPL3 open-source licence from https://github.com/quadram-institute-bioscience/ronaldo ). Positives detected by diagnostic platforms were generally supported by sequencing data; platforms that used RT-qPCR were the best predictors of whether the sample would subsequently sequence successfully. To maximise success of sample sequencing for phylogenetic analysis, samples with Ct <31 should be chosen. For diagnostic tests that do not provide a quantifiable Ct value, adding a quantification step is recommended. The ARTIC SARS-CoV-2 sequencing protocol is highly sensitive, capable of detecting SARS-CoV-2 in samples with Cts in the high 30s. However, to routinely obtain whole genome coverage, samples with Ct Abstract The SARS-CoV-2 ARTIC amplicon protocol is the most widely used genome sequencing method for SARS-CoV-2, accounting for over 43% of publicly-available genome sequences. The protocol utilises 98 primers to amplify $$400bp fragments of the SARS-CoV-2 genome covering all 30,000 bases. Understanding the analytical performance metrics of this protocol will improve how the data is used and interpreted. Different concentrations of SARS-CoV-2 control material were used to establish the limit of detection (LoD) of the ARTIC protocol. Results demonstrated the LoD was a minimum of 25-50 virus particles per mL. The sensitivity of ARTIC was comparable to the published sensitivities of commercial diagnostics assays and could therefore be used to confirm diagnostic testing results. A set of over 3,600 clinical samples from three UK regions were then evaluated to compare the protocols performance to clinical diagnostic assays (Roche Lightcycler 480 II, AusDiagnostics, Roche Cobas, Hologic Panther, Corman RdRp, Roche Flow, ABI QuantStudio 5, Seegene Nimbus, Qiagen Rotorgene, Abbott M2000, Thermo TaqPath, Xpert). We developed a Python tool, RonaLDO, to perform this validation (available under the GNU GPL3 open-source licence from https://github.com/quadram-institute-bioscience/ronaldo ). Positives detected by diagnostic platforms were generally supported by sequencing data; platforms that used RT-qPCR were the best predictors of whether the sample would subsequently sequence successfully. To maximise success of sample sequencing for phylogenetic analysis, samples with Ct $<$31 should be chosen. For diagnostic tests that do not provide a quantifiable Ct value, adding a quantification step is recommended. The ARTIC SARS-CoV-2 sequencing protocol is highly sensitive, capable of detecting SARS-CoV-2 in samples with Cts in the high 30s. However, to routinely obtain whole genome coverage, samples with Ct <31 are recommended. Comparing different virus detection methods close to their LoD was challenging and significant discordance was observed.

DOI: 10.1101/2021.10.09.21264695

CoronaHiT: High-Throughput Sequencing of SARS-CoV-2 Genomes

Baker et al. (2021) Genome Medicine 13:1 21

Show Details

Authors:

DJ Baker, A Aydin, T Le-Viet, GL Kay, S Rudder, L de Oliveira Martins, AP Tedim, A Kolyva, M Diaz, NF Alikhan, L Meadows, A Bell, AV Gutierrez, AJ Trotter, NM Thomson, R Gilroy, L Griffith, EM Adriaenssens, R Stanley, IG Charles, N Elumogo, J Wain, R Prakash, E Meader, AE Mather, MA Webber, S Dervisevic, AJ Page, J O'Grady

Abstract:

Abstract We present CoronaHiT, a platform and throughput flexible method for sequencing SARS-CoV-2 genomes (Abstract We present CoronaHiT, a platform and throughput flexible method for sequencing SARS-CoV-2 genomes (łeq\,96 on MinION or Abstract We present CoronaHiT, a platform and throughput flexible method for sequencing SARS-CoV-2 genomes ($łeq$\,96 on MinION or >\,96 on Illumina NextSeq) depending on changing requirements experienced during the pandemic. CoronaHiT uses transposase-based library preparation of ARTIC PCR products. Method performance was demonstrated by sequencing 2 plates containing 95 and 59 SARS-CoV-2 genomes on nanopore and Illumina platforms and comparing to the ARTIC LoCost nanopore method. Of the 154 samples sequenced using all 3 methods, Abstract We present CoronaHiT, a platform and throughput flexible method for sequencing SARS-CoV-2 genomes ($łeq$\,96 on MinION or $>$\,96 on Illumina NextSeq) depending on changing requirements experienced during the pandemic. CoronaHiT uses transposase-based library preparation of ARTIC PCR products. Method performance was demonstrated by sequencing 2 plates containing 95 and 59 SARS-CoV-2 genomes on nanopore and Illumina platforms and comparing to the ARTIC LoCost nanopore method. Of the 154 samples sequenced using all 3 methods, \,90% genome coverage was obtained for 64.3% using ARTIC LoCost, 71.4% using CoronaHiT-ONT and 76.6% using CoronaHiT-Illumina, with almost identical clustering on a maximum likelihood tree. This protocol will aid the rapid expansion of SARS-CoV-2 genome sequencing globally.

DOI: 10.1186/s13073-021-00839-5

The Impact of Viral Mutations on Recognition by SARS-CoV-2 Specific T Cells

de Silva et al. (2021) iScience 24:11 103353

Show Details

Authors:

TI de Silva, G Liu, BB Lindsey, D Dong, SC Moore, NS Hsu, D Shah, D Wellington, AJ Mentzer, A Angyal, R Brown, MD Parker, Z Ying, X Yao, L Turtle, S Dunachie, MK Maini, G Ogg, JC Knight, Y Peng, SL Rowland-Jones, T Dong, DM Aanensen, K Abudahab, H Adams, ..., M Twagira, N Vallotton, R Vancheeswaran, L Vincent-Smith, S Visuvanathan, A Vuylsteke, S Waddy, R Wake, A Walden, I Welters, T Whitehouse, P Whittaker, A Whittington, P Papineni, M Wijesinghe, M Williams, L Wilson, S Cole, S Winchester, M Wiselka, A Wolverson, DG Wootton, A Workman, B Yates, P Young

Abstract:

We identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-We identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-γ and cytotoxic killing assays. Complete loss of T cell responsiveness was seen due to Q213K in the AWe identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-$$ and cytotoxic killing assays. Complete loss of T cell responsiveness was seen due to Q213K in the A\ast01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215; due to P13L, P13S, and P13T in the BWe identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-$$ and cytotoxic killing assays. Complete loss of T cell responsiveness was seen due to Q213K in the A$$01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215; due to P13L, P13S, and P13T in the B\ast27:05-restricted CD8+ nucleocapsid epitope QRNAPRITF9-17; and due to T362I and P365S in the AWe identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-$$ and cytotoxic killing assays. Complete loss of T cell responsiveness was seen due to Q213K in the A$$01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215; due to P13L, P13S, and P13T in the B$$27:05-restricted CD8+ nucleocapsid epitope QRNAPRITF9-17; and due to T362I and P365S in the A\ast03:01/AWe identify amino acid variants within dominant SARS-CoV-2 T cell epitopes by interrogating global sequence data. Several variants within nucleocapsid and ORF3a epitopes have arisen independently in multiple lineages and result in loss of recognition by epitope-specific T cells assessed by IFN-$$ and cytotoxic killing assays. Complete loss of T cell responsiveness was seen due to Q213K in the A$$01:01-restricted CD8+ ORF3a epitope FTSDYYQLY207-215; due to P13L, P13S, and P13T in the B$$27:05-restricted CD8+ nucleocapsid epitope QRNAPRITF9-17; and due to T362I and P365S in the A$$03:01/A\ast11:01-restricted CD8+ nucleocapsid epitope KTFPPTEPK361-369. CD8+ T cell lines unable to recognize variant epitopes have diverse T cell receptor repertoires. These data demonstrate the potential for T cell evasion and highlight the need for ongoing surveillance for variants capable of escaping T cell as well as humoral immunity.

DOI: 10.1016/j.isci.2021.103353

Genomic Diversity of Escherichia Coli from Healthy Children in Rural Gambia

Foster-Nyarko et al. (2021) PeerJ 9e10572

Show Details

Authors:

E Foster-Nyarko, NF Alikhan, UN Ikumapayi, G Sarwar, C Okoi, PEM Tientcheu, M Defernez, J O'Grady, M Antonio, MJ Pallen

Abstract:

Little is known about the genomic diversity of Escherichia coli in healthy children from sub-Saharan Africa, even though this is pertinent to understanding bacterial evolution and ecology and their role in infection. We isolated and whole-genome sequenced up to five colonies of faecal E. coli from 66 asymptomatic children aged three-to-five years in rural Gambia (n = 88 isolates from 21 positive stools). We identified 56 genotypes, with an average of 2.7 genotypes per host. These were spread over 37 seven-allele sequence types and the E. coli phylogroups A, B1, B2, C, D, E, F and Escherichia cryptic clade I. Immigration events accounted for three-quarters of the diversity within our study population, while one-quarter of variants appeared to have arisen from within-host evolution. Several isolates encode putative virulence factors commonly found in Enteropathogenic and Enteroaggregative E. coli, and 53% of the isolates encode resistance to three or more classes of antimicrobials. Thus, resident E. coli in these children may constitute reservoirs of virulence- and resistance-associated genes. Moreover, several study strains were closely related to isolates that caused disease in humans or originated from livestock. Our results suggest that within-host evolution plays a minor role in the generation of diversity compared to independent immigration and the establishment of strains among our study population. Also, this study adds significantly to the number of commensal E.~coli genomes, a group that has been traditionally underrepresented in the sequencing of this species.

DOI: 10.7717/peerj.10572

Extensive Microbial Diversity within the Chicken Gut Microbiome Revealed by Metagenomics and Culture

Gilroy et al. (2021) PeerJ 9e10941

Show Details

Authors:

R Gilroy, A Ravi, M Getino, I Pursley, DL Horton, NF Alikhan, D Baker, K Gharbi, N Hall, M Watson, EM Adriaenssens, E Foster-Nyarko, S Jarju, A Secka, M Antonio, A Oren, RR Chaudhuri, R La Ragione, F Hildebrand, MJ Pallen

Abstract:

Background The chicken is the most abundant food animal in the world. However, despite its importance, the chicken gut microbiome remains largely undefined. Here, we exploit culture-independent and culture-dependent approaches to reveal extensive taxonomic diversity within this complex microbial community. Results We performed metagenomic sequencing of fifty chicken faecal samples from two breeds and analysed these, alongside all ( n = 582) relevant publicly available chicken metagenomes, to cluster over 20 million non-redundant genes and to construct over 5,500 metagenome-assembled bacterial genomes. In addition, we recovered nearly 600 bacteriophage genomes. This represents the most comprehensive view of taxonomic diversity within the chicken gut microbiome to date, encompassing hundreds of novel candidate bacterial genera and species. To provide a stable, clear and memorable nomenclature for novel species, we devised a scalable combinatorial system for the creation of hundreds of well-formed Latin binomials. We cultured and genome-sequenced bacterial isolates from chicken faeces, documenting over forty novel species, together with three species from the genus Escherichia , including the newly named species Escherichia whittamii . Conclusions Our metagenomic and culture-based analyses provide new insights into the bacterial, archaeal and bacteriophage components of the chicken gut microbiome. The resulting datasets expand the known diversity of the chicken gut microbiome and provide a key resource for future high-resolution taxonomic and functional studies on the chicken gut microbiome.

DOI: 10.7717/peerj.10941

Changes in Symptomatology, Reinfection, and Transmissibility Associated with the SARS-CoV-2 Variant B.1.1.7: An Ecological Study

Graham et al. (2021) The Lancet Public Health 6:5 e335-e345

Show Details

Authors:

MS Graham, CH Sudre, A May, M Antonelli, B Murray, T Varsavsky, K Kläser, LS Canas, E Molteni, M Modat, DA Drew, LH Nguyen, L Polidori, S Selvachandran, C Hu, J Capdevila, A Hammers, AT Chan, J Wolf, TD Spector, CJ Steves, S Ourselin, C Koshy, A Ash, E Wise, ..., J Sillitoe, MH Spencer Chapman, SA Thurston, G Tonkin-Hill, D Weldon, D Rajan, IF Bronner, L Aigrain, NM Redshaw, SV Lensing, R Davies, A Whitwham, J Liddle, K Lewis, JM Tovar-Corona, S Leonard, J Durham, AR Bassett, S McCarthy, RJ Moll, K James, K Oliver, A Makunin, J Barrett, RN Gunson

Abstract:

The SARS-CoV-2 variant B.1.1.7 was first identified in December, 2020, in England. We aimed to investigate whether increases in the proportion of infections with this variant are associated with differences in symptoms or disease course, reinfection rates, or transmissibility.

DOI: 10.1016/S2468-2667(21)00055-4

Invasive Atypical Non-Typhoidal Salmonella Serovars in The Gambia

Kanteh et al. (2021) Microbial Genomics 7:11

Show Details

Authors:

A Kanteh, AK Sesay, NF Alikhan, UN Ikumapayi, R Salaudeen, J Manneh, Y Olatunji, AJ Page, G Mackenzie

Abstract:

Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65% compared to the typical serovars Enteritidis and Typhimurium that decreased from 30 to 12Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65$%$ compared to the typical serovars Enteritidis and Typhimurium that decreased from 30 to 12%. Overall iNTS case fatality was 10%, but case fatality associated with atypical iNTS alone was 10Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65$%$ compared to the typical serovars Enteritidis and Typhimurium that decreased from 30 to 12$%$. Overall iNTS case fatality was 10%, but case fatality associated with atypical iNTS alone was 10%. Genetic virulence factors were identified in 14/70 (20Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65$%$ compared to the typical serovars Enteritidis and Typhimurium that decreased from 30 to 12$%$. Overall iNTS case fatality was 10%, but case fatality associated with atypical iNTS alone was 10$%$. Genetic virulence factors were identified in 14/70 (20%) typical serovars and 45/68 (66Invasive non-typhoidal Salmonella (iNTS) disease continues to be a significant public health problem in sub-Saharan Africa. Common clinical misdiagnosis, antimicrobial resistance, high case fatality and lack of a vaccine make iNTS a priority for global health research. Using whole genome sequence analysis of 164 invasive Salmonella isolates obtained through population-based surveillance between 2008 and 2016, we conducted genomic analysis of the serovars causing invasive Salmonella diseases in rural Gambia. The incidence of iNTS varied over time. The proportion of atypical serovars causing disease increased over time from 40 to 65$%$ compared to the typical serovars Enteritidis and Typhimurium that decreased from 30 to 12$%$. Overall iNTS case fatality was 10%, but case fatality associated with atypical iNTS alone was 10$%$. Genetic virulence factors were identified in 14/70 (20$%$) typical serovars and 45/68 (66%) of the atypical serovars and were associated with: invasion, proliferation and/or translocation (Clade A); and host colonization and immune modulation (Clade G). Among Enteritidis isolates, 33/40 were resistant to four or more of\,the antimicrobials tested, except ciprofloxacin, to which all isolates were susceptible. Resistance was low in Typhimurium isolates, but all 16 isolates were resistant to gentamicin. The increase in incidence and proportion of iNTS disease caused by atypical serovars is concerning. The increased proportion of atypical serovars and the high associated case fatality may be related to acquisition of specific genetic virulence factors. These factors may provide a selective advantage to the atypical serovars. Investigations should be conducted elsewhere in Africa to identify potential changes in the distribution of iNTS serovars and the extent of these virulence elements.

DOI: 10.1099/mgen.0.000677

Recurrent Emergence of SARS-CoV-2 Spike Deletion H69/V70 and Its Role in the Alpha Variant B.1.1.7

Meng et al. (2021) Cell Reports 35:13 109292

Show Details

Authors:

B Meng, SA Kemp, G Papa, R Datir, IA Ferreira, S Marelli, WT Harvey, S Lytras, A Mohamed, G Gallo, N Thakur, DA Collier, P Mlcochova, LM Duncan, AM Carabelli, JC Kenyon, AM Lever, A De Marco, C Saliba, K Culap, E Cameroni, NJ Matheson, L Piccoli, D Corti, LC James, ..., D Toombs, B Topping, J Tovar-Corona, D Ungureanu, J Uphill, J Urbanova, PJ Van, V Vancollie, P Voak, D Walker, M Walker, M Waller, G Ward, C Weatherhogg, N Webb, A Wells, E Wells, L Westwood, T Whipp, T Whiteley, G Whitton, S Widaa, M Williams, M Wilson, S Wright

Abstract:

We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike \DeltaH69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike $$H69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although \DeltaH69/V70 itself is not an antibody evasion mechanism, it increases infectivity associated with enhanced incorporation of cleaved spike into virions. We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike $$H69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although $$H69/V70 itself is not an antibody evasion mechanism, it increases infectivity associated with enhanced incorporation of cleaved spike into virions. \DeltaH69/V70 is able to partially rescue infectivity of spike proteins that have acquired N439K and Y453F escape mutations by increased spike incorporation. In addition, replacement of the H69 and V70 residues in the Alpha variant B.1.1.7 spike (where We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike $$H69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although $$H69/V70 itself is not an antibody evasion mechanism, it increases infectivity associated with enhanced incorporation of cleaved spike into virions. $$H69/V70 is able to partially rescue infectivity of spike proteins that have acquired N439K and Y453F escape mutations by increased spike incorporation. In addition, replacement of the H69 and V70 residues in the Alpha variant B.1.1.7 spike (where \DeltaH69/V70 occurs naturally) impairs spike incorporation and entry efficiency of the B.1.1.7 spike pseudotyped virus. Alpha variant B.1.1.7 spike mediates faster kinetics of cell-cell fusion than wild-type Wuhan-1 D614G, dependent on We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike $$H69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although $$H69/V70 itself is not an antibody evasion mechanism, it increases infectivity associated with enhanced incorporation of cleaved spike into virions. $$H69/V70 is able to partially rescue infectivity of spike proteins that have acquired N439K and Y453F escape mutations by increased spike incorporation. In addition, replacement of the H69 and V70 residues in the Alpha variant B.1.1.7 spike (where $$H69/V70 occurs naturally) impairs spike incorporation and entry efficiency of the B.1.1.7 spike pseudotyped virus. Alpha variant B.1.1.7 spike mediates faster kinetics of cell-cell fusion than wild-type Wuhan-1 D614G, dependent on \DeltaH69/V70. Therefore, as We report severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike $$H69/V70 in multiple independent lineages, often occurring after acquisition of receptor binding motif replacements such as N439K and Y453F, known to increase binding affinity to the ACE2 receptor and confer antibody escape. In vitro, we show that, although $$H69/V70 itself is not an antibody evasion mechanism, it increases infectivity associated with enhanced incorporation of cleaved spike into virions. $$H69/V70 is able to partially rescue infectivity of spike proteins that have acquired N439K and Y453F escape mutations by increased spike incorporation. In addition, replacement of the H69 and V70 residues in the Alpha variant B.1.1.7 spike (where $$H69/V70 occurs naturally) impairs spike incorporation and entry efficiency of the B.1.1.7 spike pseudotyped virus. Alpha variant B.1.1.7 spike mediates faster kinetics of cell-cell fusion than wild-type Wuhan-1 D614G, dependent on $$H69/V70. Therefore, as \DeltaH69/V70 compensates for immune escape mutations that impair infectivity, continued surveillance for deletions with functional effects is warranted.

DOI: 10.1016/j.celrep.2021.109292

Large-Scale Sequencing of SARS-CoV-2 Genomes from One Region Allows Detailed Epidemiology and Enables Local Outbreak Management

Page et al. (2021) Microbial Genomics 7:6

Show Details

Authors:

AJ Page, AE Mather, T Le-Viet, EJ Meader, NF Alikhan, GL Kay, L de Oliveira Martins, A Aydin, DJ Baker, AJ Trotter, S Rudder, AP Tedim, A Kolyva, R Stanley, M Yasir, M Diaz, W Potter, C Stuart, L Meadows, A Bell, AV Gutierrez, NM Thomson, EM Adriaenssens, T Swingler, RAJ Gilroy, L Griffith, DK Sethi, D Aggarwal, CS Brown, RK Davidson, RA Kingsley, L Bedford, LJ Coupland, IG Charles, N Elumogo, J Wain, R Prakash, MA Webber, SJL Smith, M Chand, S Dervisevic, J O'Grady, The COVID-19 Genomics UK (COG-UK) Consortium

Abstract:

The COVID-19 pandemic has spread rapidly throughout the world. In the UK, the initial peak was in April 2020; in the county of Norfolk (UK) and surrounding areas, which has a stable, low-density population, over 3200 cases were reported between March and August 2020. As part of the activities of the national COVID-19 Genomics Consortium (COG-UK) we undertook whole genome sequencing of the SARS-CoV-2 genomes present in positive clinical samples from the Norfolk region. These samples were collected by four major hospitals, multiple minor hospitals, care facilities and community organizations within Norfolk and surrounding areas. We combined clinical metadata with the sequencing data from regional SARS-CoV-2 genomes to understand the origins, genetic variation, transmission and expansion (spread) of the virus within the region and provide context nationally. Data were fed back into the national effort for pandemic management, whilst simultaneously being used to assist local outbreak analyses. Overall, 1565 positive samples (172 per 100The COVID-19 pandemic has spread rapidly throughout the world. In the UK, the initial peak was in April 2020; in the county of Norfolk (UK) and surrounding areas, which has a stable, low-density population, over 3200 cases were reported between March and August 2020. As part of the activities of the national COVID-19 Genomics Consortium (COG-UK) we undertook whole genome sequencing of the SARS-CoV-2 genomes present in positive clinical samples from the Norfolk region. These samples were collected by four major hospitals, multiple minor hospitals, care facilities and community organizations within Norfolk and surrounding areas. We combined clinical metadata with the sequencing data from regional SARS-CoV-2 genomes to understand the origins, genetic variation, transmission and expansion (spread) of the virus within the region and provide context nationally. Data were fed back into the national effort for pandemic management, whilst simultaneously being used to assist local outbreak analyses. Overall, 1565 positive samples (172 per 100000 population) from 1376 cases were evaluated; for 140 cases between two and six samples were available providing longitudinal data. This represented 42.6The COVID-19 pandemic has spread rapidly throughout the world. In the UK, the initial peak was in April 2020; in the county of Norfolk (UK) and surrounding areas, which has a stable, low-density population, over 3200 cases were reported between March and August 2020. As part of the activities of the national COVID-19 Genomics Consortium (COG-UK) we undertook whole genome sequencing of the SARS-CoV-2 genomes present in positive clinical samples from the Norfolk region. These samples were collected by four major hospitals, multiple minor hospitals, care facilities and community organizations within Norfolk and surrounding areas. We combined clinical metadata with the sequencing data from regional SARS-CoV-2 genomes to understand the origins, genetic variation, transmission and expansion (spread) of the virus within the region and provide context nationally. Data were fed back into the national effort for pandemic management, whilst simultaneously being used to assist local outbreak analyses. Overall, 1565 positive samples (172 per 100$$000 population) from 1376 cases were evaluated; for 140 cases between two and six samples were available providing longitudinal data. This represented 42.6% of all positive samples identified by hospital testing in the region and encompassed those with clinical need, and health and care workers and their families. In total, 1035 cases had genome sequences of sufficient quality to provide phylogenetic lineages. These genomes belonged to 26 distinct global lineages, indicating that there were multiple separate introductions into the region. Furthermore, 100 genetically distinct UK lineages were detected demonstrating local evolution, at a rate of ~ 2\,SNPs per month, and multiple co-occurring lineages as the pandemic progressed. Our analysis: identified a discrete sublineage associated with six care facilities; found no evidence of reinfection in longitudinal samples; ruled out a nosocomial outbreak; identified 16 lineages in key workers which were not in patients, indicating infection control measures were effective; and found the D614G spike protein mutation which is linked to increased transmissibility dominates the samples and rapidly confirmed relatedness of cases in an outbreak at a food processing facility. The large-scale genome sequencing of SARS-CoV-2-positive samples has provided valuable additional data for public health epidemiology in the Norfolk region, and will continue to help identify and untangle hidden transmission chains as the pandemic evolves.

DOI: 10.1099/mgen.0.000589

REACT-1 Round 11 Report: Low Prevalence of SARS-CoV-2 Infection in the Community Prior to the Third Step of the English Roadmap out of Lockdown

Riley et al. (2021)

Show Details

Authors:

S Riley, D Haw, CE Walters, H Wang, O Eales, KEC Ainslie, C Atchison, C Fronterre, PJ Diggle, AJ Page, AJ Trotter, T Le Viet, NF Alikhan, J O'Grady, The COVID-19 Genomics UK (COG-UK) Consortium, D Ashby, CA Donnelly, G Cooke, W Barclay, H Ward, A Darzi, P Elliott

Abstract:

Abstract Background National epidemic dynamics of SARS-CoV-2 infections are being driven by: the degree of recent indoor mixing (both social and workplace), vaccine coverage, intrinsic properties of the circulating lineages, and prior history of infection (via natural immunity). In England, infections, hospitalisations and deaths fell during the first two steps of the ``roadmap'' for exiting the third national lockdown. The third step of the roadmap in England takes place on 17 May 2021. Methods We report the most recent findings on community infections from the REal-time Assessment of Community Transmission-1 (REACT-1) study in which a swab is obtained from a representative cross-sectional sample of the population in England and tested using PCR. Round 11 of REACT-1 commenced self-administered swab-collection on 15 April 2021 and completed collections on 3 May 2021. We compare the results of REACT-1 round 11 to round 10, in which swabs were collected from 11 to 30 March 2021. Results Between rounds 10 and 11, prevalence of swab-positivity dropped by 50% in England from 0.20% (0.17%, 0.23%) to 0.10% (0.08%, 0.13%), with a corresponding R estimate of 0.90 (0.87, 0.94). Rates of swab-positivity fell in the 55 to 64 year old group from 0.17% (0.12%, 0.25%) in round 10 to 0.06% (0.04%, 0.11%) in round 11. Prevalence in round 11 was higher in the 25 to 34 year old group at 0.21% (0.12%, 0.38%) than in the 55 to 64 year olds and also higher in participants of Asian ethnicity at 0.31% (0.16%, 0.60%) compared with white participants at 0.09% (0.07%, 0.11%). Based on sequence data for positive samples for which a lineage could be identified, we estimate that 92.3% (75.9%, 97.9%, n=24) of infections were from the B.1.1.7 lineage compared to 7.7% (2.1%, 24.1%, n=2) from the B.1.617.2 lineage. Both samples from the B.1.617.2 lineage were detected in London from participants not reporting travel in the previous two weeks. Also, allowing for suitable lag periods, the prior close alignment between prevalence of infections and hospitalisations and deaths nationally has diverged. Discussion We observed marked reductions in prevalence from March to April and early May 2021 in England reflecting the success of the vaccination programme and despite easing of restrictions during lockdown. However, there is potential upwards pressure on prevalence from the further easing of lockdown regulations and presence of the B.1.617.2 lineage. If prevalence rises in the coming weeks, policy-makers will need to assess the possible impact on hospitalisations and deaths. In addition, consideration should be given to other health and economic impacts if increased levels of community transmission occur.

DOI: 10.1101/2021.05.13.21257144

REACT-1 Round 12 Report: Resurgence of SARS-CoV-2 Infections in England Associated with Increased Frequency of the Delta Variant

Riley et al. (2021)

Show Details

Authors:

S Riley, H Wang, O Eales, D Haw, CE Walters, KEC Ainslie, C Atchison, C Fronterre, PJ Diggle, AJ Page, SJ Prosolek, AJ Trotter, T Le Viet, NF Alikhan, LM Jackson, C Ludden, The COVID-19 Genomics UK (COG-UK) Consortium, D Ashby, CA Donnelly, G Cooke, W Barclay, H Ward, A Darzi, P Elliott

Abstract:

Abstract Background England entered a third national lockdown from 6 January 2021 due to the COVID-19 pandemic. Despite a successful vaccine rollout during the first half of 2021, cases and hospitalisations have started to increase since the end of May as the SARS-CoV-2 Delta (B.1.617.2) variant increases in frequency. The final step of relaxation of COVID-19 restrictions in England has been delayed from 21 June to 19 July 2021. Methods The REal-time Assessment of Community Transmision-1 (REACT-1) study measures the prevalence of swab-positivity among random samples of the population of England. Round 12 of REACT-1 obtained self-administered swab collections from participants from 20 May 2021 to 7 June 2021; results are compared with those for round 11, in which swabs were collected from 15 April to 3 May 2021. Results Between rounds 11 and 12, national prevalence increased from 0.10% (0.08%, 0.13%) to 0.15% (0.12%, 0.18%). During round 12, we detected exponential growth with a doubling time of 11 (7.1, 23) days and an R number of 1.44 (1.20, 1.73). The highest prevalence was found in the North West at 0.26% (0.16%, 0.41%) compared to 0.05% (0.02%, 0.12%) in the South West. In the North West, the locations of positive samples suggested a cluster in Greater Manchester and the east Lancashire area. Prevalence in those aged 5-49 was 2.5 times higher at 0.20% (0.16%, 0.26%) compared with those aged 50 years and above at 0.08% (0.06%, 0.11%). At the beginning of February 2021, the link between infection rates and hospitalisations and deaths started to weaken, although in late April 2021, infection rates and hospital admissions started to reconverge. When split by age, the weakened link between infection rates and hospitalisations at ages 65 years and above was maintained, while the trends converged below the age of 65 years. The majority of the infections in the younger group occurred in the unvaccinated population or those without a stated vaccine history. We observed the rapid replacement of the Alpha (B.1.1.7) variant of SARS-CoV-2 with the Delta variant during the period covered by rounds 11 and 12 of the study. Discussion The extent to which exponential growth continues, or slows down as a consequence of the continued rapid roll-out of the vaccination programme, including to young adults, requires close monitoring. Data on community prevalence are vital to track the course of the epidemic and inform ongoing decisions about the timing of further lifting of restrictions in England.

DOI: 10.1101/2021.06.17.21259103

SARS-CoV-2 Variants of Concern Dominate in Lahore, Pakistan in April 2021

Sarwar et al. (2021) Microbial Genomics 7:11

Show Details

Authors:

MB Sarwar, M Yasir, NF Alikhan, N Afzal, L de Oliveira Martins, T Le Viet, AJ Trotter, SJ Prosolek, GL Kay, E Foster-Nyarko, S Rudder, DJ Baker, ST Muntaha, M Roman, MA Webber, A Shafiq, B Shabbir, J Akram, AJ Page, S Jahan

Abstract:

The SARS-CoV-2 pandemic continues to expand globally, with case numbers rising in many areas of the world, including the Indian sub-continent. Pakistan has one of the world's largest populations, of over 200\,million people and is experiencing a severe third wave of infections caused by SARS-CoV-2 that began in March 2021. In Pakistan, during the third wave until now only 12 SARS-CoV-2 genomes have been collected and among these nine are from Islamabad. This highlights the need for more genome sequencing to allow surveillance of variants in circulation. In fact, more genomes are available among travellers with a travel history from Pakistan, than from within the country itself. We thus aimed to provide a snapshot assessment of circulating lineages in Lahore and surrounding areas with a combined population of 11.1\,million. Within a week of April 2021, 102\,samples were sequenced. The samples were randomly collected from two hospitals with a diagnostic PCR cutoff value of less than 25 cycles. Analysis of the lineages shows that the Alpha variant of concern (first identified in the UK) dominates, accounting for 97.9The SARS-CoV-2 pandemic continues to expand globally, with case numbers rising in many areas of the world, including the Indian sub-continent. Pakistan has one of the world's largest populations, of over 200\,million people and is experiencing a severe third wave of infections caused by SARS-CoV-2 that began in March 2021. In Pakistan, during the third wave until now only 12 SARS-CoV-2 genomes have been collected and among these nine are from Islamabad. This highlights the need for more genome sequencing to allow surveillance of variants in circulation. In fact, more genomes are available among travellers with a travel history from Pakistan, than from within the country itself. We thus aimed to provide a snapshot assessment of circulating lineages in Lahore and surrounding areas with a combined population of 11.1\,million. Within a week of April 2021, 102\,samples were sequenced. The samples were randomly collected from two hospitals with a diagnostic PCR cutoff value of less than 25 cycles. Analysis of the lineages shows that the Alpha variant of concern (first identified in the UK) dominates, accounting for 97.9% (97/99) of cases, with the Beta variant of concern (first identified in South Africa) accounting for 2.0The SARS-CoV-2 pandemic continues to expand globally, with case numbers rising in many areas of the world, including the Indian sub-continent. Pakistan has one of the world's largest populations, of over 200\,million people and is experiencing a severe third wave of infections caused by SARS-CoV-2 that began in March 2021. In Pakistan, during the third wave until now only 12 SARS-CoV-2 genomes have been collected and among these nine are from Islamabad. This highlights the need for more genome sequencing to allow surveillance of variants in circulation. In fact, more genomes are available among travellers with a travel history from Pakistan, than from within the country itself. We thus aimed to provide a snapshot assessment of circulating lineages in Lahore and surrounding areas with a combined population of 11.1\,million. Within a week of April 2021, 102\,samples were sequenced. The samples were randomly collected from two hospitals with a diagnostic PCR cutoff value of less than 25 cycles. Analysis of the lineages shows that the Alpha variant of concern (first identified in the UK) dominates, accounting for 97.9$%$ (97/99) of cases, with the Beta variant of concern (first identified in South Africa) accounting for 2.0% (2/99) of cases. No other lineages were observed. In depth analysis of the Alpha lineages indicated multiple separate introductions and subsequent establishment within the region. Eight samples were identical to genomes observed in Europe (seven UK, one Switzerland), indicating recent transmission. Genomes of other samples show evidence that these have evolved, indicating sustained transmission over a period of time either within Pakistan or other countries with low-density genome sequencing. Vaccines remain effective against Alpha, however, the low level of Beta against which some vaccines are less effective demonstrates the requirement for continued prospective genomic surveillance.

DOI: 10.1099/mgen.0.000693

Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity

Volz et al. (2021) Cell 184:1 64-75.e11

Show Details

Authors:

E Volz, V Hill, JT McCrone, A Price, D Jorgensen, \ O'Toole, J Southgate, R Johnson, B Jackson, FF Nascimento, SM Rey, SM Nicholls, RM Colquhoun, A da Silva Filipe, J Shepherd, DJ Pascall, R Shah, N Jesudason, K Li, R Jarrett, N Pacchiarini, M Bull, L Geidelberg, I Siveroni, I Goodfellow, ..., I Martincorena, C Puethe, JP Keatley, G Tonkin-Hill, C Smith, D Jamrozy, MA Beale, M Patel, C Ariani, M Spencer-Chapman, E Drury, S Lo, S Rajatileka, C Scott, K James, SK Buddenborg, DJ Berger, G Patel, MV Garcia-Casado, T Dibling, S McGuigan, HA Rogers, AD Hunter, E Souster, AS Neaverson

Abstract:

Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant.

DOI: 10.1016/j.cell.2020.11.020

ICTV Virus Taxonomy Profile: Herelleviridae

Barylski et al. (2020) Journal of General Virology 101:4 362-363

Show Details

Authors:

J Barylski, AM Kropinski, NF Alikhan, EM Adriaenssens, ICTV Report Consortium

Abstract:

Members of the family Herelleviridae are bacterial viruses infecting members of the phylum Firmicutes. The virions have myovirus morphology and virus genomes comprise a linear dsDNA of 125--170\,kb. This is a summary of the International Committee on Taxonomy of Viruses (ICTV) Report on the family Herelleviridae, which is available at ictv.global/report/herelleviridae.

DOI: 10.1099/jgv.0.001392

Evolution of Salmonella enterica Serotype Typhimurium Driven by Anthropogenic Selection and Niche Adaptation

Bawn et al. (2020) PLOS Genetics 16:6 e1008850

Show Details

Authors:

M Bawn, NF Alikhan, G Thilliez, M Kirkwood, NE Wheeler, L Petrovska, TJ Dallman, EM Adriaenssens, N Hall, RA Kingsley

Abstract:

Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades (Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades (α and Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and e̱ta) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade α contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade e̱ta contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade $e̱ta$ contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade α and Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade $e̱ta$ contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade $$ and e̱ta was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade $e̱ta$ contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade $$ and $e̱ta$ was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade α lineages from adaptation to circulation in populations of domesticated livestock, and the predisposition of clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade $e̱ta$ contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade $$ and $e̱ta$ was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade $$ lineages from adaptation to circulation in populations of domesticated livestock, and the predisposition of clade e̱ta lineages to undergo adaptation to an invasive lifestyle by a process of convergent evolution with of host adapted Salmonella serotypes. Gene flux was predominantly driven by acquisition and recombination of prophage and associated cargo genes, with only occasional loss of these elements. The acquisition of large chromosomally-encoded genetic islands was limited, but notably, a feature of two recent pandemic clones (DT104 and monophasic S. Typhimurium ST34) of clade Salmonella enterica serotype Typhimurium (S. Typhimurium) is a leading cause of gastroenteritis and bacteraemia worldwide, and a model organism for the study of host-pathogen interactions. Two S. Typhimurium strains (SL1344 and ATCC14028) are widely used to study host-pathogen interactions, yet genotypic variation results in strains with diverse host range, pathogenicity and risk to food safety. The population structure of diverse strains of S. Typhimurium revealed a major phylogroup of predominantly sequence type 19 (ST19) and a minor phylogroup of ST36. The major phylogroup had a population structure with two high order clades ($$ and $e̱ta$) and multiple subclades on extended internal branches, that exhibited distinct signatures of host adaptation and anthropogenic selection. Clade $$ contained a number of subclades composed of strains from well characterized epidemics in domesticated animals, while clade $e̱ta$ contained multiple subclades associated with wild avian species. The contrasting epidemiology of strains in clade $$ and $e̱ta$ was reflected by the distinct distribution of antimicrobial resistance (AMR) genes, accumulation of hypothetically disrupted coding sequences (HDCS), and signatures of functional diversification. These observations were consistent with elevated anthropogenic selection of clade $$ lineages from adaptation to circulation in populations of domesticated livestock, and the predisposition of clade $e̱ta$ lineages to undergo adaptation to an invasive lifestyle by a process of convergent evolution with of host adapted Salmonella serotypes. Gene flux was predominantly driven by acquisition and recombination of prophage and associated cargo genes, with only occasional loss of these elements. The acquisition of large chromosomally-encoded genetic islands was limited, but notably, a feature of two recent pandemic clones (DT104 and monophasic S. Typhimurium ST34) of clade α (SGI-1 and SGI-4).

DOI: 10.1371/journal.pgen.1008850

Gambian Poultry Isolates from Hyperendemic Group of AMR Escherichia coli Strains in Sub-Saharan Africa

Foster-Nyarko et al. (2020) Access Microbiology 2:7A

Show Details

Authors:

E Foster-Nyarko, NF Alikhan, A Ravi, N Thomson, S Jarju, A Secka, M Antonio, M J. Pallen

Abstract:

Chickens and guinea fowl are commonly reared in Gambian homes as affordable sources of protein. Using standard microbiological techniques, we obtained 68 caecal isolates of Escherichia coli from ten chickens and nine guinea fowl in rural Gambia. After Illumina whole-genome sequencing, 28 sequence types were detected in the isolates (four of them novel), of which ST155 was the most common (22/68, 32%). These strains span four of the eight main phylogroups of E. coli , with phylogroups B1 and A being most prevalent. Nearly a third of the isolates harboured at least one antimicrobial resistance gene, while most of the ST155 isolates (14/22, 64%) encoded resistance to Chickens and guinea fowl are commonly reared in Gambian homes as affordable sources of protein. Using standard microbiological techniques, we obtained 68 caecal isolates of Escherichia coli from ten chickens and nine guinea fowl in rural Gambia. After Illumina whole-genome sequencing, 28 sequence types were detected in the isolates (four of them novel), of which ST155 was the most common (22/68, 32%). These strains span four of the eight main phylogroups of E. coli , with phylogroups B1 and A being most prevalent. Nearly a third of the isolates harboured at least one antimicrobial resistance gene, while most of the ST155 isolates (14/22, 64%) encoded resistance to 3 classes of clinically relevant antibiotics, as well as putative virulence factors, suggesting pathogenic potential in humans. Furthermore, hierarchical clustering revealed that several Gambian poultry strains were closely related to isolates from humans. Although the ST155 lineage is common in poultry from Africa and South America, the Gambian ST155 isolates sit within a tight genomic cluster (100 alleles difference) of strains from poultry and livestock in sub-Saharan Africa (the Gambia, Uganda and Kenya). Continued surveillance of E. coli and other potential pathogens in rural backyard poultry from sub-Saharan Africa is warranted.

DOI: 10.1099/acmi.ac2020.po0824

Non-Human Primates in the Gambia Harbour Human-Associated Pathogenic Escherichia coli Strains

Foster-Nyarko et al. (2020) Access Microbiology 2:7A

Show Details

Authors:

E Foster-Nyarko, NF Alikhan, A Ravi, G Thilliez, N Thomson, D Baker, G Kay, J D. Cramer, J O'Grady, M Antonio, M Pallen

Abstract:

Increasing contact between humans and non-human primates provides an opportunity for the transfer of potential pathogens or antimicrobial resistance between different host species. We have investigated genetic diversity and antimicrobial resistance in Escherichia coli isolates from a range of non-human primates dispersed across the Gambia: patas monkey (n=1), western colobus monkey (n=6), green monkey (n=14) and guinea baboon (n=22). From 43 stools, we recovered 99 isolates. We performed Illumina whole-genome shotgun sequencing on all isolates and nanopore long-read sequencing on isolates with antimicrobial resistance genes. We inferred the evolution of E. coli in this population using the EnteroBase software environment. We identified 43 sequence types (ten of them novel), spanning five of the eight known phylogroups of E. coli . Many of the observed sequence types and phylotypes from non-human primates have been associated with human extra-intestinal infection and carry virulence characteristics associated with disease in humans, particularly ST73, ST217 and ST681. However, we found a low prevalence of antimicrobial resistance genes in isolates from non-human primates. Hierarchical clustering showed that ST442 and ST349 from non-human primates are closely related to isolates from human infections, suggesting recent exchange of bacteria between humans and monkeys. Our results are of public health importance, considering the increasing contact between humans and wild primates.

DOI: 10.1099/acmi.ac2020.po0781

Emergence of Human-Adapted Salmonella enterica Is Linked to the Neolithization Process

Key et al. (2020) Nature Ecology & Evolution 4:3 324-333

Show Details

Authors:

FM Key, C Posth, LR Esquivel-Gomez, R Hübler, MA Spyrou, GU Neumann, A Furtwängler, S Sabin, M Burri, A Wissgott, AK Lankapalli, Vågene, M Meyer, S Nagel, R Tukhbatova, A Khokhlov, A Chizhevsky, S Hansen, AB Belinsky, A Kalmykov, AR Kantorovich, VE Maslov, PW Stockhammer, S Vai, M Zavattaro, A Riga, D Caramelli, R Skeates, J Beckett, MG Gradoli, N Steuri, A Hafner, M Ramstein, I Siebke, S Lösch, YS Erdal, NF Alikhan, Z Zhou, M Achtman, K Bos, S Reinhold, W Haak, D Kühnert, A Herbig, J Krause

Abstract:

It has been hypothesized that the Neolithic transition towards an agricultural and pastoralist economy facilitated the emergence of human-adapted pathogens. Here, we recovered eight Salmonella enterica subsp. enterica genomes from human skeletons of transitional foragers, pastoralists and agropastoralists in western Eurasia that were up to 6,500\,yr old. Despite the high genetic diversity of S. enterica, all ancient bacterial genomes clustered in a single previously uncharacterized branch that contains S. enterica adapted to multiple mammalian species. All ancient bacterial genomes from prehistoric (agro-)pastoralists fall within a part of this branch that also includes the human-specific S. enterica Paratyphi C, illustrating the evolution of a human pathogen over a period of 5,000\,yr. Bacterial genomic comparisons suggest that the earlier ancient strains were not host specific, differed in pathogenic potential and experienced convergent pseudogenization that accompanied their downstream host adaptation. These observations support the concept that the emergence of human-adapted S. enterica is linked to human cultural transformations.

DOI: 10.1038/s41559-020-1106-9

Rapid Mycobacterium Tuberculosis Spoligotyping from Uncorrected Long Reads Using Galru

Page et al. (2020)

Show Details

Authors:

AJ Page, NF Alikhan, M Strinden, T Le Viet, T Skvortsov

Abstract:

Abstract Spoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no methods exist for long read sequence data such as from Nanopore or PacBio. We present a novel software package Galru, which can rapidly detect the spoligotype of a Mycobacterium tuberculosis sample from as little as a single uncorrected long read. It allows for near real-time spoligotyping from long read data as it is being sequenced, giving rapid sample typing. We compare it to the existing state of the art software and find it performs identically to the results obtained from short read sequencing data. Galru is freely available from https://github.com/quadram-institute-bioscience/galru under the GPLv3 open source licence.

DOI: 10.1101/2020.05.31.126490

Emergence of Resistance to Fluoroquinolones and Third-Generation Cephalosporins in Salmonella Typhi in Lahore, Pakistan

Rasheed et al. (2020) Microorganisms 8:9 1336

Show Details

Authors:

F Rasheed, M Saeed, NF Alikhan, D Baker, M Khurshid, EV Ainsworth, AK Turner, AA Imran, MH Rasool, M Saqalein, MA Nisar, M Fayyaz ur Rehman, J Wain, M Yasir, GC Langridge, A Ikram

Abstract:

Extensively drug-resistant (XDR) Salmonella Typhi has been reported in Sindh province of Pakistan since 2016. The potential for further spread is of serious concern as remaining treatment options are severely limited. We report the phenotypic and genotypic characterization of 27 XDR S. Typhi isolated from patients attending Jinnah Hospital, Lahore, Pakistan. Isolates were identified by biochemical profiling; antimicrobial susceptibility was determined by a modified Kirby--Bauer method. These findings were confirmed using Illumina whole genome nucleotide sequence data. All sequences were compared to the outbreak strain from Southern Pakistan and typed using the S. Typhi genotyping scheme. All isolates were confirmed by a sequence analysis to harbor an IncY plasmid and the CTX-M-15 ceftriaxone resistance determinant. All isolates were of the same genotypic background as the outbreak strain from Sindh province. We report the first emergence of XDR S. Typhi in Punjab province of Pakistan confirmed by whole genome sequencing.

DOI: 10.3390/microorganisms8091336

The EnteroBase User's Guide, with Case Studies on Salmonella Transmissions, Yersinia Pestis Phylogeny, and Escherichia Core Genomic Diversity

Zhou et al. (2020) Genome Research 30:1 138-152

Show Details

Authors:

Z Zhou, NF Alikhan, K Mohamed, Y Fan, the Agama Study Group, M Achtman

Abstract:

EnteroBase is an integrated software environment that supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview of how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella , Escherichia , Yersinia , Clostridioides , Helicobacter , Vibrio , and Moraxella and genotyped those assemblies by core genome multilocus sequence typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case Study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports single nucleotide polymorphism (SNP) calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by Case Study 2 which summarizes the microevolution of Yersinia pestis over the last 5000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by Case Study 3, which presents a novel, global overview of the population structure of all of the species, subspecies, and clades within Escherichia .

DOI: 10.1101/gr.251678.119

Multiple Evolutionary Trajectories for Non-O157 Shiga Toxigenic Escherichia coli

Alikhan et al. (2019)

Show Details

Authors:

NF Alikhan, NL Bachmann, NLB Zakour, NK Petty, M Stanton-Cook, JA Gawthorne, DM Easton, TJ Mahony, R Cobbold, MA Schembri, SA Beatson

Abstract:

Abstract Background Shiga toxigenic Escherichia coli (STEC) is an emerging global pathogen and remains a major cause of food-borne illness with more severe symptoms including hemorrhagic colitis and hemolytic-uremic syndrome. Since the characterization of the archetypal STEC serotype, E. coli O157:H7, more than 250 STEC serotypes have been defined. Many of these non-O157 STEC are associated with clinical cases of equal severity as O157. In this study, we utilize whole genome sequencing of 44 STEC strains from eight serogroups associated with human infection to establish their evolutionary relationships and contrast this with their virulence gene profiles and established typing methods. Results Our phylogenomic analysis delineated these STEC strains into seven distinct lineages, each with a characteristic repertoire of virulence factors. Some lineages included commensal or other E. coli pathotypes. Multiple independent acquisitions of the Locus for Enterocyte Effacement were identified, each associated with a distinct repertoire of effector genes. Lineages were inconsistent with O-antigen typing in several instances, consistent with lateral gene transfer within the O-antigen locus. STEC lineages could be defined by the conservation of clustered regularly interspaced short palindromic repeats (CRISPRs), however, no CRISPR profile could differentiate STEC from other E. coli strains. Six genomic regions (ranging from 500 bp - 10 kbp) were found to be conserved across all STEC in this dataset and may dictate interactions with Stx phage lysogeny. Conclusions The genomic analyses reported here present non-O157 STEC as a diverse group of pathogenic E. coli emerging from multiple lineages that independently acquired mobile genetic elements that promote pathogenesis.

DOI: 10.1101/549998

Genome-Wide Identification and Characterization of a Superfamily of Bacterial Extracellular Contractile Injection Systems

Chen et al. (2019) Cell Reports 29:2 511-521.e2

Show Details

Authors:

L Chen, N Song, B Liu, N Zhang, NF Alikhan, Z Zhou, Y Zhou, S Zhou, D Zheng, M Chen, A Hapeshi, J Healey, NR Waterfield, J Yang, G Yang

Abstract:

Several phage-tail-like nanomachines were shown to play an important role in the interactions between bacteria and their eukaryotic hosts. These apparatuses appear to represent a new injection paradigm. Here, with three verified extracellular contractile injection systems (eCISs), a protein profile and genomic context-based iterative approach was applied to identify 631 eCIS-like loci from the 11,699 publicly available complete bacterial genomes. The eCIS superfamily, which is phylogenetically diverse and sub-divided into six families, is distributed among Gram-negative and -positive bacteria in addition to archaea. Our results show that very few bacteria are seen to possess intact operons of both eCIS and type VI secretion systems (T6SSs). An open access online database of all detected eCIS-like loci is presented to facilitate future studies. The presence of this bacterial injection machine in a multitude of organisms suggests that it may play an important ecological role in the life cycles of many bacteria.

DOI: 10.1016/j.celrep.2019.08.096

Within-Host Diversity and Vertical Transmission of Group B Streptococcus Among Mother-infant Dyads in The Gambia

Foster-Nyarko et al. (2019)

Show Details

Authors:

E Foster-Nyarko, M Senghore, BA Kwambana-Adams, NF Alikhan, A Ravi, J Jafali, K Jawneh, A Jah, M Jarju, F Ceesay, S Bojang, A Worwui, A Odutola, E Ogundare, MJ Pallen, M Ota, M Antonio

Abstract:

Abstract Introduction Understanding mother-to-infant transmission of Group B Streptococcus (GBS) is vital to the prevention and control of GBS disease. We investigated the transmission and phylogenetic relationships of mothers colonised by GBS and their infants in a peri-urban setting in The Gambia. Methods We collected nasopharyngeal swabs from 35 mother-infant dyads at weekly intervals from birth until six weeks post-partum. GBS was isolated by conventional microbiology techniques. Whole-genome sequencing was performed on GBS isolates from one mother-infant dyad (dyad 17). Results We recovered 85 GBS isolates from the 245 nasopharyngeal swabs. GBS was isolated from 16.33% and 18.37% of sampled mothers and infants, respectively. In 87% of cultured swabs, the culture status of an infant agreed with that of the mother (Kappa p-value Abstract Introduction Understanding mother-to-infant transmission of Group B Streptococcus (GBS) is vital to the prevention and control of GBS disease. We investigated the transmission and phylogenetic relationships of mothers colonised by GBS and their infants in a peri-urban setting in The Gambia. Methods We collected nasopharyngeal swabs from 35 mother-infant dyads at weekly intervals from birth until six weeks post-partum. GBS was isolated by conventional microbiology techniques. Whole-genome sequencing was performed on GBS isolates from one mother-infant dyad (dyad 17). Results We recovered 85 GBS isolates from the 245 nasopharyngeal swabs. GBS was isolated from 16.33% and 18.37% of sampled mothers and infants, respectively. In 87% of cultured swabs, the culture status of an infant agreed with that of the mother (Kappa p-value <0.001). In dyad 17, phylogenetic analysis revealed within-host strain diversity in the mother and clone to her infant. Conclusion GBS colonisation in mothers presents a significant risk of colonisation in their infants. We confirm vertical transmission from mother to child in dyad 17, accompanied by within-host diversity.

DOI: 10.1101/760512

A Genomic Overview of the Population Structure of Salmonella

Alikhan et al. (2018) PLoS genetics 14:4 e1007261

Show Details

Authors:

NF Alikhan, Z Zhou, MJ Sergeant, M Achtman

Abstract:

For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.

DOI: 10.1371/journal.pgen.1007261

Principles of Systems Biology, No. 31

Cho et al. (2018) Cell Systems 7:2 133-135

Show Details

Authors:

H Cho, B Berger, J Peng, C Galitzine, O Vitek, PMJ Beltran, IM Cristea, F Görtler, S Solbrig, T Wettig, PJ Oefner, R Spang, M Altenbuchinger, RS Basso, D Hochbaum, F Vandin, D Silverbush, S Cristea, G Yanovich, T Geiger, N Beerenwinkel, R Sharan, Z Zhou, N Luhmann, NF Alikhan, M Achtman

Abstract:

This month: selected work from the 2018 RECOMB meeting, organized by Ecole Polytechnique and held last April in Paris.

DOI: 10.1016/j.cels.2018.08.005

Comparative Analysis of Core Genome MLST and SNP Typing within a European Salmonella Serovar Enteritidis Outbreak

Pearce et al. (2018) International Journal of Food Microbiology 2741-11

Show Details

Authors:

ME Pearce, NF Alikhan, TJ Dallman, Z Zhou, K Grant, MCJ Maiden

Abstract:

Multi-country outbreaks of foodborne bacterial disease present challenges in their detection, tracking, and notification. As food is increasingly distributed across borders, such outbreaks are becoming more common. This increases the need for high-resolution, accessible, and replicable isolate typing schemes. Here we evaluate a core genome multilocus typing (cgMLST) scheme for the high-resolution reproducible typing of Salmonella enterica (S. enterica) isolates, by its application to a large European outbreak of S. enterica serovar Enteritidis. This outbreak had been extensively characterised using single nucleotide polymorphism (SNP)-based approaches. The cgMLST analysis was congruent with the original SNP-based analysis, the epidemiological data, and whole genome MLST (wgMLST) analysis. Combination of the cgMLST and epidemiological data confirmed that the genetic diversity among the isolates predated the outbreak, and was likely present at the infection source. There was consequently no link between country of isolation and genetic diversity, but the cgMLST clusters were congruent with date of isolation. Furthermore, comparison with publicly available Enteritidis isolate data demonstrated that the cgMLST scheme presented is highly scalable, enabling outbreaks to be contextualised within the Salmonella genus. The cgMLST scheme is therefore shown to be a standardised and scalable typing method, which allows Salmonella outbreaks to be analysed and compared across laboratories and jurisdictions.

DOI: 10.1016/j.ijfoodmicro.2018.02.023

Accurate Reconstruction of Microbial Strains from Metagenomic Sequencing Using Representative Reference Genomes

Zhou et al. (2018) Springer International Publishing 10812225-240

Show Details

Authors:

Z Zhou, N Luhmann, NF Alikhan, C Quince, M Achtman

Abstract:

Exploring the genetic diversity of microbes within the environment through metagenomic sequencing first requires classifying these reads into taxonomic groups. Current methods compare these sequencing data with existing biased and limited reference databases. Several recent evaluation studies demonstrate that current methods either lack sufficient sensitivity for species-level assignments or suffer from false positives, overestimating the number of species in the metagenome. Both are especially problematic for the identification of low-abundance microbial species, e. g. detecting pathogens in ancient metagenomic samples. We present a new method, SPARSE, which improves taxonomic assignments of metagenomic reads. SPARSE balances existing biased reference databases by grouping reference genomes into similarity-based hierarchical clusters, implemented as an efficient incremental data structure. SPARSE assigns reads to these clusters using a probabilistic model, which specifically penalizes non-specific mappings of reads from unknown sources and hence reduces false-positive assignments. Our evaluation on simulated datasets from two recent evaluation studies demonstrated the improved precision of SPARSE in comparison to other methods for species-level classification. In a third simulation, our method successfully differentiated multiple co-existing Escherichia coli strains from the same sample. In real archaeological datasets, SPARSE identified ancient pathogens with Exploring the genetic diversity of microbes within the environment through metagenomic sequencing first requires classifying these reads into taxonomic groups. Current methods compare these sequencing data with existing biased and limited reference databases. Several recent evaluation studies demonstrate that current methods either lack sufficient sensitivity for species-level assignments or suffer from false positives, overestimating the number of species in the metagenome. Both are especially problematic for the identification of low-abundance microbial species, e. g. detecting pathogens in ancient metagenomic samples. We present a new method, SPARSE, which improves taxonomic assignments of metagenomic reads. SPARSE balances existing biased reference databases by grouping reference genomes into similarity-based hierarchical clusters, implemented as an efficient incremental data structure. SPARSE assigns reads to these clusters using a probabilistic model, which specifically penalizes non-specific mappings of reads from unknown sources and hence reduces false-positive assignments. Our evaluation on simulated datasets from two recent evaluation studies demonstrated the improved precision of SPARSE in comparison to other methods for species-level classification. In a third simulation, our method successfully differentiated multiple co-existing Escherichia coli strains from the same sample. In real archaeological datasets, SPARSE identified ancient pathogens with łeq0.02% abundance, consistent with published findings that required additional sequencing data. In these datasets, other methods either missed targeted pathogens or reported non-existent ones.

DOI: 10.1007/978-3-319-89929-9_15

GrapeTree: Visualization of Core Genomic Relationships among 100,000 Bacterial Pathogens

Zhou et al. (2018) Genome Research 28:9 1395-1404

Show Details

Authors:

Z Zhou, NF Alikhan, MJ Sergeant, N Luhmann, C Vaz, AP Francisco, JA Carri co, M Achtman

Abstract:

Current methods struggle to reconstruct and visualize the genomic relationships of large numbers of bacterial genomes. GrapeTree facilitates the analyses of large numbers of allelic profiles by a static "GrapeTree Layout" algorithm that supports interactive visualizations of large trees within a web browser window. GrapeTree also implements a novel minimum spanning tree algorithm (MSTree V2) to reconstruct genetic relationships despite high levels of missing data. GrapeTree is a stand-alone package for investigating phylogenetic trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among bacterial pathogens.

DOI: 10.1101/gr.232397.117

Pan-Genome Analysis of Ancient and Modern Salmonella Enterica Demonstrates Genomic Stability of the Invasive Para C Lineage for Millennia

Zhou et al. (2018) Current Biology 28:15 2420-2428.e10

Show Details

Authors:

Z Zhou, I Lundstrøm, A Tran-Dien, S Duchêne, NF Alikhan, MJ Sergeant, G Langridge, AK Fotakis, S Nair, HK Stenøien, SS Hamre, S Casjens, A Christophersen, C Quince, NR Thomson, FCX Weill, SY Ho, MTP Gilbert, M Achtman

Abstract:

Salmonella enterica serovar Paratyphi C causes enteric (paratyphoid) fever in humans. Its presentation can range from asymptomatic infections of the blood stream to gastrointestinal or urinary tract infection or even a fatal septicemia [1]. Paratyphi C is very rare in Europe and North America except for occasional travelers from South and East Asia or Africa, where the disease is more common [2, 3]. However, early 20th-century observations in Eastern Europe [3, 4] suggest that Paratyphi C enteric fever may once have had a wide-ranging impact on human societies. Here, we describe a draft Paratyphi C genome (Ragna) recovered from the 800-year-old skeleton (SK152) of a young woman in Trondheim, Norway. Paratyphi C sequences were recovered from her teeth and bones, suggesting that she died of enteric fever and demonstrating that these bacteria have long caused invasive salmonellosis in Europeans. Comparative analyses against modern Salmonella genome sequences revealed that Paratyphi C is a clade within the Para C lineage, which also includes serovars Choleraesuis, Typhisuis, and Lomita. Although Paratyphi C only infects humans, Choleraesuis causes septicemia in pigs and boar [5] (and occasionally humans), and Typhisuis causes epidemic swine salmonellosis (chronic paratyphoid) in domestic pigs [2, 3]. These different host specificities likely evolved in Europe over the last Salmonella enterica serovar Paratyphi C causes enteric (paratyphoid) fever in humans. Its presentation can range from asymptomatic infections of the blood stream to gastrointestinal or urinary tract infection or even a fatal septicemia [1]. Paratyphi C is very rare in Europe and North America except for occasional travelers from South and East Asia or Africa, where the disease is more common [2, 3]. However, early 20th-century observations in Eastern Europe [3, 4] suggest that Paratyphi C enteric fever may once have had a wide-ranging impact on human societies. Here, we describe a draft Paratyphi C genome (Ragna) recovered from the 800-year-old skeleton (SK152) of a young woman in Trondheim, Norway. Paratyphi C sequences were recovered from her teeth and bones, suggesting that she died of enteric fever and demonstrating that these bacteria have long caused invasive salmonellosis in Europeans. Comparative analyses against modern Salmonella genome sequences revealed that Paratyphi C is a clade within the Para C lineage, which also includes serovars Choleraesuis, Typhisuis, and Lomita. Although Paratyphi C only infects humans, Choleraesuis causes septicemia in pigs and boar [5] (and occasionally humans), and Typhisuis causes epidemic swine salmonellosis (chronic paratyphoid) in domestic pigs [2, 3]. These different host specificities likely evolved in Europe over the last \sim4,000 years since the time of their most recent common ancestor (tMRCA) and are possibly associated with the differential acquisitions of two genomic islands, SPI-6 and SPI-7. The tMRCAs of these bacterial clades coincide with the timing of pig domestication in Europe [6].

DOI: 10.1016/j.cub.2018.05.058

Comparison of Classical Multi-Locus Sequence Typing Software for next-Generation Sequencing Data

Page et al. (2017) Microbial Genomics 3:8 e000124

Show Details

Authors:

AJ Page, NF Alikhan, HA Carleton, T Seemann, JA Keane, LS Katz

Abstract:

Multi-locus sequence typing (MLST) is a widely used method for categorizing bacteria. Increasingly, MLST is being performed using next-generation sequencing (NGS) data by reference laboratories and for clinical diagnostics. Many software applications have been developed to calculate sequence types from NGS data; however, there has been no comprehensive review to date on these methods. We have compared eight of these applications against real and simulated data, and present results on: (1) the accuracy of each method against traditional typing methods, (2) the performance on real outbreak datasets, (3) the impact of contamination and varying depth of coverage, and (4) the computational resource requirements.

DOI: 10.1099/mgen.0.000124

Mechanisms Involved in Acquisition of blaNDM Genes by IncA/C2 and IncFIIY Plasmids

Wailan et al. (2016) Antimicrobial Agents and Chemotherapy 60:7 4082-4088

Show Details

Authors:

AM Wailan, HE Sidjabat, WK Yam, NF Alikhan, NK Petty, AL Sartor, DA Williamson, BM Forde, MA Schembri, SA Beatson, DL Paterson, TR Walsh, SR Partridge

Abstract:

blaNDM genes confer carbapenem resistance and have been identified on transferable plasmids belonging to different incompatibility (Inc) groups. Here we present the complete sequences of four plasmids carrying a blaNDM gene, pKP1-NDM-1, pEC2-NDM-3, pECL3-NDM-1, and pEC4-NDM-6, from four clinical samples originating from four different patients. Different plasmids carry segments that align to different parts of the blaNDM region found on Acinetobacter plasmids. pKP1-NDM-1 and pEC2-NDM-3, from Klebsiella pneumoniae and Escherichia coli, respectively, were identified as type 1 IncA/C2 plasmids with almost identical backbones. Different regions carrying blaNDM are inserted in different locations in the antibiotic resistance island known as ARI-A, and ISCR1 may have been involved in the acquisition of blaNDM-3 by pEC2-NDM-3. pECL3-NDM-1 and pEC4-NDM-6, from Enterobacter cloacae and E. coli, respectively, have similar IncFIIY backbones, but different regions carrying blaNDM are found in different locations. Tn3-derived inverted-repeat transposable elements (TIME) appear to have been involved in the acquisition of blaNDM-6 by pEC4-NDM-6 and the rmtC 16S rRNA methylase gene by IncFIIY plasmids. Characterization of these plasmids further demonstrates that even very closely related plasmids may have acquired blaNDM genes by different mechanisms. These findings also illustrate the complex relationships between antimicrobial resistance genes, transposable elements, and plasmids and provide insights into the possible routes for transmission of blaNDM genes among species of the Enterobacteriaceae family.

DOI: 10.1128/AAC.00368-16

Molecular Analysis of Asymptomatic Bacteriuria Escherichia coli Strain VR50 Reveals Adaptation to the Urinary Tract by Gene Acquisition

Beatson et al. (2015) Infection and Immunity 83:5 1749-1764

Show Details

Authors:

SA Beatson, NL Ben Zakour, M Totsika, BM Forde, RE Watts, AN Mabbett, JM Szubert, S Sarkar, MD Phan, KM Peters, NK Petty, NF Alikhan, MJ Sullivan, JA Gawthorne, M Stanton-Cook, NTK Nhu, TM Chong, WF Yin, KG Chan, V Hancock, DW Ussery, GC Ulett, MA Schembri

Abstract:

Urinary tract infections (UTIs) are among the most common infectious diseases of humans, with Escherichia coli responsible for Urinary tract infections (UTIs) are among the most common infectious diseases of humans, with Escherichia coli responsible for >80% of all cases. One extreme of UTI is asymptomatic bacteriuria (ABU), which occurs as an asymptomatic carrier state that resembles commensalism. To understand the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50 mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability to adhere to human bladder epithelial cells. In the mouse model of UTI, VR50afa and VR50afaE displayed reduced bladder colonization compared to wild-type VR50, similar to the colonization level of the GI-VR50-pheV mutant. Our study suggests that E. coli VR50 is a commensal-like strain that has acquired fitness factors that facilitate colonization of the human bladder.

DOI: 10.1128/IAI.02810-14

BLAST Ring Image Generator (BRIG): Simple Prokaryote Genome Comparisons

Alikhan et al. (2011) BMC genomics 12402

Show Details

Authors:

NF Alikhan, NK Petty, NL Ben Zakour, SA Beatson

Abstract:

BACKGROUND: Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. RESULTS: BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. CONCLUSIONS: There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/.

DOI: 10.1186/1471-2164-12-402

Genome Sequence of the Emerging Pathogen Aeromonas caviae

Beatson et al. (2011) Journal of Bacteriology 193:5 1286-1287

Show Details

Authors:

SA Beatson, M das Gra cas de Luna, NL Bachmann, NF Alikhan, KR Hanks, MJ Sullivan, BA Wee, AC Freitas-Almeida, PA dos Santos, JTB de Melo, DJP Squire, AF Cunningham, JR Fitzgerald, IR Henderson

Abstract:

ABSTRACT Aeromonas caviae is a Gram-negative, motile and rod-shaped facultative anaerobe that is increasingly being recognized as a cause of diarrhea in children. Here we present the first genome sequence of an A. caviae strain that was isolated as the sole pathogen from a child with profuse diarrhea.

DOI: 10.1128/JB.01337-10