Hello, and thank you for listening to the Microbinfeed podcast. Here we will be discussing topics in microbial bioinformatics. We hope that we can give you some insights, tips, and tricks along the way. There's so much information we all know from working in the field, but nobody writes it down. There is no manual, and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My co-hosts are Dr. Nabil Ali Khan and Dr. Andrew Page. I am Dr. Lee Katz. Andrew and Nabil work in the Quadram Institute in Norwich, UK, where they work on microbes in food and the impact on human health. I work at Centers for Disease Control and Prevention and am an adjunct member at the University of Georgia in the US. This episode is a panel discussion which was recorded in The Gambia in West Africa. Unfortunately, the audio from Nick Loman was not of sufficient quality to include in the podcast as he was Skyping in. The question is, has nanopore rendered on-site Illumina sequencing obsolete? And today we have Nick Loman from the University of Birmingham, David Baker from the Quadram Institute, Ozan Gundogdu from London School of Hygiene and Tropical Medicine, or LICHTUM, but actually in London. And we have Abdul Sese from the London School of Tropical Medicine here in The Gambia. I personally find that in terms of bioinformatics, that there's no point anymore in doing any bioinformatics software for short-read sequencing because it's on the way out and we need to reinvent everything for nanopores, we might as well, and you get so much more data out of it and you can find so much more biologically interesting things that you can't find with Illumina. Like when you do a short-read shotgun metagenomic assembly, you'll get things in maybe bins with 10 or 20,000 contigs in there, and that's crazy to work with, you know, whereas when you do a promethean assembly of a gut metagenomic sample, you get these beautiful bins where you might have 10 or 20 contigs and they're massive, you know, massive chunks of your genome. So you can actually do proper biology, proper metagenomics, unlike, you know, with short-read Illumina metagenomics where you're kind of heavily relying on algorithms to kind of tell you maybe this might be related or maybe it's not, but then you end up with mags, which are just sometimes just these crazy accumulations of stuff that happens to look slightly similar, but it's totally different. What do you think? I just personally think, in terms of being obsolete completely, the error rate for nanopore I can't see ever going up significantly. They've made small advancements over the years, but when you're applying a voltage across a pore and looking at a squiggle, I just can't see that that will ever become high enough quality to get, you know, SNP resolution. Just down to application, I still, maybe 90, 80, 90 percent of applications can be covered by nanopore long- read sequencing, but there'll always be a small percentage of work that will need the high-quality Illumina read quality, or if indeed, you know, PacBio, if people have got access to PacBio. Overall, I do see nanopore as maybe the leader in, you know, we've gone from a period where over 90 percent of sequencing is Illumina sequencing, and even in the coming years I can see that switching around so that, you know, because of the accessibility of the nanopore platform as well, you know, less and less people, when they go into sequencing, will go down the Illumina route and will probably go straight to nanopore. Abdul, what's your opinion? We had a grant that Gates Foundation gave us, and they had a deal with Illumina that they would buy this 20, the small iSeq for the metagenomics on septis. So actually, it was a really good project that they gave, but they tied it with an Illumina platform. But when we had the meeting, and Nick was there in Addis, they actually extended that we could do the work on nanopore. And I think in our setting, again, in our setting when we talk about obsolete, it would be obsolete much quicker in Africa than it would probably be in your high-end setting. So for me, I think that it would be quicker to get rid of Illumina for us, because we have to go to providers, the access to it is much more difficult than you guys are doing. We don't have novice genes sitting next to us and all of those, so we need to develop something that's easy and quite cheap and accessible for us. So we might not worry about error rate, but we can do 90% of the stuff that we want to do in Africa, then maybe it might be obsolete in our setting quicker than it would be in your guy's setting. Yeah, I mean, just going back to the error rate, I mean, I think there is, I agree with you, Dave, but I think there's more and more methodologies I've seen out there that are trying to circumvent this error rate problem, and I think that's one thing to bear in mind. The other thing is, I think, yeah, we've had an Illumina in my seat, for example, at the London School for a number of years, and there's individual groups that are having the Oxford Nanopores and the Promethean now coming in, and we're moving towards that more and more. Yeah, I think sometimes, though, that when we look back at the microarray story, when the infrastructure is there, at some institutes, it does take a long time to actually change that, so going to your point, you know, buying something now, it might actually be better to actually just buy the Oxford Nanopore and, you know, have yourself future-proof, as we were saying the other day. And I guess the barrier to entry is so much lower because, you know, it's $1,000 to get into it rather than having to build an entire building to house these beautiful NovoSeqs and all the wet lab support that goes with it. Has anyone actually done a comparison or even tried to use Nanopore to do analysis of a bacterial outbreak to see if you can call SNPs accurately enough to compare that with what you would get with Illumina to see Stavorius going through a hospital or Acinetobacter or anything? Has that actually been done with bacterial genomes and shown to work, or shown not to work? It's something we're quite used to, and if you look back at the days of GA2 when the quality was so much lower, you know, we're well used to having to throw away lots and lots of SNPs and, you know, making these judgement calls, so I think it's just more of the same. Luna wanted to speak. Yes, I wanted to say that I think it depends on what your question is, so for us, where we're looking at antimicrobial resistance and we're looking at outbreaks and things and trying to understand what is the resistance that is causing outbreak, a difference in SNP can mean that you have a different gene or you might have a resistant or a sensitive isolate, so for us, what we do is we do the hybrid, so we use the MinION Nanopore as well as the Illumina, so we look at the hybrid sequence as well, and there's a work we did with Oxford with Derek Crook's group where we actually compared the PacBio with the MinION Nanopore, and you can see that they both work really well, but with PacBio the accuracy is so much higher, but if you're doing high throughput where you're looking at hundreds and hundreds of isolates like we're doing for our surveillance activities, it just doesn't make sense because it's just so expensive, you can't use it. So presumably you can use Nanopore to work out if you have an outbreak going on, so you can say we have a bunch of salmonellas that look sufficiently similar that are coming from Newcastle and London and whatever, that we've got an outbreak, but you can't yet work out, well actually it's that kebab shop in Newcastle that gave it to that individual, and that kebab shop needs to be closed down, or for your case the farm needs to... That's right, for us accuracy is the most important thing, because if you think based on that you might go and put some farmer out of business, and it costs the UK government millions of pounds if they sue us. And if you did Nanopore sequencing on everything and then threw things into barcodes to say these things look quite similar, we've got a thousand isolates and of these 50 of them look the same, you could then go and do Illumina sequencing just on those 50 to get the fine resolution. Yeah, and we do it almost the other way at the moment, we do the Illumina first. We do the Illumina first, but I'm just saying that maybe that's a little bit flippant, but we could do that in the ES, yeah, yeah, yeah. I was going to say, I mean our emphasis is slightly different from Moon, I mean just in our setting we really want public health response, so when there's an outbreak, first of all we want to detect the pathogen, and we want to know the serotype or the genotype, because this is where WHO or UK government or partners are bringing vaccines in order to stop the transmission of that outbreak. Of course, this is really important, but in terms of response quickly, it's really to detect the pathogen and the serotype, but these are the ones that the vaccines are available. So I suppose the one thing that Nanopore and these Langmuir technologies give you is that you can maybe go and do direct sequencing, what I call. So you can cut out that entire day or two, or however long it takes to culture your bug. And you don't just get the small chunks of it, you get full, big chunks and you can say, well, say this salmonella has these resistance genes because I'm able to sequence such a long piece of it. Well, that depends on how much biomass of the bacterium you have. In a stool, that would be okay, but if you're trying to detect pneumo in a nasopharyngeal stool, I suspect there's probably not enough there for you to do it directly. You might have to do some culture first, or Nick said PCR was, you know, for viruses they take, they do the PCR step first, but yeah. I've got a point for Arno, from the perspective of an end-user, somebody who doesn't do the sequencing themselves but helps with the analysis, how much of a step is it to go from the Illumina sequencing where you just get your FASTQ files, to the type of files that Nanopore uses? Can you just easily convert everything, can you just use your same programs, or do you have to actually change your pipelines or your setups to be working with Nanopore data? So basically for Nanopore and PacBio, you need totally different pipelines to Illumina because of all the errors and you have to account for those. I've done hybrid assembly where you just basically mix the long read and the short read, and as long as you get FASTQ files or FASTA files you can do that, but that already requires a conversion, so you can't just work with the native files. So that's what I'm saying, can you work with the native files quite easily in different programs, or do you have to completely change it if you start working with Nanopore? So the native files for Illumina are BCL files, but you don't work with those. And you work with the FASTQ only? Yeah, and I think people have confidence in Illumina's base calling over the past few years, that if it's an A it's an A, whereas with Nanopore, because Guppy and all the base calling software changes so rapidly, you do have to go back every now and again and rebase all the same data over and over and over again, because you will get better data at the end, and I think it's only when we're really, really confident that the base calling is as good as it's going to get and it's stable, then we'll just probably keep the FASTQ files and use those as the base, but at the moment you do have to keep the native squiggles. Okay, so the rumours of the death of Illumina are greatly exaggerated. It's inevitable, I think. So Abdul, what's your plan for using Nanopore here? I know you've got a grid ion and you've got lots of min ions, but are you thinking about bringing in a Promethean? I think one of the things, we're writing our QQ now, and that's one of the things that I'm playing in my head with, is whether we get a bigger piece for our human genetic stuff, or we've got time, we think we've got two years is a long time in the sequencing business, so I'll get the money and then we will look at what technology we'll bring in. I think we've got two years and it gives us more time to think where we go. Yeah, I completely agree. The appointment for human genetics, I think, is still under discussion because of the cost, but that happens every day, we need a high throughput sequencing. So would you consider doing Plasmodium on it, or Anopheles, or anything like that? On Promethean, yes, if we have. We're already doing that with Alfred, and yesterday's talk, and this was done through the Sanger, wasn't it? Yes. With Dominic? Yeah. But the idea is to do everything in-house. Yes, my question is that, would this affect, would they change from Illumina sequencing to the long read and sequencing as services affect the data rights of the sample donors, or is something that is still going to be the same as it is now? Or will it have any effect on the data rights for specimen donors? I don't think so. So, I mean, when you go and give a blood sample for a test, if you've given permission for a test to be done, you don't actually decide which machine they're going to run it on. So I think that once you've got ethics, I think, is that not right, Martin? Yeah, that's right. I don't think it matters, really. I mean, it matters about protecting your data when it comes out, but I really don't think the way you... I guess that if you do better genomics on humans, you might get more human DNA out, longer stretches of human DNA in assemblies. It might mean it's easier to identify the person. But would you say, if it's not a human project, those sequences will be thrown out? Yeah, exactly. They should just be thrown away anyway, so they should never be released. It wouldn't matter in any way. But I guess it would make it easier to do sequencing in-country, so you don't have to ship it off to Sanger or to Broad or wherever. I think, really, again, this is why I go back to our setting, is that it will change in terms of taking infectious samples out of here. I mean, people are changing in animal transportation on aeroplanes, so that might change whether you send samples out. So, I mean, doing something in-house might probably be... And the governments are also trying to change that, so that sample doesn't belong to a research scientist, it belongs to the country. So they can come here and seize our freezers. If Jami was here, that was one of his plans. Yeah, I mean, it can be in the past. Any sample that is shipped out, we needed to keep an aliquot here anyway, as part of the ethics approval. So in our freezer, we have a lot of samples that are practically useless, but we had to keep an aliquot back, and it's not linked to the metadata, which is a pain, but it's a revelation. I think it's standard for government projects. I mean, for us, anything which is funded by the government, which is 95% of our work, belongs to the UK government, so it does not belong to the research scientists. So if people leave, all of that comes back to the government. And if you need permission, you have to go to our policy people to ask for permission for certain things. So I think that's standard across. Any final words from Ozan? Yeah, I mean, I agree with pretty much most of that, and I think somewhere like the London School of Hygiene and Tropical Medicine, I mean, we have a core facility with Illumina to now convince the school to invest more money in the core facility for Oxford Nanopore. It's something a lot of individual groups are having their own pieces of equipment. We do a lot of work abroad, so people just want to take and do the sequencing on site. So it's in that sort of middle playing field of we're not exactly sure what's going to happen in terms of sustainability. We've always got issues about warranty, who's going to pay for the warranty of these capital expenditure equipment. And, yeah, these are ongoing issues, and I guess that's why we're here discussing this today. David? Yeah, I mean, I just feel that in terms of, you know, Muna mentioned earlier that a single PacBio assembly was actually better quality than a Nanopore Illumina hybrid. And I just think that PacBio are missing a trick. If they could come out with an instrument that would encompass both Illumina quality and Nanopore read length, they wouldn't necessarily do the field work. But if they could make it accessible in terms of cost, I still think the future could be bright for PacBio, and that might be the all-in-one solution for someone setting up their own core sequencing facility. Cost of PacBio, even if they make it cheaper, would be an issue in our setting, and maintaining that would be an issue in our setting. So maybe we're still leaning into the smaller machine. Okay, thank you very much for everyone participating, and thank you very much to Nick as well from the University of Birmingham. Thank you all so much for listening to us at home. If you like this podcast, please subscribe and like us on iTunes, Spotify, SoundCloud, or the platform of your choice. And if you don't like this podcast, please don't do anything. This podcast was recorded by the Microbial Bioinformatics Group and edited by Nick Waters. The opinions expressed here are our own and do not necessarily reflect the views of CDC or the Quadrant Institute.