Hello, and thank you for listening to the MicroBinfeed podcast. Here we will be
discussing topics in microbial bioinformatics. We hope that we can give you some
insights, tips, and tricks along the way. There's so much information we all
know from working in the field, but nobody writes it down. There is no manual,
and it's assumed you'll pick it up. We hope to fill in a few of these gaps. My
co-hosts are Dr. Nabil Ali Khan and Dr. Andrew Page. I am Dr. Lee Katz. Andrew
and Nabil work in the Quadram Institute in Norwich, UK, where they work on
microbes in food and the impact on human health. I work at Centers for Disease
Control and Prevention and am an adjunct member at the University of Georgia in
the U.S. Hello, and welcome to the MicroBinfeed podcast. Today we're giving a
rapid roundup of what's been changing for SARS-CoV-2 COVID-19 genomics. Things
are changing very quickly, so we should mention that it's the 9th of March,
2021, and some of what we mention might change by the time you hear this. Today
we're putting a spotlight on COVID-19 in Denmark, and we're joined by a special
guest, Mads Albertson, who is professor MSO at Center for Microbial Communities
at Alborg University. Welcome, Mads. Thanks. Happy to be here. Let's get started
with reviewing the latest changes regarding variants of concern. Well, first of
all, there's some language difficulties here. Is it variant of concern, VUI,
VOI, VOC? It's getting a bit confusing now because everyone is using different
things and no one agrees on anything. So in the UK, we say VOC, variant of
concern, VUI is variant under investigation. That's kind of like the initial,
something looks a bit off here, and then VOI is variant of interest. And again,
that's something like, we think there might be a problem here. I think really we
need to do a bit of work to actually have a global set of words that we can use
for these things because it's getting so confusing now. Everyone is confused
because every group and every country seems to use different terms for different
things to mean the same thing. We might need a number system, like declare a
variant as like DEFCON 1 or DEFCON 5 or something. Yeah, but every country wants
their own variants. And now it's become like this kind of token that you, like a
badge of honor, you have to have your own variants. And these are kind of hype
variants, people are calling them now. Like there's one the other day is B11318,
which Finland have claimed for themselves, but as a Fin 7968, and yeah, they've
claimed it for themselves, but they didn't really release the data. So, you
know, they said, yeah, this is a problem, but then they didn't release the data
to the world. So the rest of the world was looking and going, okay, yeah, this
is a problem. It's associated with say, Nigeria, West Africa. It has some
interesting mutations there, which may make it concerning. But Finland, you
know, had already claimed it themselves. No one really knew that they were
talking about the same thing. So it actually took, you know, a little bit of
time to figure out what they were describing kind of abstractly was actually
what we were seeing in the data in GISAID, but from other countries. I think the
lesson there is before you announce a hype variant, that you actually go and
release your data and then also declare it so that we can add it to the lineages
correctly. And of course, the UK is now calling this a VUI, a variant under
investigation. And so now we're up to eight, which is a fair few, but they seem
to be coming thick and fast a couple of weeks, which is quite a lot, you know,
compared to December where we only had two. In the UK, we've now got eight
variants, eight different variants, which are of concern. Interesting. And
they're calling it FIN796H. What happened to bird names? I wanted to do that.
That's an American only thing. Just to understand, that's the only three
variants of concern right now. There is B1351, which is the variant which
originated in South Africa. Then there's P1, which originated in Northern
Brazil. And then there's the Kent variant, B117. But actually the same mutations
that are concerning are appearing everywhere. I believe there have been a few in
the past week, which people have said these contain the same constellations
mutations that we've seen in other ones, and they're now spreading like wildfire
in different areas. There's a B1526, which is the New York variant. And of
course, New York was hit heavily by COVID back in April. And now we have this
variant spreading very, very quickly, which is E44K, and it's popping up
everywhere now in the US, which isn't very good. I presume it's because there's
still so much travel happening internally, domestically within the US, and it's
allowing it to spread. There is also a new one I've seen, even newer, which is a
B1324, which again is a US centred variant. And that's got a N51Y and PU, which
is P681H. Now, I think really the issue, it's not a bad thing. It's more that
genomic surveillance in the US is just ramping up very, very, very rapidly. And
now you're starting to get a much better indication of what's going on as the US
starts doing sequencing, whereas the UK has been doing this for quite a while,
and Denmark has been doing it for quite a while. So we have a really good idea
of what's within our populations, whereas the US has kind of been a bit blind
because they're only doing, what, half a percent of all cases were being
sequenced compared to like Denmark, where, what are you guys up to now? Do you
sequence everything? Yeah, we sequence everything these days. I think we are 90%
plus for most of this year. And is that attempted sequencing or actual
successful? That's attempted sequencing. So we have around 70% of cases have a
genome with less than 3000 Ms. That's awesome. That's so good. I think you're
the envy of the world because you've got scale and you have very, very high
rates, whereas the rest of us are kind of lagging behind. So fair play. In terms
of lineages, what are you seeing in Denmark at the moment? So in Denmark, we've
been watching B.1.1.7 grow. It's been like a bad movie. You can just follow and
predict. So right now we have 80% in Denmark, but we have mostly other, it's old
lineages. So some of the summer lineages that spread around, they're declining
now. And then we have one of these variants of interest, or what we call them,
called B.1.525. It also has E.484K. That's around a few percent in Denmark now.
So we've seen around 200 cases of this. So that's the B.1.525. We've seen it in
Denmark and a few other countries among Nigeria, I think. Right. So do you think
that B.1.1.7 is going to hit 100% in your area? No, I think we'll always have
small variants that lurk around, especially now we are opening Denmark again.
What's been happening in Denmark is that we basically crushed all the variants
until now. Then B.1.1.7 has taken over. But we are not crushed everything
completely, so I think we'll still have these 5% of other variants that will
circulate now that we open Denmark. But let's see. It's both scary, but also
interesting to sit and watch what happens. Absolutely. We have some weird things
here. We're reopening schools this week. We'll see what happens. Hopefully it
won't be a boom, but they did get the number of cases down very, very much. Like
so much that we're kind of stuck for work. You know, we don't have samples
coming in the door, which is a fantastic thing. But we just feared what's going
to happen in whatever a month or six weeks down the road if things ramp up
again. If anyone out there is interested in what's going on in Denmark, there's
a nice website for that, which is covid19genomics.dk. That has all of the
breakdowns of all of the different lineages we're talking about. It's nice to
see that a lot of countries have set up these dashboards. I think we mentioned
the US one last week. I think every country is slowly making their own version
of these sort of dashboards, making that data available for everyone. I presume
at some stage we're going to have just like ECCDC or something like that
dashboard or WHO dashboard and that's it. Every country will just have a clone
of one dashboard instead of having 190 something dashboards. I did like a little
bit of a survey with some companies over in the US and also just looking and
seeing what's out there. I do see some common trends. Some people are using
Tableau as kind of their dashboard and making it a really nice visual. Or else
people are kind of breaking into some camps, like looking at MicroReact or
Outbreak.info. I do think you're right. I think that people are starting to go
into some groups. Yeah. Tableau is really nice for everything in general, just
for data visualization. It's lovely. I'd highly recommend it. In Denmark, our
site is simply just an Ama account file. What you see on the page, that's
actually what we deliver to the government also. It's just online on our page a
few days later. So this is like the basic breakdown and it's like an auto-
generated Ama account file. Is that on your website that you can kind of go
through it? So that's on the covid19genomics.dk. There's a statistics page,
which is basically an Ama account file. There's a version of Nextstrain as well.
So can I ask, in Denmark, have you seen eFord 4K coming into your B117 samples?
No, not yet. I guess we're expecting it to come at some point, so we are
watching it closely. Right now, because we sequence everything, hopefully we can
stamp it out before it spreads. That's at least the point right now. So that's
what we've been building up the last couple of months. Capacity to actually
really do variant-specific stamping outs in the society, both with extremely
high testing capacity, sequencing, and also search testing. Last week, there was
search testing in areas with B117.  and 351 to make sure we can start trying to
stamp out some of these variants of concern. And how successful has your search
testing been? Because in the UK it hasn't, search testing just hasn't picked up
the cases compared to just randomly doing community surveillance. I think it
seems to work. So it's really highly intensive sequencing in blocks around where
cases has been. So if there's been suspicion of community spread, they've done
search testing to try to see if they can pick up the missing links. But so far,
the majority of our other variants of concern has been related to travel
history. And it seems like we've been able to stop most of those transmission
chains. But let's see, there seems to be many more cases going around from the
rest of Europe. So we also start to see many more imports. So it starts to be
more and more difficult to keep them out. Let's talk some of the tools and
resources that we've seen in the last few weeks that have come online. Following
on from the discussion of the dashboards, outbreak.info is a great resource
because it's just got a compilation of all this different data on cases broken
down by geography or regions. You can find doubling rates. You can find all
sorts of metadata available for download. And that's coming out from the NIH, I
think, and Anderson's lab putting that together. I think the next one is pretty
fun, which is something I spotted the other day, which is from David Clemens, a
tweet from him saying that Galaxy is going to have a new public health community
based around that. I think this is kind of in the similar vein of Galaxy
Tracker, but I think it's going to be more opened up and allow a lot more people
to participate in that. That'll be really good, having some standardized, more
public health integration with what Galaxy provides. I thought it was just kind
of funny. It says, please contact me if you want to join it. And I tried to
contact and it says he doesn't receive direct messages. I'm going to email him
separately though. I think Galaxy is a great area to go because it has such a
deep history in bioinformatics and people know how to use it. There are large
communities already that might be using it for public health, especially from
FDA Genome Tracker. And I think it has a huge potential. It's definitely worth
signing up. It's easy to use for biologists. So that's what makes it really good
for non-technical people that can dive in. You don't need to spend days or weeks
playing around with next flow pipelines or whatever. You can just click, click,
click, and there you go. And we've had people with next to zero technical
skills, just in an hour, they're flying with Galaxy. They can do assemblies,
annotation, whatever. So it is really, really useful. And I think people should
make better use of it. I don't know if this is something that you are
knowledgeable with, but when I was trying to make a tool in Galaxy, it was XML
based and you'd have to write the tool with XML. That's to add a new tool into
the Galaxy repositories, but that's on the developer side that they have to do
that. But once that's in the repo, then anyone can install it with a click and
get down to using it. Yeah, yeah, yeah. But I mean, since we're a bioinformatics
podcast, it's interesting to see. I think in time, they're going to simplify
that down a bit and it's going to be a bit easier to do. Well, it only has to be
done once in the world. One person goes, does a tool, that's it. And then
forevermore, it's in a tool shed. You can click and it magically works or drag
and drop and put into a pipeline. So it's a reasonable amount of pain, as long
as people share their work. I think it's fun to use words like that, like pain,
but I've made a very simple tool one time and it actually is very
straightforward with the XML, as long as it's just a few parameters. I thought
it was very nice. I think last thing, just to round up the new tools and
resources that are out there, we've got a SaskOv2 Nextflow pipeline that's
coming out from Johan Bernal and Varun Shamana and Anthony Underwood. I don't
think it's deviating too much from what's available out there in the Nextflow
pipeline for SaskOv2, but does have integrated into it, stuff for creating and
visualizing trees. I think this is sort of an attempt to take you from start to
finish completely, where I think at the moment, Nextflow pipeline is kind of
like, they generate, they call the variants, they make the consensus sequences
and then it's like, well, have fun. It's up to you now to kind of align it and
go off and make trees or do whatever you want to do with it. So this looks like
a one-stop shop kind of version of a workflow. And so that's up now available on
GitLab. We'll put a link in the show notes if people want to have a look at it.
So should we switch over to some publications? So the first publication is quite
a bad one, unfortunately. It's a comparison of performance of different SaskOv2
sequencing protocols. This single author paper, they've gone and just assembled
some Arctic data with spades. Now, I know we've talked about this before, but
you don't assemble Amplicon data. It doesn't work very well. And if you do,
you're going to get terrible results. And that's exactly what they did. And they
got terrible results. They got terrible in the fifties as well. There you go.
And they had blindly downloaded the data from the archives. So this is a
cautionary tale that you shouldn't just take data blindly from an archive where
you don't know how it's been generated and how the method works. You're going to
get bad results in that instance. In this case, the paper, I think, seemed to be
more of an advertisement for the author's own method, own assembly method. So
yes, please don't assemble. You generate consensus sequences. That's why people
always talk about consensus sequences and not assemblies. And if people do
occasionally talk about an assembly, usually they're just misspeaking and really
they mean a consensus sequence. I can't agree more. I think it's important to
demonstrate your tools out there in the literature. It's also important to kind
of put it by your peers before publishing, that kind of thing, to catch some
common sense things. Absolutely. There's a lot of really poor work in bioRxiv
and medRxiv. But of course, hopefully once it's peer reviewed, it'll get rid of
a lot of it or make it a lot better. Anyway, moving on. The WHO have proposed
different definitions of variance of interest and variance of concern. And this
feeds back into what I was talking about earlier, where we're all talking about
different things. We're all using different words for the same things and no one
has any consensus on anything. So the WHO are trying to fix that. I'm still
seeing people use novel coronavirus-19 or nCoV-19 and hCoV-19. I thought we'd
fixed all of that nearly a year ago, where we're talking about SARS-CoV-2. But
people stick with their terminologies and they hardcode them into stuff. And
that's that. Well, hCoV-19 is baked into JSAID. And nCoV as well. I'm surprised
that's stuck around. There's no longer a novel coronavirus from 2019. We still
call it next generation sequencing. And next, next generation sequencing. I'm
thinking when they start talking about new variants, I was thinking, is there
going to be a new, new variant and then a new, new, new variant? Perhaps. Yeah,
the new variant stuff didn't last very long. It wasn't scalable. This is one
paper that came from a group in New York. And the authors pool samples from 10
patients together to obtain the consensus sequence for the SARS-CoV-2 genomes.
And then they deposited those genomes in JSAID, as if, which doesn't have any
way of distinguishing that kind of data from anything else. So that's
problematic because obviously people are expecting for the genome to be from
one, one sample from one isolation. That could look like, you know, there's a
combination or people might take it seriously if there's any sort of, any sort
of mixed snips in that. Might not be likely in this case, but that's just
something that people need to avoid doing that sort of thing. This is a mix of
ideas that just shouldn't have been mixed. Pooling and genome sequencing. At the
beginning, you know, when they didn't have testing capacity, it did make sense
to pool stuff to save some money and save reagents when you couldn't get any
reagents. But in this day and age, you know, it's not really worth it. That puts
it better in context. Okay. A lot of people were trying that to do mass
screening was to just pull all the samples. Pulling samples for diagnostics
makes sense to me. As you can, you can drastically reduce the number of tests
logarithmically or whatever the adverb is you want to choose. That's great. But
then like when you genome sequence, you don't want to do like a metagenome
sequence of all different SARS-CoV-2. I don't think you do. Matt's given you a
metagenomics background. What do you think of this? If you had a hypothetical
study that was sequencing from pool samples? Seems very creative to do. And of
course it needs a label on it. So then you need to put it in a, not in G-shape,
but put it in a proper database where it can actually label this stuff that we
can filter it out afterwards. Then it's fine. And I agree with you. It's
actually a nice way to do mass testing, but not sequencing. I know in some
places that they would have households say in the University of Cambridge,
they're doing asymptomatic screening and they would get, everyone in the
household would take a swab and then put it into the same physical tube. And
then that tube would go off to be tested, or at least that's what they're
talking about. And that makes sense, because then you're going to have a pooled
sample from a household, but then you'd really want to go back to the original
person or the original household and then sequence every single individual from
an individual swab, but not the actual pool itself. Okay. So the next paper on
the list is SARS-CoV-2 within host diversity and transmission from Tanya in
Oxford. This is actually a really awesome paper. I saw her give a talk the other
day. And so what Oxford use is actually hybrid capture rather than the ARTIC
protocol, which is just a nice way to pull down exactly what you need. It has
some limitations, so it doesn't work in high  CT samples, but they're able to
actually see minority variants much more clearly than you can see in ARTIC,
where an ARTIC has a lot of issues around that. But they can see minority
variants, and then they're able to track these variants as they went through
different people. And sometimes the minority variants they could actually see
would get transmitted. Usually it would be the main dominant variant. Sometimes
you get, say, two different variants being transmitted, the major and minor. So
like a cloud of infection, which you see with other pathogens, which is super
interesting. So this is a really, really major paper, and fair play to them, you
know, they've done a huge amount of work on it, and I would highly recommend you
go and read it. Often we also talk about some queries that have come up. So
questions that get fielded to us, or we hear people asking around, and then we
talk about them here. So this one is the fact that, what's the most up-to-date
masking strategy for SARS-CoV-2 phylogenetics? I remember back when I was trying
to learn SARS-CoV-2 assembly back in summer, that somebody hosted like a VCF
file of the sites that you ignore, and I can't remember who it was. I think that
that list is going to get bigger and bigger and bigger all the time, because so
many of these variants are arising independently, and it's just going to cause
chaos. You can see with the variants of concern, with N501Y and whatever, and
E484K, they're all arising independently everywhere, and it just messes up
phylogenetics. We don't do global trees anymore. It's just too much data, and
it's too difficult to do. So we call lineages, and then we build sub-trees of
what's of interest, then we build a tree of a specific lineage if you're looking
for variants in there. So that avoids many of those problems that are maybe not
solvable. I had a case the other day, actually, of lineage calling going
terribly wrong. Colleagues told me, oh, yeah, we've got a P1 here, and
obviously, alarm bells are ringing because the first in this particular country.
People were getting quite concerned about it. But actually, digging into it, it
was just that this was a particular region of the world which hadn't been
sequenced very well. There's very few genomes at all from the entire continent,
which is Africa. Actually it was a different story altogether. It wasn't that it
was a P1. It was that this particular genome was 18 snips away from the nearest
ancestor that had been seen. So nothing had been seen in between in this
transmission chain since April last year. And basically, travel shut down. No
one was moving by air. And so it cut off, and then it was just rumbling along at
a normal clock rate and just knocking around. And by coincidence, we spotted it
just because it was flagged up as a P1. But actually, it does highlight the
dangers of when you're dealing with genomes from under-sequenced areas that you
are going to have these lineages being miscalled, because the lineages are
defined by basically what we see in the UK and Denmark and the US, where you
have huge amounts of sequencing. But actually, there is a lot of transmission
elsewhere in the world, which we're not seeing at all. We're not even getting a
glimpse of it. And actually, it's only when it accidentally pops up in travelers
that we actually see there's a problem. I think there is quite a few whales out
there, big sea monsters that we haven't seen and are lurking below the surface
that we're going to be finding as time goes on. By chance, we should find many
of these if they are in high frequency. We may not either. And it may be quite a
while before we see them. The lesson to be learned here is don't blindly just
take your lineages. If it's an important lineage, double-decker mutations are
there that you expect. In this particular case, there's like one out of about a
dozen mutations that we expected for P1. And so it was clear it wasn't a P1,
just based on weak data. So Public Health England now have a GitHub repository,
PHE Genomics on Variant Definitions. And there, they actually list in the YAML
what are the different variants or different mutations that define that lineage
and what to look out for. So that's quite nice, actually, because it's machine-
readable. And you can ingest that, and you can double-check exactly which
mutations you expect for this given lineage and which ones are there. It can
give you, then, an idea of is this a confident call of this lineage, or is it a
probable, or is it kind of a low-quality best guess? So yeah, that's quite
useful. Check it out. I know Lee has views on that. He thinks VCF is better. I
have opinions, guys, on VCF and YAML. If you're going to define SNPs, I think
that VCF is an awful format, but also it is the format to describe SNPs. And
it's kind of awkward to put that into YAML, to me. And some people have already
told me, well, YAML's better because it's freeform. But I would say that VCF is
also freeform. It's just annoying because you have to define the freeform items
up in your header first. I think the problem with VCF is it will get you 90% of
the way there. But then that last 10% is probably going to take you months of
shoehorning it in. And really, maybe YAML is a quicker way to get to the end
results. Yeah, it's easier to get to the definition when you're writing out
YAML. But VCF is the thing that actually works in all these other workflows.
Other pipelines, other software actually read the VCFs appropriately, and
they'll be able to use them appropriately. But if you want to use something
standard like BCF tools, you can't import a YAML. Let's see which one wins.
Maybe you'll just end up writing a converter between YAML and VCF. I know I'm on
the losing side here, but I have a soapbox, guys. Mads. Mads, how do you QC your
samples? So first of all, we run a lot of negative controls. So we run four
negative controls per plate. And then we look for several things. We look for
strange long branches. This looks weird. Then we look for the number of
ambiguous sites, which indicates some sort of contamination. It could also be
ISNPs or whatever you call them. So internal variation within the host. But 99%
of what I've seen is contamination. So that's what we look out for most. Is it
like some kind of standard software you're using? Or is it just pretty
straightforward? So it's not even worth doing something specific? So we do,
again, an R Markdown report that pulls in a tree and then puts SNPs beside it
and shows so you can see the branch length. And then we manually QC everything.
I've seen another thing to look out for is that, say you have a B117, and then
at N501Y, there's actually the wild type. That's an indication that there's a
major problem. Either you're miscalling it, or there's contamination because you
shouldn't have the wild type if you've got all the other defining mutations.
Unless something's gone totally wrong and it's gone and switched back or
something, a reverse mutation. So are you saying, Andrew, that some of your QC
involves just straightforward sanity checks too? So for important samples where
people have queries, clinical queries, or where something is not right or
something that's very important, we always do a manual check, which is me. I do
a manual check, and I double check that the computer is actually working as
intended. And mostly it works, but occasionally you'll spot these things which
are odd. And you would say, OK, that's just an artifact, or that's
contamination, or actually that is right, but people are misinterpreting what
it's saying. You always have to look at the data, and that's the hard bit.
Computers can only get you so far. It's that last bit where you really do need a
human to eyeball everything. That sounds like maybe there's a war story in
there, and I'm unearthing it. Every week I do reports for various different
entities, and a lot of that just involves eyeballing data. So taking the data
that's been produced by all the pipelines and sequencing, and then just munging
it and making some sense of it, and sanity checking a lot of that as well. It's
been made easier and easier and easier over time as stuff gets automated, but it
does require a human to look at it. Particularly in the UK, or where we are in
Norwich, in Norfolk, where we have such a high density of sequencing, we can
then go and look at outbreaks, say within hospitals or care homes, or within
small geographic locations. It is important that you do dig into it. I'm sure
Mads used something similar, because you have very high density sequencing, so
you can look at these things as well. Yeah, sure. We look at strange stuff, so
it's not only sequence QC, it's also QC. Has the samples been switched around?
Has plates been switched around? We've seen everything. So with QC, the 100,000
samples now, we've seen everything. And even though it's at low rate, you will
have cases where, by chance, two plates are switched around, and you need to
spot that. And one of the things we do is we compare with the CT values. So for
most of the daily samples, we actually have CT values also, and that predicts
quite well the success rate. So I can actually see from the pattern of CT values
on a plate if it has been switched around with another. And these things happen
at scale. What's your cutoff for the CT, where you start saying it'll fail?
Yeah, above 35.  and we drop way below 50% success rate, it starts to drop
around 32. But it's a bit difficult to compare CT values, it depends on how it's
actually set up in the country. Yeah, and some of the instruments actually
don't, actually sort of have a burn-in where they don't report the first couple
of cycles. So what we've been doing recently as well is where plates are
questionable, we looked at the sex of the sample, and more or less we can, what
about 90% accuracy you can tell if the sample is from a female? It's hard to go
either way, but it does give us an indication of is the plate totally messed up
or is it roughly in the right ballpark? And that's been quite useful by looking
at the human reads within a sample, so the RNA and DNA that originally were in
there. And Nabila is looking at me very strangely. In this particular case it's
just, we had samples coming from a different lab that we don't normally deal
with, and there were some issues with sample sheets, so we weren't very
confident with the actual samples because this particular lab was telling us
there were 97 samples on a 96-well plate. So that raised some alarm bells
straight off. But yeah, we're using the sex of the person who provided the
sample to look for things like rotations of plates and plate swaps. Wow. Are you
thinking of any other markers? You do, actually, yeah. Surprisingly, actually.
Now, it can be as low as just a dozen, but often you'll get a few hundred reads
or a few thousand reads, so it's a fair bit, particularly because we sequence on
Illumina and we can get 3,000 to 5,000 X for a sample, so actually you do get a
fair few human in there. Obviously, we filter those out before depositing them,
so those reads only ever stay within our institute and never leave the world, so
it's not something you can do in a large scale. It's just something we're using
as a QC check. Are you thinking of any other markers, you think? If you're
putting your blanks in the same coordinates, you can obviously use that to check
the orientation. Mads, do you have any war stories on this stuff? No, but we've
seen it a lot, a mix-up of different sorts, so we've been running 100,000
samples. I've seen a handful of it, and the most things we can actually spot,
again, in the manual QC, for example, often we get plates that are not
completely full, so we know some places should be empty, and then we can easily
spot it from there, so we've had at least a handful of cases where we had to go
back and revert it. Normally, we can actually just look at the data and then
revert it automatically afterwards and don't have to re-sequence it, but you
need to really take care. There's so many steps involved in these pipelines.
Well, I think we have a really good ending with a high note from Mads there. I
want to thank Mads for joining us today. Thank you all so much for listening to
us at home. If you like this podcast, please subscribe and like us on iTunes,
Spotify, SoundCloud, or the platform of your choice, and if you don't like this
podcast, please don't do anything. This podcast was recorded by the Microbial
Bioinformatics Group and edited by Nick Waters. The opinions expressed here are
our own and do not necessarily reflect the views of CDC or the Quadrant
Institute.