How tracking coronavirus variants will prepare us for the next global public health threat
Genomic surveillance of SARS-CoV-2 is helping us spot new variants and figure out how to respond. What else could it help us do?
Tulio de Oliveira cuts a lonely figure as he paces across the parking lot of Stellenbosch University’s new $100 million biomedical research building. It’s early January, the height of summer in South Africa, and most students and staff are on holiday. But not de Oliveira. He’s on a conference call with the country’s president, Cyril Ramaphosa.
It’s the second time they’ve spoken in just over a month. The first was right after his genomic sequencing lab had discovered a new covid variant. Today, they’re talking about something else—de Oliveira doesn’t want to give specifics. The call has lasted for some time, and de Oliveira is pacing because he is, by his own admission, not a patient man. Still, his science is directly influencing policy—something he relishes.
“The pandemic has changed the way that science is done,” he tells me once he’s off his call. For one thing, science is happening faster. Six weeks earlier, his team had suddenly felt “a little weirded out” by a resurgence of covid cases in Gauteng, the country’s most populous province. This uneasy feeling prompted a flurry of sequencing activity. It took them only a day to identify the highly transmissible new variant now called omicron. They briefed the health minister and the president, and spent another day checking their work. Then, on November 25, de Oliveira announced the discovery to the world.
Omicron, which is better at evading the body’s immune defenses than its cousins, has since spread everywhere, fueling unprecedented waves of the pandemic. In early January, it was causing weekly infections to rise by 65% in Europe, 78% in Southeast Asia, and 100% in the Americas. Deaths were rising too, albeit more slowly.
Still, the team’s detection and identification of the new variant in November provided a crucial early warning to the rest of the world. In the weeks after the discovery, scientists subjected omicron to a battery of tests. Massive efforts quickly got underway to understand its sensitivity to existing covid vaccines and figure out just how infectious and lethal it was. Policy-wise, the discovery sparked an uptick in booster vaccinations, renewed restrictions, and travel bans.
Four shiny new sequencing machines in de Oliveira’s lab would have cost a total of $3 million off the shelf. But they were all gifts from donors.
Before covid, de Oliveira was sequencing scourges like Zika, chikungunya, and tuberculosis from his base in Durban. The pandemic injected unprecedented resources into his field and generated immense political interest in his work. The lab at his new Centre for Epidemic Response and Innovation is stocked with millions of dollars’ worth of equipment—much of it donated by wealthy labs, international health organizations, and manufacturers. Eventually, it’s expected to be the most powerful sequencing lab in Africa.
The spread of SARS-CoV-2 set off an avalanche of genomic sequencing all around the world. More than 7.5 million viral sequences have been uploaded to the global database GISAID, and scientists have sorted millions into tree diagrams tracking the virus’s evolution. That’s the most sequences generated and shared for any germ by far. Sequencing has also grown more common in parts of the world that didn’t have the technology before, which will be vital in spotting any new threats as they arise.
The glut of data on SARS-CoV-2 has allowed scientists to track, in close to real time, how the virus is evolving. And it’s transformed the way we use genomic sequencing to inform health policy, says Sharon Peacock, a microbiologist at the University of Cambridge who leads the UK’s covid genomics consortium. “If we look at previous threats, sequencing was used as a research tool, in a retrospective way,” Peacock says. “Now we have seen that sequencing can provide actionable information.”
With covid, scientists have tried to see if they can forecast how variations in a germ’s genetic sequence might influence how it behaves in people and even help predict its next move. While there remain gaps in the global sequencing effort, which might allow variants to circulate undetected, capacity on continents like Africa has increased dramatically. And de Oliveira and his colleagues want to go further.
While keeping an eye on covid, they want to use the momentum and funding they’ve amassed for genomic sequencing to tackle other diseases, like tuberculosis, HIV, and viral hepatitis. And they want to see if the technology can give us an edge in the ongoing and deadly fight against antimicrobial resistance.
Indeed, the full value of this work may not become clear until long after the pandemic recedes.
The science behind genomic tracking for viruses these days is relatively simple. To sequence a SARS-CoV-2 genome, scientists isolate the viral RNA from a patient sample such as a positive covid test swab. Then they process the RNA into a form that sequencing machines can read.
Twenty years ago, it cost $100 million to sequence one human genome. Today, it costs less than $1,000. However, it’s still not cost-effective to sequence every positive covid test to look for new mutations. So de Oliveira and his team prioritize samples from areas with unexpected upticks in infections reported by national laboratories or doctors in the field.
Using this targeted approach, which he calls “nonlinear sequencing,” they were able to raise the alarm about not one but two variants of concern discovered in South Africa: omicron in November 2021, and beta a year earlier in the Eastern Cape province.
While the rapid detection of new, more infectious versions of SARS-CoV-2 has certainly spurred governments and health systems to act, it has not stopped those variants from spreading. However, knowing about new variants does offer some time to prepare, Peacock says. Many countries rolled out booster vaccinations after scientists established that omicron was better at dodging our immune defenses. “There can’t be any doubt in anyone’s mind that real-time sequencing has had a real impact during the pandemic,” she says.
It’s difficult to quantify the exact impact sequencing has had on hospitalizations or deaths. As one measure, using data from South Africa and elsewhere on how omicron behaved, scientists in the US estimated in early January that doubling the number of boosters given could prevent 41,000 deaths and more than 400,000 hospitalizations by the end of April.
Emma Griffiths, a genomic data expert based at Simon Fraser University in Vancouver, Canada, says genomic surveillance has helped us understand why our diagnostics, vaccines, and therapeutics have become less effective over time, allowing us to update our arsenal of weapons. The pandemic has also built a backbone for sharing genome data across borders, she adds, which could be useful in future outbreaks. And just as important, the global SARS-CoV-2 sequencing boom helped establish sequencing technology in areas that did not have it, or had very little, before the pandemic.
In Peru, genomic sequencing was very “niche” before covid, says Pablo Tsukayama, a microbiologist at Universidad Peruana Cayetano Heredia in Lima. But since April 2020 his lab has received increased funding from the government, which allowed it to discover a new variant of concern, lambda, by the middle of 2021.
While this discovery came too late to prevent the huge wave of illness that lambda caused in Peru, which racked up some of the world’s highest excess mortality rates, he says Peru’s new sequencing capacity will help it track new pathogens coming out of the Amazon rainforest, an area rich in biodiversity where diseases are at high risk of spilling over from animals to humans.
“We need to be able to ring the alarm early,” he says. “That’s where my lab is going.”
Two thousand miles south of Lima, in Argentina’s capital city, Buenos Aires, Josefina Campos is also reaping the rewards of the pandemic’s sequencing boom. Before covid, her genomics lab didn’t actually do surveillance. Sequencing was a retrospective scientific pursuit rather than a tool to assist in public health. Now “everyone wants to sequence,” she says.
“The pandemic put us on the map,” Campos says. She’s in charge of a new institute that will do real-time surveillance, not just of covid but of other diseases, too. One project will closely monitor tuberculosis: cases have been rising by more than 2% annually since 2013 in Argentina after a period of decline. Her lab will study the bacterium to look for drug resistance and help doctors decide which treatments to use.
Campos is also involved in an effort to decentralize sequencing in Argentina through a federal network of sequencing nodes dotted throughout the country. “We cannot wait for the samples to come from northern Argentina to Buenos Aires and then go back,” she says.
Africa is another region that’s seen a rapid scale-up of its sequencing ability, not least thanks to people like de Oliveira. In October 2020, the Africa Centres for Disease Control and Prevention in Addis Ababa, Ethiopia, announced a $100 million pathogen genomics initiative backed by sequencing equipment companies Illumina and Oxford Nanopore, the Bill & Melinda Gates Foundation, and Microsoft. This, like other genomics investment initiatives, will target diseases beyond covid.
The Stellenbosch medical campus where de Oliveira is now based hosts some of the country’s best human geneticists and top experts on local killers like tuberculosis. De Oliveira’s new lab will allow them to sequence samples here that they previously would have had to send outside of Africa, and it will support sequencing across the continent.
In November 2021 CERI held its first sequencing workshop, training participants from 15 African countries to sequence SARS-CoV-2. Since most participants came from countries with more modest laboratory setups, de Oliveira did not bring them to his cutting-edge laboratories to train. Instead, he set up a temporary laboratory in a conference room. Working on a simpler setup showed the workshop participants what’s possible in their own countries, de Oliveira says. Many have hit the ground running and are now sequencing coronavirus samples back home.
The billions of research dollars spent to understand SARS-CoV-2, and the avalanche of data arising from sequencing it, have given scientists a new crack at something that has so far proved elusive: predicting how viruses like SARS-CoV-2 and influenza will evolve.
Every year, scientists pick the flu strain that they think will dominate the following season, which becomes the basis for that year’s vaccine. They get it right only about half the time, and there are recent examples of spectacular failures. Simply put, it’s very difficult to determine which of last year’s flu strains will prove dominant next year just by looking at them.
Now they are trying to use the data describing the evolution of SARS-CoV-2 to make such predictions, with varying results. In a preprint paper released in June 2021, scientists combed the GISAID database and analyzed more than 900,000 versions of SARS-CoV-2’s spike protein—the “key” the virus uses to enter human cells, where many variant-creating mutations occur—to see if the evolution of previous variants could help them pinpoint which mutations might be likely to arise in future ones.
When omicron was discovered six months later, it “pressure tested” the concept, says Amalio Telenti, one of the authors of the paper and chief data scientist at Vir Biotechnology, an immunology company based in San Francisco. This January he and his colleagues confirmed that omicron’s mutations were among those they had identified as likely. However, he says, there’s a big difference between saying that an already observed mutation was likely to arise and predicting which specific mutation will come next.
What makes the latter so tough? Scientists have studied how changing every single building block in the SARS-CoV-2 spike protein affects its characteristics in the lab. But looking at multiple mutations at once makes the task vastly more difficult.
To test every possible combination of two mutations in the spike protein, you’d need to create approximately 800,000 versions. For triple mutations, that number would rise to a billion—an unmanageable sum to test in the lab, at least using current methods.
Other scientists are using artificial intelligence trained on SARS-CoV-2 data to see if that could help develop models for viral evolution. But even if this could give us some general insights, using such methods to predict exactly how a virus will evolve is extremely difficult or even impossible, says Jesse Bloom, an expert in viral evolution based at the Fred Hutchinson Cancer Research Center in Seattle. Viral evolution is believed to be inherently unpredictable, he says, since mutations are random by nature. So if we rewound and let the whole pandemic run again, we’d be unlikely to end up with the same variants as we did this time.
But Bloom argues that even if we can’t predict how the virus will evolve, the genomic data we collect should be able to help us predict how a given variant is likely to behave.
With omicron, scientists were pretty quick to gauge how it fared against different types of immune defenses by submitting it to lab tests, he notes. However, he says, “we need to get much better, and quicker, at doing it.” He argues that by studying how variants with known sequences act, and using data gathered about them to improve our understanding of how mutations influence viral behavior, we may eventually be able to determine, simply by looking at a new variant’s sequence, whether it’s likely to be more infectious or cause more dangerous symptoms.
At the moment there’s a “giant neon highlighter” on SARS-CoV-2 and covid, says Jeremy Kamil, a virologist based at Louisiana State University Health Shreveport. But everything being done to study the pandemic can also help us understand parasitic diseases like malaria, antimicrobial resistance, disease-causing fungi and molds, and viruses like RSV, a respiratory infection that can be fatal for infants. Surveillance systems for these pathogens could reduce their impact, saving both money and lives.
Predicting the behavior of bacteria from a genetic sequence could be even more complicated than doing so for a virus, says David Aanensen, director of the Global Health Research Unit on Genomic Surveillance of Antimicrobial Resistance, based at the University of Oxford. For one thing, bacteria are much more complex than viruses, and they have more complicated ways of replicating.
Still, the basic idea behind surveillance of bacterial genomes is the same as for viruses, he says—the goal is to identify concerning or drug-resistant variants, map their spread, and use genomic data to help us understand the cat-and-mouse game they play with the drugs we use to fight them. Perhaps drug-resistant germs could be spotted early enough to be eliminated before they become endemic. “We almost needed the pandemic to kick-start this work,” Aanensen says.
Today, though, nowhere near enough of it is being done, Kamil says. That’s not because it’s prohibitively expensive, he adds, but because it is held back—in the US at least—by a lack of suitable regulation, a reluctance to share data, and an absence of leadership. “It’s like we are wandering around in the dark,” he says.
Sharon Peacock, in the UK, thinks genomic sequencing should be used to inform vaccine development and rollout for all circulating viruses and harmful bacteria. She says this should be easier to do with all the new infrastructure put in place to deal with covid. She also hopes sequencing technology will move from labs into hospitals, where it could provide real-time data on things like how drug-resistant infections spread. That’s feasible thanks to advances in portable and easy-to-use sequencing machines, and it could improve infection control and patient care.
Even though the technology has become much cheaper, this will be expensive. But Oxford’s Aanensen believes some of the standard laboratory testing done routinely for bacteria like Streptococcus pneumoniae, which can cause pneumonia, meningitis, and sepsis, could be replaced by sequencing, offsetting some of the cost. “There’s a huge investment, but the value is immense,” he says. “You can get on top of and preempt new lineages, and guide the development of new interventions.”
In South Africa, de Oliveira plans to spend the first half of 2022 putting his new lab through its paces. Then he wants to raise $100 million for CERI, half of which he hopes will fund sequencing for research projects—anywhere in the world—that would not be able to afford it otherwise. Research that promises an immediate impact on health and policy will take precedence, he says.
“What we do well in South Africa, perhaps better than anywhere else in the world, is nonlinear sequencing. Things that have an impact on public health should always be sequenced first,” he says. And he’s optimistic that the momentum built up around sequencing will last well beyond the pandemic.
Correction: An earlier version of this article misstated Jeremy Kamil’s affiliation and title.
Deep Dive
Biotechnology and health
FDA advisors just said no to the use of MDMA as a therapy
The studies demonstrating MDMA’s efficacy against PTSD left experts with too many questions to greenlight the treatment.
Biotech companies are trying to make milk without cows
The bird flu crisis on dairy farms could boost interest in milk protein manufactured in microorganisms and plants.
Is this the end of animal testing?
Researchers are increasingly turning to organ-on-a-chip technology for drug testing and other applications.
What’s next for MDMA
The FDA is poised to approve the notorious party drug as a therapy. Here’s what it means, and where similar drugs stand in the US.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.