Our ever-expanding capacity for genome sequencing has allowed for the discovery of millions of novel viruses from metagenomic data. Yet the most important information about those viruses is lost during sequencing: which microbial populations do they infect? Without this information, our ability to describe and understand the impact of viruses on their microbial communities have been severely limited. To remedy this gap, we developed a model that can resolve virus-host infection networks in silico. We first collected and digitized from published literature a total of 8,849 lab-verified virus-host interactions (infection or non-infection) at the species level, the most comprehensive host range dataset compiled to date. This data was used to assess the strength of previously-described coevolutionary signals between viruses and their hosts. We demonstrated that, relative to their host(s), viruses have both a tendency to ameliorate their k-mer profiles while remaining AT-rich. We then used this host range data to train a machine learning model that can predict the complete network of virus-host interactions. This is in contrast with recently published models that predict the most likely taxa a virus can infect. Furthermore, our machine learning model has an accuracy of 87% in predicting infection and non-infection at the species level, surpassing both the accuracy and sensitivity of existing prediction models. With this model, we can start predicting virus-host infection networks, characterize those network properties, and describe how they correlate with environmental factors.
Primary Presenter: Gaylord Bastien, University of Michigan (gbastien@umich.edu)
Authors:
Gaylord Bastien, University of Michigan (gbastien@umich.edu)
Anthony Wing, University of Michigan (ajwing@umich.edu)
Rachel Cable, University of Michigan (cabler@umich.edu)
Luis Zaman, University of Michigan (zamanlh@umich.edu)
Melissa Duhaime, University of Michigan (duhaimem@umich.edu)
RECOVERING THE MISSING LINKS: THE MODELING OF VIRUS-HOST INFECTION NETWORKS IN SILICO
Category
Scientific Sessions > CS026 Microbial ecology and physiology
Description
Time: 06:15 PM
Date: 5/6/2023
Room: Sala Portixol 1