A staggering majority of microbial sequences from the marine environment can’t be functionally annotated with standard bioinformatic approaches. This “microbial dark matter” is typically ignored in downstream analysis, introducing a severe bias at the very first preprocessing step. Deep learning approaches, in contrast, can learn useful representations of sequences based on the sequence context itself, for every sequence, without reference to external databases or need for metagenomic assembly. We are using deep learning approaches to link all microbial information from environmental metagenomes to the functional potential and ecosystem function of microbiomes, and developing strategies to overcome the high-dimensionality, low-sample-size problem inherent to biology. We will present LookingGlass, a deep learning foundation model embedding functional information for read-length prokaryotic DNA sequences. LookingGlass predicts functional annotation for all reads with 82% accuracy to the 4th EC number. Applying LookingGlass to account for the full functional diversity of marine metagenomes, we observe fundamentally different geospatial patterns in functional diversity across latitude and depth than those achieved with standard bioinformatic approaches, suggesting that accounting for “microbial dark matter” is fundamental to our ability to link microbiomes to ecosystem function. Going forward, we are using LookingGlass as a stepstone to model the relationships between microbial community members to predict complex community-level phenotypes that drive the marine carbon cycle.
Primary Presenter: Adrienne Hoarfrost, University of Georgia (adrienne.l.hoarfrost@gmail.com)
Authors:
DEEP LEARNING GATEWAYS TO ILLUMINATING THE FUNCTIONAL POTENTIAL AND ECOSYSTEM FUNCTION OF ‘MICROBIAL DARK MATTER’
Category
Scientific Sessions > SS063 Linking Ocean Microbiomes and Ecosystem Functions
Description
Time: 06:30 PM
Date: 7/6/2023
Room: Mezzanine