With ocean and freshwater ecosystems facing stresses like climate change, species declines, introduced species, pollution, and expanding resource extraction, there is a pressing need for ecological forecasting. Fields like harmful algal bloom (HAB) forecasting and real-time endangered marine mammal prediction are increasingly moving toward machine learning methods, which extract the quantitative relationships between predictors and response variables in a training dataset and use these relationships to provide predictions. Alternatively, process-based or simulation-based modelling uses differential-equation systems to describe physical transport, environment-dependent growth and mortality, and so on, sometimes generating predictions of many ecosystem components at once, intended to be built from first principles. Ecological predictions need to be 1) accurate and feasible in the short term given the constraints of usually incomplete data, 2) robust in the face of emerging, novel combinations of environmental conditions under climate change, and 3) accountable to the stakeholders influenced by the predictions. Unfortunately, these requirements often pull in opposite directions, and tradeoffs are reflected in the choice of algorithm. Machine learning techniques are usually easier to implement than process models and often generate predictions that are much more accurate, but may fail in a shifting environment given their reliance on a constrained set of relationships. The problem of future-proofing — basing predictions on fundamental and enduring relationships, not circumstantial and ephemeral ones — encourages us to continue to explore and refine process-based models, although in practice the robustness and accuracy of these models is limited by missing processes and the difficulties of parameter tuning. There are also often tradeoffs between model interpretability and predictive skill. In this session, we invite studies that are exploring novel approaches to combining machine learning and process-based methods in ecosystem and event prediction in ocean and freshwater systems, or finding ways to make machine learning approaches more biologically and oceanographically interpretable. Topics may include HAB risk, species range shifts, hypoxia events, heatwaves and extreme phenomena, or biogeochemical interactions.
Lead Organizer: Neil Banas, University of Strathclyde (neil.banas@strath.ac.uk)
Co-organizers:
Neil Banas, University of Strathclyde (neil.banas@strath.ac.uk)
Johnathan Evanilla, Bigelow Laboratory for Ocean Sciences (jevanilla@bigelow.org)
Bingzhang Chen, University of Strathclyde (bingzhang.chen@strath.ac.uk)
Clarissa Anderson, Scripps Institution of Oceanography / SCCOOS (cra002@ucsd.edu)
Rafael Marcé, Catalan Institute for Water Research (ICRA) (rmarce@icra.cat)
Presentations
03:00 PM
INFERRING THE MECHANISMS AND TRADE-OFFS THAT GOVERN ECOLOGICAL DYNAMICS FROM TIME SERIES (6371)
Primary Presenter: Mridul Thomas, University of Geneva (mridul.thomas@unige.ch)
All ecophysiological processes depend strongly on the environment. This poses a major challenge to our attempts to model and forecast dynamics because (1) we need to incorporate the environmental dependence in process-based models, (2) the environment is multi-dimensional, and (3) the experiments needed to parameterise this multi-dimensional dependence are too large and complex. How do we solve this problem? I propose that time series datasets and flexible modelling approaches (including machine learning) can provide imperfect but usable parameterisations. I demonstrate using a 20-year plankton time series from Lake Constance that we can use interpretable machine learning methods to obtain realistic estimates of how growth varies across multiple dimensions: temperature, light, nutrients, predators. The 4-dimensional ‘growth response surfaces’ are consistent with lab experiments, and can be used as input in process-based models. Furthermore, they can be used to derive important ecological inferences. Comparing how all species perform across environmental parameter space, I find no evidence for any of the commonly hypothesized trade-offs believed to govern ecological dynamics, including gleaner vs. opportunist, growth rate vs. competitive ability, or nutrient competitive ability vs. light competitive ability. Instead, I find evidence for a more complex and unexplored co-existence mechanism. Trade-offs appear to be multidimensional, suggesting that ignoring any of these environmental dimensions will lead to incorrect inferences and forecasts of dynamics and biodiversity.
03:15 PM
Model Enabled Machine Learning: Predicting Non-linear dynamics with Theory and Data (4854)
Primary Presenter: Emerson Arehart, University of Pennsylvania (eejjaa@gmail.com)
If machine learning were around a hundred years ago would the Lotka-Volterra equations have been created to describe predator-prey interactions? Or, would scientists have relied on black box regression models to make their predictions, and missed the simple rules underlying nature? Recent advances in computer science have made it possible for machines to learn the processes underpinning the complex dynamics that we observe in ecosystems. The “automated discovery” of the laws of nature through machine learning is an exciting and new area of growth in ecology and environmental science, but there are still many lessons to be learned. Here, I will discuss how recent advances in machine learning, techniques such as symbolic regression, Neural Ordinary Differential Equations and Hybrid Modeling among others – can be used to learn about and predict non-linear dynamics in ocean and lake ecosystems. Particular focus will be placed on predicting freshwater harmful algal blooms, to help inform water quality stakeholders tasked with minimizing the impacts of harmful algal blooms, and climate change driven regime shifts in coral ecosystems.
03:30 PM
Environmental and ecological drivers of harmful algal blooms revealed by automated underwater microscopy (7290)
Primary Presenter: Kasia Kenitz, UC San Diego, Scripps Institution of Oceanography (kkenitz@ucsd.edu)
Harmful algal blooms (HABs) have increased in their severity and extent in many parts of the world and pose serious threats to local aquaculture, fisheries, and public health. In many cases, the mechanisms triggering and regulating HAB events remain poorly understood. Using underwater microscopy and Residual Neural Network (ResNet-18) to taxonomically classify imaged organisms, we developed a daily abundance record of four potentially harmful algae (Akashiwo sanguinea, Chattonella spp., Dinophysis spp., and Lingulodinium polyedra) and major grazer groups (ciliates, copepod nauplii, and copepods) from August 2017 to November 2020 at Scripps pier, a coastal location in the Southern California Bight. Random Forest algorithms were used to identify the optimal combination of environmental and ecological variables that produced the most accurate abundance predictions for each taxon. We developed models with high prediction accuracy for A. sanguinea, Chattonella spp., and L. polyedra, whereas models for Dinophysis spp. showed lower prediction accuracy. Offshore nutricline depth and indices describing climate variability, including ENSO, PDO, and NPGO, that influence regional-scale ocean circulation patterns and environmental conditions, were key predictor variables for these HAB taxa. Ciliate abundance was an important predictor of Chattonella and Dinophysis spp., but not of A. sanguinea and L. polyedra. Our findings indicate that combining regional and local environmental factors with microzooplankton populations dynamics can improve real-time HAB abundance forecasts.
03:45 PM
COMPARING PROCESS-BASED AND MACHINE LEARNING ECOLOGICAL PREDICTION OF TOXIC ALGAE IN COASTAL MAINE (6932)
Primary Presenter: Johnathan Evanilla, Bigelow Laboratory for Ocean Sciences (jevanilla@bigelow.org)
Annual occurrence of paralytic shellfish poison (PSP) across the coast of Maine poses a challenge for fishery managers who implement regional harvesting closures to protect human health. Similarly, shellfish growers and harvesters also must make decisions to operate their businesses in the face of these closures. Two process-based predictive models exist that deliver seasonal and weekly Alexandrium catenella bloom potential forecasts. A more recently developed machine-learning model predicts the probabilistic risk of PSP accumulation in shellfish at a site-specific, weekly timescale. The latter model was developed with shellfish industry members and managers, to produce the most usable forecast possible. Through two seasons of delivering predictions in an experimental mode, the forecast has achieved high accuracy and received positive feedback from its users. Both the process-based and machine learning models provide important, but different insights for their users. While the process-based models capture a suite of environmental conditions that may lead to blooms of A. catenella, high cell concentrations cannot always predict spikes in toxicity. On the other hand, the machine learning model excels at predicting toxicity at a finer (weekly) timescale, but loses skill with longer lead times (> two weeks). Combining the two model types is being explored and will be discussed.
04:00 PM
Using machine learning (MLs) techniques to predict the occurrence of diarrhetic shellfish poisoning in Irish produced mussels (5734)
Primary Presenter: Xiyao Wang, University College Dublin (wang.xiyao@ucdconnect.ie)
Diarrhetic shellfish poisoning toxins (DSTs) are biotoxins produced by several dinoflagellates Dinophysis species. Their accumulation in shellfish, especially in mussels, may pose significant public health issues. To protect consumers from diarrhetic shellfish poisoning (DSP), when mussels having DSTs higher than regulatory limits are detected, production areas are typically closed temporarily, and harvested stock need to be discarded, causing a financial burden on the industry. Predicting the occurrence of DSTs contamination in mussels will not only help with harvest timing management but also improve public health safety. Previously, a Bayesian Networks (BNs) model has been developed which can provide short-term prediction on DSP toxins variation in mussels from Bantry Bay, Ireland(Wang et al., 2022). The model's inputs were weekly plankton density in seawater, DSP toxin concentration in mussels from ten production sites in Bantry Bay, and sea surface temperature. In this study, environmental factors (including wave period, sea temperature, and air temperature) were added into the BNs model to assist better prediction accuracy. The model was re-trained with data from 2014 to 2018 and validated with data from 2019. Validation results showed a better performance than the previous BNs model, which is higher than 90%. For the class of DSP concentration higher than the regulation limits, the prediction accuracy is improved from 73% to 87%.
04:15 PM
Modeling the vertical distributions of Microcystis aeruginosa: from data to theory and back again (4774)
Primary Presenter: Jackie Opfer, Augustana College (jackieopfer@augustana.edu)
Predicting vertical distributions of the harmful algae Microcystis aeruginosa is an important and challenging task. There are complicating factors of various origin (biological and physical) working across time scales (sub-daily to seasonal) and length scales (micrometers to meters). To elucidate the drivers of Microcystis vertical distributions, research has been conducted in three parts: (i) an exploratory field study of Microcystis vertical distributions, (ii) a theoretical model explaining previous observations, and (iii) a targeted field study to calibrate theoretical model parameters. The exploratory field study utilized a long-duration, high-frequency research station in a stratified and eutrophic lake. Using a combination of dimensional analyses and machine learning techniques, results indicated that subsurface Microcystis concentration peak magnitude and location were significantly mediated by lake thermal structure. A novel theoretical model to explain these observations was subsequently derived, coupling lake hydrodynamics with Microcystis motility and colony dynamics in a one-dimensional advection-dispersion-aggregation model. Results demonstrated vertical transport is highly dependent on Microcystis colony size, which is in turn dependent on wind-induced mixing. To calibrate the theoretical model, a field study is underway to relate wind intensity to Microcystis vertical transport and colony formation rates. The iterative nature of this work—from data to theory and back again—will also be discussed in a broader context of ecological modeling.
SS121A Combining Machine Learning and Process-Based Models in Ecological Prediction
Description
Time: 3:00 PM
Date: 6/6/2023
Room: Sala Menorca A