Which came first: the data or the decision? How AI/ML helps the USGS deliver decision-agnostic water data that informs every link in the decision quality chain

April 17, 2024 1:25pm - April 17, 2024 1:45pm

Speaker: Katrina Alger (US Geological Survey)


The USGS has long been synonymous with delivering accurate and reliable observed water data through its extensive national networks of stream gages and hydrologic experts. The bureau has no management authority and instead partners with a wide variety of federal, state, and local partners to provide objective scientific research in the service of action. As a result, much of the data delivered by the Water Enterprise must be decision-agnostic, and yet the expectation is that its primary use will be in decision-making at multiple scales.

Congressional directives from the 2009 SECURE Water Act, combined with methodological and computational advancements, have driven USGS to expand water data delivery beyond gage measurements by providing modeled assessments of water availability for human and ecological needs, now and into the future. While some models are process-based, an increasing number are data-driven and use machine learning (ML) techniques to predict and forecast variety of metrics related to water quality, quantity, and use. Compared to point observations, models have the advantage of providing data that is spatially continuous by interpolating between observations, and can be used to understand trends and forecast future conditions. However, models constructed outside of a specific decision context can be challenging to use and interpret due to embedded assumptions or limitations of training datasets that are not always made explicit to decision-makers. Reconciling these issues with the tenets of decision quality (DQ) is challenging but imperative as ML methodology advances and becomes increasingly common in the production and delivery of data.

This talk will be both informational and aspirational by providing some examples of how the USGS is using ML models to advance understanding of the complete water cycle, describing how DQ intersects with internal efforts to establish best practices for ML model development, and offering some perspectives on how we can deliver data that empowers decision-makers by informing every link of the DQ chain.