FARR RCN hosts the FAIR in ML, AI Readiness, & Reproducibility (FARR) Workshop
- lbschreiber
- 3 days ago
- 3 min read
April 28, 2026
Written by Lynne Schreiber and Julie Christopher
To share advances, best practices, and lessons learned across the FAIR in Machine Learning, AI Readiness, and Reproducibility Research Coordination Network (FARR RCN), researchers from the San Diego Supercomputer Center (SDSC), Scripps Institution of Oceanography (SIO), North Carolina State University (NCSU), and National Center for Supercomputing Applications (NCSA) hosted the FARR Workshop on April 8–9, 2026, in Washington, D.C., at the American Geophysical Union Conference Center. The workshop convened 75 participants from academia, national laboratories, and federal agencies to highlight progress in making data and AI systems more FAIR, AI-ready, and reproducible across domains.

In the “Advancement Across Domains” session, moderated by Karen Stocks, speakers showcased domain-specific advances. Eric Sokol (NEON, Battelle) presented outcomes from efforts to build AI-ready biodiversity data infrastructure, Michela Taufer (University of Tennessee, Knoxville) discussed extending FAIR principles to support reproducibility at inference time in geoscience, and David Elbert (MaRCN, Johns Hopkins University,) described event-driven, AI-ready data infrastructures that enable autonomous laboratories.
The “Crosscutting Advances” session, led by Daniel S. Katz, focused on shared challenges and solutions. Wesley Brewer (Oak Ridge National Labs) addressed data readiness for scientific AI, Kate Keahey (REPETO, University of Chicago) highlighted practical approaches to reproducibility in computing, and Sean Wilkinson (Oak Ridge National Labs) emphasized the importance of machine-actionable metadata for autonomous AI workflows.
A cross-agency panel representing five national agencies, moderated by Christine Kirkpatrick, brought together perspectives from major U.S. science agencies on advancing FARR-related challenges. Panelists, including Steven Crawford (NASA), Susan Gregurick (NIH), Gretchen Greene (NIST), Alejandro Suarez (NSF), and David Rabson (DOE), outlined priorities to improve data access, integration, and AI readiness, underscoring the need for coordinated, cross-institutional efforts. Meeting agenda and presentation slides.

In conjunction with the workshop, the HDR ML Challenge program, led by Josephine Namayanja (iHARP), Elizabeth Campolongo (IMAGEOMICS) and Philip Harris (A3D3), showcased results from its second FAIR Challenge, which introduced three scientific benchmarks addressing out-of-distribution modeling in neural forecasting, climate prediction using ecological data, and coastal flooding over time. The session featured an overview by the organizers, presentations from winning teams highlighting their approaches, and concluded with an awards ceremony, with awards presented by Amy Walton (NSF).
Facilitated discussion groups advanced FARR community roadmaps by identifying key gaps and priorities across AI readiness, reproducibility, and FAIR practices. Discussions addressed defining and incentivizing AI-ready data; supporting geoscience repositories; tackling reproducibility challenges, including those posed by generative AI; and developing frameworks for FAIR in machine learning. These roadmaps will guide future community led efforts.
A forward-looking session explored knowledge gaps, unmet technological needs, and lessons from prior funding initiatives. Participants emphasized the need for clearer definitions, particularly around what constitutes “AI ready” data, and examined distinctions between FAIR for ML and FAIR for AI. Concerns about sensitive data, governance, and alignment with principles such as FAIR, CARE, and TRUST highlighted ongoing risks and responsibilities.
Workforce development and AI literacy emerged as central themes, with participants calling for improved training, clearer skill pathways, and education aligned with real-world use cases. Infrastructure discussions emphasized the need for integrated, end-to-end platforms that combine data, tools, models, and benchmarking, alongside stronger multidisciplinary collaboration. While automation and agent-based workflows were seen as promising, participants also raised concerns about oversight, safety, and risk amplification.
AI Reproducibility remained a key focus, with attention to evolving definitions, lifecycle evaluation, and the implications of increasingly autonomous systems.
Finally, participants reflected on structural and cultural challenges, including funding mechanisms, collaborative proposal development, and balancing rapid AI adoption with rigor and trust. Overall, the workshop highlighted both optimism about AI’s potential and the need for more coordinated approaches to standards, training, infrastructure, and governance. Input from the workshop will be included in upcoming community roadmaps.
Supported by NSF Awards # 2226453 & 2612718.




Comments