2025 Workshop on Evaluating AI in Practice
Bridging Statistical Rigor, Sociotechnical Insights, and Ethical Boundaries
Key Details
Date: December 8, 2025
Location: UCSD, San Diego, California
Hosted by: Eval Eval, UK AISI, and UC San Diego (UCSD)
About the Workshop
We’re excited to announce the upcoming Evaluating AI in Practice workshop, happening on December 8, 2025, in San Diego.
This full-day event will explore how to evaluate AI systems responsibly and effectively by bridging three essential dimensions:
Statistical Methods – Techniques to quantify uncertainty, aggregate evaluation data, and estimate latent model capabilities while ensuring reliability and validity.
Sociotechnical Perspectives – Understanding task selection, societal impacts, and the implications of evaluation results for downstream applications.
Evaluating Evaluation Results – Translating evaluation outcomes into meaningful insights about model capabilities, risks, and potential downstream impacts.
The workshop will feature a keynote by Stella Biderman, followed by interactive sessions designed to connect technical methods with broader ethical and social considerations in AI evaluation.
Due to limited space, general attendance/registration is by RSVP only. Please note that attendance will be confirmed by the organizers based on space availability. Exact location and confirmation will be provided 2 weeks before the workshop. RSVP at: https://luma.com/ngj395u2
Further details, including the full program and additional speakers, will be announced soon.
Note: This satellite event is not officially affiliated with NeurIPS.
Call for Extended Abstracts
We are pleased to invite you to participate in the 2025 Workshop on Evaluating AI in Practice: Bridging Statistical Rigor, Sociotechnical Insights, and Ethical Boundaries, on the sidelines of NeurIPS, co-hosted by EvalEval, UKAISI, and UCSD.
Submissions should highlight ongoing or proposed research related to the workshop theme and topics. This is an incredible opportunity for researchers, especially emerging scholars, to get involved, share early-stage work, and develop new connections. Extended abstract submissions are invited from students and researchers across multiple interdisciplinary disciplines, including decision science, cognitive science, computer science, machine learning, and related fields. Topics include, but are not limited to:
- Statistical Methods - Techniques to quantify uncertainty, aggregate evaluation data, and estimate latent model capabilities while ensuring reliability and validity.
- Sociotechnical Perspectives - Understanding task selection, societal impacts, and the implications of evaluation results for downstream applications.
- AI Redlines and Ethical Boundaries - Identifying and assessing AI system risks to prevent harmful behavior, ensure safety, and align with global accountability frameworks.
- Evaluating Evaluation Results - Translating evaluation outcomes into meaningful insights about model capabilities, risks, and potential downstream impacts.
- Benchmarking and standardization of evaluation protocols
- Analysis of existing evaluation methods or new proposals
Abstracts must be submitted by November 20th, 2025 AOE. They should be a maximum of 500 words and include the title, authors, and affiliations. All submissions will be evaluated on the basis of their technical content and relevance to the workshop. Selected abstracts will be presented as posters during an interactive session. The primary author will receive free registration and be invited to attend the workshop in person. Submit your abstract here.
Abstract Submission Deadline: Nov 20th, 2025
Notification Date: Nov 25th, 2025
Workshop Date: Dec 8th, 2025
Location: University of California, San Diego (UCSD)