EvalEval Coalition

Long Site Banner

RESEARCH COMMUNITY

EvalEval Coalition

We are a research community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations.

Explore Research Join Our Community

Current Projects

RESEARCH, INFRASTRUCTURE & ORGANIZATION

Benchmark Saturation

This project aims to investigate how to systematically characterize the complexity and behavior of AI benchmarks over time, with the overarching goal of informing more robust benchmark design. The ...

Learn more →

Evaluation Cards

This project addresses the need for a structured and systematic approach to documenting AI model evaluations through the creation of "evaluation cards," focusing specifically on technical base syst...

Learn more →

Evaluation Harness and Tutorials

The Eleuther Harness Tutorials project is designed to lower the barrier to entry for using the LM Evaluation Harness, making it easier for researchers and practitioners to onboard, evaluate, and co...

Learn more →

View All Projects

Latest Research

BLOG & PUBLICATIONS

November 12, 2025

The Hidden Social Costs of AI

As AI continues to grow more powerful, who carries the hidden social costs of its effects?

AI Evaluation LLMs Social Impact

The AI Evaluation Chart Crisis

The AI Evaluation Chart Crisis

This past week, Anthropic and OpenAI drew attention with the release of their latest AI models, Claude Opus 4.1 and G...

Evaluation Science Metrics Transparency LLMs Reliability Benchmarking Ethics Integrity Product Business Visualization Documentation

The Science of Evaluations: Workstream Kickoff Post

The Science of Evaluations: Workstream Kickoff Post

Rigorous evaluations provide decision makers with detailed information about the capabilities, risks, and opportuniti...

Evaluation Science Metrics Validity LLMs AI Evaluation Reliability Robustness

View All Blog Posts

Upcoming Events

WORKSHOPS, MEETUPS, & MORE

View All Events

Join Our Community

Researchers, practitioners, and students are welcome to contribute to our mission.

Interested in Joining?

Send us an email to learn more about getting involved with our community and working groups.

CONTACT INFO

EMAIL

evalevalpc@googlegroups.com

WORKING GROUPS

Research

Infrastructure

Organization

HOSTED BY

HuggingFace

University of Edinburgh

EleutherAI

FOLLOW US