WattBot 2025: Estimating AI Emissions with RAG

Projects
ML Marathon
MLM25
RAG
Retrieval
LLM
NLP
Sustainability
Energy
GenAI
Author

Chris Endemann

Published

September 9, 2025

WattBot was an “Active” challenge in the 2025 Machine Learning Marathon (MLM25). Teams built retrieval-augmented generation (RAG) systems to extract credible, citation-backed emissions and cost estimates for AI workloads from a corpus of 35+ peer-reviewed papers and 300+ curated Q&A pairs. Systems were expected to return citation-grounded answers or explicitly abstain when evidence was missing – promoting transparency and reproducibility in sustainability reporting.

Challenge design

  • Task: Given a natural-language question about AI energy use, water consumption, or carbon emissions, retrieve relevant passages from the provided corpus and generate a citation-backed answer.
  • Evaluation: Answers were scored on factual accuracy, proper citation, and appropriate abstention when evidence was insufficient.
  • Corpus: 35+ academic papers covering AI sustainability, energy benchmarking, and environmental impact reporting.

Winning approach

The winning solution by KohakuBlueleaf used a RAG pipeline that was later replicated and deployed in both AWS Bedrock and locally with open-source Hugging Face models. See the follow-up talk below for deployment details.

Questions

If you have any lingering questions about this project, please feel free to post to the Nexus Q&A on GitHub. We will improve materials on this website as additional questions come in.