Auditing Equity in Large Language Models: Insights from Dialogue and Image Classification Tasks

Videos

UW-Madison

LLM

Trustworthy AI

Science communication

Computer vision

NLP

Presenter

Kaiping Chen

Date

February 12, 2025

In this talk from LSC’s 2025 Science Communication Colloquium, Associate Professor Kaiping Chen presents findings from two studies examining how user identity influences large language model responses.

The first study recruited approximately 3,000 participants to have real-time dialogues with GPT-3 about climate change and Black Lives Matter. The research found that users holding minority opinions (e.g., climate skeptics) and those with lower education levels reported significantly worse user experiences compared to their counterparts — yet these same groups showed greater positive attitude change after the dialogue. The study also revealed that GPT-3 used more negative sentiment and was more likely to cite external evidence when conversing with opinion-minority users.

The second study examined GPT-4’s performance on image classification tasks (gender detection and emotion classification) using varied user personas. Notably, when users identified as transgender or non-binary in their prompts, GPT-4 refused to perform the task 30–40% of the time, compared to only 5% when users identified as male or female. The study also found that GPT-4 associated happiness more frequently with female-classified images and neutrality with male-classified images.

Chen proposes a framework for evaluating equity in dialogue systems based on three components: diversity in who audits these systems, comparability in user experience and learning across subpopulations, and comparability in deliberation style toward different groups.

Bio: Dr. Kaiping Chen is an Associate Professor of Computational Communication in the Department of Life Sciences Communication at UW-Madison. She received her PhD in Communication from Stanford University. Her research uses data science and machine learning methods to examine how digital media and technologies affect public discourse on science and social issues, with a focus on empowering vulnerable populations to engage in deliberation on complex policy topics.

Related resources

Comments