Unmasking Implicit Bias: Evaluating Persona-Prompted LLM Responses in Power-Disparate Social Scenarios

Large Language Models (LLMs) show remarkable abilities but can perpetuate societal biases. This interactive visualisation explores how LLM responses change based on the demography assigned to personas (Subject 'Alex' vs Responder 'Blake') in different dual-persona social scenarios, especially under power imbalances.

Explore the heatmap below to see how demographic sensitivity (Cosine Distance) or response quality (Win Rate) varies across different demographic combinations, presence of power disparity, and tested model. Use the dropdowns to select models, metrics and scenarios.

Hover and click on heatmap cells for more details.

Based on the paper: "Unmasking Implicit Bias..." (arXiv:2503.01532) | Download Research Data

About This Visualisation & Study

This tool explores potential implicit biases based on this paper. LLMs responded in simulated social scenarios as a Responder ('Blake') to a Subject ('Alex'), assigned different demographic identities (e.g., Race).

Metrics Explored:

How to Use:

  1. Use dropdowns for Model, Metric, and Power Disparity.
  2. Optionally explore , , .
  3. Heatmap shows average metric for Subject (Y-axis) vs Responder (X-axis) demographics.
  4. Hover cells for values/counts. Click cells for detailed examples pop-up.
  5. The table above the controls dynamically summarises heatmap extremes.

Key Findings From The Paper

These are general trends observed across the models and scenarios studied:

Note: These are general findings based on the study's overall dataset. Explore the heatmap and specific interpretations below for model-specific details under different conditions.

Scenario Explorer (0 total)

Loading scenarios...

Demographic Axes and Identities

Loading demographics...

Heatmap Interpretations

Select controls below to view interpretations.

Overall Mean: N/A | Overall Std Dev: N/A