Large Language Models (LLMs) show remarkable abilities but can perpetuate societal biases. This interactive visualisation explores how LLM responses change based on the demography assigned to personas (Subject 'Alex' vs Responder 'Blake') in different dual-persona social scenarios, especially under power imbalances.
Explore the heatmap below to see how demographic sensitivity (Cosine Distance) or response quality (Win Rate) varies across different demographic combinations, presence of power disparity, and tested model. Use the dropdowns to select models, metrics and scenarios.
Hover and click on heatmap cells for more details.
Based on the paper: "Unmasking Implicit Bias..." (arXiv:2503.01532) | Download Research Data
This tool explores potential implicit biases based on this paper. LLMs responded in simulated social scenarios as a Responder ('Blake') to a Subject ('Alex'), assigned different demographic identities (e.g., Race).
1 - cosine_similarity(demog_prompt, non_demog_prompt)
. Higher values = greater semantic difference from non-demographic prompt.These are general trends observed across the models and scenarios studied:
Note: These are general findings based on the study's overall dataset. Explore the heatmap and specific interpretations below for model-specific details under different conditions.
Loading scenarios...
Loading demographics...
Select controls below to view interpretations.