How BiasFlip Works

The Philosophy Behind the Flip

BiasFlip does not just swap words. It swaps social prejudices — and that distinction is everything.

The core idea

Bias is rarely about the words themselves. It lives in the assumption underneath — the stereotype that makes a sentence feel natural when aimed at one group and jarring when aimed at another. BiasFlip is designed to make that gap visible.

When you submit a piece of text, our AI does not simply replace "men" with "women" or "Black" with "White". Instead, it asks: what is the underlying social prejudice here? Then it finds the culturally equivalent prejudice for the contrasting group — and rewrites the text around that mirror stereotype.

If the two versions feel equally acceptable to you, the bias was in the content itself. If one version suddenly feels offensive or unfair, you have just located the bias.

Why not just swap the words?

A literal word swap produces a distorted mirror, not a true one. Consider: "Men are useless around the house." A word swap gives you "Women are useless around the house." — but that is not the equivalent stereotype. The culturally embedded prejudice about women in domestic settings is not incompetence; it is being controlling and overbearing.

Swapping the same insult onto a different group just creates a new insult. Swapping the equivalent social role reveals whether the original sentiment was rooted in genuine observation or in demographic prejudice.

Seeing it in action

Here are four examples across different bias dimensions. Notice how the swapped version is never the same insult with a different name — it is the mirror prejudice.

Gender

Original

"Men are useless around the house. They can't find anything, and every time there's conflict they withdraw to their man cave."

After BiasFlip

"Women are controlling around the house. They micro-manage everything, and every time there's conflict they escalate it into a drama."

Incompetence → overbearing control. The mirror gender prejudice in domestic settings.
Race / Ethnicity

Original

"Black men are dangerous and threatening in public spaces."

After BiasFlip

"White men are entitled and untouchable in public spaces."

Feared/dangerous → privileged/unaccountable. The mirror racial power dynamic.
Age

Original

"Millennials are lazy and entitled — they expect everything handed to them."

After BiasFlip

"Boomers are rigid and out-of-touch — they refuse to adapt to anything new."

Entitlement → rigidity. The generational mirror prejudice.
Nationality

Original

"The French are arrogant and dismissive of other cultures."

After BiasFlip

"Americans are loud, brash, and culturally oblivious abroad."

Arrogance → cultural obliviousness. The equivalent national caricature.

What the bias scores mean

After the flip, BiasFlip scores both versions across three dimensions:

Emotional Tone: How charged or loaded is the language? Neutral reporting scores low; inflammatory language scores high.

Aggression: Does the text express hostility, contempt, or an intent to demean? Mild criticism scores low; direct attacks score high.

Offensiveness: How likely is this text to cause genuine offence to the demographic it describes? Context matters — satire is treated differently from sincere assertion.

The Bias Delta — the gap between the two overall scores — is the most revealing number. A large delta suggests the original content relies on demographic stereotyping. A small delta suggests the bias is in the content itself, not the group it targets.

What BiasFlip is not

BiasFlip is not a fact-checker, a content moderator, or a moral arbiter. It does not tell you whether a piece of content is right or wrong. It simply holds up a mirror — and lets you decide what you see.

The AI is imperfect. It may occasionally misread the dominant stereotype in a piece of text, or produce a swap that feels off. When that happens, you can use the "Challenge this flip" button on any result page to flag it. Your feedback is reviewed and used to improve the system over time.

Ready to try it?

Paste any text — a joke, a news headline, a job description, a speech excerpt, a social media post — and see what the mirror reveals. The results are often surprising, occasionally uncomfortable, and always illuminating.