AI Bias
Explained
AI doesn't invent bias. It inherits it from the data it was trained on — and then scales it. Here's how bias enters AI systems, real cases where it caused harm, how it's measured, and what "fixing it" actually requires.
00 — What AI bias actually means
AI bias occurs when a model produces systematically skewed outputs that disadvantage certain groups — usually reflecting patterns in training data that encode historical inequity. The AI didn't decide to discriminate. The data it learned from reflects societies that did.
This matters because AI systems are increasingly used to make or inform consequential decisions: who gets a job interview, who qualifies for a loan, who receives medical care, who gets flagged by law enforcement. When those systems carry bias, the consequences aren't statistical artefacts — they're real harms to real people. And because AI operates at scale, a biased algorithm can discriminate against thousands of people before anyone notices.
The "garbage in, garbage out" framing undersells the problem. It's not just that biased data produces biased models. It's that AI models can amplify biases beyond what was in the original data, making them worse over time — especially in feedback loop situations where model outputs shape the data that trains the next model. MITCH
01 — Four types of AI bias
Bias enters AI systems in different ways at different stages of development. Understanding the source determines how it should be addressed.
The training data reflects historical inequities — not because it was chosen badly, but because history was unequal. A model trained on historical hiring decisions will learn that men were hired more often, because they were — but for reasons that have nothing to do with merit.
Example: Amazon's hiring AI trained on 10 years of hiring data in a male-dominated industry learned to downgrade women's resumes.
Some groups are underrepresented in training data. The model performs worse on them because it has seen fewer examples. Facial recognition trained predominantly on white faces has higher error rates for darker-skinned faces.
Example: NIST found false positive rates 10-100x higher for Black and East Asian faces in commercial facial recognition systems.
The outcome being predicted is itself a flawed proxy for the underlying concept. "Credit risk" modelled on past defaults inherits historical barriers to credit that made certain groups look riskier than they were.
Example: Healthcare cost algorithms used "future cost" as a proxy for "health need" — but Black patients had lower costs because they received less care, not because they were healthier.
A model trained on aggregate data may not work well for subgroups with different patterns. A diabetes prediction model trained on general population data may perform worse for specific ethnic groups with different risk profiles.
Example: General medical AI models often have significantly reduced accuracy for minority populations who weren't adequately represented in training cohorts.
02 — Documented cases of AI bias causing harm
| System | Bias found | Outcome |
|---|---|---|
| Amazon hiring AI (2018) | Systematically downgraded resumes containing the word "women's" (e.g., "women's chess club") | System scrapped. AMZN |
| COMPAS recidivism algorithm | Black defendants twice as likely to be falsely flagged as high risk for reoffending | Ongoing use despite documented bias; legal challenges. PROP |
| Healthcare cost algorithm (Optum) | Systematically assigned Black patients lower health need scores than equally sick white patients | Bias discovered and partially corrected. 17,000 patients were under-served. OBER |
| Facial recognition (multiple) | False positive rates 10-100x higher for Black and East Asian faces vs white faces | Several US cities banned government use. Detroit PD paused use after wrongful arrest. NIST |
| Apple Card (Goldman Sachs) credit algorithm | Women received significantly lower credit limits than men with similar credit profiles | NY DFS investigation; Goldman settled without finding deliberate discrimination. NYFS |
03 — Key questions answered
Yes, in ways that have been documented. Studies have found ChatGPT produces more negative associations for Arab names than English names, generates higher-income professional roles for men and lower-income roles for women in hypothetical scenarios, and produces responses that reflect cultural biases in its predominantly English-language training data. OpenAI acknowledges bias as an ongoing challenge and publishes research on its mitigation efforts. But any model trained on human-generated text will reflect human biases to some degree — the question is whether those biases are measured, disclosed, and actively reduced.
Partially. Bias can be reduced through careful data curation, diverse training sets, fairness constraints during training, post-processing adjustments, and regular auditing. But there are mathematical limits: some fairness criteria are provably incompatible with each other — you can't simultaneously satisfy all commonly used fairness definitions. Reducing bias completely would require eliminating historical inequity from training data, which would require rewriting history. The practical goal is measurable improvement, transparency about remaining bias, and avoiding deployment in contexts where the bias creates serious harm.
Several approaches: counterfactual testing (changing protected attributes like race or gender and observing if outputs change), disparity analysis (comparing error rates or outcome rates across demographic groups), audit studies (sending matched applications through AI hiring systems), and red-teaming (adversarially testing for biased outputs). The EU AI Act requires bias testing for high-risk AI systems. Third-party auditing firms like Parity AI and Fairly AI have emerged to provide independent bias assessment.
The solution is measurement, not denial.
Every organisation deploying AI in consequential decisions should be able to answer three questions: What bias testing was done before deployment? What are the known error rate disparities across demographic groups? What is the process for identifying and correcting bias post-deployment? The organisations that can't answer these questions are deploying blind.
As AI moves deeper into healthcare, hiring, lending, and criminal justice, the stakes of getting this wrong increase. The good news is that the tooling for bias measurement has improved significantly. The question is whether the organisations deploying AI choose to use it.
AI intelligence,
weekly.
Every week: the AI developments that matter, the tools worth trying, and the data behind the headlines. No hype. No filler.
Subscribe free →