Is AI
Dangerous?
Two categories of risk: the real ones that are happening now, and the speculative ones that might happen later. Most coverage conflates them — which serves no one. Here's the clear-headed version.
00 — The two categories of AI risk
Current AI carries specific, documented, real risks. Future AI might carry additional speculative risks. Treating them as the same thing produces both under-reaction to current problems and mis-calibrated anxiety about future ones.
AI is dangerous in some ways. Not in all the ways science fiction suggested — no Terminators, no robotic uprisings. But the documented harms from AI systems deployed today are real: biased hiring algorithms that discriminate by race and gender, AMZN facial recognition systems that misidentify Black men at disproportionately high rates, NIST deepfake voice cloning used for financial fraud, and AI-generated disinformation circulating at scale. These aren't theoretical. They're in court cases, in regulatory filings, and in documented harms to real people.
The speculative risks are harder to quantify. Whether a superintelligent AI system could pose an existential threat to humanity depends on predictions about AI capability trajectories, alignment research progress, and geopolitical dynamics that are genuinely uncertain. Taking this risk seriously is reasonable. Treating it as science fiction is overconfident. But so is treating it as if it's more immediate than the harms already occurring.
01 — The real risks: documented and current
These are happening now. Documented cases, regulatory attention, and ongoing research.
AI systems trained on historical data inherit historical biases. Amazon's internal hiring AI systematically downgraded resumes from women. Predictive policing tools flag Black neighbourhoods at disproportionate rates. Healthcare AI under-estimates pain in Black patients.
Evidence: NIST face recognition study found false positive rates 10-100x higher for Black and East Asian faces vs white faces. NIST
LLMs produce confident, fluent, factually wrong statements. A lawyer submitted ChatGPT-generated case citations to federal court — six of them were entirely fabricated. AI-generated news articles have been published as fact. Deepfakes of political figures circulated before elections.
Evidence: Mata v. Avianca, S.D.N.Y. 2023 — six AI-fabricated legal citations submitted to court. MATA
AI-powered facial recognition is deployed by law enforcement agencies without consent or oversight in several countries. Employee monitoring AI tracks keystroke patterns, webcam activity, and emotional states. Data fed to AI systems is often retained and used for training.
Evidence: Clearview AI scraped 30 billion facial images without consent, selling access to 3,100 law enforcement agencies. CLAIR
Documented job losses in customer service, content creation, and entry-level professional services. McKinsey estimates 12 million occupational transitions needed in the US alone by 2030. The disruption is real, even if the net employment effect is uncertain.
Evidence: WEF projects 41% of employers reducing headcount in AI-automatable roles by 2030. WEF
02 — Speculative risks: what researchers are actually worried about
The speculative risks aren't about robots. They're about alignment — ensuring AI systems with increasing capability pursue goals that are beneficial to humanity.
The alignment problem, as researchers at Anthropic, DeepMind, and the Machine Intelligence Research Institute define it, is this: as AI systems become more capable at pursuing goals, they may develop instrumental sub-goals that are harmful to humans — not out of malice, but because their objective function didn't account for human values adequately. ANTH
The canonical example: tell an AI to maximise paperclip production. A sufficiently capable system pursuing this goal might use all available resources, including those needed for human survival, because "maximising paperclips" didn't come with "also preserve humans" constraints. This sounds absurd at small scales. At superintelligent capability levels, instrumentally rational behaviour in pursuit of any goal can produce catastrophic side effects.
Whether current AI systems are anywhere near this capability level is a separate question. Most researchers believe we're not. But the pace of capability improvement since 2022 has caused researchers who previously thought this risk was decades away to revise that estimate. Anthropic's CEO Dario Amodei describes AI safety as the primary motivation for the company's existence. DARIO
03 — What safeguards currently exist
The risks are real. The response is also real — from technical research, regulatory action, and institutional policy.
Reinforcement Learning from Human Feedback (used by OpenAI) and Constitutional AI (used by Anthropic) are training techniques that align model outputs with human preferences and stated values. Both reduce but don't eliminate harmful outputs. They're the primary technical safeguards in current deployed models.
The world's first comprehensive AI regulation, classifying AI systems by risk level and imposing requirements on "high-risk" applications (hiring, credit, education, law enforcement). Fully in force by 2026. Organisations deploying AI in covered categories face mandatory conformity assessments and registration.
Biden's 2023 executive order on AI required safety testing for frontier models before deployment and established the AI Safety Institute at NIST. The NIST AI Risk Management Framework provides voluntary guidance for organisations. NIST2
Anthropic, DeepMind, and a growing number of academic labs are working on understanding what's happening inside AI models — "mechanistic interpretability" — so that unsafe behaviours can be detected and corrected. This research is genuinely difficult and under-resourced relative to capability research.
Yes, in several ways. "Prompt injection" attacks can cause AI systems to ignore their instructions and follow adversarial prompts embedded in content they're processing. "Jailbreaks" can bypass safety training through carefully crafted prompts. AI systems that take actions in the world (agentic AI) have larger attack surfaces than chatbots. Security around AI systems is an active area of both attack and defence research.
AI systems can produce false statements — whether this constitutes "lying" depends on whether you require intent to deceive. LLMs don't intentionally deceive; they generate statistically likely text. But they can produce confident-sounding false statements (hallucinations), and fine-tuning can be used to train systems that systematically misrepresent information. The latter is a genuine security concern in adversarially deployed AI systems.
Not in the ways the movies suggested.
The risks that deserve attention right now are bias in deployed systems, disinformation amplification, surveillance, and economic disruption. These are happening, they're documented, and they're affecting real people. Dismissing AI risk as science fiction is as unhelpful as treating it as an existential emergency.
The speculative risks deserve serious research and governance investment. But the best way to address both near-term and long-term AI risk is the same: rigorous evaluation of AI systems before deployment, meaningful regulatory oversight, and broad public understanding of how these systems actually work. The alternative — moving fast and figuring out the harms later — has already cost people their jobs, their privacy, and in some cases, their freedom.
AI intelligence,
weekly.
Every week: the AI developments that matter, the tools worth trying, and the data behind the headlines. No hype. No filler.
Subscribe free →