The App Works.
It's Just Not Safe.
Developers using AI tools in a randomised controlled trial took 19% longer to complete tasks than those without — and estimated they were 20% faster. That 39-point gap is the most important number in AI adoption.
01 — The Perception Gap
In the spring of 2025, METR — an AI safety research organisation — ran a rigorous randomised controlled trial with 16 experienced open-source developers across 246 real tasks. Half used AI tools. Half didn't. The result contradicted almost everything in the AI productivity conversation.
"LLMs give the same feeling of achievement one would get from doing the work themselves, but without any of the heavy lifting. You remember the jackpots. You don't remember sitting there plugging tokens into the slot machine for two hours."
— Marcus Hutchins, security researcher / James Liu, Director of Software Engineering, Mediaocean — MIT Technology Review, December 2025MITThe Faros AI report, analysing telemetry from over 10,000 developers across 1,255 teams, confirmed the pattern at scale: teams using AI completed 21% more tasks and created 98% more pull requests. PR size grew 154%. Review time increased 91%. Bug counts rose. Company-level DORA metrics — deployment frequency, change failure rate, lead time — were flat. More code going in. No improvement coming out.FAR
The Stack Overflow Developer Survey 2025 backs this up from the developer side: 41.4% of developers say AI has little or no effect on their productivity. Only 16.3% say it makes them significantly more productive.SO
If you feel more productive with AI, you're in the majority. You might also be wrong.
The METR study found that experienced developers consistently overestimated their speed by 39 percentage points when using AI tools. The tools make the work feel effortless — instant suggestions, rapid generation — but the overhead of prompting, reviewing, and fixing adds up invisibly. Track your actual output metrics, not your gut feeling.
02 — Three Structural Reasons
The METR finding isn't a fluke. There are three structural reasons why AI coding tools consistently fail to deliver the productivity gains their users feel.
Optimising a step that isn't the bottleneck doesn't improve system throughput. You can make the fastest machine on the factory floor twice as fast — if it's feeding a queue that's already backed up, you've accomplished nothing at the output level. Writing code is not the bottleneck in software delivery. It never was. AI made the non-bottleneck faster. The bottleneck — understanding, reviewing, testing, securing, and maintaining — got worse.PD
AI made the fast part of software development faster. The slow parts — reviewing, testing, securing, understanding — got measurably worse.
This isn't a tool problem. It's a workflow problem. If your team adopted AI without changing how you review and test, you're likely shipping more code with more bugs at the same pace.
03 — What "It Works" Hides
An app that runs is not an app that's safe. The gap between "it works on my machine" and "it's production-ready" is where professional knowledge lives. AI optimises for syntactically correct, functionally passing code. It does not optimise for the things that matter when real users, real attackers, and real regulators interact with your product.
The Base44 platform — where the founder publicly celebrated that 100% of code was written by Cursor AI with "zero hand-written code" — was found full of basic security flaws allowing anyone to access paid features or alter data. It was shut down days after launch.KAS A separate incident: the Moltbook AI social network exposed 1.5 million API keys and 35,000 user email addresses because vibe-coded infrastructure left a Supabase database misconfigured and public.DEV
An app that runs is not an app that's safe. 86% of AI-generated code fails basic XSS protection. 19.7% of suggested dependencies don't even exist — and attackers are actively exploiting this.
If you're shipping AI-generated code without security review, you're not moving fast. You're accumulating exposure. The European Accessibility Act and GDPR mean this exposure now has legal teeth.
to ship AI-assisted code safely.
Subscribe to Veltrix Collective for weekly AI insights. The gap between AI hype and AI reality changes every week. We track the data so you don't have to — tools, rankings, and the numbers that matter, delivered every Tuesday.
Measure your actual cycle time, not your feelings. Set up a DORA metrics dashboard using LinearB, Sleuth, or even a simple spreadsheet tracking deployment frequency and change failure rate. The METR study proved perceived productivity and measured productivity are 39 points apart. Track the real number.
Run a security scanner on your last AI-generated project. Install Semgrep or Snyk (both have free tiers) and scan your most recent AI-assisted codebase. The Veracode report found 45% of AI code contains vulnerabilities. You need 20 minutes and a terminal to find out if yours does too.
Verify every dependency before you install it. Before running any AI-suggested npm install or pip install, check the package exists on the official registry with real maintainers. Use Socket.dev or npm audit to flag known risks. 19.7% of AI-suggested packages are hallucinated — and attackers register them.
Add one accessibility check to your workflow. Install axe DevTools (free Chrome extension) and run it on your latest project. The European Accessibility Act is now in force. 20 minutes per page catches 30-40% of WCAG violations automatically. Claude can help you fix what it finds.
04 — Stay Ahead of the Data
your curiosity.
Weekly data-backed analysis of AI adoption, tools, and what actually works. No hype. No jargon. Just the numbers that matter.
Weekly, every Tuesday · No spam · Privacy policy · Unsubscribe anytime