What Is DeepSeek AI? The Chinese Model That Shook the AI Industry

The numbers that caused the commotion

$6M

Claimed training cost for DeepSeek R1 — vs $100M+ for comparable US frontier models. Even accounting for underreporting, the efficiency gains are real [DeepSeek technical report]

$17B

Value wiped from Nvidia's market cap in a single day when DeepSeek R1 launched — markets feared lower GPU demand if training was this cheap [Bloomberg]

DeepSeek's app hit number one on the US App Store within 72 hours of the R1 release — unprecedented adoption speed for a non-US AI product [App Store charts, Jan 2025]

DeepSeek is a Chinese AI research lab founded in 2023, backed by hedge fund High-Flyer. In January 2025, it released DeepSeek R1 — a reasoning model that matched or exceeded GPT-4o and Claude 3.5 on multiple benchmarks, at a claimed training cost that was a fraction of what US labs spend. The release triggered genuine soul-searching across the AI industry about whether the US compute advantage was as durable as believed — a direct challenge to AI wealth concentration.

DeepSeek's key models

DeepSeek-V2

May 2024

A 236-billion parameter mixture-of-experts model that activated only 21B parameters per token — dramatically reducing inference cost without sacrificing capability. Released as open source, it demonstrated DeepSeek's MoE architecture efficiency that would define subsequent models.

DeepSeek-V3

December 2024

685B parameter MoE model trained on 14.8T tokens. Matched GPT-4o and Claude 3.5 Sonnet on coding, math, and reasoning benchmarks. Training cost claimed at approximately $5.5M — using H800 chips (export-limited, lower spec than H100s). The efficiency engineering was genuinely innovative: multi-head latent attention, auxiliary loss-free load balancing, and FP8 mixed precision training.

DeepSeek-R1

January 2025

DeepSeek's reasoning model — trained to think step-by-step before answering, similar to OpenAI's o1. Matched o1 on AIME and MATH benchmarks. Open-source release with permissive MIT licence. This is the model that caused the market disruption. Its chain-of-thought reasoning traces are visible to users, which some researchers found more instructive than o1's hidden reasoning.

DeepSeek-V3 0324 / R2

2025 updates

Subsequent releases continued improving on the base models. V3 updates showed consistent benchmark improvements. DeepSeek has maintained a rapid release cadence, continuing to compress the performance gap with US frontier models while maintaining open-source releases.

Privacy and security concerns

Important privacy considerations before using DeepSeek

DeepSeek's privacy policy explicitly states that user data is stored on servers in China, subject to Chinese law — including the National Intelligence Law, which requires organisations to cooperate with state intelligence work. Several national governments and agencies have blocked DeepSeek for government use: Italy, Taiwan, South Korea, and multiple US government agencies have prohibited or restricted its use on government devices. For enterprise and professional use involving sensitive information, this is a material concern. Using DeepSeek's app or web interface means your prompts and conversations are stored in China with no contractual data protection equivalent to GDPR or US enterprise SLAs. The open-source models can be run locally — removing the data transfer concern entirely. For sensitive work, self-hosting the open weights via Ollama or a local deployment is a significantly different risk profile from using DeepSeek's web interface or app.

What DeepSeek actually proved

The most important thing DeepSeek demonstrated wasn't that China can match US AI capability (though that's significant). It's that frontier-quality AI can be built with dramatically less compute than previously assumed — through better architecture, more efficient training techniques, and careful engineering rather than raw scale. This has two major implications for the future of AI: training costs will likely continue falling, making deployment accessible to more players, and the assumption that US export controls on high-end chips would maintain a durable AI performance gap has been undermined. DeepSeek's open-source releases have been adopted widely. The privacy concerns with the consumer app are real; the technical achievements are also real.

FAQ

Should I use DeepSeek for my work?

Depends on your risk tolerance and the sensitivity of your data. For non-sensitive personal productivity tasks where you're comfortable with data being stored in China: DeepSeek R1 and V3 are genuinely capable models — particularly strong at coding and mathematical reasoning. For any work involving client data, confidential business information, healthcare, legal, or financial information: do not use DeepSeek's web interface or app. For using the model capability without the data risk: run DeepSeek's open-source models locally via Ollama (deepseek-r1:7b runs on a modern laptop).

How does DeepSeek compare to ChatGPT and Claude?

On raw benchmark performance, DeepSeek R1 and V3 are genuinely competitive with GPT-4o and Claude 3.5/3.7 — for a closer head-to-head of the Western models, see ChatGPT vs Claude vs Gemini. Real-world performance is broadly comparable. Where they differ: DeepSeek applies censorship on politically sensitive topics (Tiananmen, Taiwan independence, Xinjiang) and refuses those queries. It's also more likely to comply with requests that Western models refuse, which is either a feature or a bug depending on your use case. The open-source availability is genuinely valuable for researchers and developers who want to fine-tune or study the model.

What Is DeepSeek AI? The Chinese Model That Shook the AI Industry

04 — Don't watch from the outside