An independent case study · 深度求索

DeepSeek: cheap, open, and contested

An evidence-first reading of the Chinese lab whose low-cost open models rattled the global AI market — assembled from English and Chinese primary sources, with each decisive question weighed rather than left open.

78 sources · 35% Chinese-languageAs of 2 June 202610 analysis sections

In January 2025 a startup spun out of a Chinese quant fund released an open-weight reasoning model that matched OpenAI's on key benchmarks — and wiped nearly $600 billion off Nvidia in a single day.

DeepSeek's capability is not seriously disputed. What is contested is almost everything around it: how cheaply it was really built, whether an open-weight lab with no super-app can stay ahead, how heavily to weigh the security and IP concerns, and whether a compute ceiling now caps it. The evidence does not cut evenly on any of them. This study lays out both cases, then weighs each: the efficiency was real but the headline cost narrow, the moat is eroding faster than the capability, the documented security findings outweigh the contested ones, and the compute ceiling is — for now — binding[3][6].

The decisive questions

Each links to the section that lays out the evidence on both sides.

Was the model really built for ~$5.6M?

DeepSeek's own report says V3's final training run cost $5.576M — but it excludes prior research, and SemiAnalysis estimates ~$1.6B in true server CapEx. Both can be true; the gap is the whole debate.

Is an open-weight, no-moat lab defensible?

Liang Wenfeng says the team and culture are the moat and that 'closed-source moats are fleeting.' Skeptics see give-away weights, no consumer distribution, and rivals poaching its researchers.

How much should the security and IP concerns weigh?

Bans across Italy, South Korea and US agencies, an exposed database, jailbreak findings and distillation allegations sit against DeepSeek's responses and a still-contested evidence base.

Can it stay at the frontier under a compute ceiling?

Export controls and an unstable Huawei-Ascend fallback delayed R2; V4 reportedly trails the frontier by 3–6 months while consumer usage has already plateaued behind ByteDance's Doubao.

The arc that frames the debate

DeepSeek's China app reached an estimated ~194M MAU in February 2025, then plateaued and slipped behind ByteDance's Doubao through 2025–26. The speed of the rise and the speed of the slowdown are the bull case and the bear case at once (reported figures from Chinese trackers; methodologies differ).

DeepSeek China AI-native-app MAU (millions, reported/estimated)

⚖️

Where this study lands

On the $5.6M figure: both real and misleading — a true final-run cost atop a ~$1.6B estimated hardware base (high confidence). On the open-weight moat: genuinely innovative, but eroding — Doubao leads at home and rivals poach the researchers who are the moat (medium confidence). On security: the documented findings (exposed database, jailbreak rates, Korea's PIPC ruling) justify caution about the hosted service, while the smuggling and China-Mobile claims remain unproven (high confidence on the core). On the compute ceiling: binding for now — V4 trails the frontier and no full Ascend training run has succeeded (medium confidence). The full weighing, with tripwires, closes the Forward View.

How to read this

Ten sections, each built the same way: a neutral synthesis, framework visuals, a two-sided case-for / case-against ledger, dated quotes (with the original Chinese shown alongside any translation), and the sources used. Start with the question that interests you, or read in order from Overview.

🔍

Independent research artifact, not affiliated with or endorsed by DeepSeek. All quotes link to primary sources; private-company figures are reported estimates and labeled as such. Where the research could not verify a claim, the relevant section says so. See Methodology & Limits.

Section 01

Overview & Timeline

From a quant fund's GPU hoard to a global AI flashpoint in under three years.

11 sources3 Chinese-languageAs of 2 Jun 2026

DeepSeek is the AI lab that quant fund High-Flyer (幻方量化) incubated and spun out in July 2023. Its open-weight V3 and R1 models reached the frontier on a fraction of the usual reported budget — and its January 2025 launch became a market-moving global event. The achievement is broadly accepted; the cost, compute and security questions are not.

What DeepSeek is

DeepSeek (深度求索) is a Hangzhou-based research lab building large language models. Founder Liang Wenfeng (梁文锋) entered Zhejiang University with the province's top entrance score, co-founded High-Flyer in 2015, and declared an AGI pivot in May 2023, formally founding DeepSeek that July[4]. Crucially, the lab inherited High-Flyer's compute: a GPU reserve that scaled from 100 cards (2015) to ~10,000 Nvidia A100s by 2021, ahead of US export controls[10]. Per Tencent News, an early model was built by a ~139-person team of domestic fresh grads and PhD interns — no overseas returnees[11].

The model lineage

DeepSeek ships fast and open. Since R1 (January 2025), its models are released under the permissive MIT License with weights downloadable for commercial use[2]. The cadence below is faster than most frontier labs, and each release has pushed price and efficiency rather than consumer polish.

2015

High-Flyer (幻方量化) founded[4]

Liang Wenfeng and two Zhejiang University alumni start a quant fund that builds large GPU clusters for trading.

2021

10,000+ A100s stockpiled[10]

High-Flyer's ~1B-RMB Fire-Flyer 2 cluster lands ~10,000 Nvidia A100s — before US export controls.

Jul 2023

DeepSeek spun out[1]

Hangzhou DeepSeek is founded as an independent AGI lab; Liang holds ~84% and is CEO of both firms.

May 2024

V2 and the price war[7]

DeepSeek-V2 (236B MoE) ships at 1/2 RMB per million tokens — ~1% of GPT-4 Turbo — earning the 'Pinduoduo of AI' label.

Dec 2024

V3 released[5]

A 671B/37B MoE model trained for a reported 2.788M H800 GPU-hours (~$5.576M for the final run).

20 Jan 2025

R1 + the 'DeepSeek moment'[3]

The open-weight R1 reasoning model matches OpenAI o1; the app hits #1 and Nvidia loses ~$600B on 27 Jan.

May 2025

R1-0528[8]

A deeper-reasoning R1 refresh (AIME-2025 70% → 87.5%); a standalone R2 is delayed.

Aug 2025

V3.1 hybrid[9]

Adds combined thinking / non-thinking modes and stronger tool-use, MIT-licensed.

Apr 2026

V4 preview[66]

V4-Pro (1.6T params, 1M-token context) co-engineered for Huawei Ascend; reportedly trails the frontier ~3–6 months.

Why it mattered immediately

R1's release coincided with China's Spring Festival and went viral worldwide. By 27 January 2025 the DeepSeek app was the most-downloaded free app on the US App Store, and Nvidia fell ~16.9% — losing nearly $600 billion, the largest single-company one-day loss in US market history[3]. Chinese coverage framed it as a national-pride milestone: a small lab matching OpenAI at a reported tenth of the cost[69]. Skeptics countered that the reaction was "obsessive hype" and that the $6M figure described only one training run[70].

Both sides of the ledger

The rest of the study expands each; weighed, the ledger reads: the capability and compute base are real, but the headline cost was narrow, the home-market lead has been lost to Doubao, and the frontier gap is now visible — the bull case survives on efficiency, not on momentum.

The case for

Genuine frontier capability: R1 matched OpenAI o1 on math/reasoning and was released openly[6].
Built on a real, pre-existing compute base and an unusually lean, research-first team[11].
Fast open cadence (V2 → V3 → R1 → V3.1 → V4) kept it culturally central among developers[2].

The case against

The headline $5.6M cost excludes prior research and the broader GPU fleet[70].
The 2025 surge plateaued; ByteDance's Doubao overtook it on home turf[34].
R2 slipped and V4 reportedly trails the frontier by 3–6 months[66].

In their words

“China's AI cannot forever remain in a following position. China inevitably needs someone to stand at the technological frontier.”

original · zh“中国的AI不可能永远处在跟随的位置……中国必然需要有人站到技术的前沿。”

Liang Wenfeng (梁文锋) · Founder & CEO, DeepSeek · Jan 2025 interview · English is a translation from zh · source

Sources for this section

13 sources · en, zh · 4 Chinese-language · full bibliography on the Sources list.

T2DeepSeek — Wikipedia· en · neutral
T2DeepSeek — Wikipedia· en · neutral
T2Nvidia drops $600bn off its market cap amid the rise of DeepSeek — TechCrunch· en · neutral
T2梁文锋 — 维基百科· zh · neutral
T1DeepSeek-V3 — GitHub (deepseek-ai)· en · neutral
T1DeepSeek-R1 — GitHub (deepseek-ai)· en · neutral
T1DeepSeek-V2 — GitHub (deepseek-ai)· en · neutral
T1DeepSeek-R1-0528 — Hugging Face· en · neutral
T1DeepSeek-V3.1 — Hugging Face· en · neutral
T2疯狂的幻方：一家隐形AI巨头的大模型之路 — 36氪· zh · neutral
T2Z Waves｜梁文锋，DeepSeek缔造者 — 腾讯新闻· zh · neutral
T2Z Waves｜梁文锋，DeepSeek缔造者 — 腾讯新闻· zh · supporting
T2DeepSeek Debates — SemiAnalysis· en · critical

Section 02

Market & Industry Structure

A large, fast-growing market — and a highly competitive, price-deflationary, compute-constrained one.

11 sources4 Chinese-languageAs of 2 Jun 2026

DeepSeek sits in two overlapping markets: the global frontier-model race (OpenAI, Anthropic, Google, Meta) and China's LLM market (Alibaba, ByteDance, Baidu, Tencent, plus the AI "tigers"). Both reward capability but make value hard to capture: open weights, free tiers and a price war push prices toward zero, while US export controls on advanced chips set a hard ceiling on the key input.

The price war DeepSeek started

In May 2024, DeepSeek-V2 priced its API at 1 RMB per million input tokens and 2 RMB output — about 1/100th of GPT-4 Turbo — earning the nickname "AI界拼多多" (the Pinduoduo of AI)[12]. Within the same month Zhipu, ByteDance, Alibaba, Baidu, iFlytek and Tencent followed with cuts of roughly 80–97%[13]. The deflation never stopped: in 2026 DeepSeek cut V4-Pro another ~75%, with cache-hit input as low as 0.025 RMB per million tokens — a level Chinese press called a global low[15]. Published V4-Pro output still lists near $0.87 per 1M tokens[14].

Compute is the binding constraint

The decisive scarce input is advanced compute. US export controls cap China's access to Nvidia's best chips; bandwidth-limited H800/H20 parts force Chinese labs to use roughly 2–4x the compute for the same result, and Liang has said money is not the problem — chip bans are[16]. DeepSeek's edge here is partly historical: High-Flyer had 10,000+ A100s by mid-2022, and SemiAnalysis estimates ~$1.63B of GPU-server spend across the group[20].

Distribution and adoption

DeepSeek's open release spread fast through infrastructure: within days of R1, Microsoft Azure, AWS and Nvidia all hosted it[17], and in China Tencent wired the full R1 into WeChat search and Yuanbao while Baidu added it alongside Ernie — DeepSeek's site traffic rose more than twentyfold to ~278M visits in January 2025[18]. Yet by usage it is one strong player among several: H1-2025 model-invocation share put Alibaba's Qwen first at 17.7%, Doubao at 14.1% and DeepSeek third at 10.3%[19]. It competes in a field already crowded with well-capitalized giants[73].

Five Forces: a structurally hard market

Click each force for the rated pressure and the evidence behind it. The picture is an industry where capability is achievable but value is hard to capture.

Frontier & open LLM market

Competitive rivalry — High. DeepSeek faces OpenAI, Anthropic and Google globally and Alibaba, ByteDance, Baidu, Tencent plus the 'AI tiger' startups at home; benchmark leadership rotates quickly and a 2024–26 price war keeps margins thin.

⚠️

Where this may mislead

Usage-share and traffic figures come from third-party trackers (Frost & Sullivan, Similarweb-style data) with differing methods and dates; treat them as directional, not precise.

Why the market favors DeepSeek

Open weights + low prices made it the default for cost-sensitive developers and cloud catalogs[17].
A pre-existing compute hoard softened the export-control squeeze relative to peers[20].
Big-tech integrations (Tencent, Baidu) extended its reach without its own super-app[18].

Why the market works against it

The price war it started compresses everyone's economics, including its own[13].
Open substitutes (Qwen, Llama, GLM) mean near-zero switching costs for buyers[15].
Export controls cap the scarce input and Huawei's alternative is immature[16].

Sources for this section

10 sources · zh, en · 4 Chinese-language · full bibliography on the Sources list.

T2梁文锋与「AI界拼多多」 — 澎湃新闻 (The Paper)· zh · supporting
T3大模型价格战：17家厂商定价梳理 — 知乎· zh · supporting
T1Models & Pricing — DeepSeek API Docs· en · neutral
T2创大模型价格新低！DeepSeek API缓存价降至首发十分之一 — 证券时报· zh · neutral
T2DeepSeek, Huawei, Export Controls, and the Future of the U.S.-China AI Race — CSIS· en · neutral
T2AWS, Microsoft, Google, Others Make DeepSeek-R1 Available — Campus Technology· en · supporting
T2Tencent, Baidu Look to Capitalize on DeepSeek's Stunning Rise — Caixin Global· en · supporting
T2Who leads China's AI cloud race? — KrASIA· en · critical
T2DeepSeek, Huawei, Export Controls — CSIS· en · critical
T2疯狂的幻方 — 36氪· zh · neutral

Section 03

Business Model & Economics

Funded by a quant fund, priced near cost, given away as open weights — the most contested numbers in AI.

12 sources2 Chinese-languageAs of 2 Jun 2026

DeepSeek is unusual: bankrolled from High-Flyer's quant profits rather than VC, it has prioritized research over commercialization, given its models away under MIT, and priced its API near cost. The famous $5.576M training figure is real but narrow; the true all-in investment is far larger, and DeepSeek's own 545% "margin" is explicitly theoretical.

The $5.6M figure — what it does and doesn't say

DeepSeek's V3 technical report states the model's full training consumed 2.788M H800 GPU-hours, or about $5.576M at an assumed $2/GPU-hour rental[21]. The same report adds the crucial caveat that this covers only the final official run — excluding prior research and ablation experiments[22]. SemiAnalysis estimates DeepSeek's true hardware footprint at ~$1.6B in server CapEx and ~$944M in operating costs, with the $6M representing just the pre-training GPU run[23]. Ben Thompson splits the difference: the number is plausible for the run itself, but "you can't replicate DeepSeek the company for $5.576 million"[24].

🧮

Both numbers are true

The $5.6M (marginal cost of one run) and the ~$1.6B (cumulative cluster investment) are not in conflict — they measure different things. The error is quoting either as "what DeepSeek spent."

What 'DeepSeek's cost' means, depending on what you count (US$ millions)

V3 final training run

$5.576M

Cluster operating cost

$944M

Total server CapEx

$1,600M

The same lab, three honest answers: the $5.576M final-run figure DeepSeek published[21] sits against SemiAnalysis's estimate of ~$944M in cluster operating cost and ~$1.6B in total server CapEx[23]. The headline is ~290× smaller than the all-in hardware base — which is exactly why both the "genuine efficiency breakthrough" and the "misleading headline" readings can each cite a real number.

How the money works (and doesn't)

DeepSeek's R&D came from High-Flyer's budget, not outside capital — VCs were reluctant because a quick exit looked unlikely — and Liang holds 84.29% with near-total voting control[27][29]. He long pledged not to raise or chase commercialization, though Chinese press reported in April 2026 that DeepSeek may pursue a first external round at a $10B+ valuation (unconfirmed)[28]. Revenue comes from API usage; web and app access remain free[26].

The 545% margin claim

During its February 2025 Open Source Week, DeepSeek published a striking figure: a theoretical 545% cost-profit margin, with $562,027 in notional daily revenue against $87,072 of daily cost[25]. But DeepSeek itself disclaimed it: actual revenue is "substantially lower" because V3 is priced far below R1, web/app access is free, and off-peak discounts apply[26]. Read carefully, it demonstrates that inference can be highly profitable at scale — not that DeepSeek is currently capturing it. Wikipedia notes DeepSeek was nonetheless reported profitable relative to money-losing rivals[30].

Running the number: annualizing DeepSeek's own disclosure

A calculation this study runs (DeepSeek did not publish it): annualize the disclosed snapshot. The $562,027 notional daily revenue and $87,072 daily cost[25] across 365 days give a theoretical ceiling of roughly ~$205M in annual inference revenue against ~$32M of serving cost — about ~$173M of theoretical annual gross profit. Set against SemiAnalysis's ~$1.6B server-CapEx estimate[23], even that best case would take ~9 years of gross profit to pay back the hardware base — and DeepSeek itself says actual revenue is "substantially lower" than the notional figure, since web/app access is free and V3 is priced below R1[26]. All derived figures are illustrative, but the direction is clear: even DeepSeek's most flattering disclosed number does not make the economics self-funding at the estimated hardware scale.

“We just work at our own pace, then cost it out and set the price; we don't subsidize at a loss, nor do we seek exorbitant profit.”

original · zh“我们只是按照自己的步调来做事，然后核算成本定价……不贴钱，也不赚取暴利。”

Liang Wenfeng (梁文锋) · Founder & CEO, DeepSeek · Jan 2025 · English is a translation from zh · source

📏

The expectations bar

The only valuation marker on record is the reported, unconfirmed first external round at $10B+[28]. Against the ~$205M theoretical-ceiling annual inference revenue derived above from DeepSeek's own disclosure[25] — with actual revenue "substantially lower"[26] — $10B implies ~49x a revenue ceiling DeepSeek says it does not reach (illustrative; every input is an estimate). A round at that mark must believe one of two things: that open-weight mindshare converts into paid API and enterprise revenue at several multiples of today's, or that strategic, national-champion value — not revenue — sets the price. That is the bar both the bull and the bear case are measured against, not a view on whether it should be paid.

The model is sound

Self-funded from quant profits, so no burn-rate pressure or investor exit clock[27].
Real architectural efficiency (MLA, MoE, FP8) makes low prices economically rational, not just subsidized[21].
Published inference economics show genuine profit potential at scale[25].

The model is unproven

Free web/app and rock-bottom API pricing leave little realized revenue[26].
True investment is ~$1.6B, far above the headline — efficiency is relative, not absolute[23].
Open weights mean it captures little of the value it creates for others[24].

Sources for this section

11 sources · en, zh · 3 Chinese-language · full bibliography on the Sources list.

T1DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI)· en · supporting
T1DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI)· en · neutral
T2DeepSeek Debates: Chinese Leadership On Cost, True Training Cost — SemiAnalysis· en · critical
T2DeepSeek FAQ — Stratechery (Ben Thompson)· en · neutral
T1DeepSeek-V3/R1 Inference System Overview — deepseek-ai/open-infra-index (GitHub)· en · supporting
T1DeepSeek-V3/R1 Inference System Overview — deepseek-ai/open-infra-index (GitHub)· en · critical
T2梁文锋从「量化养AI」到拥抱资本 — 新浪财经· zh · supporting
T2DeepSeek或首次融资 — 新浪财经· zh · neutral
T2DeepSeek — Wikipedia· en · supporting
T2DeepSeek — Wikipedia· en · supporting
T2DeepSeek创始人专访：中国的AI不可能永远跟随 — 新浪财经· zh · neutral

Section 04

Competitive Landscape

A benchmark leader on a budget — that does not lead its own home market on users.

3 sources2 Chinese-languageAs of 2 Jun 2026

On capability, DeepSeek competes with the global frontier (OpenAI o-series, Claude, Gemini) and leads on math/reasoning at a fraction of the cost. On distribution it does not lead at home: ByteDance's Doubao has roughly 1.7x its China MAU, and Alibaba's Qwen rivals it on open-weight mindshare. Capability and distribution point in opposite directions — the tension this study returns to.

Capability: strong, especially on reasoning

DeepSeek-R1 posts frontier-level math and reasoning scores. One third-party comparison puts R1 at ~90.2% on MATH-500, ahead of Claude-3.5-Sonnet (78.3%) and GPT-4o (74.6%) — at a small fraction of Claude's per-token price[32]. Treat any single benchmark cautiously (leaderboards rotate and this is one Tier-3 comparison), but the direction is consistent with DeepSeek's own reported results.

MATH-500 Pass@1 (reported; one comparison, indicative)

DeepSeek-R1

90.2%

Claude-3.5-Sonnet

78.3%

GPT-4o

74.6%

Distribution: third at home, by usage

The picture inverts on reach. In China's consumer app market (Feb 2026), Doubao led at ~226.7M MAU — equal to the sum of the #2–#5 apps — with DeepSeek second at ~135.6M, ahead of Tencent Yuanbao (~40.7M) and Alibaba Qwen (~25.2M)[34]. On developer-facing quality, Chinese rankings rate Alibaba's Qwen2.5-Max (pretrained on >20T tokens) the top non-reasoning Chinese model, with DeepSeek-V3 standing out for throughput and data scale[33].

📊

See the Peer Comparison section for a side-by-side table and a capability-vs-distribution positioning map across DeepSeek, OpenAI, Anthropic, ByteDance, Alibaba, Meta and Moonshot.

Where DeepSeek wins

Frontier-level reasoning/math benchmarks, openly verifiable[32].
Radical cost-performance: comparable quality at a fraction of closed-model API prices[32].
Strong developer/open-weight mindshare relative to its size.

Where it lags

Only #2 in China by consumer MAU, far behind Doubao's distribution[34].
Qwen contests open-weight leadership with Alibaba's scale and cloud[33].
Benchmark leads are narrow and rotate as rivals ship monthly[32].

Sources for this section

3 sources · en, zh · 2 Chinese-language · full bibliography on the Sources list.

T3DeepSeek R1 vs. Claude (2026 Comparison) — Elephas· en · supporting
T3AI模型排行榜横评：通义千问、DeepSeek、Kimi — LearnKu· zh · neutral
T22.26亿月活！豆包一家独大是第2-5名之和 — 新浪科技· zh · critical

Section 05 · Benchmarking

Peer Comparison

DeepSeek against the global frontier labs and its Chinese rivals. Figures are reported estimates; private firms are unaudited.

7 peersAs of 2 Jun 2026

⚠️

Read these as estimates

Consumer-scale figures for private firms are press/secondary reports, not disclosures; cells are for relative comparison. See each company's cited sources in the relevant sections.

Company	Backing	Weights	Consumer scale	Principal edge
DeepSeek	High-Flyer (self-funded)	Open-weight (MIT)	~136M China MAU (#2)	Cost-efficient frontier reasoning
OpenAI	Microsoft + investors	Closed	ChatGPT global leader	Frontier models + distribution
Anthropic	Amazon, Google + investors	Closed	Enterprise-weighted	Frontier + safety positioning
Doubao (ByteDance)	ByteDance	Mostly closed	~227M China MAU (#1)	Unmatched consumer distribution
Qwen (Alibaba)	Alibaba	Open-weight family	Leading open downloads	Scale + cloud ecosystem
Llama (Meta)	Meta	Open-weight	Ecosystem distribution	Western open-weight default
Kimi (Moonshot)	Alibaba, Tencent, etc.	Open-weight K2	Smaller consumer base	Long-context heritage

China consumer scale (MAU, Feb 2026)

ByteDance's Doubao roughly equals the sum of apps #2–#5; DeepSeek is a clear second but lacks a super-app of its own[34].

China AI-app monthly active users (millions, reported)

Doubao (ByteDance)

227M

DeepSeek

136M

Tencent Yuanbao

41M

Ant / Lingguang

27M

Alibaba Qwen

25M

The same picture, mapped

Capability & openness against distribution & scale. DeepSeek's top-left position — frontier and open, but without mass distribution — is exactly the tension the rest of this study examines. Hover a point for the basis.

Hover a point to see the basis for its placement.

Detailed, sourced competitive evidence is in Competitive Landscape; the moat analysis is in Strategy & Moats.

Section 06

Strategy, Moats & Talent

Open the weights, publish the papers, bet the moat on the team — a deliberate inversion of the closed-lab playbook.

10 sources6 Chinese-languageAs of 2 Jun 2026

DeepSeek's stated strategy is to compete on efficiency and openness, not secrecy or distribution. Liang Wenfeng argues "closed-source moats are fleeting" and that the real moat is the team and culture. The technical record (MLA, MoE, GRPO, FP8) supports the efficiency claim; the durability of a talent-based moat is the open question, as rivals recruit its researchers.

The revealed strategy: efficiency as a weapon

DeepSeek's edge is engineering under constraint. V3 pairs Multi-head Latent Attention (MLA) with DeepSeekMoE and FP8 training to cut compute while matching leading closed models[36]. R1 used GRPO reinforcement learning (dropping the critic model), and R1-Zero displayed a spontaneous "aha moment" — reaching 79.8% on AIME 2024, slightly above OpenAI o1[35]. Forced to do more with constrained chips, DeepSeek turned efficiency into its differentiator.

The open-source thesis

Liang's case for giving the models away is explicit: open-sourcing "loses nothing," the value compounds in the team, and openness is cultural and attracts talent[37][72]. In the English rendering, he argues even OpenAI's closed model "can't prevent others from catching up"[38]. The deeper framing is national: China should "become a contributor instead of a free-rider," and the real gap is originality versus imitation[39].

“Open-sourcing and publishing papers, we actually lose nothing. We deposit value in the team — that is our moat. The real gap is between originality and imitation.”

original · zh“开源，发论文，其实并没有失去什么……我们把价值沉淀在团队上……就是我们的护城河。真实的gap是原创和模仿之差。”

Liang Wenfeng (梁文锋) · Founder & CEO, DeepSeek · 2024 interview (BAAI / 暗涌) · English is a translation from zh · source

The culture that produced it

DeepSeek hires on "passion and curiosity," runs without formal KPIs, and lets division of labor form bottom-up; Liang says staff's desire to do research "far exceeds their concern for money"[40]. The early team was deliberately all-domestic — top-university grads and unfinished-PhD interns, no overseas returnees — a bet that home-grown talent can do frontier work[71].

🎯

The moat's soft spot

If the moat is the team, talent retention is existential. Per Sina, ByteDance's Seed group lost ~70 people to rivals in a year and Tencent ~30, while Xiaomi recruited former DeepSeek researcher Luo Fuli on a multi-million-yuan package — and DeepSeek's own options have no marketized valuation to counter with[41].

A real, defensible moat

Demonstrated efficiency know-how (MLA, MoE, GRPO, FP8) that rivals must reverse-engineer[36].
Open source builds a developer ecosystem and recruiting magnet around the brand[72].
A coherent research-first culture that has shipped repeatedly under constraint[40].

A moat that may not hold

The weights themselves are open — no IP moat on the core asset[38].
Better-capitalized rivals poach the very researchers who are the moat[41].
No consumer distribution to lock in users the way Doubao or ChatGPT do[37].

Sources for this section

9 sources · en, zh · 6 Chinese-language · full bibliography on the Sources list.

T1DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL — arXiv· en · neutral
T1DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI)· en · neutral
T2DeepSeek创始人专访：中国的AI不可能永远跟随 — 智源社区 (BAAI)· zh · supporting
T2Interview with DeepSeek Founder: We're Done Following — The China Academy· en · supporting
T2DeepSeek创始人专访 — 新浪财经· zh · neutral
T2我所见过的梁文锋 — 腾讯新闻· zh · supporting
T2字节回应「亿元年薪挖DeepSeek员工」 — 新浪财经· zh · critical
T2DeepSeek创始人专访 — 新浪财经· zh · neutral
T2DeepSeek创始人专访 — 智源社区 (BAAI)· zh · supporting

Section 07

Risks & Controversies

Distillation allegations, security and privacy bans, censorship and jailbreak findings — each presented with the response and its confidence level.

14 sources2 Chinese-languageAs of 2 Jun 2026

DeepSeek attracts more controversy than any peer, spanning four areas: IP/distillation, data security and government bans, censorship, and model safety. Some findings are firmly documented (an exposed database; jailbreak tests); others are contested or geopolitically charged (smuggled-chip and China-Mobile claims). Each is attributed below, with DeepSeek's response where one exists.

1. Distillation / IP (contested)

In January 2025 Microsoft and OpenAI investigated whether a DeepSeek-linked group exfiltrated data via OpenAI's API in late 2024 to distill models, which would breach OpenAI's terms[42]. DeepSeek's response (via its Nature paper) was that V3-Base data came only from web pages and e-books, with no intentionally added OpenAI synthetic data — while conceding some scraped pages contained OpenAI-generated answers[43]. Industry reaction was mixed: Anthropic's Dario Amodei called the threat exaggerated, and OpenAI's Mark Chen credited DeepSeek with "independently discovering" core o1 ideas[44].

2. Data security & government bans (mostly documented)

Wiz Research found a publicly accessible, unauthenticated DeepSeek database exposing over a million log entries, including plaintext chats and API keys; DeepSeek secured it promptly after disclosure[47]. Regulators acted: Italy's Garante ordered an immediate processing limitation (30 Jan 2025)[51]; South Korea's PIPC found unlawful overseas transfer of user data — including prompt content — to ByteDance subsidiary Volcano, suspending downloads (DeepSeek later complied)[52]. US states Texas, New York and Virginia banned it on state devices[53], as did NASA, the US Navy, Australia and Taiwan[54]. A US House report and Feroot Security alleged hardcoded links to state-owned China Mobile — claims that are contested and that DeepSeek has not addressed[45][55].

3. Censorship (documented for the hosted model)

Tests of DeepSeek's China-hosted service show alignment with government positions — deflecting on Tiananmen, echoing Beijing's line that Taiwan is "an inalienable part of Chinese territory," and avoiding naming Xi Jinping; the House report claimed ~85% of politically sensitive responses were censored or contained misinformation[50][45]. Because the weights are open, self-hosted deployments can differ from the hosted app.

4. Model safety (documented)

Cisco (with UPenn) reported R1 had a 100% attack-success rate against 50 HarmBench prompts — blocking none — versus far higher resistance for o1, Claude and GPT-4o, tying it to efficiency-first training[48]. Anthropic's Amodei said DeepSeek scored worst of any model his team had tested on a bioweapons-data test, while adding it is "not literally dangerous" today and praising its "talented engineers"[49].

5. Export-control circumvention (speculative)

SemiAnalysis estimates DeepSeek/High-Flyer has access to ~50,000 Hopper GPUs (including ~10,000 H100s) — which, if accurate, would sit uneasily with US export controls. But the figure is an estimate, High-Flyer bought A100s in 2021 before controls, and DeepSeek has not confirmed an illegal fleet, so this remains a speculative allegation[46].

⚖️

Where the weight falls

The documented findings — the database leak, the jailbreak rates, Korea's PIPC ruling — are independently verified and sufficient on their own to justify enterprise and government caution about the hosted service (high confidence). The contested claims — China-Mobile links, smuggled H100s — rest on unconfirmed estimates and add charge but not proof; treating them as established overstates the record. DeepSeek's near-silence on most of it leaves the documented core unanswered.

Why the concerns are serious

A real, verified data exposure of chats and keys, however quickly fixed[47].
Independent safety tests show unusually weak guardrails[48].
Multiple democratic regulators independently found data-handling problems[52].

Why some are overstated or contextual

Some flagship claims (smuggled H100s, China-Mobile links) rest on estimates DeepSeek hasn't confirmed[46].
Rivals concede the "threat" framing is exaggerated and credit genuine independent research[44].
Open weights let users self-host outside DeepSeek's servers and censorship[43].

Sources for this section

14 sources · en, zh · 2 Chinese-language · full bibliography on the Sources list.

T2Microsoft probing whether DeepSeek improperly used OpenAI's API — TechCrunch· en · critical
T2DeepSeek首次回应「蒸馏OpenAI」质疑 — 第一财经· zh · supporting
T2硅谷掀桌！DeepSeek遭OpenAI和Anthropic围剿 — 量子位· zh · supporting
T2US House Select Committee Report Accuses DeepSeek of Spying and Circumventing Export Controls — Tech Policy Press· en · critical
T2DeepSeek Debates — SemiAnalysis· en · critical
T1Wiz Research Uncovers Exposed DeepSeek Database — Wiz· en · critical
T1Evaluating Security Risk in DeepSeek and Other Frontier Reasoning Models — Cisco· en · critical
T2Anthropic CEO says DeepSeek was the worst on a critical bioweapons data safety test — TechCrunch· en · critical
T3DeepSeek and Chinese censorship — EurasiaTimes· en · critical
T2Italy's data protection authority Garante blocked DeepSeek — Security Affairs· en · critical
T2Gov't confirms DeepSeek's unauthorized data transfer abroad — The Korea Times· en · critical
T2Three States Ban DeepSeek Use on State Devices and Networks — National Law Review· en · critical
T2Which countries have banned DeepSeek and why? — Al Jazeera· en · critical
T2US lawmakers move to ban China's DeepSeek from government devices — NBC News· en · critical

Section 08

Reception & Adoption

A 'Sputnik moment' abroad and a national-pride story at home — followed by a measurable plateau.

11 sources3 Chinese-languageAs of 2 Jun 2026

The reaction split three ways: alarm (Andreessen's "Sputnik moment"), reframing (Nadella's Jevons paradox; Huang's "the market got it wrong"), and skepticism (Amodei: the threat is "greatly overstated"). In China it was a national triumph. But the adoption surge of early 2025 did not hold: downloads fell and Doubao overtook it.

The Western reaction: alarm, reframing, skepticism

Marc Andreessen called R1 "AI's Sputnik moment"[56]. Satya Nadella called the model "super impressive," urged the West to take China's progress "very, very seriously," and reframed cheap AI as bullish via the Jevons paradox[57]. Jensen Huang argued the market misread it — reasoning models increase compute demand[59]. President Trump called it a "wake-up call" but "a positive"[58]. The most pointed skeptic was Anthropic's Dario Amodei, who argued the threat was "greatly overstated" and that V3 was "an expected point on an ongoing cost reduction curve" — the difference being that a Chinese firm got there first[60].

“DeepSeek-V3 is an expected point on an ongoing cost reduction curve. What's different this time is that the company that was first to demonstrate the expected cost reductions was Chinese.”

Dario Amodei · CEO, Anthropic · Jan 2025 · source

The home reaction: national pride

In China the launch — timed to Spring Festival — was received as proof the country could innovate, not merely follow; the low-cost, open approach turned Liang into a domestic symbol[74]. A widely shared (and possibly fabricated, per its host) New Year's Eve post had him answering "national-fortune" praise with deliberate humility — "standing on the shoulders of open-source giants"[68].

Adoption: a sharp rise, then a plateau

The app hit #1 free in the US (and 50+ countries) in late January 2025, with downloads doubling to ~2.6M in days[61]. Chinese trackers put DeepSeek atop China's AI-native apps at ~194M MAU in February 2025[62]. But momentum faded: SCMP reported Q2-2025 chatbot downloads fell 72% to 22.6M with MAU down ~9% QoQ, as Doubao's downloads rose[63]; and by early 2026 ByteDance's Doubao led China's AI-app market outright (~227M MAU) with DeepSeek a clear but distant #2[34].

📉

Sentiment vs. fact

The national-pride and "Sputnik" narratives are sentiment (clearly labeled); the download and MAU figures are reported tracker estimates whose methods differ. Both point the same way: a rapid launch surge followed by a real cooling.

The optimistic read

It genuinely shifted the global conversation on AI cost and openness[56].
Record-fast adoption and deep cloud/enterprise integration in weeks[61].
A durable brand and national-champion status at home[74].

The skeptical read

Even admirers called the threat framing overstated[60].
Consumer usage plateaued and then declined within two quarters[63].
ByteDance's Doubao sits atop the home market at ~227M MAU, with DeepSeek #2[34].

Sources for this section

10 sources · en, zh · 3 Chinese-language · full bibliography on the Sources list.

T2Marc Andreessen warns DeepSeek is AI's Sputnik moment — Fortune· en · supporting
T2What is Jevons paradox? Why Satya Nadella says DeepSeek is good news — Fortune· en · neutral
T2Trump calls DeepSeek a wake-up call for U.S. tech — Fortune· en · neutral
T2Jensen Huang says market got it wrong about DeepSeek's impact — TechCrunch· en · supporting
T1On DeepSeek and Export Controls — Dario Amodei· en · critical
T2DeepSeek displaces ChatGPT as the App Store's top app — TechCrunch· en · supporting
T2中国AI原生App月活突破2.4亿：DeepSeek居榜首 — 新浪财经· zh · supporting
T2DeepSeek's namesake chatbot sees a drop in downloads — South China Morning Post· en · critical
T3网传梁文锋回应冯骥国运论 — 腾讯新闻· zh · supporting
T2梁文锋与「AI界拼多多」 — 澎湃新闻· zh · supporting

Section 09

Forward View

The next chapter turns on one question: can DeepSeek keep matching the frontier under a hard compute ceiling?

5 sources1 Chinese-languageAs of 2 Jun 2026

The R2 saga crystallized DeepSeek's central tension. A standalone R2 (targeted ~May 2025) slipped — reportedly on both Liang's perfectionism and chip shortages after April-2025 H20 curbs — and an attempt to train on Huawei Ascend reportedly failed. DeepSeek instead shipped V4 (Apr 2026), co-engineered for Ascend but trailing the frontier ~3–6 months. Talent and ideas are not the constraint; compute is.

The R2 / Huawei episode

The Information reported R2's delay reflected CEO Liang being unsatisfied with performance, compounded by a China-side Nvidia shortage after new H20 export curbs[64]. FT-sourced reporting added that Chinese authorities urged DeepSeek to train on Huawei Ascend, but persistent instability, slow interconnects and immature CANN software meant it never completed a fully successful Ascend training run — forcing a revert to Nvidia for training while using Ascend for inference[65]. Against that backdrop, Chinese press reported DeepSeek may pursue its first-ever external financing — read by some as building a compute war chest for the next generation[77].

What V4 tells us

DeepSeek effectively skipped a standalone R2 and previewed V4 on 24 April 2026 — V4-Pro (1.6T params, 1M-token context) co-engineered for Huawei chips — with reporting that it "falls marginally short of GPT-5.4 and Gemini 3.1 Pro"[66]. It continues to lead on price, with another permanent ~75% cut defending developer mindshare[76]. CSIS frames the stakes bluntly: the team would be formidable with more compute, but each Ascend 910C is only ~60% of an H100 for inference and its software is "difficult and unstable"[67].

Three scenarios to weigh

Not predictions — conditions to watch. Which one plays out hinges almost entirely on the compute question.

Bull

Domestic compute clicks

DeepSeek makes Huawei Ascend (or Cambricon) work at training scale, escaping the export-control ceiling. Its lean team + open ecosystem compound, and it re-takes the efficiency frontier on home silicon.[75]

Base

A strong fast-follower

DeepSeek stays ~3–6 months behind the frontier, leading on cost-efficiency and open weights but not dominating its home market. V4-class models keep it relevant via aggressive pricing.[66]

Bear

The compute ceiling bites

Constrained chips and an unstable Ascend stack cap the next scale-ups; talent keeps leaving; consumer usage keeps fading behind Doubao. DeepSeek becomes one capable lab among many.[67]

Weighed together, the three scenarios are not equally likely. The base case — a strong, cheap fast-follower — is the one the current evidence supports: the bull case requires an Ascend training breakthrough that has demonstrably not happened yet[65], while the bear case requires the price leadership and the release cadence to fail at the same time, and neither has[76].

The weighing

On whether the model was really built for ~$5.6M: the evidence leans true but narrow — a genuine final-run cost sitting on a far larger hardware base (high confidence). The controlling evidence is DeepSeek's own caveat that the figure excludes prior research and ablations[22] and SemiAnalysis's ~$1.6B server-CapEx estimate[23], which outweighs the "fabricated number" counter because the two figures come from opposing narratives yet are arithmetically compatible — they measure different things. The strongest surviving counter-argument: the ~50,000-Hopper-GPU estimate, which if confirmed would mean the efficiency story rests partly on undisclosed compute[46]. What would flip this reading: a US enforcement finding confirming a smuggled fleet above ~10,000 H100s, or a future technical report disclosing all-in training cost within ~2x of the headline. Pre-mortem: if this looks wrong in two years, the most likely reason is that the all-in estimates were too high and the efficiency even more real than credited — or, on the other side, that confirmed covert compute reframed the whole episode.

On whether an open-weight, no-moat lab is defensible: the evidence leans eroding (medium confidence). The controlling evidence is Doubao leading China at ~227M MAU to DeepSeek's ~136M[34] and documented poaching of the researchers Liang calls the moat[41], which outweighs the team-and-culture thesis because the moat's carrier — people — is exactly what better-capitalized rivals are buying. The strongest surviving counter-argument: distribution through others has worked — Azure, AWS and Nvidia hosted R1 within days, and Tencent wired it into WeChat[17][18]. What would flip this reading: DeepSeek's China model-invocation share (10.3% in H1 2025, third behind Qwen and Doubao[19]) reaching #1 in the next full-year ranking; or a closed external round that lets it counter nine-figure offers[77]. Pre-mortem: if this looks wrong in two years, the most likely reason is that ecosystem distribution substituted for owned distribution better than consumer MAU implied — or, on the other side, that a hollowed-out research bench quietly ended the release cadence.

On how much the security and IP concerns should weigh: the evidence leans documented core, inflated periphery (high confidence on the core). The controlling evidence is the verified database exposing over a million log entries[47] and Korea PIPC's finding of unlawful overseas data transfer[52], alongside a replicated 100% jailbreak rate[48] — which outweighs the "it's just geopolitics" counter because these findings are technical and regulatory, not political. The strongest surviving counter-argument: even rivals call the threat framing overstated and credit genuine independent research[44], while the China-Mobile and smuggled-chip claims rest on unconfirmed estimates[46]. What would flip this reading: the Microsoft/OpenAI distillation probe producing public evidence[42], a second verified data exposure — or, the other way, a clean re-audit from a regulator that previously sanctioned it. Pre-mortem: if this looks wrong in two years, the most likely reason is that hosted-service problems were fixed and self-hosting made them moot — or, on the other side, that a confirmed state-linked data channel proved the hawks right.

On whether it can stay at the frontier under a compute ceiling: the evidence leans fast-follower, not frontier-setter, while the ceiling holds (medium confidence). The controlling evidence is the failed full-scale Ascend training run[65] and V4 shipping ~3–6 months behind GPT-5.4 and Gemini 3.1 Pro[66], which outweighs the domestic-compute bull case because that case requires a capability — stable frontier training on Chinese silicon — that has not yet been demonstrated. The strongest surviving counter-argument: CSIS's judgment that the team would be formidable with more compute, on a track record of converting constraint into algorithmic gains[67][36]. What would flip this reading: a confirmed, fully successful Ascend or Cambricon training run for a V5-class model; or the reported first external round — a $10B+ valuation was floated in April 2026 — closing and funding a compute war chest[28]. Pre-mortem: if this looks wrong in two years, the most likely reason is that domestic silicon matured faster than 2026 reporting suggested — or, on the other side, that the gap widened past six months and "fast-follower" was itself the optimistic reading.

Sources for this section

7 sources · en, zh · 2 Chinese-language · full bibliography on the Sources list.

T2DeepSeek R2 launch stalled as CEO balks at progress — Reuters / Investing.com· en · critical
T2DeepSeek reportedly urged to train on Huawei hardware after multiple failures — Tom's Hardware (citing FT)· en · critical
T2DeepSeek unveils V4 model, with rock-bottom prices and close Huawei-chip integration — Fortune· en · neutral
T1DeepSeek, Huawei, Export Controls, and the Future of the U.S.-China AI Race — CSIS· en · neutral
T1DeepSeek, Huawei, Export Controls — CSIS· en · supporting
T2DeepSeek API价格新低 — 证券时报· zh · supporting
T2DeepSeek或首次融资 — 新浪财经· zh · neutral

How this was made

Methodology & Limitations

What this study is, how it was researched, and — importantly — where it could be wrong.

As of 2 June 2026

Method

Research proceeded by fan-out web search and direct fetching of primary and reputable secondary sources across nine question areas (overview, market, business model, competition, peers, strategy, risks, reception, forward view); every URL cited was opened and read, and an automated link checker validated each one. Claims were transcribed into a structured manifest that tags every source with a tier, a confidence level and a stance, and the build was bilingual by design: because DeepSeek (深度求索) is a Chinese company, a substantial share of research was conducted in Chinese — 26 of 77 sources (34%) are Chinese-language, including domestic press (新浪/Sina, 36氪, 第一财经/Yicai, 量子位, 澎湃/The Paper) and the founder's Chinese-language interviews, with translated quotes showing the original Chinese alongside the English. The load-bearing figures for this company are the disputed $5.6M V3 training cost, the SemiAnalysis ~$1.6B all-in infrastructure estimate, the reported 545% theoretical inference margin, the GPU-count claims, and the third-party China-MAU/DAU series — every one of which is an estimate rather than an audited disclosure.

Tier 1: 17Tier 2: 55Tier 3: 5·Supporting: 25Critical: 23Neutral: 29

Frameworks used

The analysis applies Porter's Five Forces to characterize the structure of the frontier and open-LLM market, a capability-vs-distribution positioning map to place DeepSeek against its peers, peer benchmarking on comparable reported metrics, scenario analysis for the forward view, and a case-for / case-against ledger in every section so the bull and bear arguments are stated side by side. Frameworks are used to organize evidence even-handedly, and the Forward View then weighs each decisive question explicitly — stating the lean, the confidence, and the tripwires that would change it; a formal unit-economics or DCF build was deliberately skipped because DeepSeek is private and unaudited and the underlying cost, revenue and compute figures are not disclosed at the granularity such a model would require.

Disclosed vs. estimated

DeepSeek is a private company that publishes no audited financials, so almost nothing here is a company-disclosed figure in the sense a listed issuer's filings would be. What the company itself has put on the record — model releases, technical reports and the headline $5.6M V3 training-run figure — is treated as disclosed-but-self-reported; comparable-basis numbers such as the ~$1.6B all-in infrastructure estimate and the 545% inference margin are directional reconstructions by third parties (notably SemiAnalysis) built on stated assumptions; and usage figures (China MAU/DAU, downloads) are third-party tracker estimates from sources such as Sina/QuestMobile, SCMP and KrASIA whose methods and dates differ. Where these conflict, the study shows the range rather than picking a single number.

⚠️

Where this case study may be wrong

All financial/compute figures are estimates. DeepSeek is private and unaudited. The $5.6M training cost, the ~$1.6B SemiAnalysis estimate, the 545% margin and the GPU-count claims are reported figures, not disclosures — and several conflict.
Usage numbers are tracker estimates. MAU/DAU/download figures come from third-party trackers (Sina/QuestMobile, SCMP, KrASIA) whose methods and dates differ; the China-MAU series mixes such sources.
Some allegations are contested or speculative. The smuggled-H100 and China-Mobile claims rest on estimates DeepSeek has not confirmed; we flag them as Medium/Speculative.
Recency. R2/V4 and the Huawei-chip situation are fast-moving and partly based on single-outlet reporting (The Information, FT) corroborated where possible.
One viral quote is unverified. The New Year's Eve "national-fortune" post attributed to Liang may be fabricated; it is labeled Tier-3 sentiment, not fact.

Neutrality & independence

This is a compilation, not an argument: each section pairs the case for and the case against, critical claims are attributed to their source and positive claims are held to the same standard. The study is not affiliated with or endorsed by DeepSeek, Nvidia, or any other company named here, and it carries no investment recommendation; it is not investment advice — no rating, price target, or recommendation to buy or sell any security. It is a point-in-time artifact dated 2 June 2026; the AI market moves fast, the figures will age, and corrections are welcome.

Bibliography

Sources

Every cited source was fetched during the research run. Tiers: 1 = primary/official, 2 = reputable press, 3 = forums/sentiment.

77 sources26 Chinese · 34%

Tier 1: 17Tier 2: 55Tier 3: 5·Supporting: 25Critical: 23Neutral: 29

Overview & Timeline

[1]DeepSeek — Wikipedia T2 neutral en
DeepSeek was spun off as an independent company on 17 July 2023 with quant fund High-Flyer as principal backer; founder Liang Wenfeng is CEO of both and held ~84% as of May 2024.
[2]DeepSeek — Wikipedia T2 neutral en
Release timeline: DeepSeek Coder (Nov 2023), LLM (Nov 2023), V2 (May 2024), V3 (Dec 2024), R1 (20 Jan 2025), R1-0528 (May 2025), V3.1 (Aug 2025); since R1 models ship under open-source licenses, primarily MIT.
[3]Nvidia drops $600bn off its market cap amid the rise of DeepSeek — TechCrunch T2 neutral en
On 27 January 2025 Nvidia fell ~16.9% and lost nearly $600B in market value — the largest single-company one-day decline in US stock-market history — attributed to DeepSeek's rise.
[4]梁文锋 — 维基百科 T2 neutral zh
Liang Wenfeng (b. 1985) entered Zhejiang University's electronic-information program with the top entrance score in 2002, took a master's there, co-founded High-Flyer (幻方量化) with two ZJU alumni in 2015, declared entry into AGI in May 2023, and formally founded DeepSeek that July.
[5]DeepSeek-V3 — GitHub (deepseek-ai) T1 neutral en
DeepSeek-V3 is a 671B-total / 37B-active Mixture-of-Experts model pre-trained on 14.8T tokens, released open-weight under the MIT License.
[6]DeepSeek-R1 — GitHub (deepseek-ai) T1 neutral en
DeepSeek-R1 achieves performance comparable to OpenAI-o1 on math, code and reasoning; R1-Zero was trained by large-scale RL without supervised fine-tuning; weights and six distilled dense models are MIT-licensed.
[7]DeepSeek-V2 — GitHub (deepseek-ai) T1 neutral en
DeepSeek-V2 (May 2024) is a 236B/21B MoE model that cut training cost 42.5% and KV cache 93.3% versus DeepSeek 67B — the efficiency basis for its very low API price.
[8]DeepSeek-R1-0528 — Hugging Face T1 neutral en
DeepSeek-R1-0528 (May 2025) deepened reasoning — AIME-2025 accuracy rose from 70% to 87.5%, with average tokens per question up from 12K to 23K — and is MIT-licensed.
[9]DeepSeek-V3.1 — Hugging Face T1 neutral en
DeepSeek-V3.1 (Aug 2025) is a hybrid model supporting both thinking and non-thinking modes with improved tool-use/agent performance, MIT-licensed.
[10]疯狂的幻方：一家隐形AI巨头的大模型之路 — 36氪 T2 neutral zh
High-Flyer's compute scaled from 1 GPU to 100 (2015), 1,000 (2019) and then ~10,000, culminating in the ~1B-RMB Fire-Flyer 2 (萤火二号) cluster of ~10,000 Nvidia A100s; it was among the first in Asia-Pacific to obtain A100s in 2021.
[11]Z Waves｜梁文锋，DeepSeek缔造者 — 腾讯新闻 T2 neutral zh
Per Tencent News, DeepSeek's V2 was built entirely by domestic talent (no overseas returnees) — a ~139-person team of top-university fresh grads, unfinished-PhD interns and recent grads — running ~10,000 A100s.
[69]Z Waves｜梁文锋，DeepSeek缔造者 — 腾讯新闻 T2 supporting zh
Chinese coverage framed DeepSeek as a phenomenon — a small lab that, at roughly a tenth of the cost, built a model rivalling OpenAI's, drawing global attention to its founder.
[70]DeepSeek Debates — SemiAnalysis T2 critical en
SemiAnalysis cautioned that the global reaction was 'obsessive hype that doesn't reflect reality,' arguing the $6M figure covered only one pre-training run rather than DeepSeek's true investment.

Market & Industry

[12]梁文锋与「AI界拼多多」 — 澎湃新闻 (The Paper) T2 supporting zh
DeepSeek-V2 (May 2024) priced its API at 1 RMB per million input tokens / 2 RMB output — about 1/100th of GPT-4 Turbo — earning the label 'AI界拼多多' (the Pinduoduo of AI) and triggering price cuts by ByteDance, Alibaba and Baidu.
[13]大模型价格战：17家厂商定价梳理 — 知乎 T3 supporting zh
Within May 2024 Zhipu, ByteDance, Alibaba, Baidu, iFlytek and Tencent followed DeepSeek with API cuts of roughly 80–97%, some offering lightweight models free — a price war that ran for months.
[14]Models & Pricing — DeepSeek API Docs T1 neutral en
DeepSeek's published API pricing remains far below Western frontier APIs: V4-Pro lists roughly $0.435 per 1M cache-miss input tokens and $0.87 per 1M output (with cache-hit input near $0.0036).
[15]创大模型价格新低！DeepSeek API缓存价降至首发十分之一 — 证券时报 T2 neutral zh
In 2026 DeepSeek cut V4-Pro API prices ~75%, with cache-hit input dropping to one-tenth of launch price (as low as 0.025 RMB/M tokens) — described by Chinese press as a global low for large models.
[16]DeepSeek, Huawei, Export Controls, and the Future of the U.S.-China AI Race — CSIS T2 neutral en
US export controls are the binding constraint: bandwidth-limited H800/H20 chips force Chinese labs to use roughly 2–4x the compute for the same result; Liang has said money is not the problem, chip bans are.
[17]AWS, Microsoft, Google, Others Make DeepSeek-R1 Available — Campus Technology T2 supporting en
Within days of R1's January 2025 release, Microsoft (Azure AI Foundry), AWS (Bedrock/SageMaker) and Nvidia all made the open model available on their platforms.
[18]Tencent, Baidu Look to Capitalize on DeepSeek's Stunning Rise — Caixin Global T2 supporting en
In February 2025 Tencent integrated the full DeepSeek-R1 into WeChat search and its Yuanbao app, and Baidu moved to add DeepSeek alongside Ernie; DeepSeek's site traffic rose more than twentyfold to ~278M visits in January 2025.
[19]Who leads China's AI cloud race? — KrASIA T2 critical en
In H1 2025 daily model-invocation share (Frost & Sullivan), Alibaba Qwen led at 17.7%, ByteDance Doubao 14.1% and DeepSeek 10.3% — placing DeepSeek third among domestic models by usage.
[20]DeepSeek, Huawei, Export Controls — CSIS T2 critical en
Per CSIS citing SemiAnalysis, High-Flyer had acquired 10,000+ Nvidia A100s by mid-2022 (before US controls), and SemiAnalysis estimates DeepSeek/High-Flyer spent ~$1.63B on GPU servers — a structural compute advantage.
[73]疯狂的幻方 — 36氪 T2 neutral zh
DeepSeek competes in a crowded field of well-funded Chinese model makers — ByteDance, Alibaba, Baidu, Tencent and the 'AI tiger' startups — where benchmark leadership rotates quickly and capital is concentrated in big tech.

Business Model & Economics

[21]DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI) T1 supporting en
DeepSeek's own V3 technical report states full training consumed 2.788M H800 GPU-hours, translating to ~$5.576M at an assumed $2/GPU-hour rental.
[22]DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI) T1 neutral en
DeepSeek explicitly caveats that the $5.576M figure covers only the final official training run, excluding prior research and ablation experiments on architectures, algorithms or data.
[23]DeepSeek Debates: Chinese Leadership On Cost, True Training Cost — SemiAnalysis T2 critical en
SemiAnalysis estimates DeepSeek's true hardware spend is far higher than the $5.6M headline — ~$1.6B in total server CapEx and ~$944M in cluster operating costs, with access to ~50,000 Hopper GPUs; the $6M is only the pre-training GPU run.
[24]DeepSeek FAQ — Stratechery (Ben Thompson) T2 neutral en
Ben Thompson (Stratechery) argues the $5.6M figure is plausible for the final run given V3's architecture, but cannot replicate DeepSeek the company; H800s are Hopper GPUs constrained by sanctions.
[25]DeepSeek-V3/R1 Inference System Overview — deepseek-ai/open-infra-index (GitHub) T1 supporting en
In its February 2025 V3/R1 Inference System Overview, DeepSeek published a theoretical 545% cost-profit margin — $562,027 daily revenue against $87,072 daily cost, assuming $2/hr per H800.
[26]DeepSeek-V3/R1 Inference System Overview — deepseek-ai/open-infra-index (GitHub) T1 critical en
DeepSeek itself disclaims the 545% margin as theoretical: actual revenue is substantially lower because V3 is priced far below R1, web and app access remain free, and off-peak nighttime discounts apply.
[27]梁文锋从「量化养AI」到拥抱资本 — 新浪财经 T2 supporting zh
DeepSeek's R&D was funded from quant fund High-Flyer's budget (not VC); Liang Wenfeng holds 84.29% of DeepSeek and near-100% voting rights, and long said he would 'temporarily not seek financing.'
[28]DeepSeek或首次融资 — 新浪财经 T2 neutral zh
As of April 2026, Chinese press reported DeepSeek may be launching its first external financing — a target valuation of at least $10B and a raise of at least $300M — a potential reversal of its 'no financing' stance (unconfirmed).
[29]DeepSeek — Wikipedia T2 supporting en
DeepSeek has said it focuses on research without immediate commercialization plans; VCs were reluctant to fund it because a quick exit looked unlikely; since R1 it releases models under open-source licenses, primarily MIT.
[30]DeepSeek — Wikipedia T2 supporting en
After R1, Chinese rivals cut prices and DeepSeek was reported profitable relative to money-losing competitors — reinforcing the 'Pinduoduo of AI' framing (reported).
[31]DeepSeek创始人专访：中国的AI不可能永远跟随 — 新浪财经 T2 neutral zh
On the price war, Liang said DeepSeek simply works at its own pace, then costs it out and sets the price — neither subsidizing at a loss nor seeking exorbitant profit.

Competitive Landscape

[32]DeepSeek R1 vs. Claude (2026 Comparison) — Elephas T3 supporting en
Third-party comparisons credit DeepSeek-R1 with leading math/reasoning benchmarks (e.g. ~90.2% on MATH-500 vs Claude 78.3% and GPT-4o 74.6%) at a fraction of Claude's per-token price (estimate).
[33]AI模型排行榜横评：通义千问、DeepSeek、Kimi — LearnKu T3 neutral zh
Among Chinese peers, Alibaba's Qwen2.5-Max (pretrained on >20T tokens) was rated the top non-reasoning Chinese model and ranked highly globally; DeepSeek V3 stands out on throughput and data scale.
[34]2.26亿月活！豆包一家独大是第2-5名之和 — 新浪科技 T2 critical zh
In China's consumer AI-app market (Feb 2026 data), ByteDance's Doubao led at ~226.7M MAU — equal to the sum of apps #2–#5 — with DeepSeek #2 at ~135.6M, ahead of Tencent Yuanbao (~40.7M) and Alibaba Qwen (~25.2M).

Strategy, Moats & Talent

[35]DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL — arXiv T1 neutral en
DeepSeek-R1 uses Group Relative Policy Optimization (GRPO) RL, dropping the critic model; R1-Zero displayed a spontaneous 'aha moment' of re-evaluating its approach; R1 reached 79.8% on AIME 2024 (slightly above o1-1217) and 97.3% on MATH-500.
[36]DeepSeek-V3 Technical Report — arXiv (DeepSeek-AI) T1 neutral en
DeepSeek-V3 pairs Multi-head Latent Attention (MLA) with DeepSeekMoE and FP8 mixed-precision training to cut compute, claiming performance comparable to leading closed-source models while beating open-source peers.
[37]DeepSeek创始人专访：中国的AI不可能永远跟随 — 智源社区 (BAAI) T2 supporting zh
Liang Wenfeng's stated strategy: open-sourcing and publishing 'lose nothing'; the team is the real moat; the genuine gap is originality versus imitation; China must become a contributor rather than a free-rider.
[38]Interview with DeepSeek Founder: We're Done Following — The China Academy T2 supporting en
In the English translation of the same interview, Liang argues closed-source moats are fleeting — even OpenAI's closed model cannot prevent others catching up — and that open source is cultural and attracts talent.
[39]DeepSeek创始人专访 — 新浪财经 T2 neutral zh
Liang has argued China's AI cannot remain a perpetual follower — someone must stand at the technological frontier — and that DeepSeek's goal is research and exploration, not vertical apps.
[40]我所见过的梁文锋 — 腾讯新闻 T2 supporting zh
Liang describes a flat, bottom-up, no-KPI culture — hiring on 'passion and curiosity,' with natural rather than pre-assigned division of labor, and staff whose desire to do research exceeds their concern for money.
[41]字节回应「亿元年薪挖DeepSeek员工」 — 新浪财经 T2 critical zh
DeepSeek faces aggressive talent poaching: per Sina, ByteDance's Seed team lost ~70 people to rivals over a year and Tencent took ~30, while Xiaomi recruited former DeepSeek researcher Luo Fuli on a multi-million-yuan package — a sign its retention moat is contested.
[71]DeepSeek创始人专访 — 新浪财经 T2 neutral zh
Liang has said the V2 team had no overseas returnees — all domestic, drawn from top-university fresh grads and unfinished-PhD interns — reflecting a bet that home-grown talent can do frontier research.
[72]DeepSeek创始人专访 — 智源社区 (BAAI) T2 supporting zh
Liang argues being followed is itself rewarding for technical staff and that giving back via open source attracts talent — positioning openness as a recruiting and culture moat rather than a giveaway.

Risks & Controversies

[42]Microsoft probing whether DeepSeek improperly used OpenAI's API — TechCrunch T2 critical en
In January 2025 Microsoft and OpenAI investigated whether a DeepSeek-linked group exfiltrated large amounts of data via OpenAI's API in late 2024 to distill competing models — which would breach OpenAI's terms.
[43]DeepSeek首次回应「蒸馏OpenAI」质疑 — 第一财经 T2 supporting zh
DeepSeek's response (via its Nature paper, reported by Yicai): V3-Base training data came only from web pages and e-books with no intentionally added OpenAI synthetic data, though some scraped pages contained OpenAI-generated answers.
[44]硅谷掀桌！DeepSeek遭OpenAI和Anthropic围剿 — 量子位 T2 supporting zh
Reaction was mixed: OpenAI claimed evidence of distillation, but Anthropic's Dario Amodei called the threat exaggerated ('our level 7–10 months ago') and OpenAI's Mark Chen credited DeepSeek with 'independently discovering' core o1 ideas.
[45]US House Select Committee Report Accuses DeepSeek of Spying and Circumventing Export Controls — Tech Policy Press T2 critical en
An April 2025 US House Select Committee on the CCP report alleged DeepSeek funnels US user data via a China Mobile back-end, used distillation through false-pretense accounts, used restricted Nvidia chips, and censored ~85% of politically sensitive responses; DeepSeek did not respond to comment requests.
[46]DeepSeek Debates — SemiAnalysis T2 critical en
SemiAnalysis estimates DeepSeek/High-Flyer has access to ~50,000 Hopper GPUs (incl. ~10,000 H100s) — holdings that, if accurate, would sit uneasily with US export controls; the firm notes High-Flyer bought A100s in 2021 before controls. The smuggling claim rests on estimates and is unconfirmed.
[47]Wiz Research Uncovers Exposed DeepSeek Database — Wiz T1 critical en
In January 2025 Wiz Research found a publicly accessible, unauthenticated DeepSeek ClickHouse database exposing over a million log entries including plaintext chat history and API keys; DeepSeek secured it promptly after responsible disclosure.
[48]Evaluating Security Risk in DeepSeek and Other Frontier Reasoning Models — Cisco T1 critical en
Cisco (with UPenn) reported DeepSeek-R1 had a 100% attack-success rate against 50 HarmBench prompts — blocking none — versus much higher resistance for o1, Claude-3.5-Sonnet and GPT-4o, suggesting efficiency-first training weakened safety guardrails.
[49]Anthropic CEO says DeepSeek was the worst on a critical bioweapons data safety test — TechCrunch T2 critical en
Anthropic CEO Dario Amodei said DeepSeek scored worst of any model his team had tested on a bioweapons-data safety test, with 'no blocks whatsoever' — while adding it is not literally dangerous today and praising DeepSeek's 'talented engineers.'
[50]DeepSeek and Chinese censorship — EurasiaTimes T3 critical en
Documented tests show DeepSeek's hosted model aligns with Chinese government positions — deflecting on Tiananmen, echoing Beijing's line that Taiwan is 'an inalienable part of Chinese territory,' and avoiding naming Xi Jinping.
[51]Italy's data protection authority Garante blocked DeepSeek — Security Affairs T2 critical en
On 30 January 2025 Italy's Garante ordered an immediate limitation on processing Italian users' data by the DeepSeek entities and opened an investigation, after their responses — including on whether data is stored in China — were deemed inadequate.
[52]Gov't confirms DeepSeek's unauthorized data transfer abroad — The Korea Times T2 critical en
South Korea's PIPC found DeepSeek transferred Korean users' data — including prompt content — overseas without consent (to ByteDance subsidiary Volcano among others); new downloads were suspended on 15 February 2025, and DeepSeek later complied and resumed service.
[53]Three States Ban DeepSeek Use on State Devices and Networks — National Law Review T2 critical en
US states banned DeepSeek on government devices and networks in early 2025 — Texas (31 Jan), New York (10 Feb) and Virginia (11 Feb) — citing CCP data-harvesting and surveillance risks.
[54]Which countries have banned DeepSeek and why? — Al Jazeera T2 critical en
Multiple governments restricted DeepSeek in early 2025 — NASA and the US Navy, Australia (all government systems), Taiwan, and South Korean ministries — citing data-handling and security concerns.
[55]US lawmakers move to ban China's DeepSeek from government devices — NBC News T2 critical en
Feroot Security found hardcoded code in DeepSeek's web login linking it to state-owned China Mobile; academics at the University of Calgary and UC Berkeley confirmed the links for the AP (the claim is contested and DeepSeek has not addressed it).

Reception & Adoption

[56]Marc Andreessen warns DeepSeek is AI's Sputnik moment — Fortune T2 supporting en
Marc Andreessen framed DeepSeek-R1's launch as 'AI's Sputnik moment' — the phrase that crystallized Western alarm and excitement.
[57]What is Jevons paradox? Why Satya Nadella says DeepSeek is good news — Fortune T2 neutral en
Microsoft CEO Satya Nadella called the model 'super impressive,' urged the West to take China's AI progress 'very, very seriously,' and reframed cheap AI as bullish via the Jevons paradox.
[58]Trump calls DeepSeek a wake-up call for U.S. tech — Fortune T2 neutral en
President Trump called DeepSeek a 'wake-up call' for US tech but framed China's cheaper method as 'a positive,' saying the US could reach the same solution without spending 'billions and billions.'
[59]Jensen Huang says market got it wrong about DeepSeek's impact — TechCrunch T2 supporting en
Nvidia CEO Jensen Huang argued the market misread DeepSeek — reasoning models like R1 increase, not decrease, demand for compute.
[60]On DeepSeek and Export Controls — Dario Amodei T1 critical en
Anthropic CEO Dario Amodei argued DeepSeek's threat to US AI leadership is 'greatly overstated' and that V3 is an expected point on the cost-reduction curve — the difference being a Chinese firm got there first — and that export controls had not 'failed.'
[61]DeepSeek displaces ChatGPT as the App Store's top app — TechCrunch T2 supporting en
DeepSeek displaced ChatGPT as the #1 free US App Store app on 26 January 2025, was #1 in 51 other countries and top-10 in 111, with downloads more than doubling to ~2.6M in roughly three days (reported).
[62]中国AI原生App月活突破2.4亿：DeepSeek居榜首 — 新浪财经 T2 supporting zh
Per Sina, as of February 2025 China's AI-native apps reached 240M MAU, with DeepSeek leading at ~194M (1.94亿), Doubao second at ~116M (1.16亿) and Tencent Yuanbao third at ~41.6M (reported/estimated).
[63]DeepSeek's namesake chatbot sees a drop in downloads — South China Morning Post T2 critical en
DeepSeek's chatbot downloads fell 72% to 22.6M in Q2 2025 and MAU dipped ~9% QoQ to ~170M, while Doubao overtook it on downloads (29.8M, +9.5%) — the clearest evidence of a plateau.
[68]网传梁文锋回应冯骥国运论 — 腾讯新闻 T3 supporting zh
SENTIMENT (Tier-3, virality unverified): a widely circulated New Year's Eve post attributed to Liang answered a game designer's 'national-fortune' (国运级) praise with deliberate humility — 'standing on the shoulders of open-source giants' — capturing China's national-pride narrative; the hosting article itself flags the post may be fabricated.
[74]梁文锋与「AI界拼多多」 — 澎湃新闻 T2 supporting zh
DeepSeek's rise was received in China as a point of national pride — its low-cost, open approach turned founder Liang Wenfeng into a domestic symbol of home-grown innovation.

Forward View

[64]DeepSeek R2 launch stalled as CEO balks at progress — Reuters / Investing.com T2 critical en
R2, planned for around May 2025, was stalled — The Information reported CEO Liang Wenfeng was unsatisfied with its performance, compounded by a China-side Nvidia chip shortage after April 2025 H20 export curbs.
[65]DeepSeek reportedly urged to train on Huawei hardware after multiple failures — Tom's Hardware (citing FT) T2 critical en
Per FT-sourced reporting, Chinese authorities urged DeepSeek to train on Huawei Ascend after R1, but persistent failures — instability, slow interconnects, immature CANN software — meant it never completed a fully successful Ascend training run and reverted to Nvidia for training while using Ascend for inference.
[66]DeepSeek unveils V4 model, with rock-bottom prices and close Huawei-chip integration — Fortune T2 neutral en
DeepSeek previewed V4 on 24 April 2026, co-engineered to run on Huawei Ascend — V4-Pro (1.6T params, 1M-token context) and a cheaper V4-Flash — with reporting that V4 'falls marginally short of GPT-5.4 and Gemini 3.1 Pro,' trailing the frontier by roughly 3–6 months.
[67]DeepSeek, Huawei, Export Controls, and the Future of the U.S.-China AI Race — CSIS T1 neutral en
CSIS frames the open question: DeepSeek's team is elite and would be formidable with more compute, but each Huawei Ascend 910C delivers only ~60% of an H100 for inference and its CANN software is 'difficult and unstable' — so the compute ceiling, not talent, is the binding constraint.
[75]DeepSeek, Huawei, Export Controls — CSIS T1 supporting en
Analysts note a bull case: if DeepSeek's elite team can make domestic Huawei/Ascend silicon work at scale, escaping the export-control ceiling could matter more for its long-run trajectory than any single model release.
[76]DeepSeek API价格新低 — 证券时报 T2 supporting zh
DeepSeek has kept using aggressive, permanent price cuts (V4-Pro down ~75% in 2026) to defend developer mindshare and API relevance even as consumer usage plateaus — a deliberate low-cost-leadership play.
[77]DeepSeek或首次融资 — 新浪财经 T2 neutral zh
Chinese press reported DeepSeek may pursue its first external financing (target valuation $10B+), which some read as building a war chest for the compute-heavy next model generation (unconfirmed).

Cross-checked at build time by an automated link checker; a few primary sources may be paywalled or bot-walled and were verified manually. See Methodology & Limits.