The AI Chip War: TSMC, NVIDIA, and Huawei Explained

 


The Chip War: How TSMC, NVIDIA, and Huawei Are Competing for AI's Hardware Future

The B200 GPU that trains your favorite AI model costs between $5,700 and $7,300 to manufacture and sells for up to $40,000 — an 81% gross margin on a single piece of silicon. That number, buried in an Epoch AI component analysis, explains more about the global semiconductor competition than any press release ever has. Three companies — a Taiwanese foundry, an American GPU designer, and a Chinese national champion — are fighting over who gets to manufacture and profit from the brains of an AI economy that is spending its way toward what Deloitte estimates will be a $975 billion global semiconductor market in 2026.

Most coverage of the chip war treats it as a clean narrative: the US restricts exports, China falls behind, NVIDIA profits. The actual picture is more fractured. Export controls leak. Huawei shipped a rack-scale system that outperforms NVIDIA's flagship cluster on certain metrics while consuming four times the power. TSMC lost a third fab's worth of quarterly profits in a single gas outage in Arizona. DeepSeek — the Chinese AI lab that panicked global markets in January 2025 — reportedly trained its most celebrated model on smuggled chips while publicly claiming to have done it cheaply on restricted hardware. The narrative everyone agreed on two years ago is not the narrative that the data supports today.

This article delivers three things: a precise accounting of where each company actually stands as of mid-2026, a clear-eyed reading of the export control regime's real effect versus its intended one, and a verdict on what the hardware competition means for the developers, enterprises, and governments that depend on these chips. The cheerleading on both sides is noise. The numbers are not.


  1. The Foundry That Won Before the War Started: TSMC's Manufacturing Hold
  2. NVIDIA's Margin Machine: The Economics of Artificial Scarcity
  3. Huawei's Counter: Real Progress, Real Ceiling
  4. The Software Layer Nobody Wants to Talk About
  5. Export Controls: What They Actually Accomplished
  6. Who This Is For
  7. Verdict
  8. FAQ

The Foundry That Won Before the War Started: TSMC's Manufacturing Hold

TSMC controls 70% of global semiconductor foundry revenue as of Q3 2025, according to TrendForce data compiled by The Motley Fool — a figure that represents roughly ten times Samsung's share. Its nearest competitor sits at 7.1%. That gap did not happen by accident; it accumulated over three decades of compounding reinvestment, and it explains why every other story in the chip war eventually runs through Hsinchu.

The financial profile in 2026 is staggering. TSMC's Q1 2026 earnings show revenue of $35.90 billion — up 40.6% year-over-year — with a net profit margin of 50.5% and a gross margin of 66.2%. Full-year 2025 revenue was $121.4 billion, a 37.56% jump from 2024. Advanced technologies — 7nm and below — account for 74% of total wafer revenue, and the newly launched 2nm process entered volume production at the end of 2025, with initial yields reportedly exceeding expectations. High-performance computing, the category that encompasses AI accelerators, nearly doubled over two years to NT$2.19 trillion. North America now generates 75% of TSMC's revenue. The customers are not buying commodity chips. They are buying access to manufacturing capability that no one else on Earth can replicate at scale.

The Arizona reality check: A gas supplier outage in Q3 2025 caused several hours of downtime at TSMC's Phoenix fab, scrapping thousands of wafers and reducing quarterly profits at that facility by 99%. The Taiwan operations have never experienced anything comparable. That gap — between what TSMC can do in Hsinchu and what it can do in Phoenix — is the one number the CHIPS Act boosters consistently omit.

The geopolitical architecture around TSMC has been shifting faster than the manufacturing roadmap. TSMC has committed $165 billion to building up to twelve fabs in Arizona, with 4nm production already running at Fab 1, and Fabs 3 and 4 designated for 2nm and the A16 (1.6nm) process. Commerce Secretary Lutnick has stated publicly that 95% of the advanced chips the US depends on are made in Taiwan, and that he wants to raise America's share of global output from 2% to 40%. The gap between those two numbers is the gap TSMC is being paid to close.

What the geopolitical framing misses is that TSMC's dominance is not merely about geography. SMIC — China's most advanced domestic foundry — has roughly 45,000 wafer starts per month of advanced-node capacity, expanding toward 60,000 by end of 2026. TSMC operates at a different order of magnitude with superior yields, better packaging integration, and exclusive access to ASML's EUV machines, which China remains blocked from purchasing. The manufacturing gap is not narrowing fast enough to matter for AI chips in this decade.

"The competitive gap is most visible at the leading edge, where TSMC's 2nm node is already in volume production while rivals remain at least 12–18 months behind — and both plants are completely sold out for the year."

NVIDIA's Margin Machine: The Economics of Artificial Scarcity

NVIDIA's data center business generated $41.1 billion in revenue in a single quarter — Q2 fiscal 2026, ending July 2025 — according to the company's SEC filing. That figure is up 56% from a year earlier and represents more than NVIDIA made across its entire business in fiscal 2022. The Blackwell architecture — specifically the B200 GPU, fabricated on TSMC's 4NP process — is responsible for most of the acceleration. B200 Data Center revenue grew 17% sequentially in that quarter alone.

The economics underneath these numbers are worth examining without the reverence they usually receive. Epoch AI estimated the bill of materials for a single B200 at $5,700 to $7,300, based on TSMC wafer costs, HBM3e packaging, and component prices. The street price is $30,000 to $40,000. At a midpoint of $35,000 sale price and $6,500 production cost, NVIDIA earns roughly $28,500 gross profit per B200 shipped — an 81% hardware gross margin. This number, which the typical market analysis never includes, explains why NVIDIA's blended company gross margin was 72.4% for the quarter despite consumer GPU lines and gaming revenue pulling it down. The B200 is not a product. It is a toll booth.

The supply side enforces the pricing. Backlogs reached 3.6 million units by late 2025, and new buyers in 2026 face 12 to 18-month wait times for volume B200 orders. Every AI lab paying premium prices for early allocation is subsidizing NVIDIA's moat. Jensen Huang has described data centers equipped with Blackwell as "AI factories" — a framing that positions electricity-in, intelligence-out as the defining industrial metaphor of the decade.

The Performance Lead That Compounds

Blackwell delivers roughly 5x the AI inference performance of the H100 per GPU, according to NVIDIA's own architecture documentation. A single GB200 "Superchip" — which pairs two B200 GPUs with a Grace ARM CPU — can exceed 1,200W peak power draw. A hypothetical cluster of one million such GPUs would consume between 1.0 and 1.4 gigawatts, enough to power a mid-sized city. This is why Microsoft and Meta have begun investing directly in small modular reactors and fusion energy research: the Blackwell performance advantage comes with an energy bill that forces infrastructure decisions that were, until recently, the exclusive concern of utility companies.

CUDA is the other compounding variable. Launched in 2006, NVIDIA's parallel computing platform has accumulated nearly two decades of library development, tooling, and embedded workflow assumptions. Virtually every AI framework of consequence — PyTorch foremost among them — is built with CUDA integration at its core. A developer who wants to leave NVIDIA hardware does not just buy different chips; they rewrite infrastructure and lose access to years of ecosystem code. This lock-in is NVIDIA's actual moat. The chips are how it is maintained.

Huawei's Counter: Real Progress, Real Ceiling

Huawei's Ascend 910C delivers an estimated 800 TFLOPS of FP16 compute, built on SMIC's 7nm DUV process, with 96GB of HBM2e and approximately 1,800 GB/s of memory bandwidth. On paper, those specifications approach the NVIDIA H100. In production, the reality is more constrained. DeepSeek — the lab that became China's most credible AI story in 2025 — acknowledged publicly that the Ascend 910C performs at roughly 60% the level of the H200, despite achieving 80% of the theoretical peak performance. That 20-point translation loss — from spec to deployed efficiency — is not a marketing problem. It is a manufacturing and software problem that Huawei has not yet solved.

The hardware story gets stranger at the system level. Huawei's CloudMatrix 384 — a rack-scale supernode connecting 384 Ascend 910C chips in an all-optical mesh — reportedly delivers 300 petaFLOPs of BF16 compute, roughly double the NVIDIA GB200 NVL72's approximately 150 petaFLOPs, with 49.2 terabytes of HBM versus the GB200's 13.8 terabytes. Tom's Hardware analysis noted that achieving this required running the CloudMatrix at approximately four times the power consumption of the NVIDIA system. Winning on compute by spending four times the electricity is not a scalable strategy for global deployment. It is a workaround.

DeepSeek R2 was delayed in part because of issues training on Huawei Ascend hardware at scale, according to reporting from August 2025. The lab returned to NVIDIA H800 chips — technically banned from sale in China — for the critical training runs. That detail is the one that advocates for Chinese AI chip independence have been most careful to avoid discussing. The 910C works for inference workloads and smaller fine-tuning runs. Pre-training frontier models on it, at scale, is where the gap with NVIDIA remains substantial.

The figure behind the figure: A Georgetown CSET analysis found that when the Ascend 910 moved to the 910B, only 75% of the advertised theoretical maximum performance increase reflected actual measured hardware gains. The remaining 25% was specification arithmetic that did not survive contact with benchmarks. That gap — between what a chip is rated at and what it delivers under real workloads — has been a consistent pattern in the Ascend line, and it compresses the real-world competition even further than the top-line specs suggest.

Supply constraints are structural, not temporary. Huawei plans to manufacture approximately 600,000 Ascend 910C units in 2026, nearly double 2025 output, according to reporting cited by industry analysis. But the SemiAnalysis production ramp analysis identifies HBM — high-bandwidth memory — as the binding constraint: Huawei received over 2.9 million Ascend die from TSMC before export controls tightened, and domestic HBM production from CXMT can support only 250,000 to 300,000 complete Ascend 910C packages at current yields. Without access to foreign HBM, the chip dies accumulate without the memory needed to turn them into functioning accelerators.

That is the ceiling. SMIC is stuck at 7nm DUV because China cannot access EUV equipment. HBM from domestic sources is not yet at the volumes or specifications the Ascend line requires. And according to CFR analysis of Huawei's own public roadmap, the chips Huawei plans to release in 2026 — the Ascend 950PR and 950DT — reportedly have lower theoretical peak performance than the current 910C. That apparent regression may reflect SMIC's limits more than any strategic design choice.

Every Huawei chip is a domestic victory and a global footnote.

The Software Layer Nobody Wants to Talk About

Hardware specs are the language of press releases. The real competition is in software, and that race is more complicated than the chip counts suggest. CUDA's advantages are not primarily about the platform itself — they are about the two decades of libraries, tutorials, tooling, and workflow assumptions built on top of it. A Chinese developer who uses PyTorch is already using a framework whose deepest performance paths run through CUDA abstractions.

Huawei's response has been methodical. In August 2025, the company open-sourced its CANN toolkit — a direct attempt to compete with CUDA's proprietary model — and began recruiting Chinese AI labs and universities to build out an Ascend developer community. The torch_npu backend plugin now allows standard PyTorch code to run on Ascend processors without rewriting the framework layer. That single move lowered the switching cost more than any chip specification improvement could. Bruegel analysis notes that Huawei's software strategy mirrors China's hardware playbook: open up what the incumbent keeps closed, subsidize adoption, and build a captive generation of developers around the alternative.

The limitation is honest: Huawei's CANN framework is newer, less stable, and has a smaller real-world feedback loop than NVIDIA's ecosystem. With fewer major customers training frontier models on Ascend hardware, Huawei lacks the high-volume data needed to rapidly surface and fix edge cases. Bugs and overheating issues reported by early adopters have slowed enterprise adoption. Chinese developers who can get access to NVIDIA chips — through secondary markets, overseas servers, or stockpiled inventory — still overwhelmingly prefer them.

DeepSeek as a Strategic Paradox

DeepSeek occupies the strangest position in this argument. Its models are optimized for Huawei Ascend inference; its V4 was trained — according to CFR reporting — at least in part on NVIDIA Blackwell chips that are banned from sale in China. The public framing and the private infrastructure point in opposite directions. DeepSeek itself acknowledged in its V4 technical paper that the model trails leading US frontier models by approximately three to six months. That gap is broadly consistent with estimates of a seven-month US lead. The surprise of January 2025 was real; the permanent closing of the frontier gap has not followed.

Export Controls: What They Actually Accomplished

The US export control regime targeting advanced chips began in October 2022 and has been revised, expanded, and partially reversed several times since. The April 2025 ban on the NVIDIA H20 — the chip specifically engineered to fall below earlier thresholds — triggered a $4.5 billion charge against NVIDIA's earnings, as the company wrote down H20 inventory and purchase obligations that suddenly had no legal buyer. The May 2025 BIS guidance went further, prohibiting US and non-US persons from using, selling, financing, or servicing Huawei's Ascend 910B, 910C, and 910D chips globally — not just in China.

Then in December 2025, the Trump administration partially reversed course, announcing that the H200 and similar chips could be shipped to approved customers in China, subject to security certifications. BIS formally implemented this in January 2026, shifting the review posture for H200-class chips from "presumption of denial" to "case-by-case review." The policy moved 180 degrees in eight months.

Two serious analysts reach genuinely conflicting conclusions about what this all means. The Council on Foreign Relations analysis argues that export controls have been effective, that China's best AI chips are declining in capability relative to the roadmap, and that the US will hold a 17x advantage in total compute capacity over China by 2027. CSIS responds that the controls have limits, that constraint has spurred genuine algorithmic innovation in China, and that a March 2025 Peking University team announced a 2D transistor that could operate 40% faster than TSMC's 3nm devices while consuming 10% less energy. Both readings are accurate on their own terms. The controls slowed Huawei without stopping it. The algorithmic innovations are real but do not yet close the compute gap. Neither side wants to acknowledge what the other has right.

Everyone who assumed export controls would work cleanly has been surprised. Everyone who assumed they would accomplish nothing has also been surprised.

Who This Is For

  • Enterprise AI teams planning infrastructure in 2026–2027: If your training workloads run at scale and you can accept 12-to-18-month lead times, Blackwell B200 clusters remain the best-performing option. If you cannot wait and do not need frontier training capability, H100 supply is more accessible. The B200's 5x inference advantage over H100 translates directly to cost per token for inference-heavy production workloads.
  • Chinese AI developers and enterprises: The Huawei Ascend 910C handles inference workloads and smaller fine-tuning competently. Full pre-training of frontier models on Ascend hardware at scale remains technically problematic, and DeepSeek's own R2 training regression provides the most credible public data point. Expect a hybrid model: Ascend for inference, whatever NVIDIA hardware can be obtained for training.
  • Policymakers and defense analysts tracking the compute gap: The 10x US advantage in total compute capacity identified by RAND is real and currently widening. The more nuanced question — whether algorithmic efficiency can compensate for compute deficits in specific domains — is where Chinese AI development has produced its most credible results. The answer is probably domain-specific.
  • TSMC customers and supply chain planners: TSMC's 2nm capacity is sold out through 2026. Advanced packaging (CoWoS) remains the binding constraint for HBM-integrated AI chips. Any company that did not secure allocation in 2024 or early 2025 is competing for scraps in a market that is not adding meaningful capacity before 2027.

Verdict

TSMC wins the foundry war regardless of who wins anything else — there is no plausible scenario in this decade where its manufacturing lead is erased. NVIDIA wins the AI accelerator market as long as CUDA remains the path of least resistance for AI development, and nothing in 2026 threatens that. Huawei has demonstrated genuine engineering capability, particularly at the system architecture level, and genuine manufacturing constraints, particularly at the transistor and memory level. The export control regime has compressed Huawei's roadmap without stopping it, and the policy reversals since April 2025 have introduced enough uncertainty that no one can credibly predict where the lines will be drawn in 2027.

The chip war is not a war with a finish line. It is a compounding race in which the leading side has to keep running fast enough that the gap stays meaningful, and the following side has to keep finding architectural workarounds that reduce the cost of the gap. Both are succeeding. Neither is winning decisively.

What no one in this debate has adequately answered is whether the compute gap will still determine AI outcomes once inference efficiency improvements — the category where Chinese researchers have genuinely closed ground — make the training compute advantage less determinative of deployed capability. The benchmarks say the US leads. The question of whether benchmarks capture the right thing is still open.


What is the chip war between the US and China about?

The US-China chip war is a competition for control over the design, manufacturing, and deployment of advanced semiconductors that power artificial intelligence systems. The US has imposed export controls restricting China's access to cutting-edge chips and the manufacturing equipment needed to produce them domestically. China, led by Huawei in AI chips and SMIC in manufacturing, is working to build domestic alternatives. The foundational issue is that advanced AI requires specific classes of chips that only a small number of companies — primarily NVIDIA and TSMC — currently produce at scale and with competitive performance.

Is Huawei's Ascend chip as good as NVIDIA's H100?

Huawei's Ascend 910C delivers roughly 60% of the H200's real-world performance on AI workloads, according to DeepSeek's own published findings — despite achieving approximately 80% of theoretical peak performance on paper. The gap between specification and deployed performance reflects SMIC's 7nm DUV manufacturing limits and software ecosystem immaturity compared to NVIDIA's CUDA. At the system level, Huawei's CloudMatrix 384 can match or exceed NVIDIA's GB200 NVL72 cluster on some metrics, but requires approximately four times the power consumption to do so.

Why is TSMC so dominant in semiconductor manufacturing?

TSMC controls roughly 70% of global foundry revenue as of Q3 2025, a position built over three decades of compounding reinvestment in process technology and manufacturing discipline. It is the exclusive producer for NVIDIA's Blackwell GPUs, Apple's chips, and most leading AI accelerators. Its 2nm process entered volume production at end of 2025 while rivals remain 12 to 18 months behind. No competitor has access to the combination of ASML EUV equipment, advanced packaging capability, and sustained capital investment that defines TSMC's manufacturing position.

What did US export controls on chips actually accomplish?

Export controls imposed since 2022 significantly constrained Huawei's ability to access advanced chip manufacturing equipment and TSMC's advanced process nodes. They also limited NVIDIA's ability to sell its most powerful chips in China, costing the company a $4.5 billion charge when the H20 ban took effect in April 2025. However, controls have been partially reversed, actively circumvented through chip smuggling and stockpiling, and have produced unintended consequences by spurring Chinese algorithmic innovation. Independent analyses from CFR and CSIS reach different conclusions about their overall effectiveness.

How much does an NVIDIA B200 GPU cost, and why is it so expensive?

The NVIDIA B200 GPU sells for $30,000 to $40,000 per unit, with a bill of materials estimated at $5,700 to $7,300 — producing a gross margin of roughly 81% per chip. Supply constraints reinforce the pricing: B200 inventory was backlogged at 3.6 million units by late 2025, and new buyers face 12 to 18-month wait times for volume orders. The B200 delivers approximately 5x the inference performance of the H100, which justifies the pricing for AI workloads where throughput directly determines cost per token at scale.

Can DeepSeek train its models on Huawei chips instead of NVIDIA?

DeepSeek's V4 model runs inference on Huawei Ascend processors, but training — the more compute-intensive and supply-constrained part of the process — reportedly still depends on NVIDIA hardware, including chips obtained via channels that bypass export restrictions. DeepSeek R2's development was reportedly delayed due to training difficulties at scale on Ascend hardware. The pattern across Chinese AI labs is consistent: Huawei chips are viable for inference and fine-tuning, but pre-training frontier models at scale on Ascend hardware remains technically problematic in 2026.

TSMC vs. SMIC — what is the real manufacturing gap?

TSMC's 2nm process is in volume production as of end-2025, while SMIC remains constrained to 7nm DUV manufacturing because China lacks access to ASML's EUV lithography equipment. SMIC's estimated advanced-node capacity in 2025 was approximately 45,000 wafer starts per month, expanding toward 60,000 by end-2026 — a fraction of TSMC's scale. The equipment gap, not the design or investment gap, is the primary constraint limiting China's ability to manufacture competitive AI chips domestically, and it cannot be bridged without access to EUV technology that export controls currently block.

Is the US losing the AI chip war to China?

No — the US holds roughly a 10-fold advantage in total AI compute capacity over China, a gap that RAND research suggests may widen through 2027. TSMC and NVIDIA hold dominant positions in manufacturing and AI accelerator hardware that China has not yet replicated. However, Chinese AI labs have produced genuine algorithmic innovations — particularly in training efficiency, mixture-of-experts architectures, and inference optimization — that partially compensate for compute deficits in specific domains. The US leads on hardware; China has closed ground on algorithmic efficiency. Whether that efficiency gain can substitute for compute in future AI development is the genuinely unresolved question.

We welcome your analysis! Share your insights on the future trends discussed, or offer your expert perspective on this topic below.

Post a Comment (0)
Previous Post Next Post