The fuel of the AI revolution. Why the shortage of GPUs and CoWoS packaging is structural, not cyclical. The most crowded trade on Earth — or just getting started?
NOT wafer capacity. TSMC can print the chips. They cannot package them fast enough.
Demand growing faster than capacity additions. Deficit persists through at least 2027.
Confirmed by $300B+ hyperscaler capex guidance (MSFT, META, GOOG, AMZN) for 2025–2026.
TSMC adding CoWoS capacity aggressively — but demand is growing ~70–80% YoY.
The popular narrative is simple: "We don't have enough GPUs." This is technically true but deeply misleading. The real bottleneck is not the GPU die itself — it is the advanced packaging that turns a bare die into a functional AI accelerator. Understanding this distinction is critical because it changes which companies benefit most and how long the shortage lasts.
Here is the pipeline: TSMC can manufacture the Blackwell B200 GPU die on its N4P process node in approximately 8–10 weeks. That is not the problem. The problem is what happens after the die comes off the wafer:
| Metric | 2023 | 2024 | 2025E | 2026E | 2027E | 2028E |
|---|---|---|---|---|---|---|
| CoWoS Supply (K wafers/mo) | 12 | 15 | 40 | 60 | 80 | 100 |
| CoWoS Demand (K wafers/mo) | 15 | 28 | 55 | 85 | 110 | 130 |
| Deficit (K wafers/mo) | -3 | -13 | -15 | -25 | -30 | -30 |
| Utilization Rate | 100% | 100% | 100% | 100% | 100% | 100% |
Sources: TSMC Investor Relations, Gartner, TrendForce, Market Watch estimates.
Chip-on-Wafer-on-Substrate (CoWoS) is TSMC's advanced 2.5D packaging technology. It allows multiple silicon dies (GPU compute dies + HBM memory stacks) to be placed side-by-side on a large silicon interposer, then bonded to an organic substrate.
Why it is the bottleneck: An NVIDIA B200 GPU requires a CoWoS interposer that is roughly 2.5x the size of the compute die itself. This interposer must be fabricated on a 300mm wafer with extreme precision, and then the GPU die and 8 stacks of HBM3e must be aligned and bonded at micron-level accuracy. Each B200 consumes ~2.5x more CoWoS area than an H100 did.
The math: TSMC can print millions of GPU dies per month. But they can only package ~40,000 CoWoS wafers per month (as of early 2025). Each wafer yields a limited number of packaged GPUs depending on interposer size. You can have all the GPU dies in the world, but without CoWoS packaging, they are useless for AI training.
The chart below tells the entire story. Despite TSMC's aggressive capacity expansion — roughly doubling CoWoS capacity each year — demand is growing even faster. Every major hyperscaler (Microsoft, Google, Meta, Amazon, Oracle, ByteDance) wants more capacity than TSMC can deliver. The deficit is not closing; it is widening.
Sources: TSMC IR, Gartner, TrendForce, SemiAnalysis, Market Watch estimates
Three demand drivers are compounding simultaneously:
Each new frontier model (GPT-5, Gemini Ultra 2, Claude 4) requires 5–10x more compute than its predecessor. The training compute doubling time is ~6 months, far faster than Moore's Law. This is the most insatiable demand in semiconductor history.
Inference demand scales with users, not with model size. As AI is embedded in every application (search, email, coding, customer service), inference compute is growing exponentially. Inference now represents >60% of AI chip demand.
Every major nation (UAE, Saudi Arabia, Japan, India, France, UK) is building sovereign AI infrastructure. This is new demand that did not exist 18 months ago. Jensen Huang estimates sovereign AI as a $100B+ TAM by 2027.
When supply cannot meet demand, TSMC becomes the most powerful allocator in the technology industry. Allocation priority goes to the largest customers first (Apple, NVIDIA, AMD, Qualcomm), leaving smaller customers with extended lead times. This creates a "have vs. have-not" dynamic where companies with guaranteed TSMC capacity have a structural competitive advantage. TSMC's allocation decisions are shaping which AI companies survive.
The AI chip supply chain is the most complex manufacturing process ever devised by humanity. A single advanced GPU requires over 1,000 processing steps across 3 continents. Here are the 7 critical chokepoints, ranked by bottleneck severity:
| # | Chokepoint | Bottleneck Owner | Market Share | Current Capacity | Lead Time | Expansion Plans |
|---|---|---|---|---|---|---|
| 1 | CoWoS Advanced Packaging | TSMC (85%), ASE (10%), Amkor (5%) | TSMC 85% | ~40K wafers/mo (2025) | 24+ weeks | 60K by end 2026; new fab in Kumamoto (JP) |
| 2 | HBM (High Bandwidth Memory) | SK Hynix (53%), Samsung (38%), Micron (9%) | 3 suppliers | HBM3e: ~90% allocated through H2 2026 | 20+ weeks | HBM4 qualification in H2 2026; 16-Hi stacking |
| 3 | EUV Lithography | ASML (100% monopoly) | 100% | ~50 EUV systems/year; High-NA ramp starting | 18+ months | Target: 90 EUV/yr by 2027; High-NA at 20/yr |
| 4 | ABF Substrates | Ibiden (35%), Shinko (30%), Unimicron (20%) | Tight oligopoly | Panel size increasing for larger interposers | 16+ weeks | Ibiden investing $2B in new capacity; 2026 online |
| 5 | Advanced DRAM (for HBM) | Samsung, SK Hynix, Micron | 3 suppliers | HBM consumes ~5x more DRAM wafers per bit vs DDR5 | 12+ weeks | Converting DDR capacity to HBM; cannibalizing PC/mobile |
| 6 | Etch & Deposition Equipment | Lam Research (45%), TEL (25%), AMAT (30%) | Concentrated | Scaling with fab expansions globally | 12–16 wk | New tools for GAA (Gate-All-Around) at 2nm |
| 7 | Test & Inspection | KLA (55%), Advantest (30%), Teradyne (15%) | Oligopoly | CoWoS requires 3x more inspection steps than standard packaging | 8–12 wk | KLA investing in advanced packaging inspection |
Think of the AI compute stack as a pyramid with 6 layers. Each layer depends on the one below it. The bottleneck is always at the narrowest layer:
Investment implication: Invest in the narrowest point of the stack. Today that is CoWoS packaging (TSMC), HBM memory (SK Hynix, Micron), and EUV lithography (ASML). These are the companies with the most pricing power.
Four primary trade setups, each targeting a different layer of the AI chip supply chain. All entries are designed around technical support levels with defined risk. Positions should be scaled into over 3–5 tranches.
Thesis: NVIDIA controls ~90% of the AI training GPU market and ~80% of inference. The CUDA software ecosystem creates an almost insurmountable moat — switching costs are measured in years of engineering effort, not dollars. The Blackwell architecture (B200/GB200) represents a generational leap in performance/watt, and every unit is pre-sold. NVIDIA's data center revenue is on track to exceed $115B in FY2026 (ending Jan 2027), with gross margins sustained above 73%. The "AI factory" narrative — where NVIDIA sells not chips but entire compute platforms — extends the TAM from $300B to $1T+.
Thesis: TSMC manufactures 100% of the world's most advanced AI chips (NVIDIA, AMD, Apple, Broadcom custom). There is no alternative foundry capable of producing at 3nm or 5nm at scale. TSMC's pricing power is increasing — CoWoS prices have risen 20%+ in 2025 with zero customer pushback because there is simply nowhere else to go. The Arizona fab (Fab 21) de-risks the Taiwan concentration thesis and qualifies TSMC for US government subsidies. Revenue is growing 25%+ YoY with operating margins expanding toward 48%.
Thesis: Broadcom is the leading designer of custom AI accelerators (XPUs) for hyperscalers. Google's TPU, Meta's MTIA, and potentially Apple and Microsoft custom chips are all designed with Broadcom's silicon. As hyperscalers diversify away from NVIDIA dependency, Broadcom's custom chip business is growing at 50%+ YoY. Additionally, VMware integration is driving 40%+ infrastructure software growth. The AI-related revenue is on a $12B+ annual run rate and accelerating.
Thesis: ASML is the sole manufacturer of EUV lithography machines, the most complex device ever built. Every advanced chip (3nm, 5nm) in the world requires ASML EUV exposure. There is no alternative supplier and no realistic path to creating one within the next decade. ASML's High-NA EUV machines ($350M+ each) are essential for 2nm and beyond. The installed base of 200+ EUV systems generates $5B+/year in services revenue. The Dutch export controls on China de-risk the ASML bear case (China was never >15% of EUV revenue). Backlog exceeds $40B.
Beyond the primary bottleneck owners, a constellation of companies benefits from the AI chip shortage. These "ecosystem plays" offer exposure to the same theme at often more attractive valuations because they are less covered by consensus.
| Ticker | Company | Role in Stack | Scarcity Exposure | AI Revenue % | Key Catalyst |
|---|---|---|---|---|---|
| ANET | Arista Networks | Data center networking (400G/800G switches) | Every GPU cluster needs high-bandwidth networking. 800G transition drives ASP uplift. | ~40% | 800G deployment ramp with MSFT/META in H2 2026 |
| VRT | Vertiv Holdings | Data center cooling & power infrastructure | AI servers need 3–5x more cooling than traditional servers. Liquid cooling is the new standard. | ~55% | Liquid cooling orders backlog +200% YoY; GB200 racks |
| MU | Micron Technology | HBM3e memory (only US-based HBM supplier) | HBM is the second-biggest bottleneck after CoWoS. Micron's HBM3e is qualified with NVIDIA. | ~25% | HBM revenue hitting $2B+/quarter by mid-2026 |
| AMAT | Applied Materials | Semiconductor equipment (deposition, etch, CMP) | Every new fab and CoWoS expansion line requires AMAT equipment. GAA transition at 2nm drives upgrade cycle. | ~30% | Gate-All-Around equipment revenue in 2026; TSMC N2 ramp |
| KLAC | KLA Corporation | Inspection & metrology (defect detection) | Advanced packaging requires 3x more inspection steps. KLA's tools are essential for CoWoS yield management. | ~20% | Advanced packaging inspection revenue growing 40%+ YoY |
The AI compute market has two distinct segments with very different economics:
Portfolio implication: Overweight NVDA for the training cycle (2024–2026 peak). Overweight AVGO + ANET for the inference cycle (2026–2028+). Own TSM and ASML for both — they are the toll roads that benefit regardless of which chip architecture wins.
The semiconductor thesis is high-conviction, but no thesis is permanent. Here is the comprehensive framework for monitoring whether the shortage is intensifying (bullish) or resolving (bearish). Track these signals monthly.
| Date | Event | Why It Matters | Impact |
|---|---|---|---|
| Feb 26, 2026 | NVDA Q4 FY2026 Earnings | Blackwell revenue ramp, gross margin trajectory, FY2027 guidance | VERY HIGH |
| Mar 10, 2026 | TSMC Feb Monthly Revenue | First read on post-Lunar New Year demand; CoWoS ramp trajectory | HIGH |
| Mar 17–20, 2026 | NVIDIA GTC Conference | Next-gen architecture reveal (Rubin?), AI roadmap, sovereign AI partnerships | VERY HIGH |
| Apr 2026 | TSMC Q1 2026 Earnings | CoWoS capacity update, N2 status, 2026 capex guidance, pricing outlook | VERY HIGH |
| Apr 2026 | ASML Q1 2026 Earnings | High-NA EUV order book, China revenue impact, 2026 backlog update | HIGH |
| May 2026 | NVDA Q1 FY2027 Earnings | First full quarter of Blackwell at scale; inference revenue mix shift | VERY HIGH |
| Jun 2026 | AVGO Q2 FY2026 Earnings | Custom XPU design win pipeline, AI revenue run rate, VMware integration update | HIGH |
| H2 2026 | TSMC N2 Volume Production | Next-gen node ramp; validates continued EUV demand + ASML High-NA adoption | HIGH |
In Part 3, we go deeper into the second-biggest bottleneck in the AI supply chain: High Bandwidth Memory (HBM). We will dissect the oligopoly (SK Hynix, Samsung, Micron), the physics of 12-Hi and 16-Hi stacking, why HBM consumes 5x more DRAM wafers per bit than DDR5, and present trade setups for the three players with very different risk/reward profiles.