Advanced Semiconductors: The AI Bottleneck | Scarcity Alpha — Part 2

1. Severity Assessment

CRITICAL: 9/10

Primary Bottleneck

CoWoS Packaging

NOT wafer capacity. TSMC can print the chips. They cannot package them fast enough.

Expected Duration

2–3 Years (to 2028)

Demand growing faster than capacity additions. Deficit persists through at least 2027.

Confidence Level

HIGH

Confirmed by $300B+ hyperscaler capex guidance (MSFT, META, GOOG, AMZN) for 2025–2026.

Supply Expansion Rate

~50% YoY

TSMC adding CoWoS capacity aggressively — but demand is growing ~70–80% YoY.

AI GPU Market (2025E)

$180B

+65% YoY

NVIDIA Data Center Rev

$115B

FY2026E (Jan)

TSMC Advanced Revenue

$52B

3nm + 5nm, 2025E

Hyperscaler Capex

$320B

Big 4 combined, 2025E

2. The Bottleneck Anatomy: It Is Not What You Think

The popular narrative is simple: "We don't have enough GPUs." This is technically true but deeply misleading. The real bottleneck is not the GPU die itself — it is the advanced packaging that turns a bare die into a functional AI accelerator. Understanding this distinction is critical because it changes which companies benefit most and how long the shortage lasts.

Here is the pipeline: TSMC can manufacture the Blackwell B200 GPU die on its N4P process node in approximately 8–10 weeks. That is not the problem. The problem is what happens after the die comes off the wafer:

STEP 1

Wafer Fab

N4P/N3E node

NO BOTTLENECK

STEP 2

Die Cutting

Dicing + test

NO BOTTLENECK

STEP 3

CoWoS Packaging

GPU + HBM on interposer

CRITICAL BOTTLENECK

STEP 4

HBM Stacking

12-Hi/16-Hi stacks

SECONDARY BOTTLENECK

STEP 5

System Assembly

Server integration

NO BOTTLENECK

CoWoS Capacity Trajectory

Metric	2023	2024	2025E	2026E	2027E	2028E
CoWoS Supply (K wafers/mo)	12	15	40	60	80	100
CoWoS Demand (K wafers/mo)	15	28	55	85	110	130
Deficit (K wafers/mo)	-3	-13	-15	-25	-30	-30
Utilization Rate	100%	100%	100%	100%	100%	100%

Sources: TSMC Investor Relations, Gartner, TrendForce, Market Watch estimates.

What Is CoWoS and Why It Matters

Chip-on-Wafer-on-Substrate (CoWoS) is TSMC's advanced 2.5D packaging technology. It allows multiple silicon dies (GPU compute dies + HBM memory stacks) to be placed side-by-side on a large silicon interposer, then bonded to an organic substrate.

Why it is the bottleneck: An NVIDIA B200 GPU requires a CoWoS interposer that is roughly 2.5x the size of the compute die itself. This interposer must be fabricated on a 300mm wafer with extreme precision, and then the GPU die and 8 stacks of HBM3e must be aligned and bonded at micron-level accuracy. Each B200 consumes ~2.5x more CoWoS area than an H100 did.

The math: TSMC can print millions of GPU dies per month. But they can only package ~40,000 CoWoS wafers per month (as of early 2025). Each wafer yields a limited number of packaged GPUs depending on interposer size. You can have all the GPU dies in the world, but without CoWoS packaging, they are useless for AI training.

3. Supply vs. Demand: The Structural Deficit

The chart below tells the entire story. Despite TSMC's aggressive capacity expansion — roughly doubling CoWoS capacity each year — demand is growing even faster. Every major hyperscaler (Microsoft, Google, Meta, Amazon, Oracle, ByteDance) wants more capacity than TSMC can deliver. The deficit is not closing; it is widening.

Sources: TSMC IR, Gartner, TrendForce, SemiAnalysis, Market Watch estimates

Three demand drivers are compounding simultaneously:

AI Training

Each new frontier model (GPT-5, Gemini Ultra 2, Claude 4) requires 5–10x more compute than its predecessor. The training compute doubling time is ~6 months, far faster than Moore's Law. This is the most insatiable demand in semiconductor history.

AI Inference

Inference demand scales with users, not with model size. As AI is embedded in every application (search, email, coding, customer service), inference compute is growing exponentially. Inference now represents >60% of AI chip demand.

Sovereign AI

Every major nation (UAE, Saudi Arabia, Japan, India, France, UK) is building sovereign AI infrastructure. This is new demand that did not exist 18 months ago. Jensen Huang estimates sovereign AI as a $100B+ TAM by 2027.

The Allocation Problem

When supply cannot meet demand, TSMC becomes the most powerful allocator in the technology industry. Allocation priority goes to the largest customers first (Apple, NVIDIA, AMD, Qualcomm), leaving smaller customers with extended lead times. This creates a "have vs. have-not" dynamic where companies with guaranteed TSMC capacity have a structural competitive advantage. TSMC's allocation decisions are shaping which AI companies survive.

4. The Supply Chain Deep Dive: 7 Chokepoints

The AI chip supply chain is the most complex manufacturing process ever devised by humanity. A single advanced GPU requires over 1,000 processing steps across 3 continents. Here are the 7 critical chokepoints, ranked by bottleneck severity:

#	Chokepoint	Bottleneck Owner	Market Share	Current Capacity	Lead Time	Expansion Plans
1	CoWoS Advanced Packaging	TSMC (85%), ASE (10%), Amkor (5%)	TSMC 85%	~40K wafers/mo (2025)	24+ weeks	60K by end 2026; new fab in Kumamoto (JP)
2	HBM (High Bandwidth Memory)	SK Hynix (53%), Samsung (38%), Micron (9%)	3 suppliers	HBM3e: ~90% allocated through H2 2026	20+ weeks	HBM4 qualification in H2 2026; 16-Hi stacking
3	EUV Lithography	ASML (100% monopoly)	100%	~50 EUV systems/year; High-NA ramp starting	18+ months	Target: 90 EUV/yr by 2027; High-NA at 20/yr
4	ABF Substrates	Ibiden (35%), Shinko (30%), Unimicron (20%)	Tight oligopoly	Panel size increasing for larger interposers	16+ weeks	Ibiden investing $2B in new capacity; 2026 online
5	Advanced DRAM (for HBM)	Samsung, SK Hynix, Micron	3 suppliers	HBM consumes ~5x more DRAM wafers per bit vs DDR5	12+ weeks	Converting DDR capacity to HBM; cannibalizing PC/mobile
6	Etch & Deposition Equipment	Lam Research (45%), TEL (25%), AMAT (30%)	Concentrated	Scaling with fab expansions globally	12–16 wk	New tools for GAA (Gate-All-Around) at 2nm
7	Test & Inspection	KLA (55%), Advantest (30%), Teradyne (15%)	Oligopoly	CoWoS requires 3x more inspection steps than standard packaging	8–12 wk	KLA investing in advanced packaging inspection

The AI Compute Stack: From Sand to Intelligence

Think of the AI compute stack as a pyramid with 6 layers. Each layer depends on the one below it. The bottleneck is always at the narrowest layer:

Raw Materials: Silicon wafers, rare gases (neon, krypton), photoresists, copper interconnects
Equipment: EUV lithography (ASML), etch (Lam), deposition (AMAT), inspection (KLA)
Fabrication: TSMC fabs turn designs into bare dies (N3, N4P, N5 process nodes)
Packaging: CoWoS integrates GPU die + HBM stacks into a functional package — THIS IS THE BOTTLENECK
System: NVIDIA/AMD/Broadcom design the chip; Foxconn/Quanta assemble servers
Infrastructure: Data centers (Equinix, Digital Realty), networking (Arista), cooling (Vertiv)

Investment implication: Invest in the narrowest point of the stack. Today that is CoWoS packaging (TSMC), HBM memory (SK Hynix, Micron), and EUV lithography (ASML). These are the companies with the most pricing power.

5. Trade Setups

Four primary trade setups, each targeting a different layer of the AI chip supply chain. All entries are designed around technical support levels with defined risk. Positions should be scaled into over 3–5 tranches.

NVDA — NVIDIA Corporation

The Compute Monopolist

STRONG BUY

Thesis: NVIDIA controls ~90% of the AI training GPU market and ~80% of inference. The CUDA software ecosystem creates an almost insurmountable moat — switching costs are measured in years of engineering effort, not dollars. The Blackwell architecture (B200/GB200) represents a generational leap in performance/watt, and every unit is pre-sold. NVIDIA's data center revenue is on track to exceed $115B in FY2026 (ending Jan 2027), with gross margins sustained above 73%. The "AI factory" narrative — where NVIDIA sells not chips but entire compute platforms — extends the TAM from $300B to $1T+.

Entry Zone

$120 – $130

Stop Loss

$108

Target 1

$165

Target 2

$200

R:R Ratio

1:3.5

Reinforcement Signals

TSMC monthly revenue growth >30% YoY sustained
Hyperscalers raising 2026 capex guidance above $300B combined
NVDA gross margins remaining above 72% on Blackwell ramp
Lead times for B200/GB200 remaining >26 weeks

Invalidation Signals

US export controls expanded to ban B200 sales to UAE/Saudi
Major hyperscaler (META or MSFT) cutting AI capex guidance >15%
AMD MI350 achieving >20% share in training workloads
NVDA gross margins falling below 68% for 2 consecutive quarters

TSM — Taiwan Semiconductor

The Toll Road of the AI Economy

ACCUMULATE

Thesis: TSMC manufactures 100% of the world's most advanced AI chips (NVIDIA, AMD, Apple, Broadcom custom). There is no alternative foundry capable of producing at 3nm or 5nm at scale. TSMC's pricing power is increasing — CoWoS prices have risen 20%+ in 2025 with zero customer pushback because there is simply nowhere else to go. The Arizona fab (Fab 21) de-risks the Taiwan concentration thesis and qualifies TSMC for US government subsidies. Revenue is growing 25%+ YoY with operating margins expanding toward 48%.

Entry Zone

$175 – $185

Stop Loss

$155

Target 1

$220

Target 2

$250

R:R Ratio

1:2.5

Reinforcement Signals

Monthly revenue growth accelerating above +30% YoY
CoWoS pricing increases of 15–20% accepted by customers
Arizona Fab 21 achieving >80% yield on N4P by Q3 2026
N2 (2nm) on-track for volume production H2 2026

Invalidation Signals

Geopolitical escalation in the Taiwan Strait (military exercises, blockade)
Samsung achieving CoWoS-equivalent packaging at scale
TSMC utilization rate dropping below 85% for 2 consecutive quarters
Major customer (Apple or NVIDIA) developing in-house packaging

AVGO — Broadcom Inc.

The Custom Silicon Kingmaker

BUY

Thesis: Broadcom is the leading designer of custom AI accelerators (XPUs) for hyperscalers. Google's TPU, Meta's MTIA, and potentially Apple and Microsoft custom chips are all designed with Broadcom's silicon. As hyperscalers diversify away from NVIDIA dependency, Broadcom's custom chip business is growing at 50%+ YoY. Additionally, VMware integration is driving 40%+ infrastructure software growth. The AI-related revenue is on a $12B+ annual run rate and accelerating.

Entry Zone

$195 – $210

Stop Loss

$175

Target 1

$260

Target 2

$300

R:R Ratio

1:2.8

Reinforcement Signals

New custom XPU design wins (Microsoft or Amazon confirmed)
AI-related revenue exceeding $4B/quarter run rate
VMware cross-sell driving infrastructure software margins >70%
AVGO winning next-gen networking (800G/1.6T) design wins

Invalidation Signals

Google or Meta bringing XPU design fully in-house
NVIDIA custom chip program (custom Blackwell) gaining traction
VMware integration stumbling — enterprise attrition >5%
AVGO AI revenue growth decelerating below 30% YoY

ASML — ASML Holding

The Lithography Monopoly

BUY ON PULLBACK

Thesis: ASML is the sole manufacturer of EUV lithography machines, the most complex device ever built. Every advanced chip (3nm, 5nm) in the world requires ASML EUV exposure. There is no alternative supplier and no realistic path to creating one within the next decade. ASML's High-NA EUV machines ($350M+ each) are essential for 2nm and beyond. The installed base of 200+ EUV systems generates $5B+/year in services revenue. The Dutch export controls on China de-risk the ASML bear case (China was never >15% of EUV revenue). Backlog exceeds $40B.

Entry Zone

$750 – $800

Stop Loss

$680

Target 1

$950

Target 2

$1,100

R:R Ratio

1:2.5

Reinforcement Signals

High-NA EUV orders from TSMC and Intel exceeding 10 systems in 2026
ASML backlog growing above $45B
Service revenue growth >15% YoY (installed base leverage)
Intel Foundry recommitting to EUV for 18A/14A process nodes

Invalidation Signals

Netherlands expanding export controls to ban ALL litho sales to China
TSMC or Samsung delaying N2 node by >6 months
Intel Foundry collapsing (removes ~15% of ASML demand)
ASML backlog declining below $35B for 2 consecutive quarters

6. The Ecosystem Plays: Second-Order Beneficiaries

Beyond the primary bottleneck owners, a constellation of companies benefits from the AI chip shortage. These "ecosystem plays" offer exposure to the same theme at often more attractive valuations because they are less covered by consensus.

Ticker	Company	Role in Stack	Scarcity Exposure	AI Revenue %	Key Catalyst
ANET	Arista Networks	Data center networking (400G/800G switches)	Every GPU cluster needs high-bandwidth networking. 800G transition drives ASP uplift.	~40%	800G deployment ramp with MSFT/META in H2 2026
VRT	Vertiv Holdings	Data center cooling & power infrastructure	AI servers need 3–5x more cooling than traditional servers. Liquid cooling is the new standard.	~55%	Liquid cooling orders backlog +200% YoY; GB200 racks
MU	Micron Technology	HBM3e memory (only US-based HBM supplier)	HBM is the second-biggest bottleneck after CoWoS. Micron's HBM3e is qualified with NVIDIA.	~25%	HBM revenue hitting $2B+/quarter by mid-2026
AMAT	Applied Materials	Semiconductor equipment (deposition, etch, CMP)	Every new fab and CoWoS expansion line requires AMAT equipment. GAA transition at 2nm drives upgrade cycle.	~30%	Gate-All-Around equipment revenue in 2026; TSMC N2 ramp
KLAC	KLA Corporation	Inspection & metrology (defect detection)	Advanced packaging requires 3x more inspection steps. KLA's tools are essential for CoWoS yield management.	~20%	Advanced packaging inspection revenue growing 40%+ YoY

Training vs. Inference Economics: Why It Matters for Stock Selection

The AI compute market has two distinct segments with very different economics:

Training

Who buys: Top 5 hyperscalers + Sovereign AI labs
What they need: Maximum FLOPS (H100, B200, GB200 NVL72)
Demand driver: Model size doubles every 6 months
NVIDIA share: 90%+ (CUDA lock-in)
Key winner: NVDA, TSM (CoWoS), SK Hynix (HBM)

Inference

Who buys: Everyone (hyperscalers, enterprises, edge)
What they need: Maximum performance/dollar (efficiency)
Demand driver: Scales with users, not model size
NVIDIA share: ~70% but declining as custom chips rise
Key winner: AVGO (custom XPUs), ANET (networking), VRT (cooling)

Portfolio implication: Overweight NVDA for the training cycle (2024–2026 peak). Overweight AVGO + ANET for the inference cycle (2026–2028+). Own TSM and ASML for both — they are the toll roads that benefit regardless of which chip architecture wins.

7. Validation & Invalidation Framework

The semiconductor thesis is high-conviction, but no thesis is permanent. Here is the comprehensive framework for monitoring whether the shortage is intensifying (bullish) or resolving (bearish). Track these signals monthly.

Bullish Validation (Shortage Persisting)

TSMC monthly revenue growth sustained >30% YoY for 6+ consecutive months
Hyperscalers (MSFT, META, GOOG, AMZN) collectively raising capex guidance above $350B for 2026
NVDA data center gross margins remaining above 72% on Blackwell architecture
CoWoS lead times staying at 24+ weeks through H2 2026
HBM3e/HBM4 spot prices rising or remaining at premium to contract prices

Bearish Invalidation (Shortage Resolving)

US government expanding export controls to ban AI chip sales to UAE, Saudi Arabia, or India
TSMC CoWoS utilization rate dropping below 85% for 2 consecutive quarters
Any hyperscaler cutting AI capex guidance by >20% (signals demand destruction)
Geopolitical escalation in Taiwan Strait (military exercises, blockade threats)
Breakthrough in inference efficiency reducing chip demand by >50% (e.g., sparse models, photonic computing)

Catalyst Calendar

Date	Event	Why It Matters	Impact
Feb 26, 2026	NVDA Q4 FY2026 Earnings	Blackwell revenue ramp, gross margin trajectory, FY2027 guidance	VERY HIGH
Mar 10, 2026	TSMC Feb Monthly Revenue	First read on post-Lunar New Year demand; CoWoS ramp trajectory	HIGH
Mar 17–20, 2026	NVIDIA GTC Conference	Next-gen architecture reveal (Rubin?), AI roadmap, sovereign AI partnerships	VERY HIGH
Apr 2026	TSMC Q1 2026 Earnings	CoWoS capacity update, N2 status, 2026 capex guidance, pricing outlook	VERY HIGH
Apr 2026	ASML Q1 2026 Earnings	High-NA EUV order book, China revenue impact, 2026 backlog update	HIGH
May 2026	NVDA Q1 FY2027 Earnings	First full quarter of Blackwell at scale; inference revenue mix shift	VERY HIGH
Jun 2026	AVGO Q2 FY2026 Earnings	Custom XPU design win pipeline, AI revenue run rate, VMware integration update	HIGH
H2 2026	TSMC N2 Volume Production	Next-gen node ramp; validates continued EUV demand + ASML High-NA adoption	HIGH

Up Next: Part 3 — HBM Memory

In Part 3, we go deeper into the second-biggest bottleneck in the AI supply chain: High Bandwidth Memory (HBM). We will dissect the oligopoly (SK Hynix, Samsung, Micron), the physics of 12-Hi and 16-Hi stacking, why HBM consumes 5x more DRAM wafers per bit than DDR5, and present trade setups for the three players with very different risk/reward profiles.

Advanced Semiconductors & AI Chips

1. Severity Assessment

2. The Bottleneck Anatomy: It Is Not What You Think

CoWoS Capacity Trajectory

What Is CoWoS and Why It Matters

3. Supply vs. Demand: The Structural Deficit

AI Training

AI Inference

Sovereign AI

The Allocation Problem

4. The Supply Chain Deep Dive: 7 Chokepoints

The AI Compute Stack: From Sand to Intelligence

5. Trade Setups

NVDA — NVIDIA Corporation

Reinforcement Signals

Invalidation Signals

TSM — Taiwan Semiconductor

Reinforcement Signals

Invalidation Signals

AVGO — Broadcom Inc.

Reinforcement Signals

Invalidation Signals

ASML — ASML Holding

Reinforcement Signals

Invalidation Signals

6. The Ecosystem Plays: Second-Order Beneficiaries

Training vs. Inference Economics: Why It Matters for Stock Selection

Training

Inference

7. Validation & Invalidation Framework

Bullish Validation (Shortage Persisting)

Bearish Invalidation (Shortage Resolving)

Catalyst Calendar

Up Next: Part 3 — HBM Memory