NVIDIA Rubin SOCAMM Reduction & More (0605)

1. NVIDIA Rubin SOCAMM Capacity Reduction

• Core Source

“NVDA is reducing the per-rack SOCAMM DRAM capacity of the Rubin NVL72 from approximately 55TB to approximately 28TB, with most Rubin systems expected to use 96GB SOCAMM modules instead of 192GB modules. This change lowers the estimated rack cost from $7.6 million to $6.8 million, and reduces total cost of ownership (TCO) from $4.16 to $3.90 per GPU-hour.”

“NVIDIA recently changed the mix of SOCAMM2 module orders, actually increasing total order volume — instead of 192GB modules being cut in half, 96GB modules increased by nearly 6x to become the primary product (64GB modules increased by 50%). As a result, total SOCAMM2 LPDDR demand has increased by 10–20% compared to prior forecasts.” (Korea Investment Securities, Meritz Securities)

“The total volume of SoCAMM2 that can be supplied in 2026 is already fixed at approximately 30 billion Gb across all three DRAM makers. Further increases are practically impossible due to the limits of 1c nm capacity expansion.”

“Vera Rubin-bound SO-CAMM capacity is estimated to have decreased from 1,536GB (192GB × 8) to 768GB (96GB × 8). Korean memory makers’ combined supply of LPDDR5x discrete and SO-CAMM is forecast to grow from 18 billion Gb in 2026 to 29 billion Gb in 2027.”

“According to SemiAnalysis, SOCAMM contract pricing as of Q1 2026 has risen to approximately $8 per GB.”

• Expected Impact

This issue originated when content from a SemiAnalysis paid institutional letter dated June 3 was misrepresented and spread through social media. “SOCAMM capacity halved” became the trigger for a sharp selloff in memory-related stocks including Samsung Electronics, SK Hynix, and Micron, but this can be read in the opposite direction.

The background of the spec change is not demand weakness, but LPDDR supply shortage. NVIDIA’s Rubin NVL72 consists of 72 Rubin GPUs and 36 Vera CPUs. HBM4, which directly determines AI performance, remains unchanged at 288GB per GPU in an 8-stack configuration. SOCAMM is an auxiliary memory subsystem handling data buffering and system memory on the Vera CPU side (LPDDR5X-based), and it is this component — not HBM4 — where the capacity adjustment was made.

The real purpose of the reduction is to lower TCO to accelerate Rubin rack deployment. With rack cost falling from $7.6M to $6.8M and TCO dropping from $4.16 to $3.90 per GPU-hour, hyperscalers building data centers at thousands of racks can realize savings on the order of hundreds of millions of dollars. This is a strategic design change by NVIDIA to maximize system deployment within constrained supply — not a sign of softening demand.

On total bit demand, there is no real deterioration. The 2026 SOCAMM2 supply pool across all three DRAM makers is fixed at approximately 30 billion Gb, with virtually no room for additional volume given 1c nm capacity constraints. NVIDIA must allocate this fixed pool across both NVL72 and Vera CPU Rack configurations, making lower per-module capacity a necessary choice to maximize set shipments. As a result, 96GB module orders surged nearly 6x, and total SOCAMM2 LPDDR demand has actually increased 10–20% versus prior forecasts. (Korea Investment Securities, Meritz Securities)

SOCAMM contract pricing rising to approximately $8 per GB as of Q1 2026 is direct evidence of the extreme LPDDR5X supply shortage. Korean memory makers’ combined LPDDR5x discrete and SO-CAMM supply is forecast to grow from 18 billion Gb in 2026 to 29 billion Gb in 2027, and even with per-unit capacity reduced, an unchanged TAM implies more GPU shipments than the market previously expected. Furthermore, the 2026–2027 memory upside is a function of P (price), not Q (volume), and SOCAMM2 adoption will drive LPDDR5 ASP higher in the second half, as server-oriented applications command a premium over mobile LPDDR5X. (Daeshin Securities, Korea Investment Securities)

For reference, NVIDIA itself has previously stated that the TAM for SOCAMM is $300 billion, making this episode better interpreted as a supply-constrained demand adjustment at the very beginning of market formation rather than ecosystem disruption.

2. JP Morgan, Years of Bearish Stance Reversed with Tesla Upgrade

• Core Source

“Tesla is preparing to expand its physical AI business and growing the market. Tesla owns both physical AI hardware and software, and its vertical integration makes it well-positioned to scale physical AI operations.”

“This strength in physical AI scalability is still underappreciated by the market. In particular, the EV and battery production lines can also be applied to the humanoid robot Optimus.”

“We estimate Optimus’s total addressable market (TAM) will reach 5 million units in the US and 30 million units globally by 2040.”

“Humanoid robots and AI-driven automation could become Tesla’s new growth engine over the next decade, and we have started evaluating Tesla as a technology platform company rather than a traditional automaker.”

• Expected Impact

J.P. Morgan upgraded Tesla’s rating from Underweight to Neutral and raised its price target dramatically from $145 to $475. The analyst leading the report, Rajat Gupta, took over Tesla coverage last month. This upgrade is symbolically significant as J.P. Morgan’s formal withdrawal of its longstanding bearish thesis and a fundamental shift in its valuation framework.

The core change is a redefinition of Tesla as a physical AI platform company rather than a traditional automaker. J.P. Morgan highlights Tesla’s unmatched vertical integration across hardware and software as a unique starting-point advantage that remains underappreciated by the market — noting that existing EV and battery production lines can directly lower manufacturing costs for Optimus and serve as validation for enterprise customers, while vast driving data feeds both robotaxi and Optimus development in a flywheel effect.

On earnings, J.P. Morgan expects Tesla’s EPS to potentially inflect beyond 2028, rising from approximately $1.95 in 2026 to roughly $7.50 by 2030 — nearly a threefold increase. Revenue is projected to more than double from approximately $95 billion in 2025 to roughly $203 billion by 2030, with nearly half of that growth projected to come from services and newer businesses tied to autonomy and robotics.

In terms of market sizing, J.P. Morgan values Tesla across five interlinked markets — automotive, energy storage, robotaxis, humanoid robots, and infrastructure licensing — with a combined potential addressable market of approximately $3.9 trillion by 2035. For Optimus alone, the TAM is estimated at 5 million units in the US and 30 million units globally by 2040. The firm does, however, flag high execution risks around regulatory approvals, safety validation, and scaling new technologies.

The fact that a major Wall Street institution like J.P. Morgan has formally reversed its bearish stance and reframed Tesla through a physical AI platform lens signals that the market narrative around Tesla is shifting in earnest — from EV sales metrics to AI and robotics platform value.

3. Anthropic: “The Era of AI Building AI Is Coming Faster Than Expected”

• Core Source

“More than 80% of the code merged into Anthropic’s codebase is written by Claude. Many researchers have already gone months without writing code directly, handling their work through Claude instead.”

“Engineer-level code productivity has increased approximately 8x compared to 2024. Quarterly code deployment volume has expanded 8x compared to the 2021–2025 average.”

“The success rate on the most open-ended and difficult engineering tasks rose from approximately 26% to 76% in just six months. The proportion of cases where the AI makes better research direction decisions than humans has risen to 64%.”

“In research optimization experiments, Mythos Preview recorded approximately 52x performance improvement. In AI safety research, it achieved 97% of human researcher (23%) performance levels.”

“The length of tasks AI can perform independently is doubling approximately every 4 months, accelerating from every 7 months previously.”

• Expected Impact

The Anthropic Institute published a report titled “When AI Builds Itself” on June 4, 2026. The core message is a warning: the ‘recursive self-improvement (RSI)’ cycle — in which AI accelerates its own development — is materializing faster than most institutions have anticipated.

The numbers are striking. Before the introduction of Claude Code in February 2025, Claude-authored code represented only a low single-digit percentage of Anthropic’s codebase. By May 2026, that share exceeded 80%. Code quality, which was somewhat below human level in late 2025, has reached parity today, and Anthropic expects it to surpass human performance within the year. Progress in research judgment is equally rapid: Mythos Preview now selects a better next step 64% of the time at the moment before a researcher takes a wrong turn — up from 51% for Opus 4.5 in November 2025.

The significance of this report goes beyond a performance disclosure — it is a declaration that the unit of competition in the AI industry is itself changing. Anthropic states explicitly that the next competitive frontier is not “better models” but “how fast AI can accelerate AI development.” The bottleneck has already shifted from writing code to human code review and research direction-setting, and the feedback loop in which AI accelerates AI development has been effectively formalized.

Anthropic presents three future scenarios: ① slowdown (S-curve, least likely), ② human-directed with AI executing most R&D (most likely), and ③ recursive self-improvement (RSI), in which AI designs, improves, and develops its own successors. In the RSI scenario, the pace of progress is determined not by human research headcount, but by compute and algorithmic improvement rates — with alignment remaining the greatest uncertainty.

From an investment standpoint, the implication is clear. As the RSI cycle accelerates, AI model development speed becomes directly tied to compute investment, which structurally expands demand for GPUs, HBM, and broader AI infrastructure. While Anthropic argues that a global verification framework should allow for the option to slow or pause frontier AI development, it simultaneously acknowledges that a unilateral pause by a single company would merely change who leads — meaning the industry-wide AI infrastructure investment race is likely to intensify further.

4. AI Agent Token Costs Emerge as “Huge Issue”

• Core Source

“OpenAI CEO Sam Altman stated that AI cost issues were barely discussed at the start of 2026, but have suddenly become a ‘huge issue.'”

“Uber: 5,000 engineers adopted Claude Code, and the company burned through its entire annual AI budget in just 4 months (spending $500–$2,000 per person per month).”

“Context retransmission waste: 62% of billed costs come not from actual work, but from repeatedly sending existing code context each time an agent is called.”

“GitHub Copilot (June 1): Switched from a flat-rate model to a usage-based pricing model (UBB) with an organization-wide credit pool. Heavy user cost tracking made visible.”

“A new market is being born: a ‘Cost Discipline / FinOps solution’ layer that measures, predicts, and optimizes AI agent costs is emerging as the most urgent and valuable new AI infrastructure business.”

• Expected Impact

A fundamental shift in the economics of AI agents is underway as of mid-2026. Sam Altman’s public acknowledgment of AI token costs as a “huge issue” is a symbolic signal that the industry is transitioning from a ‘deployment expansion’ phase into a ‘cost discipline’ phase. AI agent pricing is rapidly shifting from unlimited flat-rate (subsidized) models to actual usage-based (per-token) billing, and the impact is already visible in concrete numbers.

The enterprise shock is widespread. Microsoft canceled its internal Claude Code licenses at the end of its fiscal year (June 30) and forced a migration to GitHub Copilot, reportedly due to excessive cost burden. Uber burned through its entire 2026 AI budget in just four months after deploying Claude Code to 5,000 engineers ($500–$2,000 per person per month). An anonymous healthcare company unknowingly consumed one trillion tokens over six months, incurring $6 million in unplanned spending.

Structural root causes have also been identified. According to a LynOps audit, token consumption per person varies by up to 20x depending on usage habits even with the same tool. Most critically, 62% of billed costs stem not from actual work but from repeatedly transmitting code context every time an agent is called — a structural inefficiency, not a user error.

Pricing model transitions are accelerating across the industry. GitHub Copilot moved to usage-based pricing (UBB) on June 1; Anthropic introduces credit caps for programmatic usage on June 15. These transitions make the real cost of AI tool adoption visible in enterprise financial statements for the first time, meaning AI can no longer be treated as a vague IT budget line item. Deloitte’s publication of a “Token Economics Guide for CFOs” signals that AI cost management has moved beyond the technology department to become a core executive priority.

The new market this dynamic creates deserves attention. A ‘FinOps solution’ layer for AI — measuring, predicting, and optimizing agent costs — is rapidly emerging as the most urgent and valuable new business within AI infrastructure, with Reuters noting that token consumption volume, not user count, is becoming the primary metric for actual AI adoption scale and revenue in the industry.

5. Samsung, SK Hynix, Micron — All Three Qualify for NVIDIA HBM4

• Core Source

“All three suppliers have completed certification and have all entered production.”

“They are all competing to supply the Vera Rubin platform.”

“We will be using a very large amount of high-speed memory. Memory supply is currently constrained, and we are working with Korean partners to secure as much supply as possible while using it efficiently.”

“The second half of this year will be significantly larger than the first half, and next year will be even bigger.”

• Expected Impact

Jensen Huang, CEO of NVIDIA, confirmed immediately upon arrival in Korea on June 5 that all three memory makers — Samsung Electronics, SK Hynix, and Micron — have completed HBM4 qualification testing and entered production. Unlike the HBM3E generation, where SK Hynix held a near-dominant supply position, the HBM4 generation marks the entry of all three players into the Vera Rubin supply chain, establishing a full competitive landscape.

HBM4 is the sixth-generation high-bandwidth memory to be mounted in NVIDIA’s next-generation AI accelerator Vera Rubin, slated for H2 2026 launch. Samsung became the world’s first to begin HBM4 mass production in February and began supplying Vera Rubin units in June. SK Hynix is estimated by UBS to hold approximately 70% of initial Rubin volume. Micron joined in Q2.

The key implication of all three qualifying simultaneously is a reshaping of the HBM competitive landscape. In the HBM3E generation, market share was distributed as SK Hynix (64%), Micron (21%), and Samsung (15%). With Samsung making a serious comeback in HBM4 and Micron joining as well, that distribution is likely to shift. Jensen Huang’s explicit acknowledgment of competition among suppliers signals that NVIDIA is actively pursuing supply chain diversification rather than dependence on any single vendor.

On the supply-demand balance, constraint remains the central variable. Huang directly noted that “memory supply is currently constrained,” and SK Hynix told investors that HBM pricing strength is expected to persist through next year — with HBM supply shortage likely continuing through 2027. The fact that supply shortages persist even with all three makers in full production is itself evidence that AI accelerator demand is expanding at an explosive pace, providing a structural foundation for sustained HBM ASP appreciation.

6. DeepSeek Tops US Corporate Spending Index in June

• Core Source

“Chinese AI startup DeepSeek took the top spot on a major US business spending index in June, as companies look for alternatives to OpenAI and Anthropic.”

“US firms were making direct payments to DeepSeek, suggesting they were sending and receiving data directly through DeepSeek rather than hosting its open-source models on their own internal servers.”

“DeepSeek’s rise is part of a broader move toward open-source models. Fireworks AI and Fal AI were also included in June’s trending vendor list, as open-source models are proving competitive at a fraction of the cost of premium proprietary models.”

“Price cut of 75% followed by top intelligence-per-dollar ranking… Tencent, CATL and others investing $7.4 billion.”

• Expected Impact

Ramp, a New York-based corporate spending platform tracking payments from more than 50,000 US businesses, publishes a “trending software vendors” index measuring when companies pay a software vendor for the first time. DeepSeek’s ascent to the top of this index in June is the strongest market signal yet that enterprises are actively searching for affordable alternatives to OpenAI and Anthropic.

The immediate driver is price competitiveness. DeepSeek made its 75% price cut on the V4 Pro model API permanent. The new rate stands at $0.87 per million output tokens — this is up to 34 times cheaper than Claude Opus 4.7 at $25 per million tokens and GPT-5.5 at $30. Benchmark firm Artificial Analysis rated DeepSeek V4 Pro among the world’s best on an intelligence-per-dollar basis after the cut, and it ranked just below GPT-5.5 on legal AI benchmarks, confirming its viability for professional workloads.

Particularly noteworthy is that US firms are not self-hosting DeepSeek’s open-source models — they are sending real business data directly to DeepSeek’s servers in China. Ramp Economics Lab lead economist Ara Kharazian called this “probably the biggest sign that companies are looking for cheaper alternatives to OpenAI and Anthropic, some willing to use cheaper, Chinese models, sending US data back and forth from China-hosted servers.” This reflects the depth of enterprise cost pressure — companies are accepting data sovereignty and geopolitical risks in exchange for cost relief.

The competitive dynamics are structurally significant. Anthropic confidentially filed for an IPO at approximately a $965 billion valuation on June 1, while OpenAI closed a $122 billion funding round in March at an $852 billion valuation. At those valuation levels, neither company can realistically compete on price with a startup that just permanently slashed its rates by 75%. The very moment that AI unicorns preparing for IPOs are compelled to maintain premium pricing under valuation pressure is the moment DeepSeek permanently reset the market price floor.

DeepSeek is simultaneously closing an approximately 50 billion yuan (~$7.4 billion) first-ever external funding round led by Tencent and CATL, with China’s state-backed National AI Industry Investment Fund also participating — forming a structure of effective state backing. DeepSeek’s rise is not merely the emergence of a low-cost competitor. It represents an open-source model + state capital + structural pricing advantage combination that has begun to challenge the competitive moats of the global AI model market.

NVIDIA Rubin SOCAMM Reduction & More (0605)

1. NVIDIA Rubin SOCAMM Capacity Reduction

• Core Source

• Expected Impact

2. JP Morgan, Years of Bearish Stance Reversed with Tesla Upgrade

• Core Source

• Expected Impact

3. Anthropic: “The Era of AI Building AI Is Coming Faster Than Expected”

• Core Source

• Expected Impact

4. AI Agent Token Costs Emerge as “Huge Issue”

• Core Source

• Expected Impact

5. Samsung, SK Hynix, Micron — All Three Qualify for NVIDIA HBM4

• Core Source

• Expected Impact

6. DeepSeek Tops US Corporate Spending Index in June

• Core Source

• Expected Impact

Comment [3]

Leave a Comment Cancel