NVIDIA Just Validated AMD’s Original Strategy — Here’s the Proof (Q225).
This is why I believe AMD holds the strongest - or at least a very strong hand - over the coming decades.
Introduction:
In summary, NVIDIA delivered another strong quarter. The bigger picture is clear: demand for compute is accelerating and will only continue to rise.
Yet Wall Street remains distracted by two narrow concerns:
The assumption that compute demand is limited to CPUs and GPUs.
The belief that China represents a make-or-break market for NVIDIA and AMD.
Both views are misguided, symptoms of the short-term thinking that dominates the Street.
My conviction in AMD rests on a broader, more fundamental perspective:
First, the demand for compute is, in principle, unlimited.
Second, the trajectory of the compute market is inherently unpredictable.
Third, in a landscape defined by unpredictability, the companies that succeed will be those with the greatest agility - not only culturally but architecturally. Agility allows them to adapt to shifting market needs at marginal cost.
This is why I believe AMD holds the strongest - or at least a very strong hand - over the coming decades.
Their bold adoption of chiplet architecture, now over a decade ago, positions them uniquely to thrive in a future where adaptability, not size alone, will determine winners.
The question is, why is the demand for compute infinite in nature?
Everything in the real world — from atoms and genes to traffic, markets, and geopolitical conflict — operates according to patterns. These patterns may be complex, but they are not chaotic. They follow rules, and anything that follows rules can be modeled.
Simulated. Predicted. Optimized. Controlled.
That’s what it means to say reality is computable. And once you accept that, the thesis becomes obvious: the companies building the hardware and software infrastructure to compute reality will (likely) capture an extraordinary share of future economic value.
This is where AMD and Palantir come in.
AMD is building the silicon — the chips and edge devices embedded in factories, vehicles, infrastructure, and biological systems — that allow us to extract and process raw data from the world in real time.
Palantir is building the ontology and simulation layers — the software that takes raw edge signals and transforms them into meaning, models, and action. Their platforms structure data into digital twins: live, computable models of real-world systems that can be queried, forecasted, and optimized.
Put simply: AMD lets us compute the world. Palantir lets us visualise and control it.
Together, they form the foundation of a new economic paradigm: one where reality is not just observed, but rendered in software — and governed through simulation.
Most investors haven’t caught up to this yet. But those who understand it early stand to benefit disproportionately from the transformation ahead.
In other words, to compute reality, you need 3 foundational stacks, of which will create the future economy:
→ Patterned Reality: The world runs on rules and structures — from physics to finance — and those patterns can be mathematically modeled.
→ Silicon: Chips: sensors, and edge devices that capture the raw data — turning the physical world into digital signals.
→ Ontology: The software layer that gives structure and meaning to that data — transforming signals into models, frameworks, and usable insight.
Thus, in order to avoid getting caught up in these short term debates, from which can cause many investors to sell at a low price, you need to learn how to think abstractly, and from first principles.
That is exactly what I teach in my course.
Lock in your spot today with code PLTR at checkout. The next version of the course drops next week with huge upgrades in production and content quality. Prices are going up with the relaunch, so this is the last chance to grab it at the lowest rate.
Instead of thinking: what percentage share of the compute market does AMD have? Instead say, what actually is compute; and why is the demand for compute infinite in nature?
Most investors get trapped in short-term debates — China sales, quarterly earnings, market share battles. They think they’re analyzing, but really they’re reacting. That’s why they miss the structural forces driving the next decade of compounding.
The solution is to step back and think from first principles. That’s exactly why I built 30 Days, 30 Insights.
One short email a day, each designed to rewire how you analyze companies — not by chasing headlines, but by spotting the deeper traits that actually matter.
👉 Join free here. Thirty lessons, thirty lenses. Learn how to think in a way that gives you conviction before Wall Street catches up.
From Training to Inference: The Data Center Inflection Point
NVIDIA’s CEO highlighted the accelerating momentum behind their data center business and the central role of AI workloads:
“Customers continue to accelerate their Hopper architecture purchases while gearing up to adopt Blackwell. Key workloads driving our Data Center growth include generative AI model training and inferencing; video, image, and text data pre- and post-processing with CUDA and AI workloads; synthetic data generation; AI-powered recommender systems; SQL and Vector database processing as well.
Next-generation models will require 10 to 20 times more compute to train with significantly more data. The trend is expected to continue.”
He went further, underscoring the importance of inference as a growth driver:
“Over the trailing four quarters, we estimate that inference drove more than 40% of our Data Center revenue. CSPs, consumer Internet companies, and enterprises benefit from the incredible throughput and efficiency of NVIDIA's inference platform.”
There are two important takeaways here.
First, the compute requirements for training next-generation models are not simply growing linearly but exponentially, with estimates of 10 to 20 times more compute needed per cycle. Second, inference has quietly become a massive business in its own right, accounting for more than 40% of NVIDIA’s data center revenues over the past year.
Both points tie directly into my broader thesis.
The demand for compute is structurally upward-trending, regardless of short-term bottlenecks or potential “AI walls.” Even if research progress slows or models temporarily plateau, the long-run trajectory is clear: the world will continue to require more computational power, at greater scale and efficiency, across every industry.
What is especially notable here is NVIDIA’s rapid advancement in inference. This validates what I have argued for some time: inference is not only the practical endpoint of AI adoption, it is also the domain where cost efficiency, scalability, and architecture matter most.
And this is precisely where AMD is best positioned to compete. Their chiplet-based design offers an unmatched ability to deliver compute flexibly and at marginal cost, aligning perfectly with the economics of inference as AI applications scale from research labs into mass deployment across enterprises, healthcare, robotics, and beyond.
This is a point to we will discuss momentarily.
AMD’s Recent Quarter Represents A True Transition:
As I noted in my recent piece on AMD, this quarter represents a true transition phase. The company is shifting from the MI300 and MI325 series into the next-generation MI350 family — a critical inflection point.
The earlier chips established AMD’s foothold in AI, but it is the MI350 series that signals their full-scale entry into competitive inference. This matters profoundly because AI workloads are now moving decisively from training to inference, and inference is exactly where AMD has chosen to compete most aggressively.
What stands out is the speed of this ramp.
The MI350 series is already in volume production ahead of schedule, a clear sign of both demand and execution. CEO Lisa Su highlighted that the MI355, part of this lineup, matches or even surpasses NVIDIA’s B200 in both training and inference.
More importantly, it delivers up to 40% more tokens per dollar for inference, establishing leadership not just in performance, but in total cost of ownership. Oracle’s recent decision to build a 27,000-node AI cluster using MI355X GPUs paired with fifth-generation EPYC CPUs underlines the seriousness with which customers are betting on AMD’s platform.
Why does inference matter so much?
Because it is where AI moves from theoretical capability to real-world impact. Training large models may capture headlines, but inference is what drives adoption. It is the point where models leave the lab and begin making decisions in real time — in hospitals, factories, vehicles, financial systems, and across critical infrastructure.
AMD’s MI355 is a genuine breakthrough.
Delivering 40% more tokens per dollar than NVIDIA’s competing solutions is not a marginal improvement; it is a fundamental shift in the economics of AI deployment. By making inference cheaper and more scalable, AMD lowers the barriers to embedding AI everywhere.
Picture a factory floor where every robot, conveyor, and machine is embedded with an edge device capable of live inference: detecting anomalies, optimizing workflows, and making split-second decisions that improve efficiency and safety.
This is not a distant vision — it is the trajectory we are already on. AMD’s architectural bet on chiplets makes this future economically viable and technically scalable.
By enabling high-performance inference at the edge, AMD is effectively bringing real-time AI into the physical world, one silicon layer at a time. This is where the abstract promise of AI converges with the tangible machinery of industry — and AMD is positioning itself as the company that makes that convergence possible.
Lisa Su said it best:
“Inference will be the primary driver of AI — and now we’re seeing that inflection point.”
But this moment represents more than just design wins — it marks the beginning of AMD’s second act.
After disrupting Intel in CPUs with its chiplet-based architecture, AMD is now stepping onto the AI stage — not just to compete with NVIDIA, but to redefine the economics of inference. As AI shifts from training to deployment, from the cloud to the edge, AMD’s hardware is emerging as the foundation for real-world, real-time intelligence across industries.
When comparing products, the question is often framed around incremental performance differences — which chip trains a model a few percent faster, or which GPU squeezes out slightly higher throughput.
But this misses the bigger picture.
At this point, AMD and NVIDIA are on relatively equal footing in terms of raw capability. The real differentiator is not a marginal spec battle, but who can deliver scalability and cost efficiency as AI moves from research into mass adoption. '
On that front, AMD’s chiplet-driven approach provides a structural advantage that positions it to thrive as inference becomes the economic core of the AI era.
That said, it goes without saying that NVIDIA is an exceptional company, and it will not be easy for AMD to capture significant market share from such a dominant player.
This is a point I will return to shortly.
Still, my thesis for AMD has always been more abstract, grounded in a first-principles approach to investing rather than in short-term debates over which company currently has the better chips. Technical performance matters, of course, but it is only one part of a deeper and more enduring story.
At the heart of my thesis is a core belief: the demand for compute is, in principle, unlimited. In such an environment, the companies that win will be those with the greatest architectural and cultural agility.
On this front, I believe AMD holds a structural edge — particularly due to its leadership in chiplet-based architecture. NVIDIA has only recently begun exploring chiplet integration (not a pure “chiplet” play), while AMD has spent years refining and operationalizing it at scale.
Here’s how I break down the thesis:
→ The demand for compute is effectively infinite
→ The evolution of compute workloads is highly unpredictable
→ Chiplet architecture offers superior flexibility and long-term agility
→ AMD’s early and continued investment in this design positions it to outperform over the coming decades
In sum, because the demand for compute is effectively infinite and the evolution of the market is highly unpredictable, it is likely that the company with the greatest architectural agility will be best positioned to capitalize on these shifting trends.
Such a company will be able to adapt quickly, extend into new use cases, and deliver compute at marginal cost where others cannot.
Most investors think about compute the wrong way. They treat it like oil or steel — something finite, cyclical, capped.
But from first principles, compute is nothing like a commodity. Reality itself runs on patterns, and anything that runs on rules can be modeled, simulated, and optimized. Which means compute demand is not just big, it is infinite.
Once you see compute this way, you stop panicking during selloffs. You understand why long-term conviction matters — and why adaptability, not quarterly market share, is what determines the winners.
The question is: how do you train yourself to think this way?
That’s what I teach inside my programme.
It’s a complete system for analyzing companies from first principles, spotting structural edges, and holding through volatility with clarity.
Chinese Market & Why This Debate Misses The Point:
NVIDIA’s CEO acknowledged ongoing challenges in China, noting:
“As a percentage of total Data Center revenue, it remains below levels seen prior to the imposition of export controls. We continue to expect the China market to be very competitive going forward.
The latest round of MLPerf inference benchmarks highlighted NVIDIA's inference leadership, with both NVIDIA Hopper and Blackwell platforms combining to win gold medals on all tasks. At Computex, NVIDIA, with the top computer manufacturers, unveiled an array of Blackwell architecture-powered systems and NVIDIA networking for building AI factories and data centers.”
On this point, I am personally not overly concerned with short-term fluctuations in China. Yes, it is a significant market, but the obsession with quarterly Chinese demand misses the forest for the trees.
Compute demand is, in principle, unlimited. Even excluding China, I expect global demand to expand multiples beyond today’s levels.
This is where first-principles thinking matters — a lens I emphasize constantly in my own framework. By abstracting away from headlines and short-term narratives, one can see the structural forces at work. That clarity has been invaluable for holding stocks over years, rather than being shaken out by noise or misleading comparisons.
Once again, the reason for this is because, I understand what compute actually is at the most core level;
Everything around you — atoms, genes, traffic, markets, wars — follows patterns. And patterns can be modeled. Simulated. Predicted. Controlled.
That’s what makes reality computable.
“Computation is how we interact with information—and reality itself is made of information.”
Atoms. Genes. Languages. Markets. Behaviors.
They can all be modeled computationally.
At its core, it’s the recognition that the world around us — from the movement of atoms to the behavior of global markets — operates according to underlying patterns. These patterns aren’t random; they follow rules, structures, and regularities that can be expressed mathematically.
Whether it’s the laws of physics governing motion, the genetic code driving biological processes, or the flow of information across social networks and supply chains, all of these phenomena exhibit structure. And if there’s structure, there’s the potential for modeling.
To say that reality is "computable" is to claim that these patterns can be translated into algorithms — that the logic of the universe can be rendered into code. If something can be measured, tracked, and predicted, it can also be simulated. And if it can be simulated, we can test it, optimize it, and eventually act upon it — not through trial and error in the physical world, but within a controlled digital environment.
Because the universe is intelligible - a fact that is nothing short of miraculous - human beings are able to perceive patterns in reality, translate those patterns into mathematical structures, and then generate insights from those structures.
Through silicon chips, those insights can be processed and applied to an ever-expanding range of use cases.
Seen at an even more abstract level, the human mind itself exhibits computational properties. In a sense, the mind is applied computation: it organizes, interprets, and generates order from the raw complexity of experience.
Today, we are creating silicon-based forms of computation that mirror, at least in part, the very same properties we find in our own minds. Computation, whether biological or artificial, is the process by which order is drawn out of chaos.
What is the demand for that?
The value of abstraction lies in two key functions. First, it pulls us out of short-term debates that so often distract investors from what truly matters. Second, it reveals the deeper forces at play, the larger structure of the problem we are dealing with.
It is only by stepping back in this way that one can uncover insights that are both unique and genuinely novel.
AMD vs. NVIDIA: Two Paths to Modularity
The CEO also highlighted NVIDIA’s modular approach with MGX:
“With the NVIDIA MGX modular reference architecture, our OEM and ODM partners are building more than 100 Blackwell-based systems designed quickly and cost-effectively.
The NVIDIA Blackwell platform brings together multiple GPUs, CPUs, DPUs, NVLink, and Link Switches, plus networking chips, systems, and the NVIDIA CUDA software stack, to power the next generation of AI across use cases, industries, and countries. The NVIDIA GB200 NVL72 system with the fifth-generation NVLink enables all 72 GPUs to act as a single GPU and deliver up to 30x faster inference for LLM workloads, unlocking the ability to run trillion-parameter models in real time. Hopper demand is strong, and Blackwell is already widely sampling.”
It is striking that NVIDIA itself now emphasizes “modular” design. That word carries enormous weight.
For AMD, modularity has long been the heart of their chiplet architecture — the decision that allowed them to leapfrog Intel in CPUs and now gives them an economic edge in AI.
NVIDIA’s pivot toward modularity at the system level validates this broader architectural shift.
The question is which layer of modularity — chiplet silicon or rack-level system design — will prove the more scalable and cost-efficient foundation for the next decade of compute.
Inside Modularity: Why the Future Isn’t Monolithic
“Modular” in NVIDIA’s MGX reference architecture and in AMD’s chiplet design both point toward flexibility and scalability, but they operate at very different layers of the computing stack.
AMD pioneered modularity at the silicon level.
Instead of producing one massive monolithic die, AMD broke the processor into smaller dies, or chiplets, each handling a specific role such as compute, I/O, or cache.
These chiplets can even be manufactured on different process nodes, which means AMD can place compute dies on the most advanced lithography while keeping I/O or memory controllers on cheaper, mature nodes.
Infinity Fabric ties everything together, ensuring these disparate pieces function seamlessly as a single processor. This design choice was pivotal: it reduced costs, improved yields, and gave AMD a lasting structural edge over Intel’s monolithic approach.
In GPUs, the same philosophy underpins the MI300 series, which integrates CPU, GPU, and memory into a single package.
NVIDIA’s MGX, by contrast, brings modularity up a layer, to the level of the system and the server rack. MGX does not solve silicon yield or wafer economics.
Instead, it provides a standardized framework that allows CPUs, GPUs, DPUs, NVLink switches, and networking cards to be slotted into a common chassis.
This accelerates OEM design cycles, enabling Blackwell-based systems to come to market in months rather than years, a major advantage for hyperscale deployment.
If AMD’s chiplet design is like constructing a city block from individual apartment units, NVIDIA’s MGX is like planning the entire neighborhood, with standardized streets, utilities, and infrastructure so that builders can quickly assemble a wide variety of structures.
Both approaches are manifestations of the same deeper trend: computation is moving toward modularity at every scale, from transistor layouts, to chiplets, to processor packages, and finally to rack-level systems.
In this sense, NVIDIA’s adoption of modularity at the system level indirectly validates AMD’s original bet on modularity at the silicon level. The industry’s pivot toward modular design — across every layer of compute — is, in many ways, an endorsement of the chiplet philosophy that Lisa Su committed to more than a decade ago.
That decision is now paying dividends, not just for AMD, but for the entire direction of modern compute.
At its core, modularity is about decoupling functions so they can be upgraded, swapped, and scaled more efficiently than if everything were locked into one giant block.
AMD applies this principle at the silicon level with chiplets, while NVIDIA’s MGX applies it at the system level. The underlying logic is the same: monolithic design becomes too slow, too costly, and too fragile in the face of exponential compute demand.
The fact that NVIDIA now emphasizes MGX so heavily is, in many ways, a validation by analogy. Blackwell GPUs are still essentially large monolithic dies with some advanced packaging, but MGX exists for the same reason AMD pioneered chiplets more than a decade ago: innovation cycles are accelerating, manufacturing costs are rising, and workloads are becoming too diverse for one-size-fits-all design.
In spirit, MGX is an endorsement of the chiplet principle.
It signals that the winning strategy in compute is modular scaling, not monolithic scaling. AMD simply brought this idea to silicon first. Over time, it would be surprising if NVIDIA did not move toward true chiplet GPUs, because otherwise they are stuck fighting Moore’s Law with brute-force die sizes while the rest of the industry builds with modular “LEGO blocks.”
Seen this way, system-level modularity reinforces the long-term inevitability of chiplets at the silicon level.
It is the same philosophy playing out across different scales of the compute hierarchy.
AMD chiplets = modularity inside the chip. One giant processor is broken into smaller dies for compute, I/O, and cache, which can be mixed, matched, and manufactured more efficiently.
NVIDIA MGX = modularity outside the chip. CPUs, GPUs, DPUs, and networking parts slot into a standardized chassis so OEMs can bring new systems to market faster.
What MGX demonstrates is that the future of compute is modular at every level.
If modularity is the right solution for servers and systems, it only strengthens the case that chiplets are the right solution for silicon.
NVIDIA’s embrace of modularity at the system level is, intentionally or not, an endorsement of AMD’s earlier decision to bring modularity to the heart of the processor.
Most investors see NVIDIA’s MGX and AMD’s chiplets as two separate engineering choices. They argue endlessly over which one “wins” in the next quarter. That’s the mistake.
From first principles, both are expressions of the same deeper truth: monolithic design eventually collapses under its own weight. The only sustainable path forward is modularity, applied at every layer of compute — from transistors to racks.
Once you think this way, you stop getting caught up in surface-level debates and start asking the right question: which company is structurally positioned to compound adaptability over decades?
That kind of thinking doesn’t come from headlines.
It comes from training your mind to analyze from first principles. And that’s exactly what I teach inside my course. It’s a complete system for breaking down problems at the root, spotting structural inevitabilities like modularity, and building conviction before Wall Street does.
👉 If you want to learn how to think this way — to see the logic beneath the noise — join here.
AMD vs NVIDIA: The Real Divide
AMD’s chiplet modularity operates at the silicon level. The company’s breakthrough was recognizing that Moore’s Law economics were breaking down for ever-larger monolithic dies.
By splitting processors into smaller chiplets, AMD solved yield problems, cut costs, and effectively created a Lego block system within the chip itself.
This approach allows compute cores to be fabricated on cutting-edge nodes while memory or I/O can remain on cheaper, mature nodes. The result is agility at the atomic level of compute, delivering both cost efficiency and design flexibility.
As workloads evolve, AMD can mix and match building blocks without redesigning the entire processor.
NVIDIA’s MGX modularity operates at the system level.
MGX does not address the economics of wafers or transistor scaling. Instead, it provides hyperscalers and OEMs with a standardized chassis where CPUs, GPUs, DPUs, NVLink switches, and networking hardware can be slotted in interchangeably.
While this doesn’t make the GPU itself cheaper, it does accelerate design cycles and allows new configurations to reach market much faster.
The agility here lies in deployment: hyperscalers can assemble new Blackwell systems in months rather than years.
Viewed through the lens of infinite and unpredictable demand for compute, the distinction becomes critical.
Silicon-level modularity, as pioneered by AMD, represents agility at the very atoms of compute.
It directly improves long-term economics by lowering marginal costs and enabling adaptation within the processor itself.
If workloads shift toward greater efficiency per watt, per dollar, or per unit of area, chiplets offer structural advantages. System-level modularity, by contrast, is agility in assembly.
It does not alter the fundamental economics of silicon, but it does give hyperscalers the ability to reconfigure infrastructure quickly for new demands. For example, if inference workloads suddenly dominate, MGX enables faster deployment of racks optimized for that purpose without redesigning hardware from scratch.
In this sense, NVIDIA is also building agility, though at a different layer.
MGX provides valuable system-level flexibility in a world where deployment speed and customization matter. Yet it does not remove the fragility of relying on ever-larger monolithic GPUs.
If die yields deteriorate or manufacturing costs rise, MGX cannot solve the problem at its root. The deeper implication is that if compute demand is both infinite and unpredictable, then agility at every layer will matter.
AMD’s chiplet strategy delivers adaptability at the silicon level, while NVIDIA’s MGX buys time by making monolithic GPUs more deployable in diverse configurations.
Over the long run, however, system-level modularity points back toward silicon-level modularity: if modularity is the right solution at the rack, it is almost certainly the right solution inside the chip itself.
Which means: NVIDIA’s MGX is not a refutation of AMD’s chiplet thesis, it’s a tacit endorsement. It shows the industry is converging on modularity at every level, but AMD still holds the deepest advantage because they solved the problem at the most fundamental layer.
AMD’s chiplet philosophy gives it adaptability at the very heart of the processor. Each chiplet is designed for a specific function—compute, I/O, cache, GPU cores, or memory stacks—and these are tied together by the Infinity Fabric.
Because of this modular structure, AMD can upgrade, swap, or expand individual functions without having to redesign one massive monolithic die. This creates adaptability at the component level of the chip itself.
Another advantage lies in process-node flexibility. AMD can manufacture compute dies on TSMC’s most advanced 3nm or 4nm nodes, while leaving I/O or memory controllers on older, cheaper 6nm or 12nm nodes. This gives them significant economic agility, since the cost per transistor can be optimized for each function. NVIDIA’s monolithic approach lacks this freedom, forcing all components to be built on the same, often most expensive, node.
Chiplets also enable faster iteration. Because the components are modular, AMD can redesign and release specific chiplets in response to changing workloads.
For example, if AI inference suddenly requires much larger caches, AMD can build and integrate a new cache die without needing to overhaul the compute cores. In this way, adaptability is built into their development process.
Finally, AMD’s MI300 shows how this philosophy scales.
By integrating CPU, GPU, and memory dies into a single package, AMD has created hybrid processors tuned for highly specific workloads.
This is not just mixing parts, but demonstrating architectural agility at the silicon level itself, a direct response to the unpredictability of compute demand.
This is why AMD already possesses adaptability inside the chip. It is a structural edge. NVIDIA’s MGX provides modularity at the system level, which helps with deployment speed and configuration. But AMD’s chiplet architecture embeds modularity directly into the processor, giving it a deeper and more fundamental ability to evolve as workloads shift.
Put simply, NVIDIA’s MGX is about modularity in how you assemble computers, while AMD’s chiplets are about modularity in how you create processors. Both approaches are valuable, but adaptability at the silicon level compounds over decades, because it lowers marginal costs and accelerates innovation where it matters most: at the foundation of compute itself.
AMD’s approach to modularity begins at the silicon level with its chiplet architecture. By breaking processors into smaller dies, each dedicated to a specific role such as compute, I/O, or memory, AMD can optimize every function separately.
Compute dies can be manufactured on advanced nodes, while less critical functions are placed on cheaper, mature nodes. This flexibility reduces wafer costs, improves yields, and allows AMD to adapt its designs quickly as workloads evolve.
The result is a structural advantage in both cost and innovation. Products like the MI300 demonstrate this in practice, integrating CPU, GPU, and memory dies into a single package tuned for emerging AI demands.
Adaptability is built directly into the chip.
NVIDIA, by contrast, applies modularity at the system level through its MGX architecture.
MGX creates a standardized framework that allows GPUs, CPUs, DPUs, NVLink switches, and networking components to be slotted into a common chassis. This does not solve the economic challenges of monolithic GPU design, but it dramatically accelerates deployment.
OEMs and hyperscalers can design and roll out new Blackwell-based systems in months rather than years, giving NVIDIA’s customers the ability to adapt infrastructure to new workloads with remarkable speed. MGX therefore provides agility in deployment rather than in chip creation.
AMD and NVIDIA have both embraced modularity, but at very different layers of the computing stack. AMD’s chiplet architecture embeds modularity directly into the processor. This strengthens long-term economics by improving yields and lowering wafer costs, while also making it easier to reconfigure silicon as workloads evolve in unpredictable ways.
NVIDIA’s MGX, by contrast, introduces modularity at the system level. It gives data centers the flexibility to assemble and scale systems quickly, allowing OEMs and hyperscalers to mix and match GPUs, CPUs, DPUs, and networking components within a standardized chassis.
This delivers clear advantages in deployment speed and customer choice, but it does not address the fundamental economics of silicon production. Because demand for compute is effectively infinite and inherently unpredictable, adaptability at the silicon level compounds more profoundly over the long run, giving AMD a structural edge that reaches beyond immediate deployment cycles.
NVIDIA formally introduced MGX at its Computex keynote in Taipei on May 29, 2023, describing it as a modular server specification that would allow system makers to build more than 100 different configurations for AI, high-performance computing, and Omniverse applications.
Industry outlets confirmed the announcement the following day, emphasizing how MGX could dramatically reduce development costs and shorten time-to-market for new systems. By August 2023, the first MGX-based systems were shipping, with partners such as Supermicro and QCT unveiling designs like the ARS-221GL-NR and S74G-2U.
This contrast highlights just how forward-looking Lisa Su’s strategy was. Nearly a decade earlier, in 2014, AMD committed to the chiplet architecture when much of the industry still clung to monolithic dies.
That contrarian bet positioned AMD ahead of the curve, embedding agility into silicon itself at a time when few appreciated how essential that would become.
The problem is simple: if you can’t think from first principles, you can’t understand this.
You’ll get stuck comparing spec sheets, arguing over benchmarks, or chasing headlines about quarterly GPU sales. That kind of thinking blinds investors to the deeper structural edges that actually compound over decades.
The solution is to learn how to abstract away from surface-level debates and analyze at the root.
That’s exactly what I teach inside my programme. It’s a step-by-step system for breaking down markets, architectures, and business models from first principles — so you can see inevitabilities like modularity years before they show up in the consensus view.
The Payoff Question: Is AI Really Delivering Returns?
During the Q&A, one analyst pressed on whether customers are really seeing good returns on their AI investments. The CEO’s response was telling, as he framed the discussion in terms of two simultaneous platform transitions reshaping the entire industry.
First, he pointed to the shift from general-purpose computing to accelerated computing. For years, CPU scaling has slowed to a crawl, even as computing demand has continued to grow at near-exponential rates — arguably doubling every year.
Without a new approach, he noted, this imbalance would create “computing inflation,” driving up costs for enterprises while pushing energy consumption in data centers to unsustainable levels.
Accelerated computing, by contrast, not only speeds up applications but also cuts both cost and energy. In some cases, he emphasized, companies adopting accelerated computing have saved as much as 90% of their computing costs, precisely because running workloads 50x faster naturally collapses the underlying cost structure.
The second transition builds on the first: accelerated computing has reduced the cost of training AI models so dramatically that training multitrillion-parameter models has become feasible.
Models can now be pretrained on vast datasets approaching the entire corpus of human knowledge. This breakthrough is what triggered the generative AI revolution.
But as the CEO explained, generative AI is not simply a new feature — it is a fundamental shift in how software is created. Where traditional software was built through human-engineered algorithms, generative AI replaces those with machine-learned functions. Data, rather than human design, defines the rules.
In his words, AI acts as a universal function approximator: given enough structured examples, it can learn the function of almost anything.
This transition has profound consequences. Every layer of computing is being reshaped, from CPUs to GPUs, from hand-coded algorithms to models that learn their own rules, and from small-scale applications to entirely new categories of software.
At the cutting edge, frontier models are growing in scale with no end in sight. Each doubling of model size requires more than a doubling of data, which in turn drives compute requirements quadratically higher.
This is why, as the CEO emphasized, next-generation models may require 10, 20, or even 40 times more compute than their predecessors.
Beneath the surface, beyond headline applications like ChatGPT or image generators, generative AI is already embedding itself into workflows.
At NVIDIA, for instance, AI is now heavily used for coding tasks. This illustrates an important dynamic: compute demand is recursive. More compute enables the development of better AI models, which in turn are deployed to design and optimize new computing systems, feeding back into the cycle.
In effect, compute is being used to create more compute.
The implications are enormous. Far from peaking, demand is self-reinforcing and structurally upward. Each platform transition accelerates the next, and every new generation of AI models pushes the boundaries of both hardware and software simultaneously.
The CEO was emphatic about the ROI case for NVIDIA’s infrastructure. He noted that the H200 remains state-of-the-art, and that if a business today had to choose between investing in new CPU infrastructure or building on Hopper, “that decision is relatively clear.”
In his view, companies are now clamoring to transition the trillion dollars of existing infrastructure into a modern, accelerated stack — and Hopper sits at the center of that shift.
This ties into a broader truth: industry dynamics in compute are built on trust. Lisa Su has often emphasized that companies must deliver consistent performance, scalability, and cost-efficiency if they want customers to bet their business on their hardware.
NVIDIA’s brand has become synonymous with that trust, which is why enterprises are moving decisively in their direction.
As the CEO explained further:
“The people who are investing in NVIDIA infrastructure are getting returns on it right away. It's the best ROI computing infrastructure investment you can make today. One way to think about it is just from first principles: your capacity gets rented immediately, so the return is strong.
And then there’s the impact on your own business. Do you want to build the next frontier yourself? Or do you want your Internet services to benefit from a next-generation ad system, a next-generation recommender, or a next-generation search system? For e-commerce, for user-generated content, for social platforms — generative AI is also a fast ROI.”
In other words, the argument is simple: compute is not just a cost center, it is a revenue driver. Meta, for example, is leveraging AI to transform its advertising business, which directly translates into higher revenues. Without the infrastructure, this would not be possible. For companies like Meta, NVIDIA is not an expense line, it is a growth catalyst.
The CEO went further in setting expectations for the Blackwell generation:
“Blackwell is going to be a complete game changer for the industry. And Blackwell is going to carry into the following year. And as I mentioned earlier, working backwards from first principles, remember that computing is going through two platform transitions at the same time: from general-purpose computing to accelerated computing, and from human-engineered software to generative AI or machine-learned software.”
He closed with a sweeping vision:
“It’s growing, so we want to continue to increase its scale. And we believe that by continuing to scale AI models, we’ll reach a level of extraordinary usefulness and realize the next industrial revolution. We believe it. And so, we’re going to drive ourselves really hard to continue to go up that scale.”
Taken together, the message is clear: compute is becoming the highest-ROI infrastructure investment a company can make, because it directly enables new revenue streams. At the same time, NVIDIA’s framing of the industry as undergoing two platform transitions — from CPUs to accelerated computing, and from engineered algorithms to learned algorithms — captures the scale of what is happening. It is not simply a new product cycle, but a redefinition of the foundations of computing itself.
The Big Question: Does My AMD Thesis Still Stand Strong?
The problem is that most investors never make it this far in their thinking. They stop at surface comparisons — quarterly earnings, benchmark tests, China headlines — and miss the structural forces shaping the next decade of compute. That’s why they buy late, sell early, and never build conviction.
The solution is to train yourself to think from first principles. To see the inevitabilities — like infinite compute demand, the shift from training to inference, and the compounding power of chiplet architecture — before consensus catches on.
That’s exactly what I teach inside The Asymmetric Investing Blueprint. It’s a complete system for analyzing companies through abstraction, spotting hidden structural edges, and holding with conviction while the market panics.
When viewed as a whole, the picture is clear: compute demand is accelerating at a structural level, and both AMD and NVIDIA are responding with different strategies.
NVIDIA continues to dominate through its ecosystem strength, its CUDA moat, and its rapid deployment of new architectures like Hopper and Blackwell. Its MGX system-level modularity has given hyperscalers faster ways to bring AI infrastructure online, reducing development costs and enabling massive clusters to be built at speed.
The company’s positioning is formidable, and its leadership in training workloads remains undisputed.
AMD, however, plays a different game. Rather than focusing on system-level adaptability, it has embedded modularity directly into the processor itself through its chiplet architecture.
This choice, made a decade ago, allows AMD to solve the economics of silicon production, giving it the ability to lower costs, improve yields, and adapt its processors to unpredictable workloads.
With products like the MI350 family, AMD is showing how this design philosophy can translate into concrete advantages for inference, the stage where AI transitions from lab experiments into real-world applications.
Inference is particularly important because it is not just a technical benchmark — it is the economic engine of AI adoption. Training may define the frontier, but inference defines scale. Hospitals, factories, financial systems, vehicles, and cloud platforms all depend on inference to operationalize AI in real time.
This is where efficiency and cost per token matter more than raw peak performance. AMD’s chiplet-driven architecture positions it to thrive in this environment by delivering flexibility and scalability at marginal cost.
NVIDIA’s MGX is a powerful tool for speeding up deployment and configuration, but it does not resolve the fragility of monolithic GPU design. If wafer yields decline or costs rise further, MGX cannot change the underlying economics. In contrast, AMD’s silicon-level adaptability compounds over decades, creating a foundation for long-term leadership in a world where compute demand is both infinite and unpredictable.
The conclusion is not that AMD will replace NVIDIA, or that NVIDIA’s achievements are less impressive. NVIDIA remains an exceptional company with unmatched customer trust and ecosystem depth. But from a first-principles perspective, AMD has built a structural advantage. Its decision to embrace chiplets in 2014 — a contrarian bet at the time — has positioned it to capitalize on the realities of an era defined by infinite compute demand and architectural unpredictability.
The broader thesis holds: compute demand is not finite, it is unbounded. The trajectory of workloads is inherently uncertain. In such an environment, the companies best positioned to win are those that exhibit the greatest architectural agility. AMD’s leadership in chiplets, combined with its growing presence in AI inference, gives it precisely that edge.
Does the thesis still hold?
Yes — and it has never been more relevant.
While NVIDIA continues to lead in training and system-level deployment, AMD’s structural adaptability at the silicon level positions it to thrive over the coming decades.
For investors willing to think abstractly and focus on first principles, AMD represents not just a competitor to NVIDIA, but a company uniquely aligned with the deepest forces driving the future of compute.