NVIDIA GTC 2026: Our key takeaways
So what strategic points did we take away from the NVIDIA conference last week? Besides the extraordinary level of production, showmanship, and sheer energy? Next week, we’ll get into some of the issues for us in European data center development. But first, here’s our take at GARBE.DC on the strategic highlights:
NVIDIA is shifting to providing 5 layers of comprehensive infrastructure for AI factories, agentic systems, and physical AI. This strategy means collaboration with a vast network of developers, startups, and hyperscalers. And NVIDIA are building a vertically integrated but horizontally open ecosystem.
It’s really worth taking a couple of minutes on their explanation of AI’s ‘five-layer cake’.
Energy is the foundation
At the foundation is energy. Intelligence generated in real time requires power generated in real time. Every token produced is the result of electrons moving, heat being managed, and energy being converted into computation. There is no abstraction layer beneath this. Energy is the first principle of AI infrastructure and the binding constraint on how much intelligence the system can produce.
Chips transform energy into computation
Above energy are the chips. These are processors designed to transform energy into computation efficiently at massive scale. AI workloads require enormous parallelism, high-bandwidth memory, and fast interconnects. Progress at the chip layer determines how fast AI can scale and how affordable intelligence becomes.
Infrastructure is the orchestration of processors into an AI factory
Above chips is infrastructure. This includes land, power delivery, cooling, construction, networking, and the systems that orchestrate tens of thousands of processors into one machine. These systems are AI factories. They are not designed to store information. They are designed to manufacture intelligence.
Models sit above infrastructure – there are many kinds not just language models
Above infrastructure are the models. AI models understand many kinds of information: language, biology, chemistry, physics, finance, medicine, and the physical world itself. Language models are only one category. Some of the most transformative work is happening in protein AI, chemical AI, physical simulation, robotics, and autonomous systems.
Applications at the top are where economic value happens
At the top are applications, where economic value is created: drug discovery platforms, industrial robotics, legal copilots, self-driving cars. A self-driving car is an AI application embodied in a machine. A humanoid robot is an AI application embodied in a body. Same stack, different outcomes.
At GTC 2026, NVIDIA CEO Jensen Huang announced that the company is experiencing unprecedented demand backlog, forecasting that $1 trillion in revenue will be generated through 2027 from its AI chip and infrastructure products. That demand backlog is primarily broken down into high-performance AI chips and infrastructure, and this represents a shift from a $500 billion projection (made previously) toward an accelerated buildup of AI infrastructure, largely driven by the transition from model training to large-scale, real-time AI inference and agentic AI.
The Inference Boom is on us
NVIDIA says the future will be 70% inference and 30% LLMs in terms of data center capacity – that’s a huge shift, and it shows AI moving deeper and more broadly into the real world. Inference focuses on low-latency, high-throughput delivery of tokens to users. This requires massive memory bandwidth and high-speed networking to handle concurrent user requests. Of course, training remains essential for developing new “reasoning models,’ “agentic AI” (AI that takes action), and “physical AI” (robotics).
“The Inference Boom” will be enabled by the next generation chip – The Vera Rubin, designed to provide up to 10 times lower inference token costs than Blackwell and requiring fewer GPUs to train mixture-of-experts models. And what that will look like is a series of structural advances in the work AI does:
- Production Inference (Real-time AI) will move AI out of chat boxes into real-time applications (e.g., agents making API calls, autonomous vehicles).
- Agentic AI & Specialized Model will change too. Instead of general models, the focus is on specialized, fine-tuned agents (using technologies like NVIDIA NemoClaw) that interact with other agents to solve complex tasks.
- Real-time Interaction: New infrastructure (e.g., GROQ 3 LPU) offers 35x higher inference throughput per megawatt, significantly reducing the “time to first token” for interactive AI.
Thus enabling entirely new AI business models. Tokens per watt will define the ability to be successful in the future – and this addresses both the cost of AI in money and the cost in terms of power.
And the robots are on the march!
NVIDIA GTC focused on AI and robotics in a big way because “physical AI ‘is on the move. ” GTC showcased how AI-powered simulation, digital twins, and advanced hardware are used to train, deploy, and operate autonomous machines and humanoid robots to address labor shortages, improve simulation accuracy, and accelerate industrial automation.
Written by Christian Kallenbach
Head of Business Development, Sales & Marketing