Jen-Hsun Huang’s latest CES speech: AI Agent is expected to become the next robotics industry, with a scale of trillions of dollars

Reprinted from chaincatcher
01/07/2025·2MOrganizing: new
At CES 2025, which opened this morning, NVIDIA founder and CEO Jensen Huang gave a landmark keynote speech, revealing the future of AI and computing. From the core token concept of generating AI, to the release of the new Blackwell architecture GPU, to the digital future driven by AI, this speech will profoundly affect the entire industry from a cross-domain perspective.
1) From generative AI to agentic AI: the beginning of a new era
-
The birth of token: As the core driving force for generating AI, token transforms text into knowledge, injects life into images, and opens up a new way of digital expression.
-
AI evolution path: From perceptual AI and generative AI to agentic AI capable of reasoning, planning, and action, AI technology continues to reach new heights.
-
The Transformer Revolution: Since its launch in 2018, this technology has redefined computing and completely disrupted the traditional technology stack.
2) Blackwell GPU: Breaking through the performance limits
-
The new generation GeForce RTX 50 series: Based on the Blackwell architecture, it has 92 billion transistors, 4000 TOPS AI performance and 4 PetaFLOPS computing power, which is three times the performance of the previous generation.
-
Integration of AI and graphics: For the first time, programmable shaders and neural networks are combined, and neural texture compression and material shading technology are introduced to bring stunning rendering effects.
-
Inclusive high performance: The RTX 5070 notebook achieves RTX 4090 performance at a price of $1,299, promoting the popularity of high-performance computing.
3) Multi-field expansion of AI applications
-
Enterprise-level AI Agent: NVIDIA provides tools such as Nemo and Llama Nemotron to help enterprises build autonomous reasoning digital employees and achieve intelligent management and services.
-
Physic AI: Through the Omniverse and Cosmos platforms, AI is integrated into the fields of industry, autonomous driving and robotics, redefining global manufacturing and logistics.
-
Future computing scenarios: NVIDIA is bringing AI from the cloud to personal devices and inside enterprises, covering all computing needs from developers to ordinary users.
The following is the main content of Huang Renxun’s speech:
This is the birthplace of wisdom, a brand new factory - a generator that generates tokens. It is the building block of AI, opening up a new realm and the first step into an extraordinary world. Tokens turn words into knowledge and breathe life into images; they turn ideas into videos, helping us safely navigate any environment; they teach robots to move like masters and inspire us to celebrate victories in new ways. Tokens can also bring peace of mind when we need it most. They give meaning to numbers and help us better understand the world, predict potential dangers, and find cures for the threats inherent in them. It can make our visions come true and repair what we have lost.
All this in AI began in 1993, when NVIDIA launched its first product, NV1. We wanted to create computers that could do things that normal computers couldn't do, which made it possible to have a gaming console inside a PC. Then, in 1999, NVIDIA invented the programmable GPU, ushering in more than 20 years of technological progress that made modern computer graphics possible. Six years later, we launched CUDA to express GPU programmability through rich algorithms. This technology was difficult to explain at first, but by 2012, the success of AlexNet verified the potential of CUDA and promoted the breakthrough development of AI.
Since then, AI has developed at an astonishing pace. From perceptual AI to generative AI, to Agentic AI that can perceive, reason, plan and act, the capabilities of AI are constantly improving. In 2018, Google launched Transformer, and the world of AI really took off. Transformer not only revolutionized the landscape of AI, but also redefined the entire computing field. We realize that machine learning is not just a new application or business opportunity, but a fundamental innovation in computing. From manually writing instructions to optimizing neural networks with machine learning, every layer of the technology stack has changed dramatically.
Today, AI applications are everywhere. Whether it's understanding text, images, sounds, or translating amino acids and physics, it can get it done. Almost all AI applications can be boiled down to three questions: What modalities of information does it learn? What mode of information is translated? What modal information is generated? This fundamental concept drives every AI-driven application.
All of these achievements would not have been possible without the support of GeForce. GeForce brought AI to the masses, and now, AI is returning to GeForce. With real-time ray tracing technology, we are able to render graphics with stunning effects. Through DLSS, AI can even go beyond frame generation and predict future images. Only 2 million of the 33 million pixels are calculated, and the rest are generated by AI predictions. This miraculous technology demonstrates the powerful capabilities of AI, makes computing more efficient, and allows us to see the infinite possibilities in the future.
This is why so many amazing things are happening right now. We pushed the boundaries of AI with GeForce, and now, AI is revolutionizing GeForce. Today, we’re announcing the next generation of our products – the RTX Blackwell family. Let's take a look.
This is the new GeForce RTX 50 series, based on the Blackwell architecture. This GPU is a performance monster, with 92 billion transistors, 4000 TOPS of AI performance and 4 PetaFLOPS of AI computing power, which is three times higher than the previous generation Ada architecture. It's all about generating those amazing pixels I just showed you. It also has 380 ray tracing teraflops to deliver the most beautiful possible image quality for the pixels that require calculation, along with 125 shading teraflops. This graphics card uses Micron's G7 video memory, with a speed of 1.8TB per second, twice the performance of the previous generation.
We can now combine AI workloads with computer graphics workloads, and a remarkable feature of this generation is that programmable shaders can also handle neural networks. This led us to the invention of neural texture compression and neural material shading. These technologies use AI to learn texture and compression algorithms, ultimately producing stunning image effects that only AI can achieve.
Even in terms of mechanical design, this graphics card is a marvel. It uses a dual-fan design, the entire graphics card is like a huge fan, and the internal voltage regulation module is the most advanced. Such excellent design is entirely due to the efforts of the engineering team.
Next is the performance comparison. The familiar RTX 4090, priced at $1599, is the core investment in a home PC entertainment center. Now, the RTX 50 series offers higher performance, starting at only $549, from RTX 5070 to RTX 5090, with twice the performance of the RTX 4090.
Even more amazing is that we put this high-performance GPU into a notebook. The RTX 5070 notebook costs $1,299, but has the performance of the RTX 4090. This design combines AI and computer graphics technology to achieve high energy efficiency and high performance.
The future of computer graphics will be neural rendering - the fusion of AI and computer graphics. The Blackwell series can even be implemented in notebooks with a thickness of only 14.9 mm, and the entire range of products from RTX 5070 to RTX 5090 can be adapted to ultra-thin notebooks.
GeForce helped popularize AI, and now AI is revolutionizing GeForce. This is the mutual promotion of technology and intelligence, and we are moving towards a higher realm.
Three Scalling Laws of AI
Next, let's talk about the development direction of AI.
1) Pre-training Scalling Law
The AI industry is expanding at an accelerated pace, and driving this process is a powerful model called "Scalling Law." This rule of thumb has been repeatedly verified by researchers and industry, showing that the larger the size of the training data, the larger the model, and the more computing power is invested, the stronger the model will be.
The growth rate of data is accelerating exponentially. It is estimated that in the next few years, the amount of data produced every year by humans will exceed the total amount produced in all previous human history. This data is becoming multimodal, including video, images, and sounds. These massive data can be used to train the basic knowledge system of AI and lay a solid knowledge foundation for AI.
2) Post-training Scalling Law
In addition, there are two other Scalling Laws on the rise.
The second type of Scalling Law is "post-training Scalling Law", which involves technologies such as reinforcement learning and human feedback. In this way, AI generates answers based on human queries and continuously improves them from human feedback. This type of reinforcement learning system helps AI improve skills in specific areas through high-quality hints, such as becoming better at solving mathematical problems or performing complex reasoning.
The future of AI is not just about perception and generation, but a process of continuous self-improvement and boundary-breaking. It's like having a mentor or coach who provides feedback after you complete a task. Through testing, feedback, and self-improvement, AI can also improve through similar reinforcement learning and feedback mechanisms. This post-training phase of reinforcement learning combined with synthetic data generation techniques is similar to the process of self-training. AI can face complex and provable puzzles, such as proving theorems or solving geometric problems, and continuously optimizes its answers through reinforcement learning. This kind of post-training, while requiring huge amounts of computing power, can ultimately create extraordinary models.
3) Test time Scalling Law
Scalling Law also gradually emerged during test time. This law shows unique potential when AI is actually used. AI can dynamically allocate resources during inference, no longer limited to parameter optimization, but focus on computational allocation to produce the required high-quality answers.
This process is similar to inferential thinking rather than direct inferences or one-time answers. AI can break down a problem into multiple steps, generate and evaluate multiple solutions, and ultimately select the best one. This long-term inference has a significant effect on improving model capabilities.
We've seen the evolution of this technology, from ChatGPT to GPT-4 and now Gemini Pro, with all of these systems going through a gradual progression of pre-training, post-training, and test time scaling. Achieving these breakthroughs requires tremendous computing power, which is the core value of NVIDIA's Blackwell architecture.
The latest introduction to Blackwell architecture
The Blackwell system is in full production and its performance is amazing. Today, every cloud service provider is deploying these systems, which are manufactured in 45 factories around the world and support up to 200 configurations, including liquid cooling, air cooling, x86 architecture and NVIDIA Grace CPU versions.
Its core component, the NVLink system itself, weighs up to 1.5 tons, has 600,000 parts, is equivalent to the complexity of 20 cars, and is connected by 2 miles of copper wires and 5,000 cables. The entire manufacturing process is extremely complex, but the goal is to meet the ever-expanding demands for computing.
Compared to the previous generation architecture, Blackwell delivers a 4x improvement in performance per watt and a 3x improvement in performance per dollar. This means that the size of the trained model can be increased by 3 times at the same cost, and the key behind these improvements is the generation of AI tokens. These tokens are widely used in ChatGPT, Gemini and various AI services, and are the basis for future computing.
On this basis, NVIDIA has promoted a new computing model: neural rendering, which perfectly integrates AI and computer graphics. The 72 GPUs under the Blackwell architecture form the world's largest single-chip system, providing AI floating point performance of up to 1.4 ExaFLOPS, and its memory bandwidth reaches an astonishing 1.2 PB/s, which is equivalent to the sum of all Internet traffic in the world. This super computing power enables AI to handle more complex reasoning tasks while significantly reducing costs, laying the foundation for more efficient computing.
AI Agent system and ecosystem
Looking to the future, AI's reasoning process is no longer a simple single-step response, but closer to an "internal dialogue." The AI of the future will not only generate answers, but also reflect, reason, and continuously optimize. As the AI token generation rate increases and the cost decreases, AI service quality will be significantly improved to meet a wider range of application needs.
To help enterprises build AI systems with autonomous reasoning capabilities, NVIDIA provides three key tools: NVIDIA NeMo, AI microservices, and acceleration libraries. By packaging complex CUDA software and deep learning models into containerized services, enterprises can deploy these AI models on any cloud platform and quickly develop AI Agents for specific fields, such as service tools that support enterprise management or digital employees for user interaction.
These models open up new possibilities for enterprises, not only lowering the development threshold for AI applications, but also promoting the entire industry to take a solid step in the direction of Agentic AI (autonomous AI). AI in the future will become digital employees that can be easily integrated into enterprise tools such as SAP and ServiceNow to provide intelligent services to customers in different environments. This is the next milestone in the expansion of AI and the core vision of NVIDIA’s technology ecosystem.
Training evaluation system. In the future, these AI Agents will essentially be the digital workforce that works side by side with employees to complete tasks for you. Therefore, introducing these professional agents to your company is like onboarding new employees. We provide different tool libraries to help these AI Agents learn the company's unique language, vocabulary, business processes, and working methods. You need to provide them with examples of work products that they will try to generate, and then you can provide feedback, conduct evaluations, and so on. You'll also set restrictions such as what they can't do or say, and control what information they can access. This entire digital workforce process is called Nemo. To some extent, every company's IT department will become the HR department of AI Agents.
Today, IT departments manage and maintain large amounts of software; in the future, they will manage, train, onboard, and improve large numbers of digital agents to provide services to the company. Therefore, the IT department will gradually evolve into an AI Agent human HR department.
In addition, we provide many open source blueprints for use by the ecosystem. Users can freely modify these blueprints. We provide blueprints for all different types of Agent people. Today, we also announced something very cool and smart: we launched a new model family based on Llama, the NVIDIA Llama Nemo Tron language base model series.
Llama 3.1 is a phenomenal model. Meta's Llama 3.1 has been downloaded approximately 350,650,000 times and has spawned approximately 60,000 other models. This is one of the core reasons that drives almost all companies and industries to start researching AI. We realized that the Llama model could be better fine-tuned for enterprise use cases. Leveraging our expertise and capabilities, we fine-tuned it into the Llama Nemotron Open Model Kit.
The models come in different sizes: the small model responds quickly; the mainstream Super Llama Nemotron is a general-purpose model; and the ultra-large model can be used as a teacher model to evaluate other models, generate answers and determine their quality, or Used as a knowledge distillation model. All these models are now available online.
These models perform extremely well and top the rankings in areas such as dialogue, instruction, and information retrieval, making them ideal for AI agent functions worldwide.
We also cooperate very closely with the ecosystem, such as our cooperation with ServiceNow, SAP, and Siemens in industrial AI. Companies like Cadence and Perplexity are also working on great projects. Perplexity disrupts search, and Codium serves 30 million software engineers worldwide. AI assistants will greatly improve the productivity of software developers, which is the next huge application area for AI services. There are 1 billion knowledge workers in the world, and AI Agent may be the next robotics industry, with the potential to reach trillions of dollars.
AI Agent Blueprint
Next, we show some AI Agent blueprints completed with partners.
AI Agents are the new digital workforce that can assist or replace humans in completing tasks. NVIDIA's Agentic AI building blocks, NEM pre-trained models and Nemo framework help organizations easily develop and deploy AI Agents. These agents can be trained to be domain-specific task experts.
Here are four examples:
-
Research Assistant Agent: Able to read complex documents such as lectures, journals, financial reports, etc., and generate interactive podcasts for easy learning;
-
Software security AI Agent: Helps developers continuously scan software vulnerabilities and prompts to take corresponding measures;
-
Virtual laboratory AI Agent: accelerates compound design and screening, and quickly finds potential drug candidates;
-
Video Analysis AI Agent: Based on the NVIDIA Metropolis blueprint, it analyzes data from billions of cameras and generates interactive searches, summaries, and reports. For example, monitor traffic flow, facility processes, provide improvement suggestions, etc.;
The arrival of the physics AI era
We hope to bring AI from the cloud to every corner, including within companies and personal PCs. NVIDIA is working hard to transform Windows WSL 2 (Windows Subsystem) into the preferred platform for supporting AI. This will make it easier for developers and engineers to utilize NVIDIA's AI technology stack, including language models, image models, animation models, etc.
In addition, NVIDIA launched Cosmos, the first basic model development platform for the physical world, focusing on understanding the dynamic characteristics of the physical world, such as gravity, friction, inertia, spatial relationships, causal relationships, etc. It can generate videos and scenes that comply with physical laws and is widely used in the training and verification of robots, industrial AI, and multi-modal language models.
Cosmos provides physical simulation by connecting to NVIDIA Omniverse to generate realistic and credible simulation results. This combination is a core technology for the development of robotics and industrial applications.
NVIDIA's industrial strategy is based on three computing systems:
-
DGX system for training AI;
-
AGX system that deploys AI;
-
Digital twin system for reinforcement learning and AI optimization;
Through the collaborative work of these three systems, NVIDIA promotes the development of robots and industrial AI and builds the future digital world. Rather than saying this is a three-body problem, we have a "three-computer" solution.
Nvidia’s Robotics Vision Let me show you three examples.
1) Application of industrial visualization
There are currently millions of factories and hundreds of thousands of warehouses around the world, which form the backbone of a $50 trillion manufacturing industry. In the future, this will all need to be software defined and automated, incorporating robotics. We work with Keon, the world's leading provider of warehouse automation solutions, and Accenture, the world's largest professional services provider, focusing on digital manufacturing, to create some very special solutions. Our marketing method is similar to other software and technology platforms, through developers and ecosystem partners, and more and more ecosystem partners are connected to the Omniverse platform. This is because everyone wants to visualize the future of industry. Within this $50 trillion global GDP, there is so much waste and so much opportunity for automation.
Check out this example of Keon and Accenture working with us:
Keon (a supply chain solutions company), Accenture (a global leader in professional services), and NVIDIA are bringing physics AI to the trillion-dollar warehouse and distribution center market. Managing efficient warehouse logistics requires grappling with a complex web of decisions influenced by ever-changing variables such as daily and seasonal demand changes, space constraints, labor supply, and the integration of diverse robotics and automation systems. Today, predicting operational key performance indicators (KPIs) for a physical warehouse is nearly impossible.
To address these issues, Keon is using Mega, an NVIDIA Omniverse blueprint, to build industrial digital twins to test and optimize fleets of robots. First, Keon’s warehouse management solution assigns tasks to industrial AI brains in digital twins, such as moving goods from buffer locations to shuttle storage solutions. The robot fleet performs tasks through perception and reasoning in the physical warehouse simulation environment in Omniverse, plans its next move and takes action. The digital twin environment uses sensor simulation so that the robot brain can see the status after the task is performed and decide on the next action. With Mega's precise tracking, the entire cycle continues while measuring operational KPIs such as throughput, efficiency and utilization, all before changes are made to the physical warehouse.
With NVIDIA, Keon and Accenture are redefining the future of industrial autonomy.
In the future, every factory will have a digital twin that is fully synchronized with the actual factory. You can use Omniverse and Cosmos to generate a large number of future scenarios, and AI will determine the optimal KPI scenario and use it as constraints and AI programming logic for actual factory deployment.
2) Self-driving cars
The autonomous driving revolution has arrived. After years of development, the success of both Waymo and Tesla has proven the maturity of autonomous driving technology. Our solutions provide this industry with three types of computer systems: systems for training AI (such as the DGX system), systems for simulation testing and generating synthetic data (such as Omniverse and Cosmos), and in-vehicle computer systems (such as AGX system). Almost all major car companies in the world are working with us, including Waymo, Zoox, Tesla, and BYD, the world's largest electric vehicle company. There are also companies such as Mercedes, Lucid, Rivian, Xiaomi and Volvo that are about to launch innovative models. Aurora is using Nvidia technology to develop self-driving trucks.
100 million cars are manufactured every year, and 1 billion cars are driven on the world's roads, with a total of trillions of miles driven every year. These will gradually become highly automated or fully automated. This industry is expected to become the first multi-trillion dollar robotics industry.
Today, we’re announcing Thor, our next generation in-vehicle computer. It is a general-purpose robot computer capable of processing large amounts of data from cameras, high-resolution radar, lidar and other sensors. Thor is an upgraded version of the current industry standard Orin, with 20 times the computing power and is now in full mass production. At the same time, NVIDIA’s Drive OS is the first AI computer operating system certified to meet the highest standard of functional safety (ISO 26262 ASIL D).
Autonomous Driving Data Factory
NVIDIA uses the Omniverse AI model and the Cosmos platform to create an autonomous driving data factory to significantly expand training data through synthetic driving scenarios. This includes:
-
OmniMap: Fusion of maps and geospatial data to build drivable 3D environments;
-
Neural reconstruction engine: Use sensor logs to generate high-fidelity 4D simulation environments and generate scene variants for training data;
-
Edify 3DS: Search or generate new assets from the asset library to create scenes for simulation.
Through these technologies, we expand thousands of driving scenarios into billions of miles of data for the development of safer and more advanced autonomous driving systems.
3) Universal robot
The era of universal robots is coming. The key to driving breakthroughs in this area is training. For humanoid robots, obtaining imitation data is relatively difficult, but NVIDIA's Isaac Groot provides a solution. It generates massive data sets through simulation and combines the multiverse simulation engines of Omniverse and Cosmos for policy training, verification and deployment.
For example, with Apple Vision Pro, developers can remotely operate robots, capture data without the need for a physical robot, and teach task movements in a risk-free environment. Through Omniverse's domain randomization and 3D to real scene expansion functions, exponentially growing data sets are generated, providing massive resources for robot learning.
In short, whether it is industrial visualization, autonomous driving, or general robots, NVIDIA's technology is leading future changes in the fields of physical AI and robotics.
Finally, I have another important thing to show. All of this is inseparable from a project called Project Digits that we started within the company ten years ago. The full name is Deep Learning GPU Intelligence Training System. System), referred to as Digits.
Before the official launch, we adjusted DGX to harmonize with the company's in-house RTX, AGX, OVX and other product series. The advent of DGX1 has truly changed the development direction of AI, and this is also a milestone in NVIDIA's development of AI.
DGX1 Revolutionary
The original intention of DGX1 is to provide researchers and startups with an out-of-the-box AI supercomputer. Imagine that supercomputers in the past required users to build dedicated facilities and design and build complex infrastructure to realize their existence. DGX1 is a supercomputer specially designed for AI development. It requires no complicated operations and can be used right out of the box.
I still remember that in 2016, I delivered the first DGX1 to a start-up company-OpenAI. At that time, Elon Musk, Ilya Sutskever, and many NVIDIA engineers were present, and we celebrated the arrival of DGX1 together. This device significantly advances the development of AI computing.
Today, AI is everywhere. Not just limited to research institutions and startup labs, as I mentioned at the beginning, AI has become a completely new way of computing and software development. Every software engineer, creative artist, and even ordinary user of computer tools needs an AI supercomputer. But I always wished the DGX1 could be smaller.
The latest AI supercomputer
Here are Nvidia’s latest AI supercomputers. It still belongs to Project Digits. We are still looking for a better name and welcome your suggestions. This is a truly amazing piece of equipment.
The supercomputer can run Nvidia’s complete AI software stack, including DGX Cloud. It can be used as a cloud supercomputer, a high-performance workstation, or even an analysis workstation on the desktop. Best of all, it's based on a new chip we developed in secret, codenamed GB110, which is the smallest Grace Blackwell we've ever made.
I have a chip in my hand to show you its internal design. This chip was developed in cooperation with MediaTek, the world's leading SoC company. The CPU SoC is customized for Nvidia and uses NVLink chip-to-chip interconnect technology to connect to the Blackwell GPU. The tiny chip is now in full production. We expect this supercomputer to be officially available around May
We even provide a "double computing power" configuration, which can connect these devices together through ConnectX and support GPU pass-through (GPUDirect) technology. It is a complete set of supercomputing solutions that can meet the various needs of AI development, analysis work and industrial applications.
In addition, it also announced the mass production of three new Blackwell system chips, the world's first physical AI basic model, and breakthroughs in three major robotics fields - autonomous AI Agent robots, humanoid robots and self-driving cars.