Vera Arrives: NVIDIA’s Agent-First CPU Signals Shift to Autonomous Reasoning
NVIDIA's new Vera CPU is specifically architected to handle agentic AI workloads, delivering significant performance gains for autonomous reasoning systems at top labs like OpenAI and Anthropic.
The dawn of agentic AI—systems capable of reasoning, planning, and executing complex multi-step tasks independently—has arrived, and it requires a fundamentally different computing architecture. NVIDIA has officially entered this space with the release of 'Vera,' the company's first CPU specifically designed to handle the unique demands of high-level AI agents. This week, the first units reached the world's most prominent AI research facilities, including OpenAI, Anthropic, and SpaceXAI.
Traditional CPUs often struggle with the latency requirements of agentic inference, but Vera is built to bridge that gap. According to NVIDIA CEO Jensen Huang, the demand for this specialized silicon is "parabolic." When paired in the Vera Rubin NVL72 rack system, the architecture allows agent sandboxes to run up to 50% faster than on traditional general-purpose CPUs. Perhaps more importantly for the realization of Physical AI, the architecture brings the cost per token down to one-tenth of previous generations, making it economically viable to deploy these 'thinking' systems in real-world environments.
The shift from Large Language Models (LLMs) to Large Agentic Models (LAMs) represents the next frontier of Physical AI. While LLMs process text, agents interact with environments, manipulate software tools, and potentially control physical hardware. By optimizing for agentic inference, NVIDIA is providing the 'brain' necessary for robots and autonomous systems to operate with a level of situational awareness that was previously computationally prohibitive.
Source: NVIDIA Blog