Back to Blog

NVIDIA's RTX Spark: The True AI PC Has Arrived

June 4, 2026

For decades, the Windows PC ecosystem has been defined by a relatively stable division of labor: Microsoft on software, Intel/AMD on silicon, and NVIDIA on graphics. That equilibrium may have just shifted.

At Jensen Huang's COMPUTEX 2026 keynote, NVIDIA announced RTX Spark — a processor built specifically to bring true AI PC capability to Windows machines. And for the first time, Windows users have a credible path to running powerful local AI agents without relying on the cloud.

What Is RTX Spark?

RTX Spark is NVIDIA's answer to a question the industry has been circling for two years: what should an AI PC actually be?

The chip is a custom silicon package combining:

  • Blackwell RTX GPU with 1 petaflop of FP4 AI compute
  • 20-core Grace CPU (built in partnership with MediaTek)
  • 128 GB unified memory with 600 GB/s NVLink C2C bandwidth
  • Full NVIDIA software stack: CUDA, TensorRT, NVFP4, DLSS, Reflex, G-SYNC

At 14 mm thick and ~3 lbs, it fits into 14–16 inch laptops — a form factor that was previously unthinkable for this class of AI hardware. The screen is a tandem OLED with color accuracy for creative work and NVIDIA G-SYNC for gaming.

RTX Spark Specs (laptop config):
- FP4 AI Performance: 1 petaflop
- CPU: 20-core Grace
- Unified Memory: 128 GB
- Memory Bandwidth: 600 GB/s (NVLink C2C)
- Thickness: 14 mm
- Weight: ~3 lbs

Running Local LLMs on RTX Spark

Jensen's demo made the use case concrete: given a site, sketch, style reference, and requirements, an AI agent running on RTX Spark called Rhino to generate architectural layouts, then imported them into Blender with Flux 2 for multi-angle renders. The user could modify the output at any step.

Adobe Photoshop and Premiere are already being optimized for RTX Spark and integrated via MCP into local AI agent workflows.

In terms of LLM support, RTX Spark can run:

  • Nemotron 3 Ultra (NVIDIA's own open model, announced at the same keynote)
  • Local models via Ollama or LM Studio using the CUDA/TensorRT stack
  • Cloud models when GPU memory is insufficient for the task at hand

This is a meaningful expansion from the Mac-centric narrative that's dominated local AI computing. If you need serious GPU compute and large memory for local model serving, RTX Spark gives Windows users a legitimate alternative to Apple Silicon.

Three Product Forms

NVIDIA showed RTX Spark across three form factors:

Form FactorTarget UserKey Feature
LaptopMobile professional, developer, gamerPortable 1-petaflop AI compute
DesktopHome AI hub24/7 agent operation, connects peripherals
WorkstationModel developer, agent builderDGX Station for Windows: 748 GB RAM, 20 petaflops, runs trillion-parameter models locally

The Broader Picture: Agentic AI on the Desktop

RTX Spark is part of NVIDIA's larger vision for the "AI PC era." Jensen drew a direct parallel to the smartphone — ten years from now, he argued, a PC will be as fundamentally different from today's machines as an iPhone is from a Nokia.

The shift he's describing:

Old model: Human opens app → clicks → types → gets result New model: AI agent receives goal → understands intent → plans → calls tools → retrieves context → executes → saves memory

In this framework, the PC becomes a personal AI supercomputer. It can run agents 24/7, connect to cameras and home devices, handle model development locally, and still run traditional Windows applications.

Nemotron 3 Ultra: NVIDIA's Open Agent Model

Also announced at COMPUTEX: Nemotron 3 Ultra, NVIDIA's new open-weight model for agentic workflows.

Nemotron 3 Ultra uses an SSM (state-space model) + MoE hybrid architecture. NVIDIA claims it's 5× faster and 30% cheaper to run than comparable open models like Kimi K2.6, Qwen 3.5, and GLM 5.1 on agentic tasks. The model, training scripts, and training data will all be released for enterprise fine-tuning.

This gives enterprises a credible path to building proprietary agents without depending on OpenAI or Anthropic APIs.

How RTX Spark Compares to Apple Silicon for Local AI

For developers running local LLMs today, the comparison is inevitable:

RTX Spark (laptop)Apple Silicon (M4 Max)
AI Performance1 PFLOPS (FP4)~40-50 TOPS (ANTO)
Unified Memory128 GBUp to 128 GB
Memory Bandwidth600 GB/s~500 GB/s
CUDA SupportNativeVia emulation / MLX
Form Factor14 mm laptop15-16" MacBook Pro
Agentic OS IntegrationWindows + NVIDIA stackmacOS + Apple Intelligence
Local Model OptionsOllama, LM Studio, TensorRTOllama, LM Studio, MLX-native

Apple Silicon still leads on efficiency and developer experience for local AI. But RTX Spark closes the gap significantly on raw memory capacity and introduces native CUDA support — which remains the standard for production AI deployments.

Should You Wait for an RTX Spark PC?

If you're already deep in the local LLM ecosystem with a capable GPU (RTX 4090, etc.), RTX Spark may not dramatically change your setup. Ollama and LM Studio work well today.

But if you're in the market for a new machine, want the highest possible local AI throughput, need to run large models (70B+), or are building agentic workflows that run 24/7, RTX Spark is worth waiting for.

The era of the true AI PC is arriving. And for the first time, Windows users have a seat at the table.