NVIDIA's RTX Spark: The True AI PC Has Arrived
June 4, 2026
For decades, the Windows PC ecosystem has been defined by a relatively stable division of labor: Microsoft on software, Intel/AMD on silicon, and NVIDIA on graphics. That equilibrium may have just shifted.
At Jensen Huang's COMPUTEX 2026 keynote, NVIDIA announced RTX Spark — a processor built specifically to bring true AI PC capability to Windows machines. And for the first time, Windows users have a credible path to running powerful local AI agents without relying on the cloud.
What Is RTX Spark?
RTX Spark is NVIDIA's answer to a question the industry has been circling for two years: what should an AI PC actually be?
The chip is a custom silicon package combining:
- Blackwell RTX GPU with 1 petaflop of FP4 AI compute
- 20-core Grace CPU (built in partnership with MediaTek)
- 128 GB unified memory with 600 GB/s NVLink C2C bandwidth
- Full NVIDIA software stack: CUDA, TensorRT, NVFP4, DLSS, Reflex, G-SYNC
At 14 mm thick and ~3 lbs, it fits into 14–16 inch laptops — a form factor that was previously unthinkable for this class of AI hardware. The screen is a tandem OLED with color accuracy for creative work and NVIDIA G-SYNC for gaming.
RTX Spark Specs (laptop config):
- FP4 AI Performance: 1 petaflop
- CPU: 20-core Grace
- Unified Memory: 128 GB
- Memory Bandwidth: 600 GB/s (NVLink C2C)
- Thickness: 14 mm
- Weight: ~3 lbs
Running Local LLMs on RTX Spark
Jensen's demo made the use case concrete: given a site, sketch, style reference, and requirements, an AI agent running on RTX Spark called Rhino to generate architectural layouts, then imported them into Blender with Flux 2 for multi-angle renders. The user could modify the output at any step.
Adobe Photoshop and Premiere are already being optimized for RTX Spark and integrated via MCP into local AI agent workflows.
In terms of LLM support, RTX Spark can run:
- Nemotron 3 Ultra (NVIDIA's own open model, announced at the same keynote)
- Local models via Ollama or LM Studio using the CUDA/TensorRT stack
- Cloud models when GPU memory is insufficient for the task at hand
This is a meaningful expansion from the Mac-centric narrative that's dominated local AI computing. If you need serious GPU compute and large memory for local model serving, RTX Spark gives Windows users a legitimate alternative to Apple Silicon.
Three Product Forms
NVIDIA showed RTX Spark across three form factors:
| Form Factor | Target User | Key Feature |
|---|---|---|
| Laptop | Mobile professional, developer, gamer | Portable 1-petaflop AI compute |
| Desktop | Home AI hub | 24/7 agent operation, connects peripherals |
| Workstation | Model developer, agent builder | DGX Station for Windows: 748 GB RAM, 20 petaflops, runs trillion-parameter models locally |
The Broader Picture: Agentic AI on the Desktop
RTX Spark is part of NVIDIA's larger vision for the "AI PC era." Jensen drew a direct parallel to the smartphone — ten years from now, he argued, a PC will be as fundamentally different from today's machines as an iPhone is from a Nokia.
The shift he's describing:
Old model: Human opens app → clicks → types → gets result New model: AI agent receives goal → understands intent → plans → calls tools → retrieves context → executes → saves memory
In this framework, the PC becomes a personal AI supercomputer. It can run agents 24/7, connect to cameras and home devices, handle model development locally, and still run traditional Windows applications.
Nemotron 3 Ultra: NVIDIA's Open Agent Model
Also announced at COMPUTEX: Nemotron 3 Ultra, NVIDIA's new open-weight model for agentic workflows.
Nemotron 3 Ultra uses an SSM (state-space model) + MoE hybrid architecture. NVIDIA claims it's 5× faster and 30% cheaper to run than comparable open models like Kimi K2.6, Qwen 3.5, and GLM 5.1 on agentic tasks. The model, training scripts, and training data will all be released for enterprise fine-tuning.
This gives enterprises a credible path to building proprietary agents without depending on OpenAI or Anthropic APIs.
How RTX Spark Compares to Apple Silicon for Local AI
For developers running local LLMs today, the comparison is inevitable:
| RTX Spark (laptop) | Apple Silicon (M4 Max) | |
|---|---|---|
| AI Performance | 1 PFLOPS (FP4) | ~40-50 TOPS (ANTO) |
| Unified Memory | 128 GB | Up to 128 GB |
| Memory Bandwidth | 600 GB/s | ~500 GB/s |
| CUDA Support | Native | Via emulation / MLX |
| Form Factor | 14 mm laptop | 15-16" MacBook Pro |
| Agentic OS Integration | Windows + NVIDIA stack | macOS + Apple Intelligence |
| Local Model Options | Ollama, LM Studio, TensorRT | Ollama, LM Studio, MLX-native |
Apple Silicon still leads on efficiency and developer experience for local AI. But RTX Spark closes the gap significantly on raw memory capacity and introduces native CUDA support — which remains the standard for production AI deployments.
Should You Wait for an RTX Spark PC?
If you're already deep in the local LLM ecosystem with a capable GPU (RTX 4090, etc.), RTX Spark may not dramatically change your setup. Ollama and LM Studio work well today.
But if you're in the market for a new machine, want the highest possible local AI throughput, need to run large models (70B+), or are building agentic workflows that run 24/7, RTX Spark is worth waiting for.
The era of the true AI PC is arriving. And for the first time, Windows users have a seat at the table.