After months of speculation and anticipation, NVIDIA has finally unveiled the full specifications for its DGX Spark workstation (formerly known as Project DIGITS), aimed at AI developers and enthusiasts who want to run large language models locally. With a starting price of $2,999 (though the Founders Edition is listed at $3,999), the DGX Spark aims to democratize access to high-memory compute for local AI inference.

Hardware Specifications

The DGX Spark is built around NVIDIA’s Grace Blackwell architecture, featuring:

  • 20-core ARM CPU (10 Cortex-X925 + 10 Cortex-A725)
  • Blackwell GPU architecture with 5th generation Tensor Cores
  • 128GB LPDDR5x unified system memory
  • 256-bit memory interface
  • 273 GB/s memory bandwidth
  • 1TB or 4TB NVMe M.2 storage
  • Connectivity: 4x USB 4 TypeC, 10GbE Ethernet, WiFi 7, Bluetooth 5.3
  • 170W power consumption
  • Compact form factor: 150mm x 150mm x 50.5mm
  • NVIDIA DGX OS (specialized Linux-based operating system)

The standout feature here is the unified memory architecture that allows allocating a significant portion of the system memory to the GPU – critical for running large language models like 70B parameter systems that typically wouldn’t fit in traditional consumer GPU VRAM.

Competing Solutions

AMD Ryzen AI MAX+ 395 (Strix Halo)

While NVIDIA was developing the DGX Spark, AMD wasn’t sitting idle. The Ryzen AI MAX+ 395 (Strix Halo) offers a compelling alternative:

  • 16 cores/32 threads (Zen 5 architecture)
  • Ability to allocate up to 96GB of system memory to the GPU in Windows, 110GB in Linux
  • Memory bandwidth around 256 GB/s
  • Runs standard Windows and Linux distributions
  • Mini-PC implementations expected to cost less then $2000
  • Framework Desktop implementation at $2,100 for 128GB configuration

Apple M4 MAX

Apple’s offerings also deserve consideration:

  • Mac Studio with M4 MAX and 128GB RAM
  • 546 GB/s memory bandwidth (double that of the DGX Spark)
  • Starting price around $4,999 with education discount

Performance and Value Analysis

The key metric for LLM inference is tokens per second (TPS), which is heavily influenced by memory bandwidth. Early benchmark data suggests that systems with similar specifications to the DGX Spark can run DeepSeek R1 Distil Llama-70B Q8 at approximately 3 tokens per second.

What’s puzzling is the relatively modest memory bandwidth of the DGX Spark at 273 GB/s. This places it only marginally above the AMD Ryzen AI MAX+ 395 solution (256 GB/s) while costing significantly more. Moreover, it falls well short of Apple’s M4 MAX, which offers double the memory bandwidth at 546 GB/s.

For perspective, running a 70B parameter model efficiently requires substantial memory bandwidth. The DGX Spark’s 273 GB/s, while respectable, represents a significant bottleneck when handling these larger models  – especially when Apple has demonstrated that higher bandwidths are achievable in similar form factors.

Use Cases and Limitations

NVIDIA positions the DGX Spark as a device for prototyping, fine-tuning, and inference for AI models. However, the memory bandwidth constraints suggest that while it can technically load these larger models, inference speeds for 70B+ models will be relatively slow.

Where the DGX Spark might excel is in running smaller models (7B-32B parameters) with higher efficiency, where the bandwidth limitations would be less apparent. The device also features networking capabilities through ConnectX-7 Smart NIC, allowing users to link multiple units together for distributed computing.

One significant limitation of the DGX Spark is its specialized NVIDIA DGX OS, which restricts its versatility compared to the AMD solution that supports standard Windows and Linux distributions. This means the AMD option can serve as both an AI workstation and a general-purpose computer, including gaming capabilities.

Community Reaction

The reaction to the DGX Spark’s specifications has been mixed, with many enthusiasts expressing disappointment at the memory bandwidth figures. Comments on hardware forums show that many potential buyers who had been waiting for the DGX Spark are now reconsidering their options.

As one commenter pointed out, “For $3000, it needs to be around 500 GB/sec,” reflecting the general sentiment that the price-to-performance ratio doesn’t quite hit the mark. Others noted that “AMD ended up being the dark horse with a similar product for 50% less.

The Apple comparison is particularly unfavorable for NVIDIA, with multiple commenters observing that a Mac Studio with M4 MAX and 128GB RAM offers twice the memory bandwidth for only slightly more money, while also functioning as a complete workstation.

Analysis and Verdict

The DGX Spark finds itself in an awkward middle ground. It’s more expensive than AMD’s solution while offering only marginally better memory bandwidth, and it’s significantly outclassed by Apple’s offerings in terms of raw bandwidth while being limited to a specialized operating system.

For enthusiasts looking to run local LLM inference, particularly with models in the 70B+ range, the AMD Ryzen AI MAX+ 395 currently offers the better value proposition. It provides nearly the same memory bandwidth at a substantially lower price point while maintaining the flexibility of running standard operating systems.

If budget is less of a concern and maximum performance is the goal, Apple’s M4 MAX solutions, despite their higher price tags, offer substantially better memory bandwidth that will translate directly to faster inference speeds.

The DGX Spark does have some unique advantages, including NVIDIA’s mature CUDA ecosystem and potentially better optimization for certain AI workloads. However, these benefits need to be weighed against the significant price premium and the limitations of being locked into NVIDIA’s ecosystem.

In the rapidly evolving landscape of local LLM inference, the DGX Spark represents an important step toward more accessible AI computing, but it doesn’t quite deliver the breakthrough many were hoping for in terms of price-to-performance ratio.