In the world of AI, the demand for local inference of large language models (LLMs) is growing. Home users and AI enthusiasts are looking for compact systems capable of running powerful models, such as quantized versions of Llama 3.1 70B, without the need for expensive and bulky GPUs. The GMK EVO-X2, which was recently showcased at AMD’s “ADVANCING AI” Summit, is designed to meet this need, packing impressive AI processing capabilities into a small form factor.

In tests, systems with similar Ryzen AI MAX+ 395 processors have shown promising results, achieving decent inference speeds for models like DeepSeek-R1 and Llama-70B (quantized versions). Although benchmarks for models of this size can vary, the processor’s memory architecture ensures that even large models can be handled with greater efficiency compared to traditional GPU-based setups.

A New Approach to Local AI Inference

At the core of the GMK EVO-X2 is the AMD Ryzen AI MAX+ 395 processor, a chip engineered specifically for AI tasks like running LLMs. This processor is part of AMD’s new Strix Halo line and is tailored to address the growing need for systems capable of running AI models that demand large amounts of memory and computational power. Unlike traditional desktop GPUs, which can struggle with the memory needs of large models due to VRAM limitations (typically capped at 24GB to 48GB), the Ryzen AI MAX+ 395 offers a unique advantage in its ability to handle up to 96GB of system memory in Windows and 110GB in Linux.

This increased memory allocation gives it a major edge when it comes to running large AI models locally, allowing it to load entire models into memory without having to swap data in and out of slower storage, a process that can significantly slow down inference speeds on GPUs.

The Advantage of a Small Form Factor

One of the standout features of the GMK EVO-X2 is its size. Unlike desktop systems that can be bulky and power-hungry, the EVO-X2 is a compact mini-PC that delivers serious AI power. Measuring just 110.19 x 107.3 x 63.2mm, this mini-PC fits easily into any workspace, offering users the power to run complex LLMs in a small, energy-efficient form factor. For home users who need to run models locally without relying on cloud computing, the EVO-X2 could prove to be an invaluable tool.

AMD Ryzen AI MAX+ 395 Processor: What’s Under the Hood?

The AMD Ryzen AI MAX+ 395 processor is designed to handle demanding AI workloads with ease. It features the advanced chiplet-based architecture, combining Zen 5 CPU cores and RDNA 3.5 GPU architecture. This gives the processor the ability to deliver powerful heterogeneous computing, enabling it to handle both general-purpose computing and AI-specific tasks efficiently. The CPU has 16 cores and 32 threads, with a maximum clock speed of 5.1GHz, ensuring that it can handle multi-threaded workloads with ease.

Additionally, the integrated Radeon 8060S GPU, based on RDNA 3.5, provides solid graphics performance, complementing the CPU’s AI capabilities. For users working with models in the 7B to 32B range, this combination of CPU power and GPU performance, along with a substantial memory bandwidth, makes the EVO-X2 a promising contender for local LLM inference.

A Compact Solution for Local AI Inference

As AI models grow larger, the need for systems that can run these models efficiently without requiring cloud-based solutions or massive, power-hungry desktop systems becomes more critical. The GMK EVO-X2, powered by the Ryzen AI MAX+ 395 processor, offers home users a compact, cost-effective way to handle large AI models locally. Its small form factor, combined with its large memory allocation capabilities, makes it an ideal choice for anyone looking to experiment with or deploy LLMs without the need for expensive, power-hungry GPUs.

While the EVO-X2 is not yet available for purchase, it represents an exciting future for compact, powerful AI systems that can handle the demands of large-scale language models. Whether you’re an AI enthusiast, researcher, or developer, the GMK EVO-X2 could be the solution you’ve been waiting for to run your LLMs locally, efficiently, and affordably.

Stay tuned for more updates on its release and availability.