Recent MSI listings have reignited speculation about a possible 24GB variant of the NVIDIA GeForce RTX 5080. Initially launched with 16GB of GDDR7 memory, the RTX 5080 was positioned as a high-performance gaming GPU. However, leaked product listings, including a now-infamous MSI promotional video and subsequent motherboard compatibility lists, have hinted at an alternative 24GB SKU. If true, this would have significant implications for both gaming and professional workloads – especially for home users running large-scale language models (LLMs) like QwQ 32B.
The Leak and Speculation
The controversy began when MSI inadvertently showcased an RTX 5080 box featuring a 24GB memory specification. While initially dismissed as an error, MSI has since repeated the same “mistake” on multiple occasions, including in motherboard compatibility lists. While there has been no official confirmation from NVIDIA, the persistence of this information has fueled speculation that NVIDIA might be planning a mid-cycle refresh (e.g., RTX 5080 SUPER/Ti) featuring increased VRAM.
Adding credibility to this possibility, NVIDIA has already used 3GB GDDR7 memory modules in its RTX 5090 Laptop GPU. Since the RTX 5080 features a 256-bit memory interface, transitioning to 3GB modules could facilitate a 24GB configuration without altering the memory bus width. If this SKU is real and positioned at good price, it could introduce a compelling new option for LLM enthusiasts.
Performance Considerations for LLM Inference
For users interested in running local LLMs, the VRAM bump to 24GB would be highly beneficial. Currently, models like QwQ 32B require at least 24GB of VRAM to run. Inference speed would likely be on par with the older RTX 3090 due to similar memory bandwidth (960 GB/s vs. 936 GB/s for the 3090), but the RTX 5080’s more advanced architecture would bring additional efficiency gains.
For comparison, while the RTX 3090 remains an attractive budget choice for LLM inference, it relies on the older Ampere architecture. In contrast, the RTX 5080 utilizes the new Blackwell architecture – the same as NVIDIA’s RTX PRO series – making it significantly faster in other workloads, including gaming and general productivity tasks.
For home users looking for an affordable 48GB VRAM setup for LLM inference, the RTX 5080 (if real and available at MSRP) presents a compelling alternative to second-hand enterprise GPUs
GPU | VRAM | Price | Notes |
---|---|---|---|
2x P40 | 48GB | $800 | Passive cooling, requires custom solution for desktops, slow bandwidth (340 GB/s) |
2x RTX 3090 | 48GB | $1600 | Second-hand |
2x RTX 5080 | 48GB | $???? | If at MSRP; Rumor!!! |
2x A10 | 48GB | $3600 | Passive; second-hand |
RTX 4090 | 48GB | $3400 | Modded Chinese model, water-cooled |
2x L4 | 48GB | $4800 | Passive; second-hand, slow bandwidth (300 GB/s) |
L40 | 48GB | $5000 | Passive; second-hand |
2x Tesla V100 | 64GB | $4000 | Passive cooling, requires additional work for desktop; second-hand |
A40 | 48GB | $6800 | Passive; second-hand |
Dual-GPU Configurations: 48GB for 70B Models?
A particularly interesting scenario emerges when considering a dual-GPU setup. If an RTX 5080 24GB model are available at MSRP in the future, two such GPUs would offer a combined 48GB VRAM – enough to run 70B models, putting it in direct competition with the heavily modded Chinese RTX 4090 variants (48GB), which currently retail for around $3,400. A dual RTX 5080 24GB setup could provide a significantly better price-to-performance ratio.
Conclusion: A Possible Value King for LLM Enthusiasts?
While a secondhand RTX 3090 continues to hold its value for AI workloads, an RTX 5080 with 24GB VRAM at MSRP could be a grate option. It would not only offer a more power-efficient and feature-rich alternative but also enable cost-effective scaling for 70B parameter models in multi-GPU setups.
As of now, there is no official confirmation from NVIDIA or MSI, leaving the speculation unresolved. Enthusiasm for a potential 24GB variant continues to grow, tempered by questions about pricing, availability, and its place in NVIDIA’s Blackwell lineup. With CES 2025 showcasing the initial RTX 50-series cards, the industry is watching closely to see if this rumored upgrade becomes reality—or remains an intriguing “what if” in the world of tech.