How to Increase the VRAM of Your Mac with Apple Silicone for LLMs?
It is surprisingly straightforward to increase the VRAM of your Mac (Apple Silicone M1/M2/M3 chips) computer and use it to load large language models. Here’s the rundown of my experiments.
For 32 GB RAM Mac model
I recently experimented with a 32GB MacBook Pro equipped with an M1 Pro chip, attempting to load a 24GB Mixtral model, specifically the mixtral-8x7b-instruct-v0.1.Q4_0.gguf. Under normal circumstances, this would be infeasible on a 32GB M1 Pro/Max chip due to Apple’s restriction, which caps the allocation of system memory to VRAM at 65%.
However, I found a way to bypass this limitation. To allocate more of your Mac’s system RAM to VRAM – in this case, up to 28 GB – the following command can be used in the terminal window:
sudo sysctl iogpu.wired_limit_mb=27536
This approach pushes the MacBook beyond its standard capabilities, offering new possibilities for handling large-scale models and tasks. Nonetheless, it’s important to proceed with caution, as significantly altering your system’s memory allocation can impact overall performance and stability.
For 64 GB RAM Mac model
For Mac models that have 64 GB of system RAM, the OS can allocate up to 75% of that into VRAM. This is around 48 GB – enough for the most of the model even for the 120B Goliath model with 8-bit quantization.
But let’s say you want to go above the 75% threshold for some really large model like this new Yi based MoE model – mixtral_34bx2_moe_60b.Q6_K.gguf (50GB). Use this command to allocate 56 GB of the system RAM to VRAM
sudo sysctl iogpu.wired_limit_mb=55536
The commands described above serve as a practical guideline for allocating more VRAM to your Apple Silicon chip. By following these steps, you can enhance the performance of your Mac with LLMs.
For Macs with 192 GB of system RAM
If you happen to have a Mac Studio or Mac Pro with M1/M2 Ultra chip with 192 GB of system memory you can allocate up to 188 GB of that to the VRAM. This is a huge amount of memory that will work even for training of larger models with the new Apple machine learning framework.
How to reset to the default VRAM settings
To reset your Mac’s VRAM settings to default, restart the machine or use this command:
sudo sysctl iogpu.wired_limit_mb=0
Conclusion
By adjusting the amount of RAM dedicated to VRAM, you can significantly enhance the performance capabilities of your Mac, especially for demanding tasks like running large-scale AI models. However, it’s important to remember that such modifications can affect the overall system stability and performance, so they should be undertaken with a clear understanding of the potential impacts.