Ai

Building an AI Homelab

As part of my job over the past year, I’ve been deploying and evaluating LLMs locally on various hardware setups. Naturally, I became well versed in the open source tools and models available and various hardware options. So I decided to build my own homelab for local inference.

Llama-Server is (Almost) All You Need

If you’re running LLMs locally, you’ve probably used Ollama or LM Studio. But they are effectively wrappers for llama.cpp. They’re convenient, but they impose limits that get in the way once you start trying to squeeze maximum performance out of your hardware.