server.home - Hardware
This machine is a playground for local LLMs, utilizing 3 GPUs to achieve 22 GB of fast VRAM.
Physical Build
Left: The server enclosure. Right: Internal display showing system stats.
Click any image above to expand it.
Specifications
| Component | Detail |
|---|---|
| CPU | Intel i3-6100 (2C/4T @ 3.7 GHz Skylake) |
| :material-motherboard: MB | EVGA Z170 Classified 4-way |
| RAM | 16 GB DDR4 2400 (Reduced early 2026) |
| Storage | 256GB NVMe + 240GB SATA SSD for Ollama |
The EVGA Z170 Classified motherboard supporting 4-way GPU configurations.
Historic Graphics Cluster (Early 2025)
Before the current optimization, the system ran a 4-GPU cluster to maximize VRAM availability.
| GPU Model | VRAM | Connection | Bandwidth |
|---|---|---|---|
| RTX 3060 Ti | 8 GB GDDR6 | PCIe 3.0 x16 | 448 GB/s |
| P106-100 | 6 GB GDDR5 | PCIe 3.0 x8 | 192 GB/s |
| P106-100 | 6 GB GDDR5 | PCIe 3.0 x4 | 192 GB/s |
| GTX 1060 | 6 GB GDDR5 | PCIe 1.0 x1 | 192 GB/s |
The PCIe 1.0 x1 Bottleneck
The fourth card (GTX 1060) was connected via a PCIe 1.0 x1 slot. While this provided an additional 6 GB of VRAM allowing larger models to load, the extremely narrow bus (0.25 GB/s) created significant latency when the model's KV Cache or weights needed to transit that specific card.
Current Graphics Cluster (2026)
The current setup focuses on balancing thermal overhead and consistent VRAM speeds:
- P104-100 (8GB GDDR5X): 314 GB/s
- GTX 1070 (8GB GDDR5): 220 GB/s
- P104-100 (8GB GDDR5X): 314 GB/s
- P106-100 (6GB GDDR5): 176 GB/s
This rocks! All 47 layers in the GPUs, each with 1 GB space for local K-V values.
