TECH
EXO Labs stunned the tech world by running Llama 2 AI on a 1997 Windows 98 Pentium II PC.
By Aniket Chakraborty
May 7, 2025
Arrow
Arrow
The team bought the vintage PC for £118.88 and tackled major hardware compatibility challenges.
2
Image Credit | @exolabs | X
Arrow
With no USB ports, they relied on PS/2 peripherals and Ethernet for critical file transfers.
3
Arrow
Modern code had to be rewritten using 26-year-old Borland C++ 5.02 to fit the old processor.
4
Arrow
The lightweight llama2.c, a 700-line C code, made the AI model run on ancient hardware.
5
Arrow
The 260K parameter model generated 39.31 tokens per second — a historic achievement.
6
Arrow
Larger models like 15M and 1B parameters ran slower but proved possible on the machine.
7
Arrow
BitNet architecture, using ternary weights, played a key role in reducing computational load.
8
Arrow
BitNet models are CPU-first and over 50% more energy efficient than traditional AI models.
9
EXO Labs’ work hints at a future where powerful AI could run on old, low-power hardware.
10