TECH

EXO Labs stunned the tech world by running Llama 2 AI on a 1997 Windows 98 Pentium II PC.

By Aniket Chakraborty

May 7, 2025

Arrow
Arrow

The team bought the vintage PC for £118.88 and tackled major hardware compatibility challenges.

2

Image Credit | @exolabs | X

Arrow

With no USB ports, they relied on PS/2 peripherals and Ethernet for critical file transfers.

3

Arrow

Modern code had to be rewritten using 26-year-old Borland C++ 5.02 to fit the old processor.

4

Arrow

The lightweight llama2.c, a 700-line C code, made the AI model run on ancient hardware.

5

Arrow

The 260K parameter model generated 39.31 tokens per second — a historic achievement.

6

Arrow

Larger models like 15M and 1B parameters ran slower but proved possible on the machine.

7

Arrow

BitNet architecture, using ternary weights, played a key role in reducing computational load.

8

Arrow

BitNet models are CPU-first and over 50% more energy efficient than traditional AI models.

9

EXO Labs’ work hints at a future where powerful AI could run on old, low-power hardware.

10