fbpx

Meta, the company formerly known as Facebook, has recently unveiled Llama 3, the latest iteration of its large language model. This advanced model is available in two versions: an eight billion (8B) parameter version and a 70 billion (70B) parameter version. In this article, we will explore how to run the 8B parameter version of Llama 3 locally, a more feasible option for standard desktops or laptops that may struggle to run the larger 70B version.

Llama 3’s performance overview

Llama 3 is an impressive large language model. The 8B parameter version, trained using 1.3 million hours of GPU time, outperforms its predecessor, Llama 2, in several ways. For instance, it is 34% better than the 7 billion parameter version of Llama 2 and 14% better than the 13 billion parameter version. Remarkably, the 8B parameter version of Llama 3 even surpasses the performance of the 13 billion parameter version of Llama 2. It only falls short by 8% when compared to the 70B parameter version of Llama 2, making it an impressive model for its size.