llama3 Questions

Docker – Why does running Llama 3.1 70B model underutilises the GPU?

December 5, 2024
JAMSHAID
2 Answers

I have deployed Llama 3.1 70B and Llama 3.1 8B on my system and it works perfectly for the 8B model. When I tested it for 70B, it underutilized the GPU and took a lot of time to respond. Here…

VIEW QUESTION