Home/Models/Llama 3.1 8B

Llama 3.1 8B

FreeOpen Source
by Meta· Released 2024-07
Official Docs ↗

Llama 3.1 8B is one of the best small models available, capable of running on a single consumer GPU. It's free to use via Groq's API and can handle straightforward classification, translation, and simple Q&A tasks effectively.

Input Price
$0.06/1M
per 1M tokens
Output Price
$0.06/1M
per 1M tokens
Context Window
131K tokens
max tokens
ELO Score
1176
LMSYS Arena

Pricing Breakdown

VolumeInput CostOutput CostCombined (50/50)
1,000 tokens$0.0001$0.0001$0.0001
10,000 tokens$0.0006$0.0006$0.0006
100,000 tokens$0.0055$0.0055$0.0055
1,000,000 tokens$0.0550$0.0550$0.0550

Strengths

  • Essentially free to run via Groq or local deployment
  • Open source — full data privacy
  • Fast inference on commodity hardware
  • 128K context window for an 8B model

Weaknesses

  • Noticeably weaker on complex tasks
  • Not suitable for nuanced reasoning or analysis

Best For

fast responselow costtranslationcustomer support

Frequently Asked Questions

How much does Llama 3.1 8B cost?

Llama 3.1 8B costs $0.06/1M per 1M input tokens and $0.06/1M per 1M output tokens. It is available for free.

What is Llama 3.1 8B's context window?

Llama 3.1 8B has a context window of 131K tokens, which means it can process up to 131,072 tokens in a single request.

Is Llama 3.1 8B open source?

Yes, Llama 3.1 8B is open source. The model weights are publicly available and can be self-hosted.

What is Llama 3.1 8B best used for?

Llama 3.1 8B is best suited for: fast-response, low-cost, translation, customer-support. Essentially free to run via Groq or local deployment.

What is Llama 3.1 8B's ELO score?

Llama 3.1 8B has an ELO score of 1176 on the LMSYS Chatbot Arena leaderboard, placing it in the mid-tier range.