Llama 4 Scout is Meta's efficient open-weight model designed for single-H100 deployment. Its 10M token context window is the largest of any commonly hosted model, making it ideal for massive document ingestion, codebase analysis, and long-context RAG.
Pricing Breakdown
| Volume | Input Cost | Output Cost | Combined (50/50) |
|---|---|---|---|
| 1,000 tokens | $0.0001 | $0.0003 | $0.0002 |
| 10,000 tokens | $0.0008 | $0.0030 | $0.0019 |
| 100,000 tokens | $0.0080 | $0.0300 | $0.0190 |
| 1,000,000 tokens | $0.0800 | $0.3000 | $0.1900 |
Strengths
- ✓Runs on a single H100 — cheapest self-host target in the Llama 4 family
- ✓10M token context window — industry-leading for long context
- ✓Open weights
- ✓Natively multimodal
Weaknesses
- ✗Weaker than Maverick on complex reasoning
- ✗Smaller community than Llama 3 ecosystem
Compare Llama 4 Scout With
Frequently Asked Questions
How much does Llama 4 Scout cost?
Llama 4 Scout costs $0.08/1M per 1M input tokens and $0.30/1M per 1M output tokens.
What is Llama 4 Scout's context window?
Llama 4 Scout has a context window of 10M tokens, which means it can process up to 10,000,000 tokens in a single request.
Is Llama 4 Scout open source?
Yes, Llama 4 Scout is open source. The model weights are publicly available and can be self-hosted.
What is Llama 4 Scout best used for?
Llama 4 Scout is best suited for: long-context, low-cost, summarization, data-extraction, document-analysis. Runs on a single H100 — cheapest self-host target in the Llama 4 family.
What is Llama 4 Scout's ELO score?
Llama 4 Scout has an ELO score of 1280 on the LMSYS Chatbot Arena leaderboard, placing it in the mid-tier range.