Gemini 2.5 Flash-Lite is Google's cheapest long-context model, offering a 1M token context at $0.10/$0.40 per 1M tokens. It is ideal for high-volume classification, extraction, and summarization workloads where cost dominates.
Pricing Breakdown
| Volume | Input Cost | Output Cost | Combined (50/50) |
|---|---|---|---|
| 1,000 tokens | $0.0001 | $0.0004 | $0.0003 |
| 10,000 tokens | $0.0010 | $0.0040 | $0.0025 |
| 100,000 tokens | $0.0100 | $0.0400 | $0.0250 |
| 1,000,000 tokens | $0.1000 | $0.4000 | $0.2500 |
Strengths
- ✓One of the cheapest high-quality models available
- ✓1M token context window at $0.10/1M input
- ✓Very fast inference
- ✓Good for classification and extraction
Weaknesses
- ✗Weaker reasoning vs. Flash or Pro
- ✗Limited for complex instruction following
Frequently Asked Questions
How much does Gemini 2.5 Flash-Lite cost?
Gemini 2.5 Flash-Lite costs $0.10/1M per 1M input tokens and $0.40/1M per 1M output tokens.
What is Gemini 2.5 Flash-Lite's context window?
Gemini 2.5 Flash-Lite has a context window of 1M tokens, which means it can process up to 1,000,000 tokens in a single request.
Is Gemini 2.5 Flash-Lite open source?
No, Gemini 2.5 Flash-Lite is a proprietary model by Google and is not open source.
What is Gemini 2.5 Flash-Lite best used for?
Gemini 2.5 Flash-Lite is best suited for: low-cost, fast-response, data-extraction, summarization, translation. One of the cheapest high-quality models available.
What is Gemini 2.5 Flash-Lite's ELO score?
Gemini 2.5 Flash-Lite has an ELO score of 1250 on the LMSYS Chatbot Arena leaderboard, placing it in the mid-tier range.