Eight hundred million? No — one-point-three billion now.

The stack grew wider; the headline had to bow.

Eight hundred million? No — one-point-three billion now.
The stack grew wider; the headline had to bow.
4e8dcf6a · Tashfeen · 0262b1c6 · 4e8dcf6a
Commit 4e8dcf6a authored Apr 22, 2026 by Tashfeen
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

README.md README.md +2 -2

No files found.
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@

 # FreeLLMAPI

-**One OpenAI-compatible endpoint. Fourteen free LLM providers. ~800M+ tokens per month.**
+**One OpenAI-compatible endpoint. Fourteen free LLM providers. ~1.3B+ tokens per month.**

 Aggregate the free tiers from Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, and MiniMax behind a single `/v1/chat/completions` endpoint. Keys are stored encrypted. A router picks the best available model for each request, falls over to the next provider when one is rate-limited, and tracks per-key usage so you stay under every free-tier cap.

@@ -33,7 +33,7 @@ Aggregate the free tiers from Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral

 ## Why this exists

-Every serious AI lab now offers a free tier — a few million tokens a month, a few thousand requests a day. On its own each tier is a toy. Stacked together, they add up to roughly **800 million tokens per month** of working inference capacity, across dozens of models from small-and-fast to reasonably capable.
+Every serious AI lab now offers a free tier — a few million tokens a month, a few thousand requests a day. On its own each tier is a toy. Stacked together, they add up to roughly **1.3 billion tokens per month** of working inference capacity, across dozens of models from small-and-fast to reasonably capable.

 The problem is that stacking them by hand is painful: fourteen different SDKs, fourteen different rate limits, fourteen places a request can fail. FreeLLMAPI collapses that into one OpenAI-compatible endpoint. Point any OpenAI client library at your local server, and it routes transparently across whichever providers you've added keys for.