Commits · fbb2a175a532534b0b6a1f2a5a78f9a750869039 · Vũ Hoàng Anh / freellmapi

25 Apr, 2026 1 commit

feat(catalog): migrateModelsV6 — probe-verified additions and Google RPD fix (#6) · fbb2a175

Tashfeen authored Apr 25, 2026

Live-probed against real free-tier keys on 2026-04-25. Adds 8 models
that returned 200 with content, drops the one OR :free route that
404s, and corrects two Google rate-limits whose catalog values were
~10x-50x too high.

Adds:
- Cloudflare: @cf/moonshotai/kimi-k2.5, @cf/qwen/qwen3-30b-a3b-fp8,
  @cf/deepseek-ai/deepseek-r1-distill-qwen-32b
- Google preview: gemini-3-flash-preview, gemini-3.1-flash-lite-preview,
  gemini-3.1-pro-preview (Pro confirmed free-tier-eligible by the
  free_tier_requests quota metric in 429 errors)
- OpenRouter: google/gemma-4-31b-it:free, liquid/lfm-2.5-1.2b-instruct:free

Removes:
- openrouter/arcee-ai/trinity-large-preview:free (404 No endpoints found)

Corrects:
- gemini-2.5-flash and gemini-2.5-flash-lite RPD 250/1000 -> 20.
  Free tier now uniformly enforces 20 RPD per model per project.

Updates router test rationale: gemini-3.1-pro-preview at rank 1 now
outranks Groq's gpt-oss-120b (rank 6) when keys exist for both.

fbb2a175

23 Apr, 2026 1 commit

feat(catalog): disable Gemini 2.5 Pro, add Cerebras zai-glm-4.7 · 73698e65

Tashfeen authored Apr 23, 2026

Google moved Pro-tier Gemini off the free tier on 2026-04-01. Cerebras
added z.ai GLM-4.7 (355B) to their free tier, throttled to 10 RPM /
100 RPD with an 8192 context cap while demand stays high.

73698e65

22 Apr, 2026 6 commits

docs: document tool calling in the Using the API section · e1a5b9f1

Tashfeen authored Apr 22, 2026

Adds a worked Python example showing the round-trip (assistant tool_calls →
tool role follow-up → final answer), notes streaming support, and clarifies
the pass-through vs. Gemini-translation split.

e1a5b9f1

fix(cloudflare): normalize null content to empty string for tool round-trips · 4c149373

Tashfeen authored Apr 22, 2026

Cloudflare's OpenAI-compat endpoint rejects assistant messages with
content: null, even when tool_calls are present (standard OpenAI format).
Added normalizeMessages() that converts null content to "" before dispatch,
plus a regression test covering the null-content + tool_calls case.

Also credits @moaaz12-web in README Contributors for the tool-calling PR.

4c149373

feat: tool-calling support across providers (#3) · 1a80b0a1
Moaaz Siddiqui authored Apr 22, 2026

1a80b0a1
Refresh fallback-chain diagram to match the new catalog. · 19beb5cb
Tashfeen authored Apr 22, 2026

19beb5cb
Eight hundred million? No — one-point-three billion now. · 4e8dcf6a
Tashfeen authored Apr 22, 2026
```
The stack grew wider; the headline had to bow.
```
4e8dcf6a

Live-probed each tier; culled the silent tools. · 0262b1c6

Tashfeen authored Apr 22, 2026

Kimi, Gemma, M1, HF depart the stage,
DeepSeek, Maverick, Ling, and GPT-OSS engage.
Twenty-two new rows, four fade from view —
Agentic rank reshapes what falls through.

0262b1c6

21 Apr, 2026 1 commit

Initial release of FreeLLMAPI · 04e15037

tashfeenahmed authored Apr 21, 2026

Self-hosted OpenAI-compatible proxy that aggregates the free tiers of
fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA,
Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare,
Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint.

Server:
- Express + SQLite, per-provider adapters with streaming and non-streaming
  support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking,
  sticky sessions for multi-turn, AES-256-GCM encrypted key storage,
  unified bearer-token auth, periodic health checks.

Client:
- React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag
  to reorder, color-coded per-provider monthly token budget), playground,
  analytics with per-provider breakdowns.

Tooling:
- GitHub Actions CI (server tests + client build), MIT license,
  README with provider-by-provider ToS review.

For personal experimentation, not production.

04e15037