Commits · 73698e6566ff32a3419b2e7a4c94388a2026014c · Vũ Hoàng Anh / freellmapi

23 Apr, 2026 1 commit

feat(catalog): disable Gemini 2.5 Pro, add Cerebras zai-glm-4.7 · 73698e65

Tashfeen authored Apr 23, 2026

Google moved Pro-tier Gemini off the free tier on 2026-04-01. Cerebras
added z.ai GLM-4.7 (355B) to their free tier, throttled to 10 RPM /
100 RPD with an 8192 context cap while demand stays high.

73698e65

22 Apr, 2026 6 commits

docs: document tool calling in the Using the API section · e1a5b9f1

Tashfeen authored Apr 22, 2026

Adds a worked Python example showing the round-trip (assistant tool_calls →
tool role follow-up → final answer), notes streaming support, and clarifies
the pass-through vs. Gemini-translation split.

e1a5b9f1

fix(cloudflare): normalize null content to empty string for tool round-trips · 4c149373

Tashfeen authored Apr 22, 2026

Cloudflare's OpenAI-compat endpoint rejects assistant messages with
content: null, even when tool_calls are present (standard OpenAI format).
Added normalizeMessages() that converts null content to "" before dispatch,
plus a regression test covering the null-content + tool_calls case.

Also credits @moaaz12-web in README Contributors for the tool-calling PR.

4c149373

feat: tool-calling support across providers (#3) · 1a80b0a1
Moaaz Siddiqui authored Apr 22, 2026

1a80b0a1
Refresh fallback-chain diagram to match the new catalog. · 19beb5cb
Tashfeen authored Apr 22, 2026

19beb5cb
Eight hundred million? No — one-point-three billion now. · 4e8dcf6a
Tashfeen authored Apr 22, 2026
```
The stack grew wider; the headline had to bow.
```
4e8dcf6a

Live-probed each tier; culled the silent tools. · 0262b1c6

Tashfeen authored Apr 22, 2026

Kimi, Gemma, M1, HF depart the stage,
DeepSeek, Maverick, Ling, and GPT-OSS engage.
Twenty-two new rows, four fade from view —
Agentic rank reshapes what falls through.

0262b1c6

21 Apr, 2026 1 commit

Initial release of FreeLLMAPI · 04e15037

tashfeenahmed authored Apr 21, 2026

Self-hosted OpenAI-compatible proxy that aggregates the free tiers of
fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA,
Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare,
Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint.

Server:
- Express + SQLite, per-provider adapters with streaming and non-streaming
  support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking,
  sticky sessions for multi-turn, AES-256-GCM encrypted key storage,
  unified bearer-token auth, periodic health checks.

Client:
- React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag
  to reorder, color-coded per-provider monthly token budget), playground,
  analytics with per-provider breakdowns.

Tooling:
- GitHub Actions CI (server tests + client build), MIT license,
  README with provider-by-provider ToS review.

For personal experimentation, not production.

04e15037