- 23 Apr, 2026 1 commit
-
-
Tashfeen authored
Google moved Pro-tier Gemini off the free tier on 2026-04-01. Cerebras added z.ai GLM-4.7 (355B) to their free tier, throttled to 10 RPM / 100 RPD with an 8192 context cap while demand stays high.
-
- 22 Apr, 2026 6 commits
-
-
Tashfeen authored
Adds a worked Python example showing the round-trip (assistant tool_calls → tool role follow-up → final answer), notes streaming support, and clarifies the pass-through vs. Gemini-translation split.
-
Tashfeen authored
Cloudflare's OpenAI-compat endpoint rejects assistant messages with content: null, even when tool_calls are present (standard OpenAI format). Added normalizeMessages() that converts null content to "" before dispatch, plus a regression test covering the null-content + tool_calls case. Also credits @moaaz12-web in README Contributors for the tool-calling PR.
-
Moaaz Siddiqui authored
-
Tashfeen authored
-
Tashfeen authored
The stack grew wider; the headline had to bow.
-
Tashfeen authored
Kimi, Gemma, M1, HF depart the stage, DeepSeek, Maverick, Ling, and GPT-OSS engage. Twenty-two new rows, four fade from view — Agentic rank reshapes what falls through.
-
- 21 Apr, 2026 1 commit
-
-
tashfeenahmed authored
Self-hosted OpenAI-compatible proxy that aggregates the free tiers of fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint. Server: - Express + SQLite, per-provider adapters with streaming and non-streaming support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking, sticky sessions for multi-turn, AES-256-GCM encrypted key storage, unified bearer-token auth, periodic health checks. Client: - React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag to reorder, color-coded per-provider monthly token budget), playground, analytics with per-provider breakdowns. Tooling: - GitHub Actions CI (server tests + client build), MIT license, README with provider-by-provider ToS review. For personal experimentation, not production.
-