Commit 04e15037 authored by tashfeenahmed's avatar tashfeenahmed

Initial release of FreeLLMAPI

Self-hosted OpenAI-compatible proxy that aggregates the free tiers of
fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA,
Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare,
Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint.

Server:
- Express + SQLite, per-provider adapters with streaming and non-streaming
  support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking,
  sticky sessions for multi-turn, AES-256-GCM encrypted key storage,
  unified bearer-token auth, periodic health checks.

Client:
- React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag
  to reorder, color-coded per-provider monthly token budget), playground,
  analytics with per-provider breakdowns.

Tooling:
- GitHub Actions CI (server tests + client build), MIT license,
  README with provider-by-provider ToS review.

For personal experimentation, not production.
parents
# Server encryption key for API key storage (generate with: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")
ENCRYPTION_KEY=your-64-char-hex-key-here
# Server port (default: 3001)
PORT=3001
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
name: Test & build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm install
- name: Run server tests
run: npm test -w server
- name: Build server
run: npm run build -w server
- name: Build client
run: npm run build -w client
node_modules/
dist/
server/data/
*.db
*.db-wal
*.db-shm
.env
.env.local
.DS_Store
# Personal deployment scripts (contain keys/credentials — kept local)
deploy-pi.sh
update-hermes.sh
MIT License
Copyright (c) 2026 Tashfeen Ahmed
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
<div align="center">
# FreeLLMAPI
**One OpenAI-compatible endpoint. Fourteen free LLM providers. ~800M+ tokens per month.**
Aggregate the free tiers from Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, and MiniMax behind a single `/v1/chat/completions` endpoint. Keys are stored encrypted. A router picks the best available model for each request, falls over to the next provider when one is rate-limited, and tracks per-key usage so you stay under every free-tier cap.
[![CI](https://github.com/tashfeenahmed/freellmapi/actions/workflows/ci.yml/badge.svg)](https://github.com/tashfeenahmed/freellmapi/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](./LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](#contributing)
![Fallback chain with per-provider token budget](repo-assets/fallback-chain.png)
</div>
---
## Contents
- [Why this exists](#why-this-exists)
- [Supported providers](#supported-providers)
- [Features](#features)
- [Not yet supported](#not-yet-supported)
- [Quick start](#quick-start)
- [Using the API](#using-the-api)
- [Screenshots](#screenshots)
- [How it works](#how-it-works)
- [Limitations](#limitations)
- [Contributing](#contributing)
- [Terms of Service review](#terms-of-service-review)
- [Disclaimer](#disclaimer)
## Why this exists
Every serious AI lab now offers a free tier — a few million tokens a month, a few thousand requests a day. On its own each tier is a toy. Stacked together, they add up to roughly **800 million tokens per month** of working inference capacity, across dozens of models from small-and-fast to reasonably capable.
The problem is that stacking them by hand is painful: fourteen different SDKs, fourteen different rate limits, fourteen places a request can fail. FreeLLMAPI collapses that into one OpenAI-compatible endpoint. Point any OpenAI client library at your local server, and it routes transparently across whichever providers you've added keys for.
## Supported providers
<table>
<tr>
<td align="center" width="180"><a href="https://ai.google.dev"><b>Google</b><br/>Gemini 2.5 Pro / Flash</a></td>
<td align="center" width="180"><a href="https://groq.com"><b>Groq</b><br/>Llama 4, Qwen, Kimi</a></td>
<td align="center" width="180"><a href="https://cerebras.ai"><b>Cerebras</b><br/>Llama 3.3, Qwen</a></td>
<td align="center" width="180"><a href="https://cloud.sambanova.ai"><b>SambaNova</b><br/>Llama 3.3 70B</a></td>
</tr>
<tr>
<td align="center"><a href="https://build.nvidia.com"><b>NVIDIA</b><br/>NIM catalog</a></td>
<td align="center"><a href="https://mistral.ai"><b>Mistral</b><br/>La Plateforme</a></td>
<td align="center"><a href="https://openrouter.ai"><b>OpenRouter</b><br/>Free-tier models</a></td>
<td align="center"><a href="https://github.com/marketplace/models"><b>GitHub Models</b><br/>GPT-4o, Llama, Phi</a></td>
</tr>
<tr>
<td align="center"><a href="https://huggingface.co"><b>Hugging Face</b><br/>Inference Providers</a></td>
<td align="center"><a href="https://cohere.com"><b>Cohere</b><br/>Command R+ (trial)</a></td>
<td align="center"><a href="https://developers.cloudflare.com/workers-ai"><b>Cloudflare</b><br/>Workers AI</a></td>
<td align="center"><a href="https://bigmodel.cn"><b>Zhipu</b><br/>GLM-4 series</a></td>
</tr>
<tr>
<td align="center"><a href="https://platform.moonshot.cn"><b>Moonshot</b><br/>Kimi</a></td>
<td align="center"><a href="https://platform.minimax.io"><b>MiniMax</b><br/>abab / hailuo</a></td>
<td align="center" colspan="2"><i>Adding another? See <a href="#contributing">Contributing</a>.</i></td>
</tr>
</table>
## Features
- **OpenAI-compatible**`POST /v1/chat/completions` and `GET /v1/models` work with the official OpenAI SDKs and any OpenAI-compatible client (LangChain, LlamaIndex, Continue, Hermes, etc.). Just change `base_url`.
- **Streaming and non-streaming** — Server-Sent Events for `stream: true`, JSON response otherwise. Every provider adapter implements both.
- **Automatic fallover** — If the chosen provider returns a 429, 5xx, or times out, the router skips it, puts the key on a short cooldown, and retries on the next model in your fallback chain (up to 20 attempts).
- **Per-key rate tracking** — RPM, RPD, TPM, and TPD counters per `(platform, model, key)` so the router always picks a key that's under its caps.
- **Sticky sessions** — Multi-turn conversations keep talking to the same model for 30 minutes to avoid the hallucination spike that comes from mid-conversation model switches.
- **Encrypted key storage** — API keys are encrypted with AES-256-GCM before hitting SQLite; decryption happens in-memory just before a request.
- **Unified API key** — Clients authenticate to your proxy with a single `freellmapi-…` bearer token. You never expose upstream provider keys to your apps.
- **Health checks** — Periodic probes mark keys as `healthy`, `rate_limited`, `invalid`, or `error` so the router skips dead ones automatically.
- **Admin dashboard** — React + Vite UI to manage keys, reorder the fallback chain, inspect analytics, and run prompts in a playground. Dark mode included.
- **Analytics** — Per-request logging with latency, token counts, success rate, and per-provider breakdowns.
- **Deploys to a Raspberry Pi** — Runs happily on a Pi 4 under PM2 behind nginx. ~40 MB RSS at idle.
## Not yet supported
The scope is deliberately narrow. If a feature isn't on this list and isn't below, assume it isn't there yet.
- **Embeddings** (`/v1/embeddings`)
- **Image generation** (`/v1/images/*`)
- **Audio / speech** (`/v1/audio/*`)
- **Function / tool calling** — the request schema doesn't pass `tools` through yet
- **Vision / multimodal inputs** — message content is text-only
- **Legacy completions** (`/v1/completions`) — only the chat endpoint is implemented
- **Moderation** (`/v1/moderations`)
- **`n > 1`** (multiple completions per request)
- **Per-user billing / multi-tenant auth** — single-user by design
PRs that add any of these are very welcome. See [Contributing](#contributing).
## Quick start
**Prerequisites:** Node.js 20+, npm.
```bash
git clone https://github.com/tashfeenahmed/freellmapi.git
cd freellmapi
npm install
# Generate an encryption key for at-rest key storage
cp .env.example .env
echo "ENCRYPTION_KEY=$(node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")" >> .env
# Start server + dashboard together
npm run dev
```
Open http://localhost:5173 (the Vite dev UI), add your provider keys on the **Keys** page, reorder the **Fallback Chain** to taste, and grab your unified API key from the **Keys** page header. That unified key is what you point your OpenAI SDK at.
For a production build:
```bash
npm run build
node server/dist/index.js # server + dashboard both served on :3001
```
## Using the API
Any OpenAI-compatible client works. Examples:
**Python**
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:3001/v1",
api_key="freellmapi-your-unified-key",
)
resp = client.chat.completions.create(
model="auto", # let the router pick; or specify e.g. "gemini-2.5-flash"
messages=[{"role": "user", "content": "Summarise the fall of Rome in one sentence."}],
)
print(resp.choices[0].message.content)
print("Routed via:", resp.headers.get("x-routed-via"))
```
**curl**
```bash
curl http://localhost:3001/v1/chat/completions \
-H "Authorization: Bearer freellmapi-your-unified-key" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "hi"}]
}'
```
**Streaming**
```python
stream = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Stream me a haiku about SQLite."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
```
Every response carries an `X-Routed-Via: <platform>/<model>` header so you can see which provider actually served each call. If a request fell over between providers, you'll also see `X-Fallback-Attempts: N`.
## Screenshots
### Keys
Manage provider credentials and grab the unified API key your apps connect with. Each key shows a status dot and when it was last health-checked.
![Keys page](repo-assets/keys.png)
### Playground
Send a chat completion through the router and see which provider served it, with the model ID and latency printed right on the message.
![Playground page](repo-assets/playground.png)
### Analytics
Request volume, success rate, tokens in and out, average latency, and per-provider breakdowns over 24h / 7d / 30d windows.
![Analytics page](repo-assets/analytics.png)
## How it works
```
┌──────────────────┐ Bearer freellmapi-… ┌─────────────────────────┐
│ OpenAI SDK / │ ──────────────────────▶ │ Express proxy (:3001) │
│ curl / any │ ◀────────────────────── │ /v1/chat/completions │
│ OpenAI client │ streamed tokens └────────────┬────────────┘
└──────────────────┘ │
┌────────────────────────────────────────────────┐
│ Router │
│ 1. Pick highest-priority model that │
│ (a) has a healthy key and │
│ (b) is under all its rate limits. │
│ 2. Decrypt key, call provider SDK. │
│ 3. On 429/5xx → cooldown + retry next model. │
└────────────────────────────────────────────────┘
┌──────────────┬────────────┬──────────┴─────────┬─────────────┬──────────┐
▼ ▼ ▼ ▼ ▼ ▼
Google Groq Cerebras OpenRouter HF …10 more
```
- **Router** (`server/src/services/router.ts`) — picks a model per request.
- **Rate-limit ledger** (`server/src/services/ratelimit.ts`) — in-memory RPM/RPD/TPM/TPD counters backed by SQLite, with cooldowns on 429s.
- **Provider adapters** (`server/src/providers/*.ts`) — one file per provider, implementing the `Provider` base class: `chatCompletion()` and `streamChatCompletion()`.
- **Health service** (`server/src/services/health.ts`) — periodic probe keeps key status fresh.
- **Dashboard** (`client/`) — React + Vite + shadcn/ui admin surface.
- **Storage** — SQLite (`better-sqlite3`) with AES-256-GCM envelope encryption for keys.
## Limitations
Stacking free tiers has real trade-offs. Be honest with yourself about them:
- **No frontier models.** The free-tier catalog tops out around Llama 3.3 70B, GLM-4.5, Qwen 3 Coder, and Gemini 2.5 Pro. You will not get GPT-5 or Claude Opus class reasoning through this. For hard problems, pay for a real API.
- **Intelligence degrades as the day progresses.** Your top-ranked models (usually Gemini 2.5 Pro, GPT-4o via GitHub Models) have the lowest daily caps. Once they hit their limits, the router falls down your priority chain to smaller/weaker models. Expect the effective intelligence of the endpoint to drop in the late hours of each day — then reset at UTC midnight.
- **Latency is highly variable.** Cerebras and Groq are extremely fast; others are not. You get whichever one is available.
- **Free tiers can change without notice.** Providers regularly tighten, loosen, or remove free tiers. When that happens you'll see 429s or auth errors until you update the catalog. Re-seed scripts live in `server/src/scripts/`.
- **No SLA, by definition.** If you need reliability, use a paid provider with a contract.
- **Local-first.** There's no multi-tenant auth. Run this for yourself; don't expose it to the internet.
## Contributing
Contributors very welcome! Good first PRs:
- **Add a provider** — copy `server/src/providers/openai-compat.ts` as a template, wire it into `server/src/providers/index.ts`, seed its models in `server/src/db/index.ts`, add a test in `server/src/__tests__/providers/`.
- **Add an endpoint** — embeddings, images, moderations. The provider base class can grow new methods; adapters declare which they support.
- **Improve the router** — cost-aware routing (cheapest-healthy-fastest tradeoffs), better latency-weighted priority, regional pinning.
- **Dashboard polish** — charts on the Analytics page, key rotation UX, batch import of keys from `.env`.
- **Docs** — more examples, client library snippets for Go/Rust/etc., a deployment recipe for Docker or Fly.
**Development loop:**
```bash
npm install
npm run dev # server on :3001, dashboard on :5173, both with HMR
npm test # vitest — 69 tests across providers, routes, router, ratelimit
```
PRs should include a test, keep the existing test suite green, and match the `.editorconfig` / tsconfig defaults already in the repo. Issues and discussions are open.
## Terms of Service review
A self-hosted, single-user, personal-use setup was reviewed against each provider's ToS (April 2026). Summary:
| Provider | Verdict | Notes |
|---|---|---|
| Google Gemini | ✅ Likely OK | No adverse clause; proxy for personal use not prohibited. |
| Groq | ✅ Likely OK | Explicitly permits integrating into a "Customer Application." |
| Cerebras | ✅ Likely OK | Permitted; don't resell keys. |
| Mistral | ✅ Likely OK | APIs allowed for personal/internal business use. |
| OpenRouter | ✅ Likely OK | Private-use only; don't expose the proxy publicly. |
| Hugging Face | ✅ Likely OK | BYO-key proxying is the documented pattern. |
| Zhipu | ✅ Likely OK | Explicit "personal, non-commercial research" carve-out. |
| Moonshot / Kimi | ✅ Likely OK | Competitive-products clause is broad but not aimed at single-user proxies. |
| SambaNova | ⚠️ Ambiguous | Public terms are silent on APIs. |
| MiniMax | ⚠️ Ambiguous | Public terms silent. |
| Cloudflare Workers AI | ⚠️ Ambiguous | No adverse clause found. |
| NVIDIA NIM | ⚠️ Caution | Free tier is "evaluation only, not production." |
| GitHub Models | ⚠️ Caution | Free tier scoped to "experimentation." |
| Cohere | ❌ Avoid | Trial ToS §14 explicitly forbids personal/household use. |
Rules of thumb that keep most providers happy: **one account per provider**, **no reselling**, **no sharing your endpoint with other humans**, **don't hammer a free tier as a paid production backend**. This is informational, not legal advice — read each provider's ToS and make your own call.
## Disclaimer
**This project is for personal experimentation and learning, not production.** Free tiers exist so developers can prototype against them; they aren't a stable, supported inference substrate and shouldn't be treated as one. If you build something real on top of FreeLLMAPI, swap in a paid API before you ship. Your relationship with each upstream provider is governed by the terms you accepted when you created your account — those terms still apply when the traffic is proxied through this project, and you're responsible for complying with them.
## License
[MIT](./LICENSE)
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*
node_modules
dist
dist-ssr
*.local
# Editor directories and files
.vscode/*
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
# React + TypeScript + Vite
This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
Currently, two official plugins are available:
- [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Oxc](https://oxc.rs)
- [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/)
## React Compiler
The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
## Expanding the ESLint configuration
If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
```js
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Remove tseslint.configs.recommended and replace with this
tseslint.configs.recommendedTypeChecked,
// Alternatively, use this for stricter rules
tseslint.configs.strictTypeChecked,
// Optionally, add this for stylistic rules
tseslint.configs.stylisticTypeChecked,
// Other configs...
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])
```
You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
```js
// eslint.config.js
import reactX from 'eslint-plugin-react-x'
import reactDom from 'eslint-plugin-react-dom'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
// Other configs...
// Enable lint rules for React
reactX.configs['recommended-typescript'],
// Enable lint rules for React DOM
reactDom.configs.recommended,
],
languageOptions: {
parserOptions: {
project: ['./tsconfig.node.json', './tsconfig.app.json'],
tsconfigRootDir: import.meta.dirname,
},
// other options...
},
},
])
```
{
"$schema": "https://ui.shadcn.com/schema.json",
"style": "base-nova",
"rsc": false,
"tsx": true,
"tailwind": {
"config": "",
"css": "src/index.css",
"baseColor": "neutral",
"cssVariables": true,
"prefix": ""
},
"iconLibrary": "lucide",
"rtl": false,
"aliases": {
"components": "@/components",
"utils": "@/lib/utils",
"ui": "@/components/ui",
"lib": "@/lib",
"hooks": "@/hooks"
},
"menuColor": "default",
"menuAccent": "subtle",
"registries": {}
}
import js from '@eslint/js'
import globals from 'globals'
import reactHooks from 'eslint-plugin-react-hooks'
import reactRefresh from 'eslint-plugin-react-refresh'
import tseslint from 'typescript-eslint'
import { defineConfig, globalIgnores } from 'eslint/config'
export default defineConfig([
globalIgnores(['dist']),
{
files: ['**/*.{ts,tsx}'],
extends: [
js.configs.recommended,
tseslint.configs.recommended,
reactHooks.configs.flat.recommended,
reactRefresh.configs.vite,
],
languageOptions: {
ecmaVersion: 2020,
globals: globals.browser,
},
},
])
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/favicon.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>FreeLLMAPI · Unified LLM Router</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>
{
"name": "@freellmapi/client",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"lint": "eslint .",
"preview": "vite preview"
},
"dependencies": {
"@base-ui/react": "^1.3.0",
"@dnd-kit/core": "^6.3.1",
"@dnd-kit/sortable": "^10.0.0",
"@dnd-kit/utilities": "^3.2.2",
"@fontsource-variable/geist": "^5.2.8",
"@fontsource-variable/geist-mono": "^5.2.7",
"@tailwindcss/vite": "^4.2.2",
"@tanstack/react-query": "^5.97.0",
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"lucide-react": "^1.8.0",
"react": "^19.2.4",
"react-dom": "^19.2.4",
"react-router-dom": "^7.14.0",
"recharts": "^3.8.1",
"shadcn": "^4.2.0",
"tailwind-merge": "^3.5.0",
"tailwindcss": "^4.2.2",
"tw-animate-css": "^1.4.0"
},
"devDependencies": {
"@eslint/js": "^9.39.4",
"@types/node": "^24.12.2",
"@types/react": "^19.2.14",
"@types/react-dom": "^19.2.3",
"@vitejs/plugin-react": "^6.0.1",
"eslint": "^9.39.4",
"eslint-plugin-react-hooks": "^7.0.1",
"eslint-plugin-react-refresh": "^0.5.2",
"globals": "^17.4.0",
"typescript": "~6.0.2",
"typescript-eslint": "^8.58.0",
"vite": "^8.0.4"
}
}
<svg xmlns="http://www.w3.org/2000/svg" width="48" height="46" fill="none" viewBox="0 0 48 46"><path fill="#863bff" d="M25.946 44.938c-.664.845-2.021.375-2.021-.698V33.937a2.26 2.26 0 0 0-2.262-2.262H10.287c-.92 0-1.456-1.04-.92-1.788l7.48-10.471c1.07-1.497 0-3.578-1.842-3.578H1.237c-.92 0-1.456-1.04-.92-1.788L10.013.474c.214-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.471c-1.07 1.498 0 3.579 1.842 3.579h11.377c.943 0 1.473 1.088.89 1.83L25.947 44.94z" style="fill:#863bff;fill:color(display-p3 .5252 .23 1);fill-opacity:1"/><mask id="a" width="48" height="46" x="0" y="0" maskUnits="userSpaceOnUse" style="mask-type:alpha"><path fill="#000" d="M25.842 44.938c-.664.844-2.021.375-2.021-.698V33.937a2.26 2.26 0 0 0-2.262-2.262H10.183c-.92 0-1.456-1.04-.92-1.788l7.48-10.471c1.07-1.498 0-3.579-1.842-3.579H1.133c-.92 0-1.456-1.04-.92-1.787L9.91.473c.214-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.471c-1.07 1.498 0 3.578 1.842 3.578h11.377c.943 0 1.473 1.088.89 1.832L25.843 44.94z" style="fill:#000;fill-opacity:1"/></mask><g mask="url(#a)"><g filter="url(#b)"><ellipse cx="5.508" cy="14.704" fill="#ede6ff" rx="5.508" ry="14.704" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -4.47 31.516)"/></g><g filter="url(#c)"><ellipse cx="10.399" cy="29.851" fill="#ede6ff" rx="10.399" ry="29.851" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -39.328 7.883)"/></g><g filter="url(#d)"><ellipse cx="5.508" cy="30.487" fill="#7e14ff" rx="5.508" ry="30.487" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.814 -25.913 -14.639)scale(1 -1)"/></g><g filter="url(#e)"><ellipse cx="5.508" cy="30.599" fill="#7e14ff" rx="5.508" ry="30.599" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.814 -32.644 -3.334)scale(1 -1)"/></g><g filter="url(#f)"><ellipse cx="5.508" cy="30.599" fill="#7e14ff" rx="5.508" ry="30.599" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -34.34 30.47)"/></g><g filter="url(#g)"><ellipse cx="14.072" cy="22.078" fill="#ede6ff" rx="14.072" ry="22.078" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="rotate(93.35 24.506 48.493)scale(-1 1)"/></g><g filter="url(#h)"><ellipse cx="3.47" cy="21.501" fill="#7e14ff" rx="3.47" ry="21.501" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.009 28.708 47.59)scale(-1 1)"/></g><g filter="url(#i)"><ellipse cx="3.47" cy="21.501" fill="#7e14ff" rx="3.47" ry="21.501" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.009 28.708 47.59)scale(-1 1)"/></g><g filter="url(#j)"><ellipse cx=".387" cy="8.972" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(39.51 .387 8.972)"/></g><g filter="url(#k)"><ellipse cx="47.523" cy="-6.092" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 47.523 -6.092)"/></g><g filter="url(#l)"><ellipse cx="41.412" cy="6.333" fill="#47bfff" rx="5.971" ry="9.665" style="fill:#47bfff;fill:color(display-p3 .2799 .748 1);fill-opacity:1" transform="rotate(37.892 41.412 6.333)"/></g><g filter="url(#m)"><ellipse cx="-1.879" cy="38.332" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 -1.88 38.332)"/></g><g filter="url(#n)"><ellipse cx="-1.879" cy="38.332" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 -1.88 38.332)"/></g><g filter="url(#o)"><ellipse cx="35.651" cy="29.907" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 35.651 29.907)"/></g><g filter="url(#p)"><ellipse cx="38.418" cy="32.4" fill="#47bfff" rx="5.971" ry="15.297" style="fill:#47bfff;fill:color(display-p3 .2799 .748 1);fill-opacity:1" transform="rotate(37.892 38.418 32.4)"/></g></g><defs><filter id="b" width="60.045" height="41.654" x="-19.77" y="16.149" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="c" width="90.34" height="51.437" x="-54.613" y="-7.533" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="d" width="79.355" height="29.4" x="-49.64" y="2.03" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="e" width="79.579" height="29.4" x="-45.045" y="20.029" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="f" width="79.579" height="29.4" x="-43.513" y="21.178" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="g" width="74.749" height="58.852" x="15.756" y="-17.901" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="h" width="61.377" height="25.362" x="23.548" y="2.284" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="i" width="61.377" height="25.362" x="23.548" y="2.284" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="j" width="56.045" height="63.649" x="-27.636" y="-22.853" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="k" width="54.814" height="64.646" x="20.116" y="-38.415" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="l" width="33.541" height="35.313" x="24.641" y="-11.323" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="m" width="54.814" height="64.646" x="-29.286" y="6.009" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="n" width="54.814" height="64.646" x="-29.286" y="6.009" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="o" width="54.814" height="64.646" x="8.244" y="-2.416" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="p" width="39.409" height="43.623" x="18.713" y="10.588" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter></defs></svg>
\ No newline at end of file
<svg xmlns="http://www.w3.org/2000/svg">
<symbol id="bluesky-icon" viewBox="0 0 16 17">
<g clip-path="url(#bluesky-clip)"><path fill="#08060d" d="M7.75 7.735c-.693-1.348-2.58-3.86-4.334-5.097-1.68-1.187-2.32-.981-2.74-.79C.188 2.065.1 2.812.1 3.251s.241 3.602.398 4.13c.52 1.744 2.367 2.333 4.07 2.145-2.495.37-4.71 1.278-1.805 4.512 3.196 3.309 4.38-.71 4.987-2.746.608 2.036 1.307 5.91 4.93 2.746 2.72-2.746.747-4.143-1.747-4.512 1.702.189 3.55-.4 4.07-2.145.156-.528.397-3.691.397-4.13s-.088-1.186-.575-1.406c-.42-.19-1.06-.395-2.741.79-1.755 1.24-3.64 3.752-4.334 5.099"/></g>
<defs><clipPath id="bluesky-clip"><path fill="#fff" d="M.1.85h15.3v15.3H.1z"/></clipPath></defs>
</symbol>
<symbol id="discord-icon" viewBox="0 0 20 19">
<path fill="#08060d" d="M16.224 3.768a14.5 14.5 0 0 0-3.67-1.153c-.158.286-.343.67-.47.976a13.5 13.5 0 0 0-4.067 0c-.128-.306-.317-.69-.476-.976A14.4 14.4 0 0 0 3.868 3.77C1.546 7.28.916 10.703 1.231 14.077a14.7 14.7 0 0 0 4.5 2.306q.545-.748.965-1.587a9.5 9.5 0 0 1-1.518-.74q.191-.14.372-.293c2.927 1.369 6.107 1.369 8.999 0q.183.152.372.294-.723.437-1.52.74.418.838.963 1.588a14.6 14.6 0 0 0 4.504-2.308c.37-3.911-.63-7.302-2.644-10.309m-9.13 8.234c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.894 0 1.614.82 1.599 1.82.001 1-.705 1.82-1.6 1.82m5.91 0c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.893 0 1.614.82 1.599 1.82 0 1-.706 1.82-1.6 1.82"/>
</symbol>
<symbol id="documentation-icon" viewBox="0 0 21 20">
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="m15.5 13.333 1.533 1.322c.645.555.967.833.967 1.178s-.322.623-.967 1.179L15.5 18.333m-3.333-5-1.534 1.322c-.644.555-.966.833-.966 1.178s.322.623.966 1.179l1.534 1.321"/>
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M17.167 10.836v-4.32c0-1.41 0-2.117-.224-2.68-.359-.906-1.118-1.621-2.08-1.96-.599-.21-1.349-.21-2.848-.21-2.623 0-3.935 0-4.983.369-1.684.591-3.013 1.842-3.641 3.428C3 6.449 3 7.684 3 10.154v2.122c0 2.558 0 3.838.706 4.726q.306.383.713.671c.76.536 1.79.64 3.581.66"/>
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M3 10a2.78 2.78 0 0 1 2.778-2.778c.555 0 1.209.097 1.748-.047.48-.129.854-.503.982-.982.145-.54.048-1.194.048-1.749a2.78 2.78 0 0 1 2.777-2.777"/>
</symbol>
<symbol id="github-icon" viewBox="0 0 19 19">
<path fill="#08060d" fill-rule="evenodd" d="M9.356 1.85C5.05 1.85 1.57 5.356 1.57 9.694a7.84 7.84 0 0 0 5.324 7.44c.387.079.528-.168.528-.376 0-.182-.013-.805-.013-1.454-2.165.467-2.616-.935-2.616-.935-.349-.91-.864-1.143-.864-1.143-.71-.48.051-.48.051-.48.787.051 1.2.805 1.2.805.695 1.194 1.817.857 2.268.649.064-.507.27-.857.49-1.052-1.728-.182-3.545-.857-3.545-3.87 0-.857.31-1.558.8-2.104-.078-.195-.349-1 .077-2.078 0 0 .657-.208 2.14.805a7.5 7.5 0 0 1 1.946-.26c.657 0 1.328.092 1.946.26 1.483-1.013 2.14-.805 2.14-.805.426 1.078.155 1.883.078 2.078.502.546.799 1.247.799 2.104 0 3.013-1.818 3.675-3.558 3.87.284.247.528.714.528 1.454 0 1.052-.012 1.896-.012 2.156 0 .208.142.455.528.377a7.84 7.84 0 0 0 5.324-7.441c.013-4.338-3.48-7.844-7.773-7.844" clip-rule="evenodd"/>
</symbol>
<symbol id="social-icon" viewBox="0 0 20 20">
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M12.5 6.667a4.167 4.167 0 1 0-8.334 0 4.167 4.167 0 0 0 8.334 0"/>
<path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M2.5 16.667a5.833 5.833 0 0 1 8.75-5.053m3.837.474.513 1.035c.07.144.257.282.414.309l.93.155c.596.1.736.536.307.965l-.723.73a.64.64 0 0 0-.152.531l.207.903c.164.715-.213.991-.84.618l-.872-.52a.63.63 0 0 0-.577 0l-.872.52c-.624.373-1.003.094-.84-.618l.207-.903a.64.64 0 0 0-.152-.532l-.723-.729c-.426-.43-.289-.864.306-.964l.93-.156a.64.64 0 0 0 .412-.31l.513-1.034c.28-.562.735-.562 1.012 0"/>
</symbol>
<symbol id="x-icon" viewBox="0 0 19 19">
<path fill="#08060d" fill-rule="evenodd" d="M1.893 1.98c.052.072 1.245 1.769 2.653 3.77l2.892 4.114c.183.261.333.48.333.486s-.068.089-.152.183l-.522.593-.765.867-3.597 4.087c-.375.426-.734.834-.798.905a1 1 0 0 0-.118.148c0 .01.236.017.664.017h.663l.729-.83c.4-.457.796-.906.879-.999a692 692 0 0 0 1.794-2.038c.034-.037.301-.34.594-.675l.551-.624.345-.392a7 7 0 0 1 .34-.374c.006 0 .93 1.306 2.052 2.903l2.084 2.965.045.063h2.275c1.87 0 2.273-.003 2.266-.021-.008-.02-1.098-1.572-3.894-5.547-2.013-2.862-2.28-3.246-2.273-3.266.008-.019.282-.332 2.085-2.38l2-2.274 1.567-1.782c.022-.028-.016-.03-.65-.03h-.674l-.3.342a871 871 0 0 1-1.782 2.025c-.067.075-.405.458-.75.852a100 100 0 0 1-.803.91c-.148.172-.299.344-.99 1.127-.304.343-.32.358-.345.327-.015-.019-.904-1.282-1.976-2.808L6.365 1.85H1.8zm1.782.91 8.078 11.294c.772 1.08 1.413 1.973 1.425 1.984.016.017.241.02 1.05.017l1.03-.004-2.694-3.766L7.796 5.75 5.722 2.852l-1.039-.004-1.039-.004z" clip-rule="evenodd"/>
</symbol>
</svg>
import { useEffect, useState } from 'react'
import { BrowserRouter, Routes, Route, Navigate, NavLink } from 'react-router-dom'
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import { Button } from '@/components/ui/button'
import KeysPage from '@/pages/KeysPage'
import PlaygroundPage from '@/pages/PlaygroundPage'
import FallbackPage from '@/pages/FallbackPage'
import AnalyticsPage from '@/pages/AnalyticsPage'
const queryClient = new QueryClient()
function NavItem({ to, children }: { to: string; children: React.ReactNode }) {
return (
<NavLink
to={to}
className={({ isActive }) =>
`relative text-sm px-1 py-4 transition-colors ${
isActive
? 'text-foreground after:absolute after:inset-x-0 after:-bottom-px after:h-px after:bg-foreground'
: 'text-muted-foreground hover:text-foreground'
}`
}
>
{children}
</NavLink>
)
}
function DarkModeToggle() {
const [dark, setDark] = useState(() =>
typeof window !== 'undefined' && document.documentElement.classList.contains('dark')
)
useEffect(() => {
const stored = localStorage.getItem('theme')
if (stored === 'dark' || (!stored && window.matchMedia('(prefers-color-scheme: dark)').matches)) {
document.documentElement.classList.add('dark')
setDark(true)
}
}, [])
function toggle() {
const next = !dark
setDark(next)
document.documentElement.classList.toggle('dark', next)
localStorage.setItem('theme', next ? 'dark' : 'light')
}
return (
<Button variant="ghost" size="sm" onClick={toggle} aria-label="Toggle theme">
{dark ? (
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="12" cy="12" r="4"/><path d="M12 2v2"/><path d="M12 20v2"/><path d="m4.93 4.93 1.41 1.41"/><path d="m17.66 17.66 1.41 1.41"/><path d="M2 12h2"/><path d="M20 12h2"/><path d="m6.34 17.66-1.41 1.41"/><path d="m19.07 4.93-1.41 1.41"/></svg>
) : (
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 3a6 6 0 0 0 9 9 9 9 0 1 1-9-9Z"/></svg>
)}
</Button>
)
}
function Brand() {
return (
<div className="flex items-center gap-2">
<span className="inline-block size-2 rounded-full bg-foreground" />
<span className="font-semibold tracking-tight text-sm">FreeLLMAPI</span>
</div>
)
}
function App() {
return (
<QueryClientProvider client={queryClient}>
<BrowserRouter basename={import.meta.env.BASE_URL}>
<div className="min-h-screen bg-background">
<header className="sticky top-0 z-40 bg-background/80 backdrop-blur border-b">
<div className="max-w-6xl mx-auto px-6 flex items-center">
<Brand />
<nav className="flex items-center gap-6 ml-10">
<NavItem to="/playground">Playground</NavItem>
<NavItem to="/keys">Keys</NavItem>
<NavItem to="/fallback">Fallback</NavItem>
<NavItem to="/analytics">Analytics</NavItem>
</nav>
<div className="ml-auto py-2">
<DarkModeToggle />
</div>
</div>
</header>
<main className="max-w-6xl mx-auto px-6 py-8">
<Routes>
<Route path="/" element={<Navigate to="/playground" replace />} />
<Route path="/playground" element={<PlaygroundPage />} />
<Route path="/keys" element={<KeysPage />} />
<Route path="/fallback" element={<FallbackPage />} />
<Route path="/analytics" element={<AnalyticsPage />} />
<Route path="/test" element={<Navigate to="/playground" replace />} />
<Route path="/health" element={<Navigate to="/keys" replace />} />
</Routes>
</main>
</div>
</BrowserRouter>
</QueryClientProvider>
)
}
export default App
<svg xmlns="http://www.w3.org/2000/svg" width="77" height="47" fill="none" aria-labelledby="vite-logo-title" viewBox="0 0 77 47"><title id="vite-logo-title">Vite</title><style>.parenthesis{fill:#000}@media (prefers-color-scheme:dark){.parenthesis{fill:#fff}}</style><path fill="#9135ff" d="M40.151 45.71c-.663.844-2.02.374-2.02-.699V34.708a2.26 2.26 0 0 0-2.262-2.262H24.493c-.92 0-1.457-1.04-.92-1.788l7.479-10.471c1.07-1.498 0-3.578-1.842-3.578H15.443c-.92 0-1.456-1.04-.92-1.788l9.696-13.576c.213-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.472c-1.07 1.497 0 3.578 1.842 3.578h11.376c.944 0 1.474 1.087.89 1.83L40.153 45.712z"/><mask id="a" width="48" height="47" x="14" y="0" maskUnits="userSpaceOnUse" style="mask-type:alpha"><path fill="#000" d="M40.047 45.71c-.663.843-2.02.374-2.02-.699V34.708a2.26 2.26 0 0 0-2.262-2.262H24.389c-.92 0-1.457-1.04-.92-1.788l7.479-10.472c1.07-1.497 0-3.578-1.842-3.578H15.34c-.92 0-1.456-1.04-.92-1.788l9.696-13.575c.213-.297.556-.474.92-.474H53.93c.92 0 1.456 1.04.92 1.788L47.37 13.03c-1.07 1.498 0 3.578 1.842 3.578h11.376c.944 0 1.474 1.088.89 1.831L40.049 45.712z"/></mask><g mask="url(#a)"><g filter="url(#b)"><ellipse cx="5.508" cy="14.704" fill="#eee6ff" rx="5.508" ry="14.704" transform="rotate(269.814 20.96 11.29)scale(-1 1)"/></g><g filter="url(#c)"><ellipse cx="10.399" cy="29.851" fill="#eee6ff" rx="10.399" ry="29.851" transform="rotate(89.814 -16.902 -8.275)scale(1 -1)"/></g><g filter="url(#d)"><ellipse cx="5.508" cy="30.487" fill="#8900ff" rx="5.508" ry="30.487" transform="rotate(89.814 -19.197 -7.127)scale(1 -1)"/></g><g filter="url(#e)"><ellipse cx="5.508" cy="30.599" fill="#8900ff" rx="5.508" ry="30.599" transform="rotate(89.814 -25.928 4.177)scale(1 -1)"/></g><g filter="url(#f)"><ellipse cx="5.508" cy="30.599" fill="#8900ff" rx="5.508" ry="30.599" transform="rotate(89.814 -25.738 5.52)scale(1 -1)"/></g><g filter="url(#g)"><ellipse cx="14.072" cy="22.078" fill="#eee6ff" rx="14.072" ry="22.078" transform="rotate(93.35 31.245 55.578)scale(-1 1)"/></g><g filter="url(#h)"><ellipse cx="3.47" cy="21.501" fill="#8900ff" rx="3.47" ry="21.501" transform="rotate(89.009 35.419 55.202)scale(-1 1)"/></g><g filter="url(#i)"><ellipse cx="3.47" cy="21.501" fill="#8900ff" rx="3.47" ry="21.501" transform="rotate(89.009 35.419 55.202)scale(-1 1)"/></g><g filter="url(#j)"><ellipse cx="14.592" cy="9.743" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(39.51 14.592 9.743)"/></g><g filter="url(#k)"><ellipse cx="61.728" cy="-5.321" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 61.728 -5.32)"/></g><g filter="url(#l)"><ellipse cx="55.618" cy="7.104" fill="#00c2ff" rx="5.971" ry="9.665" transform="rotate(37.892 55.618 7.104)"/></g><g filter="url(#m)"><ellipse cx="12.326" cy="39.103" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 12.326 39.103)"/></g><g filter="url(#n)"><ellipse cx="12.326" cy="39.103" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 12.326 39.103)"/></g><g filter="url(#o)"><ellipse cx="49.857" cy="30.678" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 49.857 30.678)"/></g><g filter="url(#p)"><ellipse cx="52.623" cy="33.171" fill="#00c2ff" rx="5.971" ry="15.297" transform="rotate(37.892 52.623 33.17)"/></g></g><path d="M6.919 0c-9.198 13.166-9.252 33.575 0 46.789h6.215c-9.25-13.214-9.196-33.623 0-46.789zm62.424 0h-6.215c9.198 13.166 9.252 33.575 0 46.789h6.215c9.25-13.214 9.196-33.623 0-46.789" class="parenthesis"/><defs><filter id="b" width="60.045" height="41.654" x="-5.564" y="16.92" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="c" width="90.34" height="51.437" x="-40.407" y="-6.762" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="d" width="79.355" height="29.4" x="-35.435" y="2.801" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="e" width="79.579" height="29.4" x="-30.84" y="20.8" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="f" width="79.579" height="29.4" x="-29.307" y="21.949" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="g" width="74.749" height="58.852" x="29.961" y="-17.13" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="h" width="61.377" height="25.362" x="37.754" y="3.055" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="i" width="61.377" height="25.362" x="37.754" y="3.055" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="j" width="56.045" height="63.649" x="-13.43" y="-22.082" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="k" width="54.814" height="64.646" x="34.321" y="-37.644" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="l" width="33.541" height="35.313" x="38.847" y="-10.552" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="m" width="54.814" height="64.646" x="-15.081" y="6.78" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="n" width="54.814" height="64.646" x="-15.081" y="6.78" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="o" width="54.814" height="64.646" x="22.45" y="-1.645" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="p" width="39.409" height="43.623" x="32.919" y="11.36" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter></defs></svg>
import type { ReactNode } from 'react'
export function PageHeader({
title,
description,
actions,
}: {
title: string
description?: string
actions?: ReactNode
}) {
return (
<div className="flex items-end justify-between gap-6 pb-6 mb-6 border-b">
<div className="min-w-0">
<h1 className="text-2xl font-semibold tracking-tight">{title}</h1>
{description && (
<p className="text-sm text-muted-foreground mt-1">{description}</p>
)}
</div>
{actions && <div className="flex items-center gap-2 shrink-0">{actions}</div>}
</div>
)
}
import { mergeProps } from "@base-ui/react/merge-props"
import { useRender } from "@base-ui/react/use-render"
import { cva, type VariantProps } from "class-variance-authority"
import { cn } from "@/lib/utils"
const badgeVariants = cva(
"group/badge inline-flex h-5 w-fit shrink-0 items-center justify-center gap-1 overflow-hidden rounded-4xl border border-transparent px-2 py-0.5 text-xs font-medium whitespace-nowrap transition-all focus-visible:border-ring focus-visible:ring-[3px] focus-visible:ring-ring/50 has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 aria-invalid:border-destructive aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 [&>svg]:pointer-events-none [&>svg]:size-3!",
{
variants: {
variant: {
default: "bg-primary text-primary-foreground [a]:hover:bg-primary/80",
secondary:
"bg-secondary text-secondary-foreground [a]:hover:bg-secondary/80",
destructive:
"bg-destructive/10 text-destructive focus-visible:ring-destructive/20 dark:bg-destructive/20 dark:focus-visible:ring-destructive/40 [a]:hover:bg-destructive/20",
outline:
"border-border text-foreground [a]:hover:bg-muted [a]:hover:text-muted-foreground",
ghost:
"hover:bg-muted hover:text-muted-foreground dark:hover:bg-muted/50",
link: "text-primary underline-offset-4 hover:underline",
},
},
defaultVariants: {
variant: "default",
},
}
)
function Badge({
className,
variant = "default",
render,
...props
}: useRender.ComponentProps<"span"> & VariantProps<typeof badgeVariants>) {
return useRender({
defaultTagName: "span",
props: mergeProps<"span">(
{
className: cn(badgeVariants({ variant }), className),
},
props
),
render,
state: {
slot: "badge",
variant,
},
})
}
export { Badge, badgeVariants }
import { Button as ButtonPrimitive } from "@base-ui/react/button"
import { cva, type VariantProps } from "class-variance-authority"
import { cn } from "@/lib/utils"
const buttonVariants = cva(
"group/button inline-flex shrink-0 items-center justify-center rounded-lg border border-transparent bg-clip-padding text-sm font-medium whitespace-nowrap transition-all outline-none select-none focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 active:not-aria-[haspopup]:translate-y-px disabled:pointer-events-none disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
{
variants: {
variant: {
default: "bg-primary text-primary-foreground [a]:hover:bg-primary/80",
outline:
"border-border bg-background hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground dark:border-input dark:bg-input/30 dark:hover:bg-input/50",
secondary:
"bg-secondary text-secondary-foreground hover:bg-secondary/80 aria-expanded:bg-secondary aria-expanded:text-secondary-foreground",
ghost:
"hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground dark:hover:bg-muted/50",
destructive:
"bg-destructive/10 text-destructive hover:bg-destructive/20 focus-visible:border-destructive/40 focus-visible:ring-destructive/20 dark:bg-destructive/20 dark:hover:bg-destructive/30 dark:focus-visible:ring-destructive/40",
link: "text-primary underline-offset-4 hover:underline",
},
size: {
default:
"h-8 gap-1.5 px-2.5 has-data-[icon=inline-end]:pr-2 has-data-[icon=inline-start]:pl-2",
xs: "h-6 gap-1 rounded-[min(var(--radius-md),10px)] px-2 text-xs in-data-[slot=button-group]:rounded-lg has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 [&_svg:not([class*='size-'])]:size-3",
sm: "h-7 gap-1 rounded-[min(var(--radius-md),12px)] px-2.5 text-[0.8rem] in-data-[slot=button-group]:rounded-lg has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 [&_svg:not([class*='size-'])]:size-3.5",
lg: "h-9 gap-1.5 px-2.5 has-data-[icon=inline-end]:pr-2 has-data-[icon=inline-start]:pl-2",
icon: "size-8",
"icon-xs":
"size-6 rounded-[min(var(--radius-md),10px)] in-data-[slot=button-group]:rounded-lg [&_svg:not([class*='size-'])]:size-3",
"icon-sm":
"size-7 rounded-[min(var(--radius-md),12px)] in-data-[slot=button-group]:rounded-lg",
"icon-lg": "size-9",
},
},
defaultVariants: {
variant: "default",
size: "default",
},
}
)
function Button({
className,
variant = "default",
size = "default",
...props
}: ButtonPrimitive.Props & VariantProps<typeof buttonVariants>) {
return (
<ButtonPrimitive
data-slot="button"
className={cn(buttonVariants({ variant, size, className }))}
{...props}
/>
)
}
export { Button, buttonVariants }
import * as React from "react"
import { cn } from "@/lib/utils"
function Card({
className,
size = "default",
...props
}: React.ComponentProps<"div"> & { size?: "default" | "sm" }) {
return (
<div
data-slot="card"
data-size={size}
className={cn(
"group/card flex flex-col gap-4 overflow-hidden rounded-xl bg-card py-4 text-sm text-card-foreground ring-1 ring-foreground/10 has-data-[slot=card-footer]:pb-0 has-[>img:first-child]:pt-0 data-[size=sm]:gap-3 data-[size=sm]:py-3 data-[size=sm]:has-data-[slot=card-footer]:pb-0 *:[img:first-child]:rounded-t-xl *:[img:last-child]:rounded-b-xl",
className
)}
{...props}
/>
)
}
function CardHeader({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-header"
className={cn(
"group/card-header @container/card-header grid auto-rows-min items-start gap-1 rounded-t-xl px-4 group-data-[size=sm]/card:px-3 has-data-[slot=card-action]:grid-cols-[1fr_auto] has-data-[slot=card-description]:grid-rows-[auto_auto] [.border-b]:pb-4 group-data-[size=sm]/card:[.border-b]:pb-3",
className
)}
{...props}
/>
)
}
function CardTitle({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-title"
className={cn(
"font-heading text-base leading-snug font-medium group-data-[size=sm]/card:text-sm",
className
)}
{...props}
/>
)
}
function CardDescription({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-description"
className={cn("text-sm text-muted-foreground", className)}
{...props}
/>
)
}
function CardAction({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-action"
className={cn(
"col-start-2 row-span-2 row-start-1 self-start justify-self-end",
className
)}
{...props}
/>
)
}
function CardContent({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-content"
className={cn("px-4 group-data-[size=sm]/card:px-3", className)}
{...props}
/>
)
}
function CardFooter({ className, ...props }: React.ComponentProps<"div">) {
return (
<div
data-slot="card-footer"
className={cn(
"flex items-center rounded-b-xl border-t bg-muted/50 p-4 group-data-[size=sm]/card:p-3",
className
)}
{...props}
/>
)
}
export {
Card,
CardHeader,
CardFooter,
CardTitle,
CardAction,
CardDescription,
CardContent,
}
import * as React from "react"
import { Input as InputPrimitive } from "@base-ui/react/input"
import { cn } from "@/lib/utils"
function Input({ className, type, ...props }: React.ComponentProps<"input">) {
return (
<InputPrimitive
type={type}
data-slot="input"
className={cn(
"h-8 w-full min-w-0 rounded-lg border border-input bg-transparent px-2.5 py-1 text-base transition-colors outline-none file:inline-flex file:h-6 file:border-0 file:bg-transparent file:text-sm file:font-medium file:text-foreground placeholder:text-muted-foreground focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:pointer-events-none disabled:cursor-not-allowed disabled:bg-input/50 disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 md:text-sm dark:bg-input/30 dark:disabled:bg-input/80 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40",
className
)}
{...props}
/>
)
}
export { Input }
import * as React from "react"
import { cn } from "@/lib/utils"
function Label({ className, ...props }: React.ComponentProps<"label">) {
return (
<label
data-slot="label"
className={cn(
"flex items-center gap-2 text-sm leading-none font-medium select-none group-data-[disabled=true]:pointer-events-none group-data-[disabled=true]:opacity-50 peer-disabled:cursor-not-allowed peer-disabled:opacity-50",
className
)}
{...props}
/>
)
}
export { Label }
"use client"
import * as React from "react"
import { Select as SelectPrimitive } from "@base-ui/react/select"
import { cn } from "@/lib/utils"
import { ChevronDownIcon, CheckIcon, ChevronUpIcon } from "lucide-react"
const Select = SelectPrimitive.Root
function SelectGroup({ className, ...props }: SelectPrimitive.Group.Props) {
return (
<SelectPrimitive.Group
data-slot="select-group"
className={cn("scroll-my-1 p-1", className)}
{...props}
/>
)
}
function SelectValue({ className, ...props }: SelectPrimitive.Value.Props) {
return (
<SelectPrimitive.Value
data-slot="select-value"
className={cn("flex flex-1 text-left", className)}
{...props}
/>
)
}
function SelectTrigger({
className,
size = "default",
children,
...props
}: SelectPrimitive.Trigger.Props & {
size?: "sm" | "default"
}) {
return (
<SelectPrimitive.Trigger
data-slot="select-trigger"
data-size={size}
className={cn(
"flex w-fit items-center justify-between gap-1.5 rounded-lg border border-input bg-transparent py-2 pr-2 pl-2.5 text-sm whitespace-nowrap transition-colors outline-none select-none focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:cursor-not-allowed disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 data-placeholder:text-muted-foreground data-[size=default]:h-8 data-[size=sm]:h-7 data-[size=sm]:rounded-[min(var(--radius-md),10px)] *:data-[slot=select-value]:line-clamp-1 *:data-[slot=select-value]:flex *:data-[slot=select-value]:items-center *:data-[slot=select-value]:gap-1.5 dark:bg-input/30 dark:hover:bg-input/50 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
{children}
<SelectPrimitive.Icon
render={
<ChevronDownIcon className="pointer-events-none size-4 text-muted-foreground" />
}
/>
</SelectPrimitive.Trigger>
)
}
function SelectContent({
className,
children,
side = "bottom",
sideOffset = 4,
align = "center",
alignOffset = 0,
alignItemWithTrigger = true,
...props
}: SelectPrimitive.Popup.Props &
Pick<
SelectPrimitive.Positioner.Props,
"align" | "alignOffset" | "side" | "sideOffset" | "alignItemWithTrigger"
>) {
return (
<SelectPrimitive.Portal>
<SelectPrimitive.Positioner
side={side}
sideOffset={sideOffset}
align={align}
alignOffset={alignOffset}
alignItemWithTrigger={alignItemWithTrigger}
className="isolate z-50"
>
<SelectPrimitive.Popup
data-slot="select-content"
data-align-trigger={alignItemWithTrigger}
className={cn("relative isolate z-50 max-h-(--available-height) w-(--anchor-width) min-w-36 origin-(--transform-origin) overflow-x-hidden overflow-y-auto rounded-lg bg-popover text-popover-foreground shadow-md ring-1 ring-foreground/10 duration-100 data-[align-trigger=true]:animate-none data-[side=bottom]:slide-in-from-top-2 data-[side=inline-end]:slide-in-from-left-2 data-[side=inline-start]:slide-in-from-right-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 data-open:animate-in data-open:fade-in-0 data-open:zoom-in-95 data-closed:animate-out data-closed:fade-out-0 data-closed:zoom-out-95", className )}
{...props}
>
<SelectScrollUpButton />
<SelectPrimitive.List>{children}</SelectPrimitive.List>
<SelectScrollDownButton />
</SelectPrimitive.Popup>
</SelectPrimitive.Positioner>
</SelectPrimitive.Portal>
)
}
function SelectLabel({
className,
...props
}: SelectPrimitive.GroupLabel.Props) {
return (
<SelectPrimitive.GroupLabel
data-slot="select-label"
className={cn("px-1.5 py-1 text-xs text-muted-foreground", className)}
{...props}
/>
)
}
function SelectItem({
className,
children,
...props
}: SelectPrimitive.Item.Props) {
return (
<SelectPrimitive.Item
data-slot="select-item"
className={cn(
"relative flex w-full cursor-default items-center gap-1.5 rounded-md py-1 pr-8 pl-1.5 text-sm outline-hidden select-none focus:bg-accent focus:text-accent-foreground not-data-[variant=destructive]:focus:**:text-accent-foreground data-disabled:pointer-events-none data-disabled:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4 *:[span]:last:flex *:[span]:last:items-center *:[span]:last:gap-2",
className
)}
{...props}
>
<SelectPrimitive.ItemText className="flex flex-1 shrink-0 gap-2 whitespace-nowrap">
{children}
</SelectPrimitive.ItemText>
<SelectPrimitive.ItemIndicator
render={
<span className="pointer-events-none absolute right-2 flex size-4 items-center justify-center" />
}
>
<CheckIcon className="pointer-events-none" />
</SelectPrimitive.ItemIndicator>
</SelectPrimitive.Item>
)
}
function SelectSeparator({
className,
...props
}: SelectPrimitive.Separator.Props) {
return (
<SelectPrimitive.Separator
data-slot="select-separator"
className={cn("pointer-events-none -mx-1 my-1 h-px bg-border", className)}
{...props}
/>
)
}
function SelectScrollUpButton({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.ScrollUpArrow>) {
return (
<SelectPrimitive.ScrollUpArrow
data-slot="select-scroll-up-button"
className={cn(
"top-0 z-10 flex w-full cursor-default items-center justify-center bg-popover py-1 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
<ChevronUpIcon
/>
</SelectPrimitive.ScrollUpArrow>
)
}
function SelectScrollDownButton({
className,
...props
}: React.ComponentProps<typeof SelectPrimitive.ScrollDownArrow>) {
return (
<SelectPrimitive.ScrollDownArrow
data-slot="select-scroll-down-button"
className={cn(
"bottom-0 z-10 flex w-full cursor-default items-center justify-center bg-popover py-1 [&_svg:not([class*='size-'])]:size-4",
className
)}
{...props}
>
<ChevronDownIcon
/>
</SelectPrimitive.ScrollDownArrow>
)
}
export {
Select,
SelectContent,
SelectGroup,
SelectItem,
SelectLabel,
SelectScrollDownButton,
SelectScrollUpButton,
SelectSeparator,
SelectTrigger,
SelectValue,
}
import { Separator as SeparatorPrimitive } from "@base-ui/react/separator"
import { cn } from "@/lib/utils"
function Separator({
className,
orientation = "horizontal",
...props
}: SeparatorPrimitive.Props) {
return (
<SeparatorPrimitive
data-slot="separator"
orientation={orientation}
className={cn(
"shrink-0 bg-border data-horizontal:h-px data-horizontal:w-full data-vertical:w-px data-vertical:self-stretch",
className
)}
{...props}
/>
)
}
export { Separator }
import { Switch as SwitchPrimitive } from "@base-ui/react/switch"
import { cn } from "@/lib/utils"
function Switch({
className,
size = "default",
...props
}: SwitchPrimitive.Root.Props & {
size?: "sm" | "default"
}) {
return (
<SwitchPrimitive.Root
data-slot="switch"
data-size={size}
className={cn(
"peer group/switch relative inline-flex shrink-0 items-center rounded-full border border-transparent transition-all outline-none after:absolute after:-inset-x-3 after:-inset-y-2 focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 data-[size=default]:h-[18.4px] data-[size=default]:w-[32px] data-[size=sm]:h-[14px] data-[size=sm]:w-[24px] dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 data-checked:bg-primary data-unchecked:bg-input dark:data-unchecked:bg-input/80 data-disabled:cursor-not-allowed data-disabled:opacity-50",
className
)}
{...props}
>
<SwitchPrimitive.Thumb
data-slot="switch-thumb"
className="pointer-events-none block rounded-full bg-background ring-0 transition-transform group-data-[size=default]/switch:size-4 group-data-[size=sm]/switch:size-3 group-data-[size=default]/switch:data-checked:translate-x-[calc(100%-2px)] group-data-[size=sm]/switch:data-checked:translate-x-[calc(100%-2px)] dark:data-checked:bg-primary-foreground group-data-[size=default]/switch:data-unchecked:translate-x-0 group-data-[size=sm]/switch:data-unchecked:translate-x-0 dark:data-unchecked:bg-foreground"
/>
</SwitchPrimitive.Root>
)
}
export { Switch }
"use client"
import * as React from "react"
import { cn } from "@/lib/utils"
function Table({ className, ...props }: React.ComponentProps<"table">) {
return (
<div
data-slot="table-container"
className="relative w-full overflow-x-auto"
>
<table
data-slot="table"
className={cn("w-full caption-bottom text-sm", className)}
{...props}
/>
</div>
)
}
function TableHeader({ className, ...props }: React.ComponentProps<"thead">) {
return (
<thead
data-slot="table-header"
className={cn("[&_tr]:border-b", className)}
{...props}
/>
)
}
function TableBody({ className, ...props }: React.ComponentProps<"tbody">) {
return (
<tbody
data-slot="table-body"
className={cn("[&_tr:last-child]:border-0", className)}
{...props}
/>
)
}
function TableFooter({ className, ...props }: React.ComponentProps<"tfoot">) {
return (
<tfoot
data-slot="table-footer"
className={cn(
"border-t bg-muted/50 font-medium [&>tr]:last:border-b-0",
className
)}
{...props}
/>
)
}
function TableRow({ className, ...props }: React.ComponentProps<"tr">) {
return (
<tr
data-slot="table-row"
className={cn(
"border-b transition-colors hover:bg-muted/50 has-aria-expanded:bg-muted/50 data-[state=selected]:bg-muted",
className
)}
{...props}
/>
)
}
function TableHead({ className, ...props }: React.ComponentProps<"th">) {
return (
<th
data-slot="table-head"
className={cn(
"h-10 px-2 text-left align-middle font-medium whitespace-nowrap text-foreground [&:has([role=checkbox])]:pr-0",
className
)}
{...props}
/>
)
}
function TableCell({ className, ...props }: React.ComponentProps<"td">) {
return (
<td
data-slot="table-cell"
className={cn(
"p-2 align-middle whitespace-nowrap [&:has([role=checkbox])]:pr-0",
className
)}
{...props}
/>
)
}
function TableCaption({
className,
...props
}: React.ComponentProps<"caption">) {
return (
<caption
data-slot="table-caption"
className={cn("mt-4 text-sm text-muted-foreground", className)}
{...props}
/>
)
}
export {
Table,
TableHeader,
TableBody,
TableFooter,
TableHead,
TableRow,
TableCell,
TableCaption,
}
import * as React from "react"
import { cn } from "@/lib/utils"
function Textarea({ className, ...props }: React.ComponentProps<"textarea">) {
return (
<textarea
data-slot="textarea"
className={cn(
"flex field-sizing-content min-h-16 w-full rounded-lg border border-input bg-transparent px-2.5 py-2 text-base transition-colors outline-none placeholder:text-muted-foreground focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:cursor-not-allowed disabled:bg-input/50 disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 md:text-sm dark:bg-input/30 dark:disabled:bg-input/80 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40",
className
)}
{...props}
/>
)
}
export { Textarea }
@import "tailwindcss";
@import "tw-animate-css";
@import "shadcn/tailwind.css";
@import "@fontsource-variable/geist";
@import "@fontsource-variable/geist-mono";
@custom-variant dark (&:is(.dark *));
@theme inline {
--font-heading: var(--font-sans);
--font-sans: 'Geist Variable', ui-sans-serif, system-ui, -apple-system, sans-serif;
--font-mono: 'Geist Mono Variable', ui-monospace, SFMono-Regular, Menlo, monospace;
--color-sidebar-ring: var(--sidebar-ring);
--color-sidebar-border: var(--sidebar-border);
--color-sidebar-accent-foreground: var(--sidebar-accent-foreground);
--color-sidebar-accent: var(--sidebar-accent);
--color-sidebar-primary-foreground: var(--sidebar-primary-foreground);
--color-sidebar-primary: var(--sidebar-primary);
--color-sidebar-foreground: var(--sidebar-foreground);
--color-sidebar: var(--sidebar);
--color-chart-5: var(--chart-5);
--color-chart-4: var(--chart-4);
--color-chart-3: var(--chart-3);
--color-chart-2: var(--chart-2);
--color-chart-1: var(--chart-1);
--color-ring: var(--ring);
--color-input: var(--input);
--color-border: var(--border);
--color-destructive: var(--destructive);
--color-accent-foreground: var(--accent-foreground);
--color-accent: var(--accent);
--color-muted-foreground: var(--muted-foreground);
--color-muted: var(--muted);
--color-secondary-foreground: var(--secondary-foreground);
--color-secondary: var(--secondary);
--color-primary-foreground: var(--primary-foreground);
--color-primary: var(--primary);
--color-popover-foreground: var(--popover-foreground);
--color-popover: var(--popover);
--color-card-foreground: var(--card-foreground);
--color-card: var(--card);
--color-foreground: var(--foreground);
--color-background: var(--background);
--radius-sm: calc(var(--radius) * 0.6);
--radius-md: calc(var(--radius) * 0.8);
--radius-lg: var(--radius);
--radius-xl: calc(var(--radius) * 1.4);
--radius-2xl: calc(var(--radius) * 1.8);
--radius-3xl: calc(var(--radius) * 2.2);
--radius-4xl: calc(var(--radius) * 2.6);
}
:root {
--background: oklch(0.995 0 0);
--foreground: oklch(0.18 0 0);
--card: oklch(1 0 0);
--card-foreground: oklch(0.18 0 0);
--popover: oklch(1 0 0);
--popover-foreground: oklch(0.18 0 0);
--primary: oklch(0.22 0 0);
--primary-foreground: oklch(0.98 0 0);
--secondary: oklch(0.97 0 0);
--secondary-foreground: oklch(0.22 0 0);
--muted: oklch(0.97 0 0);
--muted-foreground: oklch(0.52 0 0);
--accent: oklch(0.96 0 0);
--accent-foreground: oklch(0.22 0 0);
--destructive: oklch(0.577 0.245 27.325);
--border: oklch(0.92 0 0);
--input: oklch(0.92 0 0);
--ring: oklch(0.55 0 0);
--chart-1: oklch(0.28 0 0);
--chart-2: oklch(0.45 0 0);
--chart-3: oklch(0.6 0 0);
--chart-4: oklch(0.75 0 0);
--chart-5: oklch(0.87 0 0);
--radius: 0.5rem;
--sidebar: oklch(0.98 0 0);
--sidebar-foreground: oklch(0.18 0 0);
--sidebar-primary: oklch(0.22 0 0);
--sidebar-primary-foreground: oklch(0.98 0 0);
--sidebar-accent: oklch(0.96 0 0);
--sidebar-accent-foreground: oklch(0.22 0 0);
--sidebar-border: oklch(0.92 0 0);
--sidebar-ring: oklch(0.55 0 0);
}
.dark {
--background: oklch(0.135 0 0);
--foreground: oklch(0.98 0 0);
--card: oklch(0.175 0 0);
--card-foreground: oklch(0.98 0 0);
--popover: oklch(0.175 0 0);
--popover-foreground: oklch(0.98 0 0);
--primary: oklch(0.93 0 0);
--primary-foreground: oklch(0.18 0 0);
--secondary: oklch(0.24 0 0);
--secondary-foreground: oklch(0.98 0 0);
--muted: oklch(0.22 0 0);
--muted-foreground: oklch(0.68 0 0);
--accent: oklch(0.24 0 0);
--accent-foreground: oklch(0.98 0 0);
--destructive: oklch(0.704 0.191 22.216);
--border: oklch(1 0 0 / 10%);
--input: oklch(1 0 0 / 14%);
--ring: oklch(0.6 0 0);
--chart-1: oklch(0.92 0 0);
--chart-2: oklch(0.78 0 0);
--chart-3: oklch(0.6 0 0);
--chart-4: oklch(0.45 0 0);
--chart-5: oklch(0.32 0 0);
--sidebar: oklch(0.175 0 0);
--sidebar-foreground: oklch(0.98 0 0);
--sidebar-primary: oklch(0.93 0 0);
--sidebar-primary-foreground: oklch(0.18 0 0);
--sidebar-accent: oklch(0.24 0 0);
--sidebar-accent-foreground: oklch(0.98 0 0);
--sidebar-border: oklch(1 0 0 / 10%);
--sidebar-ring: oklch(0.6 0 0);
}
@layer base {
* {
@apply border-border outline-ring/50;
}
html {
@apply font-sans antialiased;
font-feature-settings: 'ss01', 'cv11';
text-rendering: optimizeLegibility;
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
body {
@apply bg-background text-foreground;
font-feature-settings: 'ss01', 'cv11', 'tnum';
}
code, pre, kbd, samp {
font-family: var(--font-mono);
font-feature-settings: 'ss02', 'ss03';
}
/* Tabular numerals on any element that opts in */
.tabular-nums {
font-variant-numeric: tabular-nums;
}
/* Tighter focus ring across inputs */
input, textarea, select {
font-feature-settings: 'ss01';
}
}
const BASE = import.meta.env.BASE_URL.replace(/\/$/, '');
export async function apiFetch<T>(path: string, options?: RequestInit): Promise<T> {
const res = await fetch(`${BASE}${path}`, {
headers: { 'Content-Type': 'application/json', ...options?.headers },
...options,
});
if (!res.ok) {
const body = await res.json().catch(() => ({ error: { message: res.statusText } }));
throw new Error(body.error?.message ?? `HTTP ${res.status}`);
}
return res.json();
}
import { clsx, type ClassValue } from "clsx"
import { twMerge } from "tailwind-merge"
export function cn(...inputs: ClassValue[]) {
return twMerge(clsx(inputs))
}
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import './index.css'
import App from './App'
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
)
import { useState } from 'react'
import { useQuery } from '@tanstack/react-query'
import {
BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, ResponsiveContainer,
LineChart, Line, Legend,
} from 'recharts'
import { apiFetch } from '@/lib/api'
import { Button } from '@/components/ui/button'
import { Table, TableBody, TableCell, TableHead, TableHeader, TableRow } from '@/components/ui/table'
import { PageHeader } from '@/components/page-header'
type TimeRange = '24h' | '7d' | '30d'
function formatTokens(n?: number): string {
if (!n) return '0'
if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`
if (n >= 1_000) return `${(n / 1_000).toFixed(1)}K`
return String(n)
}
function Stat({ label, value, className }: { label: string; value: string | number; className?: string }) {
return (
<div className="rounded-lg border bg-card px-4 py-3">
<p className="text-[11px] text-muted-foreground uppercase tracking-wider">{label}</p>
<p className={`text-xl font-semibold tabular-nums mt-1 ${className ?? ''}`}>{value}</p>
</div>
)
}
function Panel({ title, children }: { title: string; children: React.ReactNode }) {
return (
<div className="rounded-lg border bg-card">
<div className="px-4 py-3 border-b">
<h3 className="text-sm font-medium">{title}</h3>
</div>
<div className="p-4">{children}</div>
</div>
)
}
const axisStyle = { fontSize: 11, fill: 'var(--muted-foreground)' } as const
const gridStyle = 'var(--border)'
const primaryFill = 'var(--foreground)'
export default function AnalyticsPage() {
const [range, setRange] = useState<TimeRange>('7d')
const { data: summary } = useQuery({
queryKey: ['analytics', 'summary', range],
queryFn: () => apiFetch<any>(`/api/analytics/summary?range=${range}`),
})
const { data: byPlatform = [] } = useQuery({
queryKey: ['analytics', 'by-platform', range],
queryFn: () => apiFetch<any[]>(`/api/analytics/by-platform?range=${range}`),
})
const { data: timeline = [] } = useQuery({
queryKey: ['analytics', 'timeline', range],
queryFn: () => apiFetch<any[]>(`/api/analytics/timeline?range=${range}`),
})
const { data: byModel = [] } = useQuery({
queryKey: ['analytics', 'by-model', range],
queryFn: () => apiFetch<any[]>(`/api/analytics/by-model?range=${range}`),
})
const { data: errors = [] } = useQuery({
queryKey: ['analytics', 'errors', range],
queryFn: () => apiFetch<any[]>(`/api/analytics/errors?range=${range}`),
})
const { data: errorDist } = useQuery({
queryKey: ['analytics', 'error-distribution', range],
queryFn: () => apiFetch<{ byCategory: any[]; byPlatform: any[]; detailed: any[] }>(`/api/analytics/error-distribution?range=${range}`),
})
return (
<div>
<PageHeader
title="Analytics"
description="Request volume, latency, token usage, and failures."
actions={
<div className="flex gap-1 rounded-md border p-0.5">
{(['24h', '7d', '30d'] as TimeRange[]).map(r => (
<Button
key={r}
variant={range === r ? 'secondary' : 'ghost'}
size="xs"
onClick={() => setRange(r)}
>
{r}
</Button>
))}
</div>
}
/>
<div className="space-y-6">
{/* Summary stats */}
<div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-3">
<Stat label="Requests" value={summary?.totalRequests ?? 0} />
<Stat label="Success rate" value={`${summary?.successRate ?? 0}%`} />
<Stat label="Input tokens" value={formatTokens(summary?.totalInputTokens)} />
<Stat label="Output tokens" value={formatTokens(summary?.totalOutputTokens)} />
<Stat label="Avg latency" value={`${summary?.avgLatencyMs ?? 0} ms`} />
<Stat label="Est. savings" value={`$${summary?.estimatedCostSavings ?? '0.00'}`} />
</div>
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
<Panel title="Requests by provider">
{byPlatform.length === 0 ? (
<p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
) : (
<ResponsiveContainer width="100%" height={240}>
<BarChart data={byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
<CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
<XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
<YAxis tick={axisStyle} tickLine={false} axisLine={false} />
<Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
<Bar dataKey="requests" fill={primaryFill} radius={[3, 3, 0, 0]} />
</BarChart>
</ResponsiveContainer>
)}
</Panel>
<Panel title="Avg latency by provider">
{byPlatform.length === 0 ? (
<p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
) : (
<ResponsiveContainer width="100%" height={240}>
<BarChart data={byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
<CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
<XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
<YAxis unit="ms" tick={axisStyle} tickLine={false} axisLine={false} />
<Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
<Bar dataKey="avgLatencyMs" name="Latency (ms)" fill="var(--muted-foreground)" radius={[3, 3, 0, 0]} />
</BarChart>
</ResponsiveContainer>
)}
</Panel>
<div className="lg:col-span-2">
<Panel title="Requests over time">
{timeline.length === 0 ? (
<p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
) : (
<ResponsiveContainer width="100%" height={240}>
<LineChart data={timeline} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
<CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
<XAxis dataKey="timestamp" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
<YAxis tick={axisStyle} tickLine={false} axisLine={false} />
<Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
<Legend wrapperStyle={{ fontSize: 12 }} iconType="line" />
<Line type="monotone" dataKey="successCount" name="Success" stroke={primaryFill} strokeWidth={1.5} dot={false} />
<Line type="monotone" dataKey="failureCount" name="Failures" stroke="var(--destructive)" strokeWidth={1.5} dot={false} />
</LineChart>
</ResponsiveContainer>
)}
</Panel>
</div>
<div className="lg:col-span-2">
<Panel title="Per-model breakdown">
{byModel.length === 0 ? (
<p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
) : (
<div className="max-h-[360px] overflow-y-auto -mx-4">
<Table>
<TableHeader>
<TableRow>
<TableHead className="pl-4">Model</TableHead>
<TableHead>Provider</TableHead>
<TableHead className="text-right">Requests</TableHead>
<TableHead className="text-right">Success</TableHead>
<TableHead className="text-right">Latency</TableHead>
<TableHead className="text-right">In tokens</TableHead>
<TableHead className="text-right pr-4">Out tokens</TableHead>
</TableRow>
</TableHeader>
<TableBody>
{byModel.map((m: any, i: number) => (
<TableRow key={i}>
<TableCell className="pl-4 text-sm font-medium">{m.displayName}</TableCell>
<TableCell className="text-xs text-muted-foreground">{m.platform}</TableCell>
<TableCell className="text-right tabular-nums">{m.requests}</TableCell>
<TableCell className="text-right tabular-nums">{m.successRate}%</TableCell>
<TableCell className="text-right tabular-nums">{m.avgLatencyMs} ms</TableCell>
<TableCell className="text-right tabular-nums">{formatTokens(m.totalInputTokens)}</TableCell>
<TableCell className="text-right tabular-nums pr-4">{formatTokens(m.totalOutputTokens)}</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</div>
)}
</Panel>
</div>
<Panel title="Errors by provider">
{!errorDist?.byPlatform?.length ? (
<p className="text-sm text-muted-foreground text-center py-8">No errors</p>
) : (
<ResponsiveContainer width="100%" height={240}>
<BarChart data={errorDist.byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
<CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
<XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
<YAxis tick={axisStyle} tickLine={false} axisLine={false} />
<Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
<Bar dataKey="count" fill="var(--destructive)" radius={[3, 3, 0, 0]} />
</BarChart>
</ResponsiveContainer>
)}
</Panel>
<Panel title="Recent errors">
{errors.length === 0 ? (
<p className="text-sm text-muted-foreground text-center py-8">No errors</p>
) : (
<div className="max-h-[240px] overflow-y-auto -mx-4">
<Table>
<TableHeader>
<TableRow>
<TableHead className="pl-4">Provider</TableHead>
<TableHead>Message</TableHead>
<TableHead className="text-right pr-4">Time</TableHead>
</TableRow>
</TableHeader>
<TableBody>
{errors.slice(0, 20).map((e: any) => (
<TableRow key={e.id}>
<TableCell className="pl-4 text-xs">{e.platform}</TableCell>
<TableCell className="text-xs max-w-[200px] truncate">{e.error}</TableCell>
<TableCell className="text-right text-xs text-muted-foreground tabular-nums pr-4">
{new Date(e.createdAt).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })}
</TableCell>
</TableRow>
))}
</TableBody>
</Table>
</div>
)}
</Panel>
</div>
</div>
</div>
)
}
import { useState } from 'react'
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
import {
DndContext,
closestCenter,
KeyboardSensor,
PointerSensor,
useSensor,
useSensors,
type DragEndEvent,
} from '@dnd-kit/core'
import {
arrayMove,
SortableContext,
sortableKeyboardCoordinates,
useSortable,
verticalListSortingStrategy,
} from '@dnd-kit/sortable'
import { CSS } from '@dnd-kit/utilities'
import { apiFetch } from '@/lib/api'
import { Button } from '@/components/ui/button'
import { Switch } from '@/components/ui/switch'
import { PageHeader } from '@/components/page-header'
interface FallbackEntry {
modelDbId: number
priority: number
effectivePriority: number
penalty: number
rateLimitHits: number
enabled: boolean
platform: string
modelId: string
displayName: string
intelligenceRank: number
speedRank: number
sizeLabel: string
rpmLimit: number | null
rpdLimit: number | null
monthlyTokenBudget: string
keyCount: number
}
function formatTokens(n: number): string {
if (n >= 1_000_000_000) return `${(n / 1_000_000_000).toFixed(1)}B`
if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`
if (n >= 1_000) return `${(n / 1_000).toFixed(1)}K`
return String(n)
}
interface TokenUsageData {
totalBudget: number
totalUsed: number
models: { displayName: string; platform: string; budget: number }[]
}
const platformColors: Record<string, string> = {
google: '#4285f4',
groq: '#f55036',
cerebras: '#8b5cf6',
sambanova: '#14b8a6',
nvidia: '#76b900',
mistral: '#f59e0b',
openrouter: '#ec4899',
github: '#6e7b8b',
huggingface: '#ffd21e',
cohere: '#d946ef',
cloudflare: '#f38020',
zhipu: '#06b6d4',
moonshot: '#4f46e5',
minimax: '#a855f7',
}
function TokenUsageBar({ data }: { data: TokenUsageData }) {
const { totalBudget, totalUsed, models } = data
const remaining = Math.max(0, totalBudget - totalUsed)
const remainingPct = totalBudget > 0 ? Math.round((remaining / totalBudget) * 100) : 0
// Scale each model's segment proportionally so the colored portion of the
// bar sums to `remaining`; the grey tail represents what's been used.
const modelsWithWidth = models.map(m => ({
...m,
remainingTokens: totalBudget > 0 ? (m.budget / totalBudget) * remaining : 0,
widthPct: totalBudget > 0 ? (m.budget / totalBudget) * (remaining / totalBudget) * 100 : 0,
}))
const usedPct = totalBudget > 0 ? (totalUsed / totalBudget) * 100 : 0
return (
<section className="rounded-lg border bg-card p-5">
<div className="flex items-baseline justify-between mb-3">
<h2 className="text-sm font-medium">Monthly token budget</h2>
<span className="text-xs text-muted-foreground tabular-nums">
<span className="text-foreground font-medium">{formatTokens(remaining)}</span> remaining
<span className="mx-1.5">·</span>
{remainingPct}% of {formatTokens(totalBudget)}
</span>
</div>
<div className="flex h-2.5 rounded-full overflow-hidden bg-muted">
{modelsWithWidth.map((m, i) => (
<div
key={i}
title={`${m.displayName} (${m.platform}) — ${formatTokens(m.remainingTokens)} remaining`}
style={{
width: `${m.widthPct}%`,
backgroundColor: platformColors[m.platform] ?? '#94a3b8',
}}
/>
))}
{totalUsed > 0 && (
<div
title={`Used — ${formatTokens(totalUsed)}`}
className="bg-muted-foreground/30"
style={{ width: `${usedPct}%` }}
/>
)}
</div>
<div className="mt-4 grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 gap-x-5 gap-y-1.5 text-xs tabular-nums">
{modelsWithWidth.map((m, i) => (
<div key={i} className="flex items-center gap-2 min-w-0">
<span
className="size-2 rounded-sm flex-shrink-0"
style={{ backgroundColor: platformColors[m.platform] ?? '#94a3b8' }}
/>
<span className="truncate">{m.displayName}</span>
<span className="flex-1" />
<span className="font-mono text-muted-foreground">{formatTokens(m.remainingTokens)}</span>
</div>
))}
</div>
</section>
)
}
function SortableModelRow({
entry,
index,
onToggle,
}: {
entry: FallbackEntry
index: number
onToggle: (modelDbId: number, enabled: boolean) => void
}) {
const { attributes, listeners, setNodeRef, transform, transition, isDragging } = useSortable({
id: entry.modelDbId,
})
const style = {
transform: CSS.Transform.toString(transform),
transition,
}
return (
<div
ref={setNodeRef}
style={style}
className={`group flex items-center gap-3 px-4 py-3 bg-card ${isDragging ? 'opacity-50' : ''} ${entry.enabled ? '' : 'opacity-50'}`}
>
<button
{...attributes}
{...listeners}
className="cursor-grab active:cursor-grabbing text-muted-foreground/50 hover:text-foreground transition-colors"
aria-label="Drag to reorder"
>
<svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor">
<circle cx="9" cy="6" r="1.5" /><circle cx="15" cy="6" r="1.5" />
<circle cx="9" cy="12" r="1.5" /><circle cx="15" cy="12" r="1.5" />
<circle cx="9" cy="18" r="1.5" /><circle cx="15" cy="18" r="1.5" />
</svg>
</button>
<span className="text-xs font-mono text-muted-foreground w-5 tabular-nums">{index + 1}</span>
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 flex-wrap">
<span className="font-medium text-sm">{entry.displayName}</span>
<span className="text-xs text-muted-foreground">{entry.platform}</span>
{entry.penalty > 0 && (
<span className="text-xs text-amber-600 dark:text-amber-400">
{entry.penalty} penalty
</span>
)}
</div>
<div className="flex gap-3 mt-0.5 text-xs text-muted-foreground tabular-nums">
<span>Intel #{entry.intelligenceRank}</span>
<span>Speed #{entry.speedRank}</span>
{entry.rpmLimit && <span>{entry.rpmLimit} rpm</span>}
{entry.rpdLimit && <span>{entry.rpdLimit} rpd</span>}
<span>{entry.monthlyTokenBudget} tok/mo</span>
</div>
</div>
<Switch
checked={entry.enabled}
onCheckedChange={(checked) => onToggle(entry.modelDbId, checked)}
/>
</div>
)
}
export default function FallbackPage() {
const queryClient = useQueryClient()
const [localEntries, setLocalEntries] = useState<FallbackEntry[] | null>(null)
const { data: entries = [], isLoading } = useQuery<FallbackEntry[]>({
queryKey: ['fallback'],
queryFn: () => apiFetch('/api/fallback'),
})
const { data: tokenUsage } = useQuery<TokenUsageData>({
queryKey: ['fallback', 'token-usage'],
queryFn: () => apiFetch('/api/fallback/token-usage'),
})
const saveMutation = useMutation({
mutationFn: (data: { modelDbId: number; priority: number; enabled: boolean }[]) =>
apiFetch('/api/fallback', { method: 'PUT', body: JSON.stringify(data) }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['fallback'] })
setLocalEntries(null)
},
})
const sortMutation = useMutation({
mutationFn: (preset: string) =>
apiFetch(`/api/fallback/sort/${preset}`, { method: 'POST' }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['fallback'] })
setLocalEntries(null)
},
})
const allEntries = localEntries ?? entries
const displayEntries = allEntries.filter(e => e.keyCount > 0)
const unconfiguredPlatforms = [...new Set(allEntries.filter(e => e.keyCount === 0).map(e => e.platform))]
const sensors = useSensors(
useSensor(PointerSensor),
useSensor(KeyboardSensor, { coordinateGetter: sortableKeyboardCoordinates }),
)
function handleDragEnd(event: DragEndEvent) {
const { active, over } = event
if (!over || active.id === over.id) return
const oldIndex = displayEntries.findIndex(e => e.modelDbId === active.id)
const newIndex = displayEntries.findIndex(e => e.modelDbId === over.id)
const reorderedVisible = arrayMove(displayEntries, oldIndex, newIndex)
const unconfigured = allEntries.filter(e => e.keyCount === 0)
const merged = [
...reorderedVisible.map((e, i) => ({ ...e, priority: i + 1 })),
...unconfigured.map((e, i) => ({ ...e, priority: reorderedVisible.length + i + 1 })),
]
setLocalEntries(merged)
}
function handleToggle(modelDbId: number, enabled: boolean) {
const updated = allEntries.map(e =>
e.modelDbId === modelDbId ? { ...e, enabled } : e
)
setLocalEntries(updated)
}
function handleSave() {
if (!localEntries) return
saveMutation.mutate(
allEntries.map(e => ({
modelDbId: e.modelDbId,
priority: e.priority,
enabled: e.enabled,
}))
)
}
const hasChanges = localEntries !== null
return (
<div>
<PageHeader
title="Fallback chain"
description="Drag to reorder. Requests try models top-to-bottom until one succeeds."
actions={
<>
<Button variant="outline" size="sm" onClick={() => sortMutation.mutate('intelligence')} disabled={sortMutation.isPending}>
Sort by intelligence
</Button>
<Button variant="outline" size="sm" onClick={() => sortMutation.mutate('speed')} disabled={sortMutation.isPending}>
Sort by speed
</Button>
<Button variant="outline" size="sm" onClick={() => sortMutation.mutate('budget')} disabled={sortMutation.isPending}>
Sort by budget
</Button>
</>
}
/>
<div className="space-y-6">
{tokenUsage && tokenUsage.totalBudget > 0 && (
<TokenUsageBar data={tokenUsage} />
)}
{isLoading ? (
<p className="text-sm text-muted-foreground">Loading…</p>
) : displayEntries.length === 0 ? (
<div className="rounded-lg border border-dashed p-8 text-center">
<p className="text-sm text-muted-foreground">
No models available. Add API keys on the <a href="/keys" className="underline text-foreground">Keys page</a> first.
</p>
</div>
) : (
<>
<div className="rounded-lg border divide-y overflow-hidden">
<DndContext
sensors={sensors}
collisionDetection={closestCenter}
onDragEnd={handleDragEnd}
>
<SortableContext
items={displayEntries.map(e => e.modelDbId)}
strategy={verticalListSortingStrategy}
>
{displayEntries.map((entry, index) => (
<SortableModelRow
key={entry.modelDbId}
entry={entry}
index={index}
onToggle={handleToggle}
/>
))}
</SortableContext>
</DndContext>
</div>
{hasChanges && (
<div className="flex justify-end gap-2">
<Button variant="outline" size="sm" onClick={() => setLocalEntries(null)}>
Discard
</Button>
<Button size="sm" onClick={handleSave} disabled={saveMutation.isPending}>
{saveMutation.isPending ? 'Saving…' : 'Save order'}
</Button>
</div>
)}
{unconfiguredPlatforms.length > 0 && (
<p className="text-xs text-muted-foreground">
Hidden (no keys): {unconfiguredPlatforms.join(', ')}
</p>
)}
</>
)}
</div>
</div>
)
}
import { useState } from 'react'
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
import { apiFetch } from '@/lib/api'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Label } from '@/components/ui/label'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
import { PageHeader } from '@/components/page-header'
import type { ApiKey, Platform } from '../../../shared/types'
const PLATFORMS: { value: Platform; label: string }[] = [
{ value: 'google', label: 'Google AI Studio' },
{ value: 'groq', label: 'Groq' },
{ value: 'cerebras', label: 'Cerebras' },
{ value: 'sambanova', label: 'SambaNova' },
{ value: 'nvidia', label: 'NVIDIA NIM' },
{ value: 'mistral', label: 'Mistral' },
{ value: 'openrouter', label: 'OpenRouter' },
{ value: 'github', label: 'GitHub Models' },
{ value: 'huggingface', label: 'Hugging Face' },
{ value: 'cohere', label: 'Cohere' },
{ value: 'cloudflare', label: 'Cloudflare Workers AI' },
{ value: 'zhipu', label: 'Zhipu AI (Z.ai)' },
{ value: 'moonshot', label: 'Moonshot (Kimi)' },
{ value: 'minimax', label: 'MiniMax' },
]
const statusDot: Record<string, string> = {
healthy: 'bg-emerald-500',
rate_limited: 'bg-amber-500',
invalid: 'bg-rose-500',
error: 'bg-rose-500',
unknown: 'bg-muted-foreground/40',
}
const statusLabel: Record<string, string> = {
healthy: 'healthy',
rate_limited: 'rate-limited',
invalid: 'invalid',
error: 'error',
unknown: 'unchecked',
}
interface HealthPlatform {
platform: string
totalKeys: number
healthyKeys: number
rateLimitedKeys: number
invalidKeys: number
errorKeys: number
unknownKeys: number
}
interface HealthData {
platforms: HealthPlatform[]
keys: { id: number; platform: string; status: string; lastCheckedAt: string | null }[]
}
function UnifiedKeySection() {
const queryClient = useQueryClient()
const [showKey, setShowKey] = useState(false)
const [copied, setCopied] = useState(false)
const { data } = useQuery<{ apiKey: string }>({
queryKey: ['unified-key'],
queryFn: () => apiFetch('/api/settings/api-key'),
})
const regenerate = useMutation({
mutationFn: () => apiFetch('/api/settings/api-key/regenerate', { method: 'POST' }),
onSuccess: () => queryClient.invalidateQueries({ queryKey: ['unified-key'] }),
})
const apiKey = data?.apiKey ?? ''
const masked = apiKey ? apiKey.slice(0, 13) + '•'.repeat(32) : '…'
function copy() {
navigator.clipboard.writeText(apiKey)
setCopied(true)
setTimeout(() => setCopied(false), 1500)
}
return (
<section className="rounded-lg border bg-card p-5">
<div className="flex items-start justify-between gap-4 mb-3">
<div>
<h2 className="text-sm font-medium">Your unified API key</h2>
<p className="text-xs text-muted-foreground mt-0.5">
Use this as your OpenAI <code className="font-mono">api_key</code>; it authenticates requests to this proxy.
</p>
</div>
<Button
variant="ghost"
size="sm"
onClick={() => regenerate.mutate()}
disabled={regenerate.isPending}
>
Regenerate
</Button>
</div>
<div className="flex items-center gap-2">
<code className="flex-1 font-mono text-xs bg-muted px-3 py-2 rounded-md select-all truncate tabular-nums">
{showKey ? apiKey : masked}
</code>
<Button variant="outline" size="sm" onClick={() => setShowKey(!showKey)}>
{showKey ? 'Hide' : 'Show'}
</Button>
<Button variant="outline" size="sm" onClick={copy}>
{copied ? 'Copied' : 'Copy'}
</Button>
</div>
<div className="mt-4 grid grid-cols-[auto_1fr] gap-x-4 gap-y-1.5 text-xs">
<span className="text-muted-foreground">Base URL</span>
<code className="font-mono">http://localhost:3001/v1</code>
<span className="text-muted-foreground">Endpoint</span>
<code className="font-mono">/v1/chat/completions</code>
</div>
</section>
)
}
export default function KeysPage() {
const queryClient = useQueryClient()
const [platform, setPlatform] = useState<Platform | ''>('')
const [apiKey, setApiKey] = useState('')
const [accountId, setAccountId] = useState('')
const [label, setLabel] = useState('')
const { data: keys = [], isLoading } = useQuery<ApiKey[]>({
queryKey: ['keys'],
queryFn: () => apiFetch('/api/keys'),
})
const { data: healthData } = useQuery<HealthData>({
queryKey: ['health'],
queryFn: () => apiFetch('/api/health'),
refetchInterval: 30000,
})
const addKey = useMutation({
mutationFn: (body: { platform: string; key: string; label?: string }) =>
apiFetch('/api/keys', { method: 'POST', body: JSON.stringify(body) }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['keys'] })
queryClient.invalidateQueries({ queryKey: ['health'] })
queryClient.invalidateQueries({ queryKey: ['fallback'] })
setPlatform('')
setApiKey('')
setAccountId('')
setLabel('')
},
})
const deleteKey = useMutation({
mutationFn: (id: number) => apiFetch(`/api/keys/${id}`, { method: 'DELETE' }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['keys'] })
queryClient.invalidateQueries({ queryKey: ['health'] })
},
})
const checkAll = useMutation({
mutationFn: () => apiFetch('/api/health/check-all', { method: 'POST' }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['health'] })
queryClient.invalidateQueries({ queryKey: ['keys'] })
},
})
const checkKey = useMutation({
mutationFn: (keyId: number) => apiFetch(`/api/health/check/${keyId}`, { method: 'POST' }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['health'] })
queryClient.invalidateQueries({ queryKey: ['keys'] })
},
})
const needsAccountId = platform === 'cloudflare'
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault()
if (!platform || !apiKey) return
if (needsAccountId && !accountId) return
const key = needsAccountId ? `${accountId}:${apiKey}` : apiKey
addKey.mutate({ platform, key, label: label || undefined })
}
const healthKeyMap = new Map<number, { status: string; lastCheckedAt: string | null }>()
for (const k of healthData?.keys ?? []) healthKeyMap.set(k.id, k)
const grouped = PLATFORMS.map(p => ({
...p,
keys: keys.filter(k => k.platform === p.value),
})).filter(p => p.keys.length > 0)
return (
<div>
<PageHeader
title="Keys"
description="Provider credentials and the unified API key your apps connect with."
actions={
keys.length > 0 && (
<Button variant="outline" size="sm" onClick={() => checkAll.mutate()} disabled={checkAll.isPending}>
{checkAll.isPending ? 'Checking…' : 'Check all'}
</Button>
)
}
/>
<div className="space-y-8">
<UnifiedKeySection />
<section>
<h2 className="text-sm font-medium mb-3">Add a provider key</h2>
<form onSubmit={handleSubmit} className="flex flex-wrap items-end gap-3 rounded-lg border p-4 bg-card">
<div className="space-y-1.5">
<Label className="text-xs">Platform</Label>
<Select value={platform} onValueChange={(v) => setPlatform(v as Platform)}>
<SelectTrigger className="w-[220px]">
<SelectValue placeholder="Select provider" />
</SelectTrigger>
<SelectContent>
{PLATFORMS.map(p => (
<SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
))}
</SelectContent>
</Select>
</div>
{needsAccountId && (
<div className="space-y-1.5">
<Label className="text-xs">Account ID</Label>
<Input
value={accountId}
onChange={e => setAccountId(e.target.value)}
placeholder="a1b2c3d4…"
className="w-[200px] font-mono text-xs"
/>
</div>
)}
<div className="space-y-1.5 flex-1 min-w-[240px]">
<Label className="text-xs">{needsAccountId ? 'API token' : 'API key'}</Label>
<Input
type="password"
value={apiKey}
onChange={e => setApiKey(e.target.value)}
placeholder={needsAccountId ? 'Bearer token' : 'paste key here'}
className="font-mono text-xs"
/>
</div>
<div className="space-y-1.5">
<Label className="text-xs">Label</Label>
<Input
value={label}
onChange={e => setLabel(e.target.value)}
placeholder="optional"
className="w-[160px]"
/>
</div>
<Button type="submit" size="sm" disabled={!platform || !apiKey || (needsAccountId && !accountId) || addKey.isPending}>
{addKey.isPending ? 'Adding…' : 'Add key'}
</Button>
</form>
{addKey.isError && (
<p className="text-destructive text-xs mt-2">{(addKey.error as Error).message}</p>
)}
</section>
<section>
<h2 className="text-sm font-medium mb-3">Configured providers</h2>
{isLoading ? (
<p className="text-sm text-muted-foreground">Loading…</p>
) : keys.length === 0 ? (
<div className="rounded-lg border border-dashed p-8 text-center">
<p className="text-sm text-muted-foreground">
No provider keys yet. Add one above to start routing.
</p>
</div>
) : (
<div className="space-y-6">
{grouped.map(group => (
<div key={group.value}>
<div className="flex items-baseline justify-between mb-2">
<h3 className="text-sm font-medium">{group.label}</h3>
<span className="text-xs text-muted-foreground tabular-nums">
{group.keys.length} key{group.keys.length === 1 ? '' : 's'}
</span>
</div>
<div className="rounded-lg border divide-y bg-card overflow-hidden">
{group.keys.map(k => {
const h = healthKeyMap.get(k.id)
const status = h?.status ?? k.status
const lastChecked = h?.lastCheckedAt
return (
<div key={k.id} className="flex items-center gap-3 px-4 py-3 hover:bg-muted/40 transition-colors">
<span className={`size-1.5 rounded-full flex-shrink-0 ${statusDot[status] ?? statusDot.unknown}`} />
<code className="text-xs font-mono flex-shrink-0">{k.maskedKey}</code>
{k.label && <span className="text-xs text-muted-foreground">{k.label}</span>}
<span className="text-xs text-muted-foreground">{statusLabel[status] ?? status}</span>
<div className="flex-1" />
{lastChecked && (
<span className="text-[11px] text-muted-foreground tabular-nums">
{new Date(lastChecked).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })}
</span>
)}
<Button variant="ghost" size="xs" onClick={() => checkKey.mutate(k.id)} disabled={checkKey.isPending}>
Check
</Button>
<Button variant="ghost" size="xs" className="text-muted-foreground hover:text-destructive" onClick={() => deleteKey.mutate(k.id)} disabled={deleteKey.isPending}>
Remove
</Button>
</div>
)
})}
</div>
</div>
))}
</div>
)}
</section>
</div>
</div>
)
}
import { useState, useRef, useEffect } from 'react'
import { useQuery } from '@tanstack/react-query'
import { apiFetch } from '@/lib/api'
import { Button } from '@/components/ui/button'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
import { PageHeader } from '@/components/page-header'
interface FallbackEntry {
modelDbId: number
priority: number
enabled: boolean
platform: string
modelId: string
displayName: string
sizeLabel: string
keyCount: number
}
interface ChatMessage {
role: 'user' | 'assistant'
content: string
meta?: {
platform?: string
model?: string
latency?: number
fallbackAttempts?: number
}
}
export default function PlaygroundPage() {
const [messages, setMessages] = useState<ChatMessage[]>([])
const [input, setInput] = useState('')
const [loading, setLoading] = useState(false)
const [selectedModel, setSelectedModel] = useState<string>('auto')
const messagesEndRef = useRef<HTMLDivElement>(null)
const inputRef = useRef<HTMLTextAreaElement>(null)
const { data: keyData } = useQuery<{ apiKey: string }>({
queryKey: ['unified-key'],
queryFn: () => apiFetch('/api/settings/api-key'),
})
const { data: fallbackEntries = [] } = useQuery<FallbackEntry[]>({
queryKey: ['fallback'],
queryFn: () => apiFetch('/api/fallback'),
})
const availableModels = fallbackEntries.filter(e => e.keyCount > 0 && e.enabled)
useEffect(() => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
}, [messages])
const handleSend = async () => {
const text = input.trim()
if (!text || loading) return
const userMsg: ChatMessage = { role: 'user', content: text }
const newMessages = [...messages, userMsg]
setMessages(newMessages)
setInput('')
setLoading(true)
inputRef.current?.focus()
try {
const headers: Record<string, string> = { 'Content-Type': 'application/json' }
if (keyData?.apiKey) headers['Authorization'] = `Bearer ${keyData.apiKey}`
const body: any = {
messages: newMessages.map(m => ({ role: m.role, content: m.content })),
}
if (selectedModel !== 'auto') body.model = selectedModel
const base = import.meta.env.BASE_URL.replace(/\/$/, '')
const start = Date.now()
const res = await fetch(`${base}/v1/chat/completions`, {
method: 'POST',
headers,
body: JSON.stringify(body),
})
const latency = Date.now() - start
const routedVia = res.headers.get('X-Routed-Via')
const fallbackAttempts = res.headers.get('X-Fallback-Attempts')
if (!res.ok) {
const err = await res.json().catch(() => ({ error: { message: `HTTP ${res.status}` } }))
setMessages([...newMessages, {
role: 'assistant',
content: `Error: ${err.error?.message ?? 'Unknown error'}`,
}])
return
}
const data = await res.json()
const content = data.choices?.[0]?.message?.content ?? JSON.stringify(data, null, 2)
const via = data._routed_via ?? (routedVia ? {
platform: routedVia.split('/')[0],
model: routedVia.split('/').slice(1).join('/'),
} : undefined)
setMessages([...newMessages, {
role: 'assistant',
content,
meta: {
platform: via?.platform,
model: via?.model,
latency,
fallbackAttempts: fallbackAttempts ? parseInt(fallbackAttempts) : undefined,
},
}])
} catch (err: any) {
setMessages([...newMessages, {
role: 'assistant',
content: `Error: ${err.message}`,
}])
} finally {
setLoading(false)
setTimeout(() => inputRef.current?.focus(), 0)
}
}
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
handleSend()
}
}
const handleClear = () => {
setMessages([])
inputRef.current?.focus()
}
const activeModelLabel = selectedModel === 'auto'
? 'Auto (fallback chain)'
: availableModels.find(m => m.modelId === selectedModel)?.displayName ?? selectedModel
return (
<div className="flex flex-col h-[calc(100vh-8rem)]">
<PageHeader
title="Playground"
description="Send a chat completion through the router and see which provider serves it."
actions={
<>
<Select value={selectedModel} onValueChange={(v) => setSelectedModel(v ?? 'auto')}>
<SelectTrigger className="w-[260px]">
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="auto">Auto (fallback chain)</SelectItem>
{availableModels.map(m => (
<SelectItem key={m.modelDbId} value={m.modelId}>
<span className="flex items-center gap-2">
<span>{m.displayName}</span>
<span className="text-xs text-muted-foreground">{m.platform}</span>
</span>
</SelectItem>
))}
</SelectContent>
</Select>
{messages.length > 0 && (
<Button variant="outline" size="sm" onClick={handleClear}>
Clear
</Button>
)}
</>
}
/>
<div className="flex-1 flex flex-col rounded-lg border bg-card overflow-hidden min-h-0">
<div className="flex-1 overflow-y-auto p-6 space-y-4">
{messages.length === 0 ? (
<div className="flex items-center justify-center h-full text-center">
<div className="space-y-2 max-w-sm">
<p className="text-base font-medium">Send a message to get started.</p>
<p className="text-sm text-muted-foreground">
Using <span className="text-foreground">{activeModelLabel}</span>. Switch models in the selector above.
</p>
</div>
</div>
) : (
<>
{messages.map((msg, i) => (
<div key={i} className={`flex ${msg.role === 'user' ? 'justify-end' : 'justify-start'}`}>
<div
className={`max-w-[78%] rounded-2xl px-4 py-2.5 text-sm leading-relaxed ${
msg.role === 'user'
? 'bg-primary text-primary-foreground'
: 'bg-muted'
}`}
>
<div className="whitespace-pre-wrap">{msg.content}</div>
{msg.meta && (
<div className="flex items-center gap-2 mt-2 flex-wrap text-[11px] opacity-70 tabular-nums">
{msg.meta.platform && <span>{msg.meta.platform}</span>}
{msg.meta.model && <span className="font-mono">· {msg.meta.model}</span>}
{msg.meta.latency != null && <span>· {msg.meta.latency} ms</span>}
{msg.meta.fallbackAttempts != null && msg.meta.fallbackAttempts > 0 && (
<span>· {msg.meta.fallbackAttempts} fallback{msg.meta.fallbackAttempts > 1 ? 's' : ''}</span>
)}
</div>
)}
</div>
</div>
))}
{loading && (
<div className="flex justify-start">
<div className="bg-muted rounded-2xl px-4 py-3">
<div className="flex gap-1">
<span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '0ms' }} />
<span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '150ms' }} />
<span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '300ms' }} />
</div>
</div>
</div>
)}
<div ref={messagesEndRef} />
</>
)}
</div>
<div className="border-t bg-background/50 p-3">
<div className="flex gap-2 items-end">
<textarea
ref={inputRef}
value={input}
onChange={e => setInput(e.target.value)}
onKeyDown={handleKeyDown}
placeholder="Type a message… (⏎ to send, ⇧⏎ for newline)"
rows={1}
className="flex-1 resize-none rounded-md border bg-background px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-ring/50 min-h-[40px] max-h-[160px]"
style={{ height: 'auto', overflow: 'hidden' }}
onInput={e => {
const el = e.target as HTMLTextAreaElement
el.style.height = 'auto'
el.style.height = Math.min(el.scrollHeight, 160) + 'px'
}}
/>
<Button onClick={handleSend} disabled={loading || !input.trim()} size="default">
{loading ? 'Sending…' : 'Send'}
</Button>
</div>
</div>
</div>
</div>
)
}
{
"compilerOptions": {
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
"target": "es2023",
"lib": ["ES2023", "DOM", "DOM.Iterable"],
"module": "esnext",
"types": ["vite/client"],
"skipLibCheck": true,
/* Bundler mode */
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"verbatimModuleSyntax": true,
"moduleDetection": "force",
"noEmit": true,
"jsx": "react-jsx",
/* Linting */
"noUnusedLocals": true,
"noUnusedParameters": true,
"erasableSyntaxOnly": true,
"noFallthroughCasesInSwitch": true,
"paths": {
"@/*": ["./src/*"]
}
},
"include": ["src"]
}
{
"files": [],
"references": [
{ "path": "./tsconfig.app.json" },
{ "path": "./tsconfig.node.json" }
],
"compilerOptions": {
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
}
}
}
{
"compilerOptions": {
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
"target": "es2023",
"lib": ["ES2023"],
"module": "esnext",
"types": ["node"],
"skipLibCheck": true,
/* Bundler mode */
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"verbatimModuleSyntax": true,
"moduleDetection": "force",
"noEmit": true,
/* Linting */
"noUnusedLocals": true,
"noUnusedParameters": true,
"erasableSyntaxOnly": true,
"noFallthroughCasesInSwitch": true
},
"include": ["vite.config.ts"]
}
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
import path from 'path'
export default defineConfig({
plugins: [react(), tailwindcss()],
base: process.env.VITE_BASE ?? '/',
resolve: {
alias: {
'@': path.resolve(__dirname, './src'),
},
},
server: {
proxy: {
'/api': 'http://localhost:3001',
'/v1': 'http://localhost:3001',
},
},
})
#set page(margin: (x: 1.2in, y: 1in), numbering: "1")
#set text(font: "New Computer Modern", size: 10.5pt)
#set heading(numbering: "1.1")
#set par(justify: true, leading: 0.65em)
#set table(stroke: 0.5pt + luma(180))
#show heading.where(level: 1): it => {
v(0.8em)
text(size: 16pt, weight: "bold", it)
v(0.4em)
}
#show heading.where(level: 2): it => {
v(0.6em)
text(size: 13pt, weight: "bold", it)
v(0.3em)
}
#show heading.where(level: 3): it => {
v(0.4em)
text(size: 11pt, weight: "bold", it)
v(0.2em)
}
#show table: set text(size: 9pt)
// Title page
#align(center)[
#v(2in)
#text(size: 26pt, weight: "bold")[Free AI API Platforms]
#v(0.2em)
#text(size: 14pt, fill: luma(80))[Ongoing Free Tiers for a Unified LLM Routing Service]
#v(1em)
#line(length: 40%, stroke: 0.5pt + luma(120))
#v(0.5em)
#text(size: 11pt, fill: luma(100))[April 2026]
#v(0.3em)
#text(size: 10pt, fill: luma(130))[Only platforms with ongoing monthly free access --- no expiring trial credits.]
]
#pagebreak()
#outline(title: "Contents", indent: 1.5em)
#pagebreak()
= Executive Summary
This report catalogs every major platform offering *ongoing* free API access to LLMs (not one-time expiring trial credits). The goal: a service where users contribute their free API keys, and a unified endpoint routes requests to the best available free LLM, ranked by intelligence.
*Key Findings:*
- *13 platforms* offer genuinely ongoing free tiers. None require a credit card.
- *Google AI Studio* (Gemini 2.5 Pro) offers the highest-intelligence model for free.
- *Cerebras* and *NVIDIA NIM* offer the most generous throughput.
- *Groq* and *Cerebras* offer the fastest inference speeds.
= Platform-by-Platform Analysis
== Google AI Studio (Gemini API)
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing, no expiration],
[*Credit Card*], [No],
[*Best Free Model*], [Gemini 2.5 Pro],
[*Other Models*], [Gemini 2.5 Flash, Gemini 2.5 Flash-Lite],
)
*Rate Limits:*
#table(
columns: (auto, 0.5in, 0.5in, 0.7in),
align: (left, center, center, center),
[*Model*], [*RPM*], [*RPD*], [*TPM*],
[Gemini 2.5 Pro], [5], [100], [250,000],
[Gemini 2.5 Flash], [10], [250], [250,000],
[Gemini 2.5 Flash-Lite], [15], [1,000], [250,000],
)
*Monthly Token Budget:* ~12M tokens (Pro), ~30M (Flash), ~120M (Flash-Lite)
*Benchmarks (Gemini 2.5 Pro):*
#table(
columns: (auto, auto),
align: (left, center),
[*Benchmark*], [*Score*],
[Global MMLU], [89.8%],
[MMLU-Pro], [86.0%],
[AIME], [88.0%],
[GPQA], [84.0%],
[SWE-Bench Verified], [63.8%],
[Chatbot Arena ELO], [~1450+],
)
*Speed:* ~80--150 tokens/sec
*Limitations:* Free tier data may be used for training. Rate limits reduced 50--80% in Dec 2025 due to abuse. Limits are per-project.
== Groq
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing, no expiration],
[*Credit Card*], [No],
[*Best Free Model*], [Llama 3.3 70B Versatile],
[*Other Models*], [Llama 4 Scout, Qwen3 32B, Llama 3.1 8B, Kimi K2, 15+ more],
)
*Rate Limits:*
#table(
columns: (auto, 0.4in, 0.55in, 0.55in, 0.65in),
align: (left, center, center, center, center),
[*Model*], [*RPM*], [*RPD*], [*TPM*], [*TPD*],
[Llama 3.3 70B], [30], [1,000], [6,000], [~500K],
[Llama 4 Scout 17B], [30], [1,000], [30,000], [~1M],
[Qwen3 32B], [60], [~1,000], [~6,000], [~500K],
[Llama 3.1 8B], [30], [14,400], [6,000], [500K],
)
*Monthly Token Budget:* ~15M/month per model, ~45--60M combined
*Benchmarks (Llama 3.3 70B):* MMLU 82.0%, HumanEval 88.4%, Arena ELO ~1250
*Speed:* 276--316 tok/sec (standard), up to 1,665 tok/sec (speculative decoding)
*Limitations:* Cached tokens don't count toward limits (advantage). Only open-source models.
== Cerebras
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing, no expiration],
[*Credit Card*], [No],
[*Best Free Model*], [Qwen3 235B-A22B Instruct],
[*Other Models*], [Llama 3.1 8B/70B, Llama 4 Scout, GPT-OSS 120B],
)
*Rate Limits:*
#table(
columns: (auto, auto),
align: (left, center),
[RPM], [30],
[TPM], [60,000],
[Tokens/Day], [1,000,000],
[Context Window (free)], [8,192 tokens],
)
*Monthly Token Budget:* ~30M tokens/month
*Benchmarks (Qwen3 235B):* MMLU 88.4%, HumanEval 79.2%, AIME '24 85.7%, Arena ELO 1422
*Speed:* ~1,400 tok/sec (Qwen3 235B), ~2,600 tok/sec (Scout), ~1,800 tok/sec (8B)
*Limitations:* Context window capped at 8,192 tokens on free tier (major limitation).
== SambaNova
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing (after initial \$5 credit expires)],
[*Credit Card*], [No],
[*Best Free Model*], [Llama 3.1 405B / MiniMax-M2.5],
[*Other Models*], [Llama 3.3 70B, Qwen3 32B, DeepSeek V3.1, Llama 4 Maverick],
)
*Rate Limits:*
#table(
columns: (auto, 0.5in, 0.7in),
align: (left, center, center),
[*Model*], [*RPM*], [*TPD*],
[Llama 3.1 405B], [10], [~200K],
[Llama 3.3 70B], [20], [~200K],
[Llama 3.1 8B], [30], [~200K],
)
*Monthly Token Budget:* ~6M tokens/month
*Benchmarks (Llama 3.1 405B):* MMLU 88.6%, HumanEval 89.0%, MATH 73.8%, Arena ELO ~1320
*Speed:* ~114 tok/sec (405B)
== NVIDIA NIM
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing, rate-limited (no token cap)],
[*Credit Card*], [No (requires NVIDIA Developer signup)],
[*Best Free Model*], [Nemotron 3 Super 120B, Kimi K2.5, GLM-5 744B, DeepSeek-R1 671B],
[*Catalog*], [100+ models],
)
*Rate Limits:* 40 RPM, no daily token cap
*Monthly Token Budget:* ~50--100M tokens/month (practically)
*Speed:* Varies by model; NIM-optimized for throughput
*Limitations:* Intended for prototyping/evaluation. Heavy models may be slow at peak times.
== Mistral (Experiment Plan)
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing (Experiment plan)],
[*Credit Card*], [No (requires phone verification)],
[*Best Free Model*], [Mistral Large 3],
[*Other Models*], [Codestral, Mistral Small, all Mistral models],
)
*Rate Limits:* 2 RPM, 500K TPM, 1B monthly token cap
*Monthly Token Budget:* ~50--100M tokens/month (2 RPM is the bottleneck)
*Benchmarks (Mistral Large 3):* MMLU 85.5%, Arena ELO ~1280
*Limitations:* Only 2 RPM is extremely restrictive. Data may be used for training.
== OpenRouter (Free Models)
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing, free model variants],
[*Credit Card*], [No],
[*Best Free Model*], [DeepSeek R1 (free), Qwen3 Coder 480B (free)],
[*Free Models*], [29 total, including Gemma 3, Nemotron 3 Super],
)
*Rate Limits:*
#table(
columns: (auto, 0.5in, 0.5in),
align: (left, center, center),
[*Tier*], [*RPM*], [*RPD*],
[No credits purchased], [20], [50],
[\$10+ credits purchased], [20], [1,000],
)
*Monthly Token Budget:* ~6M (no credits) / ~120M (\$10 purchase)
*Benchmarks (DeepSeek R1 free):* MMLU 90.8%, AIME '24 79.8%, Arena ELO 1398
== GitHub Models
#table(
columns: (1.3in, auto),
[*Free Tier Type*], [Ongoing],
[*Credit Card*], [No (requires GitHub account)],
[*Best Free Model*], [GPT-4o, DeepSeek-R1, Llama 3.3 70B],
)
*Rate Limits:*
#table(
columns: (auto, 0.4in, 0.4in, 0.8in, 0.8in),
align: (left, center, center, center, center),
[*Tier*], [*RPM*], [*RPD*], [*Input Tok/Req*], [*Output Tok/Req*],
[High (GPT-4o)], [10], [50], [8,000], [4,000],
[Low (smaller)], [15], [150], [8,000], [4,000],
)
*Monthly Token Budget:* ~18M (high), ~54M (low)
*Benchmarks (GPT-4o):* MMLU 88.7%, HumanEval 90.2%, Arena ELO ~1350
== Other Platforms
#table(
columns: (auto, auto, auto, auto),
align: (left, left, center, left),
[*Platform*], [*Best Free Model*], [*Monthly Tokens*], [*Notes*],
[Hugging Face], [Various (1000s)], [~5--10M], [100K inference credits/mo],
[Cohere], [Command R+], [~4M], [1,000 calls/mo, 20 RPM],
[Cloudflare Workers AI], [Llama 3.1 70B], [~18--45M], [10K neurons/day],
[Fireworks AI], [Open-source], [~5--10M], [10 RPM (after \$1 credit)],
)
#pagebreak()
= Comprehensive Rankings
== By Intelligence (Best Free Model Per Platform)
#table(
columns: (0.3in, auto, auto, 0.5in, 0.55in, 0.55in, auto),
align: (center, left, left, center, center, center, left),
[*\#*], [*Platform*], [*Best Free Model*], [*MMLU*], [*Human\ Eval*], [*Arena\ ELO*], [*Tier*],
[1], [Google AI Studio], [Gemini 2.5 Pro], [89.8%], [~92%], [~1450], [Frontier],
[2], [OpenRouter], [DeepSeek R1 (free)], [90.8%], [~85%], [1398], [Frontier],
[3], [Cerebras], [Qwen3 235B], [88.4%], [79.2%], [1422], [Near-Frontier],
[4], [SambaNova], [Llama 3.1 405B], [88.6%], [89.0%], [~1320], [Near-Frontier],
[5], [GitHub Models], [GPT-4o], [88.7%], [90.2%], [~1350], [Near-Frontier],
[6], [Cohere], [Command R+], [88.2%], [--], [~1200], [Strong],
[7], [Mistral], [Mistral Large 3], [85.5%], [--], [~1280], [Strong],
[8], [NVIDIA NIM], [Nemotron 3 / GLM-5], [--], [--], [~1300], [Strong],
[9], [Groq], [Llama 3.3 70B], [82.0%], [88.4%], [~1250], [Good],
[10], [Cloudflare], [Llama 3.1 70B], [82.0%], [88.4%], [~1250], [Good],
)
== By Monthly Token Budget
#table(
columns: (0.3in, auto, auto, auto),
align: (center, left, right, left),
[*\#*], [*Platform*], [*Est. Monthly Tokens*], [*Budget Tier*],
[1], [NVIDIA NIM], [~50--100M+], [Excellent],
[2], [Mistral], [~50--100M], [Excellent],
[3], [Google AI Studio (Flash-Lite)], [~120M], [Excellent],
[4], [Cloudflare Workers AI], [~18--45M], [Very Good],
[5], [Cerebras], [~30M], [Very Good],
[6], [GitHub Models], [~18--54M], [Good],
[7], [Groq], [~15--60M], [Good],
[8], [Hugging Face], [~5--10M], [Moderate],
[9], [SambaNova], [~6M], [Moderate],
[10], [OpenRouter (no credits)], [~6M], [Moderate],
[11], [Fireworks AI], [~5--10M], [Moderate],
[12], [Cohere], [~4M], [Limited],
)
== Final Composite Ranking
Scoring: Intelligence (0--40) + Generosity (0--30) + Usability (0--20) + Reliability (0--10)
#table(
columns: (0.3in, auto, auto, 0.4in, 0.45in, 0.45in, 0.4in, 0.45in),
align: (center, left, left, center, center, center, center, center),
[*\#*], [*Platform*], [*Best Model*], [*Intel*], [*Gener.*], [*Usab.*], [*Rel.*], [*Total*],
[1], [*Google AI Studio*], [Gemini 2.5 Pro], [40], [18], [14], [8], [*80*],
[2], [*Cerebras*], [Qwen3 235B], [35], [22], [16], [7], [*80*],
[3], [*NVIDIA NIM*], [100+ models], [32], [28], [15], [5], [*80*],
[4], [*Groq*], [Llama 3.3 70B], [28], [20], [20], [8], [*76*],
[5], [*Cloudflare*], [Llama 3.1 70B], [28], [22], [12], [6], [*68*],
[6], [*OpenRouter*], [DeepSeek R1], [38], [12], [12], [6], [*68*],
[7], [*GitHub Models*], [GPT-4o], [34], [16], [10], [7], [*67*],
[8], [*Mistral*], [Mistral Large 3], [30], [24], [6], [6], [*66*],
[9], [*SambaNova*], [Llama 3.1 405B], [34], [10], [12], [7], [*63*],
[10], [*Cohere*], [Command R+], [30], [8], [14], [7], [*59*],
[11], [*Hugging Face*], [Various], [25], [12], [10], [5], [*52*],
[12], [*Fireworks AI*], [Open-source], [25], [10], [10], [5], [*50*],
)
#pagebreak()
= Architecture for Unified Routing Service
== Routing Priority (by intelligence)
+ *Gemini 2.5 Pro* (Google AI Studio) --- highest intelligence, 100 RPD/key
+ *DeepSeek R1* (OpenRouter free) --- near-frontier reasoning, 50 RPD/key
+ *Qwen3 235B* (Cerebras) --- near-frontier, 1M tokens/day, 8K context limit
+ *GPT-4o* (GitHub Models) --- strong, 50 RPD/key
+ *Llama 3.1 405B* (SambaNova) --- strong, 10 RPM
+ *Mistral Large 3* (Mistral) --- good, 2 RPM bottleneck
+ *Llama 3.3 70B* (Groq) --- good intelligence, fastest speed, 1,000 RPD
+ *Any NIM model* (NVIDIA) --- huge variety, no daily token cap
== Key Pooling Multiplier
If 100 users each contribute one API key per platform:
#table(
columns: (auto, auto, auto),
align: (left, center, center),
[*Platform*], [*Per Key RPD*], [*100 Keys RPD*],
[Google AI Studio (Pro)], [100], [*10,000*],
[Groq (70B)], [1,000], [*100,000*],
[Cerebras], [~33K tok/hr], [*3.3M tok/hr*],
[OpenRouter (R1)], [50], [*5,000*],
[GitHub Models (GPT-4o)], [50], [*5,000*],
[NVIDIA NIM], [40 RPM], [*4,000 RPM*],
)
== Recommended Architecture
*"Quality burst" backends* (highest intelligence, low per-key limits):
- Gemini 2.5 Pro, DeepSeek R1, GPT-4o
*"Workhorse" backends* (high throughput, good intelligence):
- Cerebras Qwen3 235B (30M tok/mo/key)
- NVIDIA NIM (no daily cap, 100+ models)
- Groq Llama 3.3 70B (fast, reliable)
*"Speed" backends* (real-time chat):
- Groq: 276--1,665 tok/sec
- Cerebras: 1,400--2,600 tok/sec
== Excluded Platforms
#table(
columns: (auto, auto),
align: (left, left),
[*Platform*], [*Reason*],
[OpenAI], [One-time \$5 trial credit, expires],
[Anthropic], [One-time trial credits only],
[Together AI], [\$25 signup credit, no confirmed ongoing free tier],
[DeepSeek], [5M free tokens expire in 30 days (but API is near-free at \$0.28/M)],
)
#v(1em)
#line(length: 100%, stroke: 0.5pt + luma(150))
#text(size: 8.5pt, fill: luma(100))[
_Free tier details change frequently. Verify current limits on each platform's pricing page. Benchmark scores from published papers, LMSYS Chatbot Arena, and OpenLLM Leaderboard as of April 2026._
]
This source diff could not be displayed because it is too large. You can view the blob instead.
{
"name": "freellmapi",
"private": true,
"workspaces": [
"shared",
"server",
"client"
],
"scripts": {
"dev": "concurrently \"npm run dev -w server\" \"npm run dev -w client\"",
"test": "npm run test -w server && npm run test -w client",
"build": "npm run build -w server && npm run build -w client",
"build:server": "npm run build -w server"
},
"devDependencies": {
"concurrently": "^9.1.2"
}
}
{
"name": "@freellmapi/server",
"version": "0.1.0",
"private": true,
"type": "module",
"scripts": {
"dev": "tsx watch src/index.ts",
"build": "tsc",
"start": "node dist/index.js",
"test": "vitest run",
"test:watch": "vitest"
},
"dependencies": {
"@freellmapi/shared": "*",
"better-sqlite3": "^11.8.1",
"cors": "^2.8.5",
"drizzle-orm": "^0.44.2",
"express": "^5.1.0",
"helmet": "^8.1.0",
"zod": "^3.24.4"
},
"devDependencies": {
"@types/better-sqlite3": "^7.6.13",
"@types/cors": "^2.8.17",
"@types/express": "^5.0.2",
"@types/node": "^22.15.3",
"drizzle-kit": "^0.31.1",
"tsx": "^4.19.4",
"typescript": "^5.8.3",
"vitest": "^3.1.3"
}
}
import { describe, it, expect, beforeAll, vi } from 'vitest';
import type { Express } from 'express';
import { createApp } from '../../app.js';
import { initDb, getDb } from '../../db/index.js';
async function req(app: Express, method: string, path: string, body?: any) {
const server = app.listen(0);
const addr = server.address() as any;
const url = `http://127.0.0.1:${addr.port}${path}`;
const res = await fetch(url, {
method,
headers: body ? { 'Content-Type': 'application/json' } : {},
body: body ? JSON.stringify(body) : undefined,
});
const data = await res.text();
server.close();
let json: any = null;
try { json = JSON.parse(data); } catch {}
return { status: res.status, body: json, headers: res.headers, raw: data };
}
describe('Full Integration Flow', () => {
let app: Express;
beforeAll(() => {
process.env.ENCRYPTION_KEY = '0'.repeat(64);
initDb(':memory:');
app = createApp();
// Clean
const db = getDb();
db.prepare('DELETE FROM api_keys').run();
db.prepare('DELETE FROM requests').run();
});
it('Step 1: Verify models are seeded', async () => {
const { status, body } = await req(app, 'GET', '/api/models');
expect(status).toBe(200);
expect(body.length).toBeGreaterThanOrEqual(14);
expect(body[0]).toHaveProperty('modelId');
expect(body[0]).toHaveProperty('hasProvider');
// All should have providers
for (const m of body) {
expect(m.hasProvider).toBe(true);
}
});
it('Step 2: Verify fallback chain is populated', async () => {
const { status, body } = await req(app, 'GET', '/api/fallback');
expect(status).toBe(200);
expect(body.length).toBeGreaterThanOrEqual(14);
expect(body[0]).toHaveProperty('priority');
expect(body[0]).toHaveProperty('enabled');
});
it('Step 3: Proxy returns 429 with no keys', async () => {
const { status, body } = await req(app, 'POST', '/v1/chat/completions', {
messages: [{ role: 'user', content: 'hello' }],
});
// 429 (all exhausted) or 502 (provider error) or 503 (no route)
expect([429, 502, 503]).toContain(status);
expect(body.error).toBeDefined();
});
it('Step 4: Add a Groq key', async () => {
const { status, body } = await req(app, 'POST', '/api/keys', {
platform: 'groq',
key: 'gsk_integration_test_key',
label: 'Integration Test',
});
expect(status).toBe(201);
expect(body.platform).toBe('groq');
expect(body.maskedKey).toContain('...');
});
it('Step 5: Proxy routes to Groq and handles provider error gracefully', async () => {
// Mock fetch to simulate a Groq API error
const origFetch = global.fetch;
vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
const urlStr = typeof url === 'string' ? url : url.toString();
// If it's calling the Groq API, return an error
if (urlStr.includes('api.groq.com')) {
return {
ok: false,
status: 401,
statusText: 'Unauthorized',
json: () => Promise.resolve({ error: { message: 'Invalid API Key' } }),
} as any;
}
// Otherwise pass through (for our test server)
return origFetch(url, init);
});
const { status, body } = await req(app, 'POST', '/v1/chat/completions', {
messages: [{ role: 'user', content: 'hello' }],
});
// 502 (provider error) or 429 (all exhausted after retries)
expect([502, 429]).toContain(status);
expect(body.error).toBeDefined();
vi.restoreAllMocks();
});
it('Step 6: Error was logged in analytics', async () => {
const { status, body } = await req(app, 'GET', '/api/analytics/summary?range=24h');
expect(status).toBe(200);
// May or may not have logged depending on retry behavior
expect(body.totalRequests).toBeGreaterThanOrEqual(0);
});
it('Step 7: Sort fallback by speed', async () => {
const { status } = await req(app, 'POST', '/api/fallback/sort/speed');
expect(status).toBe(200);
const { body } = await req(app, 'GET', '/api/fallback');
expect(body[0].speedRank).toBe(1);
});
it('Step 8: Health endpoint works', async () => {
const { status, body } = await req(app, 'GET', '/api/health');
expect(status).toBe(200);
expect(body).toHaveProperty('platforms');
expect(body).toHaveProperty('keys');
});
it('Step 9: Delete a key if any exist', async () => {
// Add a fresh key to ensure we have one to delete
await req(app, 'POST', '/api/keys', {
platform: 'groq', key: 'gsk_delete_test', label: 'delete-test',
});
const { body: keys } = await req(app, 'GET', '/api/keys');
const target = keys.find((k: any) => k.label === 'delete-test');
expect(target).toBeDefined();
const { status } = await req(app, 'DELETE', `/api/keys/${target.id}`);
expect(status).toBe(200);
});
it('Step 10: Validate request schema', async () => {
const { status } = await req(app, 'POST', '/v1/chat/completions', {
messages: [], // empty
});
expect(status).toBe(400);
const { status: s2 } = await req(app, 'POST', '/v1/chat/completions', {
// missing messages entirely
});
expect(s2).toBe(400);
});
});
import { describe, it, expect, beforeAll } from 'vitest';
import { initDb } from '../../db/index.js';
import { encrypt, decrypt, maskKey } from '../../lib/crypto.js';
describe('Crypto', () => {
beforeAll(() => {
process.env.ENCRYPTION_KEY = '0'.repeat(64);
initDb(':memory:');
});
it('should encrypt and decrypt a key round-trip', () => {
const original = 'gsk_test1234567890abcdef';
const { encrypted, iv, authTag } = encrypt(original);
const decrypted = decrypt(encrypted, iv, authTag);
expect(decrypted).toBe(original);
});
it('should produce different ciphertext for same input (random IV)', () => {
const original = 'same-key';
const a = encrypt(original);
const b = encrypt(original);
expect(a.encrypted).not.toBe(b.encrypted);
expect(a.iv).not.toBe(b.iv);
});
it('should fail to decrypt with wrong auth tag', () => {
const { encrypted, iv } = encrypt('test-key');
expect(() => decrypt(encrypted, iv, 'a'.repeat(32))).toThrow();
});
describe('maskKey', () => {
it('should mask long keys', () => {
expect(maskKey('gsk_test1234567890abcdef')).toBe('gsk_...cdef');
});
it('should mask short keys', () => {
expect(maskKey('abcd')).toBe('****abcd');
});
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { CerebrasProvider } from '../../providers/cerebras.js';
describe('CerebrasProvider', () => {
let provider: CerebrasProvider;
beforeEach(() => {
provider = new CerebrasProvider();
});
it('should have correct platform and name', () => {
expect(provider.platform).toBe('cerebras');
expect(provider.name).toBe('Cerebras');
});
it('should call Cerebras API with OpenAI-compatible format', async () => {
const mockResponse = {
id: 'chatcmpl-456',
object: 'chat.completion',
created: 1234567890,
model: 'qwen3-235b',
choices: [{
index: 0,
message: { role: 'assistant', content: 'Response from Cerebras' },
finish_reason: 'stop',
}],
usage: { prompt_tokens: 8, completion_tokens: 4, total_tokens: 12 },
};
let capturedUrl = '';
vi.spyOn(global, 'fetch').mockImplementation(async (url, _init) => {
capturedUrl = url as string;
return {
ok: true,
json: () => Promise.resolve(mockResponse),
} as any;
});
const result = await provider.chatCompletion(
'csk_test456',
[{ role: 'user', content: 'Hello' }],
'qwen3-235b',
);
expect(capturedUrl).toContain('api.cerebras.ai');
expect(result.choices[0].message.content).toBe('Response from Cerebras');
expect(result._routed_via?.platform).toBe('cerebras');
expect(result._routed_via?.model).toBe('qwen3-235b');
});
it('should validate key', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
expect(await provider.validateKey('valid')).toBe(true);
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { CloudflareProvider } from '../../providers/cloudflare.js';
describe('CloudflareProvider', () => {
let provider: CloudflareProvider;
beforeEach(() => {
provider = new CloudflareProvider();
});
it('should have correct platform and name', () => {
expect(provider.platform).toBe('cloudflare');
expect(provider.name).toBe('Cloudflare Workers AI');
});
it('should parse account_id:token key format', async () => {
let capturedUrl = '';
let capturedHeaders: Record<string, string> = {};
vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
capturedUrl = url as string;
capturedHeaders = (init as any).headers;
return {
ok: true,
json: () => Promise.resolve({ result: { response: 'Hello from CF!' } }),
} as any;
});
const result = await provider.chatCompletion(
'abc123:my-token-here',
[{ role: 'user', content: 'Hi' }],
'@cf/meta/llama-3.1-70b-instruct',
);
expect(capturedUrl).toContain('abc123');
expect(capturedUrl).toContain('@cf/meta/llama-3.1-70b-instruct');
expect(capturedHeaders['Authorization']).toBe('Bearer my-token-here');
expect(result.choices[0].message.content).toBe('Hello from CF!');
});
it('should throw if key format is wrong', async () => {
await expect(
provider.chatCompletion('no-colon-here', [{ role: 'user', content: 'Hi' }], 'model')
).rejects.toThrow(/account_id:api_token/);
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { CohereProvider } from '../../providers/cohere.js';
describe('CohereProvider', () => {
let provider: CohereProvider;
beforeEach(() => {
provider = new CohereProvider();
});
it('should have correct platform and name', () => {
expect(provider.platform).toBe('cohere');
expect(provider.name).toBe('Cohere');
});
it('should translate response to OpenAI format', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({
ok: true,
json: () => Promise.resolve({
id: 'cohere-123',
message: { content: [{ type: 'text', text: 'Hello from Cohere!' }] },
finish_reason: 'COMPLETE',
usage: { tokens: { input_tokens: 10, output_tokens: 5 } },
}),
} as any);
const result = await provider.chatCompletion(
'test-key',
[{ role: 'user', content: 'Hi' }],
'command-r-plus-08-2024',
);
expect(result.object).toBe('chat.completion');
expect(result.choices[0].message.content).toBe('Hello from Cohere!');
expect(result.usage.prompt_tokens).toBe(10);
expect(result.usage.completion_tokens).toBe(5);
expect(result._routed_via?.platform).toBe('cohere');
});
it('should validate key', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
expect(await provider.validateKey('valid')).toBe(true);
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { GoogleProvider } from '../../providers/google.js';
describe('GoogleProvider', () => {
let provider: GoogleProvider;
beforeEach(() => {
provider = new GoogleProvider();
});
it('should have correct platform and name', () => {
expect(provider.platform).toBe('google');
expect(provider.name).toBe('Google AI Studio');
});
it('should call Gemini API and return OpenAI-compatible response', async () => {
const mockResponse = {
candidates: [{
content: { parts: [{ text: 'Hello from Gemini!' }] },
finishReason: 'STOP',
}],
usageMetadata: {
promptTokenCount: 10,
candidatesTokenCount: 5,
totalTokenCount: 15,
},
};
vi.spyOn(global, 'fetch').mockResolvedValueOnce({
ok: true,
json: () => Promise.resolve(mockResponse),
} as any);
const result = await provider.chatCompletion(
'test-key',
[{ role: 'user', content: 'Hi' }],
'gemini-2.5-pro',
);
expect(result.object).toBe('chat.completion');
expect(result.choices[0].message.content).toBe('Hello from Gemini!');
expect(result.choices[0].message.role).toBe('assistant');
expect(result.usage.prompt_tokens).toBe(10);
expect(result.usage.completion_tokens).toBe(5);
expect(result._routed_via?.platform).toBe('google');
});
it('should throw on API error', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({
ok: false,
status: 429,
statusText: 'Too Many Requests',
json: () => Promise.resolve({ error: { message: 'Rate limit exceeded' } }),
} as any);
await expect(
provider.chatCompletion('test-key', [{ role: 'user', content: 'Hi' }], 'gemini-2.5-pro')
).rejects.toThrow(/Rate limit exceeded/);
});
it('should validate key via models endpoint', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
expect(await provider.validateKey('valid-key')).toBe(true);
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: false, status: 401 } as any);
expect(await provider.validateKey('invalid-key')).toBe(false);
});
it('should translate system messages to systemInstruction', async () => {
let capturedBody: any;
vi.spyOn(global, 'fetch').mockImplementation(async (_url, init) => {
capturedBody = JSON.parse((init as any).body);
return {
ok: true,
json: () => Promise.resolve({
candidates: [{ content: { parts: [{ text: 'ok' }] }, finishReason: 'STOP' }],
usageMetadata: { promptTokenCount: 1, candidatesTokenCount: 1, totalTokenCount: 2 },
}),
} as any;
});
await provider.chatCompletion(
'test-key',
[
{ role: 'system', content: 'You are helpful' },
{ role: 'user', content: 'Hi' },
],
'gemini-2.5-pro',
);
expect(capturedBody.systemInstruction).toEqual({ parts: [{ text: 'You are helpful' }] });
expect(capturedBody.contents).toHaveLength(1);
expect(capturedBody.contents[0].role).toBe('user');
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { GroqProvider } from '../../providers/groq.js';
describe('GroqProvider', () => {
let provider: GroqProvider;
beforeEach(() => {
provider = new GroqProvider();
});
it('should have correct platform and name', () => {
expect(provider.platform).toBe('groq');
expect(provider.name).toBe('Groq');
});
it('should call Groq API with OpenAI-compatible format', async () => {
const mockResponse = {
id: 'chatcmpl-123',
object: 'chat.completion',
created: 1234567890,
model: 'llama-3.3-70b-versatile',
choices: [{
index: 0,
message: { role: 'assistant', content: 'Hello!' },
finish_reason: 'stop',
}],
usage: { prompt_tokens: 5, completion_tokens: 2, total_tokens: 7 },
};
let capturedHeaders: Record<string, string> = {};
vi.spyOn(global, 'fetch').mockImplementation(async (_url, init) => {
capturedHeaders = Object.fromEntries(
Object.entries((init as any).headers)
);
return {
ok: true,
json: () => Promise.resolve(mockResponse),
} as any;
});
const result = await provider.chatCompletion(
'gsk_test123',
[{ role: 'user', content: 'Hi' }],
'llama-3.3-70b-versatile',
);
expect(capturedHeaders['Authorization']).toBe('Bearer gsk_test123');
expect(result.choices[0].message.content).toBe('Hello!');
expect(result._routed_via?.platform).toBe('groq');
});
it('should throw on API error', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({
ok: false,
status: 401,
statusText: 'Unauthorized',
json: () => Promise.resolve({ error: { message: 'Invalid API key' } }),
} as any);
await expect(
provider.chatCompletion('bad-key', [{ role: 'user', content: 'Hi' }], 'llama-3.3-70b-versatile')
).rejects.toThrow(/Invalid API key/);
});
it('should validate key', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
expect(await provider.validateKey('valid')).toBe(true);
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: false } as any);
expect(await provider.validateKey('invalid')).toBe(false);
});
});
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { OpenAICompatProvider } from '../../providers/openai-compat.js';
describe('OpenAICompatProvider', () => {
let provider: OpenAICompatProvider;
beforeEach(() => {
provider = new OpenAICompatProvider({
platform: 'groq',
name: 'TestProvider',
baseUrl: 'https://api.test.com/v1',
extraHeaders: { 'X-Custom': 'test' },
});
});
it('should set platform and name from config', () => {
expect(provider.platform).toBe('groq');
expect(provider.name).toBe('TestProvider');
});
it('should call API with correct URL and headers', async () => {
let capturedUrl = '';
let capturedHeaders: Record<string, string> = {};
vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
capturedUrl = url as string;
capturedHeaders = (init as any).headers;
return {
ok: true,
json: () => Promise.resolve({
id: 'test-id',
object: 'chat.completion',
created: 123,
model: 'test-model',
choices: [{ index: 0, message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
}),
} as any;
});
await provider.chatCompletion('my-key', [{ role: 'user', content: 'test' }], 'test-model');
expect(capturedUrl).toBe('https://api.test.com/v1/chat/completions');
expect(capturedHeaders['Authorization']).toBe('Bearer my-key');
expect(capturedHeaders['X-Custom']).toBe('test');
});
it('should throw on error response', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({
ok: false,
status: 429,
statusText: 'Rate Limited',
json: () => Promise.resolve({ error: { message: 'Too many requests' } }),
} as any);
await expect(
provider.chatCompletion('key', [{ role: 'user', content: 'hi' }], 'model')
).rejects.toThrow(/Too many requests/);
});
it('should validate key using models endpoint', async () => {
vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
expect(await provider.validateKey('valid')).toBe(true);
});
});
describe('OpenAICompatProvider - platform instances', () => {
const platforms = [
{ platform: 'sambanova', name: 'SambaNova', baseUrl: 'https://api.sambanova.ai/v1' },
{ platform: 'nvidia', name: 'NVIDIA NIM', baseUrl: 'https://integrate.api.nvidia.com/v1' },
{ platform: 'mistral', name: 'Mistral', baseUrl: 'https://api.mistral.ai/v1' },
{ platform: 'openrouter', name: 'OpenRouter', baseUrl: 'https://openrouter.ai/api/v1' },
{ platform: 'github', name: 'GitHub Models', baseUrl: 'https://models.inference.ai.azure.com' },
{ platform: 'fireworks', name: 'Fireworks AI', baseUrl: 'https://api.fireworks.ai/inference/v1' },
] as const;
for (const p of platforms) {
it(`${p.name} provider should make requests to ${p.baseUrl}`, async () => {
const provider = new OpenAICompatProvider(p as any);
let capturedUrl = '';
vi.spyOn(global, 'fetch').mockImplementation(async (url) => {
capturedUrl = url as string;
return {
ok: true,
json: () => Promise.resolve({
id: 'id', object: 'chat.completion', created: 1, model: 'm',
choices: [{ index: 0, message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 },
}),
} as any;
});
const result = await provider.chatCompletion('key', [{ role: 'user', content: 'hi' }], 'model');
expect(capturedUrl).toContain(p.baseUrl);
expect(result._routed_via?.platform).toBe(p.platform);
});
}
});
import { describe, it, expect, beforeAll } from 'vitest';
import type { Express } from 'express';
import { createApp } from '../../app.js';
import { initDb } from '../../db/index.js';
async function request(app: Express, method: string, path: string, body?: any) {
const server = app.listen(0);
const addr = server.address() as any;
const url = `http://127.0.0.1:${addr.port}${path}`;
const res = await fetch(url, {
method,
headers: body ? { 'Content-Type': 'application/json' } : {},
body: body ? JSON.stringify(body) : undefined,
});
const data = await res.json().catch(() => null);
server.close();
return { status: res.status, body: data };
}
describe('Fallback API', () => {
let app: Express;
beforeAll(() => {
process.env.ENCRYPTION_KEY = '0'.repeat(64);
initDb(':memory:');
app = createApp();
});
it('GET /api/fallback returns fallback chain', async () => {
const { status, body } = await request(app, 'GET', '/api/fallback');
expect(status).toBe(200);
expect(Array.isArray(body)).toBe(true);
expect(body.length).toBeGreaterThan(0);
// Should be sorted by priority
for (let i = 1; i < body.length; i++) {
expect(body[i].priority).toBeGreaterThanOrEqual(body[i - 1].priority);
}
});
it('GET /api/fallback entries have expected fields', async () => {
const { body } = await request(app, 'GET', '/api/fallback');
const first = body[0];
expect(first).toHaveProperty('modelDbId');
expect(first).toHaveProperty('priority');
expect(first).toHaveProperty('enabled');
expect(first).toHaveProperty('platform');
expect(first).toHaveProperty('displayName');
expect(first).toHaveProperty('intelligenceRank');
});
it('PUT /api/fallback updates order', async () => {
const { body: original } = await request(app, 'GET', '/api/fallback');
// Reverse the order
const reversed = original.map((e: any, i: number) => ({
modelDbId: e.modelDbId,
priority: original.length - i,
enabled: e.enabled,
}));
const { status } = await request(app, 'PUT', '/api/fallback', reversed);
expect(status).toBe(200);
// Verify order changed
const { body: after } = await request(app, 'GET', '/api/fallback');
expect(after[0].modelDbId).toBe(original[original.length - 1].modelDbId);
// Restore original order
const restore = original.map((e: any, i: number) => ({
modelDbId: e.modelDbId,
priority: i + 1,
enabled: e.enabled,
}));
await request(app, 'PUT', '/api/fallback', restore);
});
it('POST /api/fallback/sort/intelligence sorts by intelligence', async () => {
const { status } = await request(app, 'POST', '/api/fallback/sort/intelligence');
expect(status).toBe(200);
const { body } = await request(app, 'GET', '/api/fallback');
// Should be sorted ascending by intelligence rank
for (let i = 1; i < body.length; i++) {
expect(body[i].intelligenceRank).toBeGreaterThanOrEqual(body[i - 1].intelligenceRank);
}
});
it('POST /api/fallback/sort/speed sorts by speed', async () => {
const { status } = await request(app, 'POST', '/api/fallback/sort/speed');
expect(status).toBe(200);
const { body } = await request(app, 'GET', '/api/fallback');
// Should be sorted ascending by speed rank
for (let i = 1; i < body.length; i++) {
expect(body[i].speedRank).toBeGreaterThanOrEqual(body[i - 1].speedRank);
}
});
it('POST /api/fallback/sort/invalid returns 400', async () => {
const { status } = await request(app, 'POST', '/api/fallback/sort/invalid');
expect(status).toBe(400);
});
});
import { describe, it, expect, beforeAll, beforeEach } from 'vitest';
import type { Express } from 'express';
import { createApp } from '../../app.js';
import { initDb, getDb } from '../../db/index.js';
async function request(app: Express, method: string, path: string, body?: any) {
const server = app.listen(0);
const addr = server.address() as any;
const url = `http://127.0.0.1:${addr.port}${path}`;
const res = await fetch(url, {
method,
headers: body ? { 'Content-Type': 'application/json' } : {},
body: body ? JSON.stringify(body) : undefined,
});
const data = await res.json().catch(() => null);
server.close();
return { status: res.status, body: data };
}
describe('Keys API', () => {
let app: Express;
beforeAll(() => {
process.env.ENCRYPTION_KEY = '0'.repeat(64);
initDb(':memory:');
app = createApp();
});
beforeEach(() => {
const db = getDb();
db.prepare('DELETE FROM api_keys').run();
});
it('GET /api/keys returns empty array initially', async () => {
const { status, body } = await request(app, 'GET', '/api/keys');
expect(status).toBe(200);
expect(body).toEqual([]);
});
it('POST /api/keys creates a new key', async () => {
const { status, body } = await request(app, 'POST', '/api/keys', {
platform: 'groq',
key: 'gsk_test123456789',
label: 'My Groq Key',
});
expect(status).toBe(201);
expect(body.platform).toBe('groq');
expect(body.label).toBe('My Groq Key');
expect(body.maskedKey).toContain('...');
});
it('GET /api/keys returns the created key', async () => {
// First create a key
await request(app, 'POST', '/api/keys', {
platform: 'groq',
key: 'gsk_test123456789',
});
const { status, body } = await request(app, 'GET', '/api/keys');
expect(status).toBe(200);
expect(body).toHaveLength(1);
expect(body[0].platform).toBe('groq');
});
it('POST /api/keys rejects invalid platform', async () => {
const { status } = await request(app, 'POST', '/api/keys', {
platform: 'invalid_platform',
key: 'test',
});
expect(status).toBe(400);
});
it('POST /api/keys rejects missing key', async () => {
const { status } = await request(app, 'POST', '/api/keys', {
platform: 'groq',
});
expect(status).toBe(400);
});
it('DELETE /api/keys/:id removes a key', async () => {
const { body: created } = await request(app, 'POST', '/api/keys', {
platform: 'groq',
key: 'gsk_test123456789',
});
const { status } = await request(app, 'DELETE', `/api/keys/${created.id}`);
expect(status).toBe(200);
const { body: after } = await request(app, 'GET', '/api/keys');
expect(after).toHaveLength(0);
});
it('DELETE /api/keys/:id returns 404 for nonexistent key', async () => {
const { status } = await request(app, 'DELETE', '/api/keys/99999');
expect(status).toBe(404);
});
});
import { describe, it, expect, beforeEach } from 'vitest';
import {
canMakeRequest,
canUseTokens,
recordRequest,
recordTokens,
getRateLimitStatus,
} from '../../services/ratelimit.js';
describe('Rate Limiter', () => {
// Use unique identifiers per test to avoid cross-contamination
let testId: number;
beforeEach(() => {
testId = Math.floor(Math.random() * 1_000_000);
});
describe('canMakeRequest', () => {
it('should allow request when under RPM limit', () => {
expect(canMakeRequest('groq', 'llama-70b', testId, {
rpm: 30, rpd: null, tpm: null, tpd: null,
})).toBe(true);
});
it('should deny request when RPM limit reached', () => {
const limits = { rpm: 2, rpd: null, tpm: null, tpd: null };
recordRequest('groq', 'llama-70b', testId);
recordRequest('groq', 'llama-70b', testId);
expect(canMakeRequest('groq', 'llama-70b', testId, limits)).toBe(false);
});
it('should deny request when RPD limit reached', () => {
const limits = { rpm: null, rpd: 1, tpm: null, tpd: null };
recordRequest('google', 'gemini', testId);
expect(canMakeRequest('google', 'gemini', testId, limits)).toBe(false);
});
it('should allow request when limits are null (unlimited)', () => {
expect(canMakeRequest('nvidia', 'nemotron', testId, {
rpm: null, rpd: null, tpm: null, tpd: null,
})).toBe(true);
});
});
describe('canUseTokens', () => {
it('should allow tokens when under TPM limit', () => {
expect(canUseTokens('groq', 'llama-70b', testId, 500, {
tpm: 6000, tpd: null,
})).toBe(true);
});
it('should deny tokens when TPM limit would be exceeded', () => {
recordTokens('cerebras', 'qwen3', testId, 50000);
expect(canUseTokens('cerebras', 'qwen3', testId, 20000, {
tpm: 60000, tpd: null,
})).toBe(false);
});
it('should allow when limit is null', () => {
expect(canUseTokens('nvidia', 'nemotron', testId, 100000, {
tpm: null, tpd: null,
})).toBe(true);
});
});
describe('getRateLimitStatus', () => {
it('should return current usage counts', () => {
const limits = { rpm: 30, rpd: 1000, tpm: 6000, tpd: null };
recordRequest('groq', 'test-model', testId);
recordRequest('groq', 'test-model', testId);
recordTokens('groq', 'test-model', testId, 500);
const status = getRateLimitStatus('groq', 'test-model', testId, limits);
expect(status.rpm.used).toBe(2);
expect(status.rpm.limit).toBe(30);
expect(status.rpd.used).toBe(2);
expect(status.tpm.used).toBe(500);
});
});
});
import { describe, it, expect, beforeAll, beforeEach } from 'vitest';
import { initDb, getDb } from '../../db/index.js';
import { encrypt } from '../../lib/crypto.js';
import { routeRequest } from '../../services/router.js';
describe('Router', () => {
beforeAll(() => {
process.env.ENCRYPTION_KEY = '0'.repeat(64);
initDb(':memory:');
});
beforeEach(() => {
const db = getDb();
db.prepare('DELETE FROM api_keys').run();
// Reset fallback order to intelligence ranking
const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[];
const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');
for (let i = 0; i < models.length; i++) {
update.run(i + 1, models[i].id);
}
});
it('should throw when no keys are configured', () => {
expect(() => routeRequest()).toThrow(/exhausted/i);
});
it('should route to highest priority model with available key', () => {
const db = getDb();
const { encrypted, iv, authTag } = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', encrypted, iv, authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('groq');
expect(result.apiKey).toBe('test-groq-key');
});
it('should prefer higher-priority model when keys exist for multiple platforms', () => {
const db = getDb();
const googleKey = encrypt('test-google-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('google', 'test', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 1);
const groqKey = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('google');
});
it('should skip disabled keys', () => {
const db = getDb();
const googleKey = encrypt('test-google-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('google', 'disabled', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 0);
const groqKey = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('groq');
});
it('should skip invalid keys', () => {
const db = getDb();
const invalidKey = encrypt('invalid-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('google', 'invalid', invalidKey.encrypted, invalidKey.iv, invalidKey.authTag, 'invalid', 1);
const groqKey = encrypt('test-groq-key');
db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
const result = routeRequest();
expect(result.platform).toBe('groq');
});
});
import express from 'express';
import cors from 'cors';
import helmet from 'helmet';
import path from 'path';
import { fileURLToPath } from 'url';
import { keysRouter } from './routes/keys.js';
import { modelsRouter } from './routes/models.js';
import { proxyRouter } from './routes/proxy.js';
import { fallbackRouter } from './routes/fallback.js';
import { analyticsRouter } from './routes/analytics.js';
import { healthRouter } from './routes/health.js';
import { settingsRouter } from './routes/settings.js';
import { errorHandler } from './middleware/errorHandler.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));
export function createApp() {
const app = express();
app.use(helmet({ contentSecurityPolicy: false, hsts: false }));
app.use(cors());
app.use(express.json({ limit: '1mb' }));
// API routes
app.use('/api/keys', keysRouter);
app.use('/api/models', modelsRouter);
app.use('/api/fallback', fallbackRouter);
app.use('/api/analytics', analyticsRouter);
app.use('/api/health', healthRouter);
app.use('/api/settings', settingsRouter);
// OpenAI-compatible proxy
app.use('/v1', proxyRouter);
// Health check
app.get('/api/ping', (_req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
// Error handler (for API routes)
app.use(errorHandler);
// Serve client static files (after API error handler)
const clientDist = path.resolve(__dirname, '../../client/dist');
app.use(express.static(clientDist));
// SPA fallback — serve index.html for non-API routes
app.use((req, res, next) => {
if (req.path.startsWith('/api/') || req.path.startsWith('/v1/')) {
next();
return;
}
res.sendFile(path.join(clientDist, 'index.html'));
});
return app;
}
import crypto from 'crypto';
import Database from 'better-sqlite3';
import fs from 'fs';
import path from 'path';
import { fileURLToPath } from 'url';
import { initEncryptionKey } from '../lib/crypto.js';
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const DB_PATH = path.resolve(__dirname, '../../data/freeapi.db');
let db: Database.Database;
export function getDb(): Database.Database {
if (!db) {
throw new Error('Database not initialized. Call initDb() first.');
}
return db;
}
export function initDb(dbPath?: string): Database.Database {
const resolvedPath = dbPath ?? DB_PATH;
const isMemory = resolvedPath === ':memory:';
if (!isMemory) {
const dataDir = path.dirname(resolvedPath);
if (!fs.existsSync(dataDir)) {
fs.mkdirSync(dataDir, { recursive: true });
}
}
db = new Database(resolvedPath);
if (!isMemory) db.pragma('journal_mode = WAL');
db.pragma('foreign_keys = ON');
createTables(db);
initEncryptionKey(db);
seedModels(db);
migrateModels(db);
migrateModelsV2(db);
migrateModelsV3Ranks(db);
ensureUnifiedKey(db);
console.log(`Database initialized at ${resolvedPath}`);
return db;
}
function createTables(db: Database.Database) {
db.exec(`
CREATE TABLE IF NOT EXISTS models (
id INTEGER PRIMARY KEY AUTOINCREMENT,
platform TEXT NOT NULL,
model_id TEXT NOT NULL,
display_name TEXT NOT NULL,
intelligence_rank INTEGER NOT NULL,
speed_rank INTEGER NOT NULL,
size_label TEXT NOT NULL DEFAULT '',
rpm_limit INTEGER,
rpd_limit INTEGER,
tpm_limit INTEGER,
tpd_limit INTEGER,
monthly_token_budget TEXT NOT NULL DEFAULT '',
context_window INTEGER,
enabled INTEGER NOT NULL DEFAULT 1,
UNIQUE(platform, model_id)
);
CREATE TABLE IF NOT EXISTS api_keys (
id INTEGER PRIMARY KEY AUTOINCREMENT,
platform TEXT NOT NULL,
label TEXT NOT NULL DEFAULT '',
encrypted_key TEXT NOT NULL,
iv TEXT NOT NULL,
auth_tag TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'unknown',
enabled INTEGER NOT NULL DEFAULT 1,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
last_checked_at TEXT
);
CREATE TABLE IF NOT EXISTS requests (
id INTEGER PRIMARY KEY AUTOINCREMENT,
platform TEXT NOT NULL,
model_id TEXT NOT NULL,
status TEXT NOT NULL,
input_tokens INTEGER NOT NULL DEFAULT 0,
output_tokens INTEGER NOT NULL DEFAULT 0,
latency_ms INTEGER NOT NULL DEFAULT 0,
error TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE IF NOT EXISTS fallback_config (
id INTEGER PRIMARY KEY AUTOINCREMENT,
model_db_id INTEGER NOT NULL REFERENCES models(id),
priority INTEGER NOT NULL,
enabled INTEGER NOT NULL DEFAULT 1,
UNIQUE(model_db_id)
);
CREATE TABLE IF NOT EXISTS settings (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_requests_created_at ON requests(created_at);
CREATE INDEX IF NOT EXISTS idx_requests_platform ON requests(platform);
CREATE INDEX IF NOT EXISTS idx_api_keys_platform ON api_keys(platform);
`);
}
function seedModels(db: Database.Database) {
const count = db.prepare('SELECT COUNT(*) as cnt FROM models').get() as { cnt: number };
if (count.cnt > 0) return;
const insert = db.prepare(`
INSERT INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
// NOTE: Limits current as of April 2026. See migrateModels() for in-place updates.
const models = [
// Google — gemini-2.5-flash free quotas were cut Dec 2025 (now ~20 RPD, budget much lower than before)
['google', 'gemini-2.5-pro', 'Gemini 2.5 Pro', 1, 8, 'Frontier', 5, 100, 250000, null, '~12M', 1048576],
['google', 'gemini-2.5-flash', 'Gemini 2.5 Flash', 4, 5, 'Large', 10, 20, 250000, null, '~3M', 1048576],
['google', 'gemini-2.5-flash-lite', 'Gemini 2.5 Flash-Lite', 8, 3, 'Medium', 15, 1000, 250000, null, '~120M', 1048576],
// OpenRouter — upgraded DeepSeek R1 -> V3.1 (stronger reasoning); default RPD ~200
['openrouter', 'deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, 10, 'Frontier', 20, 200, null, null, '~6M', 131072],
['openrouter', 'moonshotai/kimi-k2:free', 'Kimi K2 (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 131072],
['openrouter', 'qwen/qwen3-coder:free', 'Qwen3 Coder (free)', 3, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
['openrouter', 'z-ai/glm-4.5-air:free', 'GLM-4.5 Air (free)', 4, 9, 'Large', 20, 200, null, null, '~6M', 131072],
// Cerebras — same 30 RPM / 1M TPD free pool; adding frontier coder, Llama 4 Maverick, GPT-OSS
['cerebras', 'qwen-3-coder-480b', 'Qwen3-Coder 480B', 2, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
['cerebras', 'llama-4-maverick-17b-128e-instruct', 'Llama 4 Maverick', 3, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
['cerebras', 'qwen3-235b', 'Qwen3 235B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 8192],
['cerebras', 'gpt-oss-120b', 'GPT-OSS 120B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 131072],
// GitHub Models — GPT-4o replaced with GPT-5 (same free tier key)
['github', 'openai/gpt-5', 'GPT-5 (GitHub)', 1, 7, 'Frontier', 10, 50, null, null, '~18M', 128000],
// SambaNova — 70B RPM bumped to 20
['sambanova', 'Meta-Llama-3.3-70B-Instruct', 'Llama 3.3 70B', 6, 9, 'Large', 20, null, null, 200000, '~6M', 8192],
// Mistral — Experiment pool ~1B tokens/mo shared across all models
['mistral', 'mistral-large-latest', 'Mistral Large 3', 7, 8, 'Large', 2, null, 500000, null, '~50-100M', 131072],
['mistral', 'magistral-medium-latest', 'Magistral Medium', 4, 8, 'Large', 2, null, 500000, null, '~50-100M', 40000],
['mistral', 'codestral-latest', 'Codestral', 6, 6, 'Medium', 2, null, 500000, null, '~50-100M', 32000],
// Groq — scout TPM corrected to 6k (not 30k)
['groq', 'llama-3.3-70b-versatile', 'Llama 3.3 70B', 9, 2, 'Medium', 30, 1000, 6000, 500000, '~15M', 131072],
['groq', 'llama-4-scout-17b-16e-instruct', 'Llama 4 Scout', 10, 2, 'Medium', 30, 1000, 6000, 1000000, '~30M', 131072],
// NVIDIA NIM — moved to credit-based model in 2025; no longer truly recurring monthly. Disabled by default.
['nvidia', 'meta/llama-3.1-70b-instruct', 'Llama 3.1 70B (NV)', 11, 6, 'Large', 40, null, null, null, 'credits-based', 131072],
// Cohere — trial tier is 1000 calls/mo total → realistic budget 1-2M
['cohere', 'command-r-plus-08-2024', 'Command R+ (08-2024)', 12, 11, 'Large', 20, 33, null, null, '~1-2M', 131072],
['cloudflare', '@cf/meta/llama-3.1-70b-instruct', 'Llama 3.1 70B (CF)', 13, 11, 'Medium', null, null, null, null, '~18-45M', 131072],
// Hugging Face — free Inference credits are ~$0.10/mo → budget closer to 1-3M on a 70B model
['huggingface', 'accounts/fireworks/models/llama-v3p3-70b-instruct', 'Llama 3.3 70B (HF)', 14, 11, 'Medium', null, null, null, null, '~1-3M', 131072],
// New providers — recurring monthly free tiers, no card required
['zhipu', 'glm-4.5-flash', 'GLM-4.5 Flash', 5, 4, 'Large', null, null, null, 1000000, '~30M', 131072],
['moonshot', 'kimi-latest', 'Kimi Latest', 4, 8, 'Large', 60, null, null, 500000, '~15M', 200000],
['minimax', 'MiniMax-M1', 'MiniMax M1', 5, 8, 'Large', 20, null, 1000000, null, '~30M', 200000],
];
const insertMany = db.transaction(() => {
for (const m of models) {
insert.run(...m);
}
});
insertMany();
// Seed default fallback config from models
const allModels = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as { id: number; intelligence_rank: number }[];
const insertFallback = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
const insertFallbacks = db.transaction(() => {
for (let i = 0; i < allModels.length; i++) {
insertFallback.run(allModels[i].id, i + 1);
}
});
insertFallbacks();
console.log(`Seeded ${models.length} models and fallback config`);
}
/**
* Idempotent migration to bring existing DBs up to the April 2026 pool.
* Covers: replaces outdated models (DeepSeek R1 → V3.1, GPT-4o → GPT-5),
* corrects stale rate-limits / monthly budgets, adds new smarter models
* and three new providers (Zhipu, Moonshot, MiniMax).
*/
function migrateModels(db: Database.Database) {
// 1) Replace outdated models in-place (preserves fallback_config & any references)
const renames: Array<[string, string, string, string, number, string, number | null, number | null, number]> = [
// platform, oldModelId, newModelId, newDisplayName, intelligenceRank, monthlyBudget, rpdLimit, contextWindow, sizeLabelPriority(unused)
];
const renameStmt = db.prepare(`
UPDATE models
SET model_id = ?, display_name = ?, intelligence_rank = ?,
monthly_token_budget = ?, rpd_limit = COALESCE(?, rpd_limit),
context_window = COALESCE(?, context_window),
size_label = COALESCE(?, size_label)
WHERE platform = ? AND model_id = ?
`);
// DeepSeek R1 (free) -> DeepSeek V3.1 (free)
renameStmt.run('deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, '~6M', 200, 131072, 'Frontier', 'openrouter', 'deepseek/deepseek-r1:free');
// GitHub GPT-4o -> GPT-5
renameStmt.run('openai/gpt-5', 'GPT-5 (GitHub)', 1, '~18M', null, 128000, 'Frontier', 'github', 'gpt-4o');
// 2) Correct stale limits / budgets on existing rows
db.prepare(`UPDATE models SET rpd_limit = 20, monthly_token_budget = '~3M' WHERE platform = 'google' AND model_id = 'gemini-2.5-flash'`).run();
db.prepare(`UPDATE models SET rpm_limit = 20 WHERE platform = 'sambanova' AND model_id = 'Meta-Llama-3.3-70B-Instruct'`).run();
db.prepare(`UPDATE models SET tpm_limit = 6000 WHERE platform = 'groq' AND model_id = 'llama-4-scout-17b-16e-instruct'`).run();
db.prepare(`UPDATE models SET monthly_token_budget = '~1-2M' WHERE platform = 'cohere' AND model_id = 'command-r-plus-08-2024'`).run();
db.prepare(`UPDATE models SET monthly_token_budget = '~1-3M' WHERE platform = 'huggingface' AND model_id = 'accounts/fireworks/models/llama-v3p3-70b-instruct'`).run();
// NVIDIA moved to credit model — disable and label accordingly
db.prepare(`UPDATE models SET monthly_token_budget = 'credits-based', enabled = 0 WHERE platform = 'nvidia' AND model_id = 'meta/llama-3.1-70b-instruct'`).run();
// 3) Insert new models (UNIQUE(platform, model_id) makes this idempotent)
const insert = db.prepare(`
INSERT OR IGNORE INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const newModels: Array<[string, string, string, number, number, string, number | null, number | null, number | null, number | null, string, number | null]> = [
// Cerebras — same free pool as qwen3-235b
['cerebras', 'qwen-3-coder-480b', 'Qwen3-Coder 480B', 2, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
['cerebras', 'llama-4-maverick-17b-128e-instruct', 'Llama 4 Maverick', 3, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
['cerebras', 'gpt-oss-120b', 'GPT-OSS 120B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 131072],
// OpenRouter free tier
['openrouter', 'deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, 10, 'Frontier', 20, 200, null, null, '~6M', 131072],
['openrouter', 'moonshotai/kimi-k2:free', 'Kimi K2 (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 131072],
['openrouter', 'qwen/qwen3-coder:free', 'Qwen3 Coder (free)', 3, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
['openrouter', 'z-ai/glm-4.5-air:free', 'GLM-4.5 Air (free)', 4, 9, 'Large', 20, 200, null, null, '~6M', 131072],
// Mistral Experiment pool — shared ~1B/mo across models
['mistral', 'magistral-medium-latest', 'Magistral Medium', 4, 8, 'Large', 2, null, 500000, null, '~50-100M', 40000],
['mistral', 'codestral-latest', 'Codestral', 6, 6, 'Medium', 2, null, 500000, null, '~50-100M', 32000],
// New providers
['zhipu', 'glm-4.5-flash', 'GLM-4.5 Flash', 5, 4, 'Large', null, null, null, 1000000, '~30M', 131072],
['moonshot', 'kimi-latest', 'Kimi Latest', 4, 8, 'Large', 60, null, null, 500000, '~15M', 200000],
['minimax', 'MiniMax-M1', 'MiniMax M1', 5, 8, 'Large', 20, null, 1000000, null, '~30M', 200000],
];
const apply = db.transaction(() => {
for (const m of newModels) insert.run(...m);
// Ensure every model has a fallback_config row (new inserts + any orphans)
const missing = db.prepare(`
SELECT m.id FROM models m
LEFT JOIN fallback_config f ON m.id = f.model_db_id
WHERE f.id IS NULL
ORDER BY m.intelligence_rank ASC
`).all() as { id: number }[];
if (missing.length > 0) {
const maxPriority = (db.prepare('SELECT COALESCE(MAX(priority), 0) AS mx FROM fallback_config').get() as { mx: number }).mx;
const addFallback = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
for (let i = 0; i < missing.length; i++) {
addFallback.run(missing[i].id, maxPriority + i + 1);
}
}
});
apply();
}
/**
* Second-pass migration after live-testing every model against its provider.
* Corrects model IDs verified wrong, removes models not actually available on
* the current free tier, and adds real :free OpenRouter models found in the
* live catalog (April 2026).
*/
function migrateModelsV2(db: Database.Database) {
// Helper: delete a model and its fallback_config entry (FK is RESTRICT-by-default)
const deleteModel = db.prepare(`DELETE FROM models WHERE platform = ? AND model_id = ?`);
const deleteFallback = db.prepare(`
DELETE FROM fallback_config WHERE model_db_id IN (
SELECT id FROM models WHERE platform = ? AND model_id = ?
)
`);
const removals: Array<[string, string]> = [
// GitHub free tier does NOT include GPT-5 (only catalog-listed). Revert handled below.
// Cerebras: qwen-3-coder-480b and llama-4-maverick not on free tier; gpt-oss-120b is listed
// but requires special access — our key gets 404. Remove all three.
['cerebras', 'qwen-3-coder-480b'],
['cerebras', 'llama-4-maverick-17b-128e-instruct'],
['cerebras', 'gpt-oss-120b'],
// These OpenRouter :free variants do not exist in the live catalog (April 2026)
['openrouter', 'deepseek/deepseek-v3.1:free'],
['openrouter', 'moonshotai/kimi-k2:free'],
];
const applyRemovals = db.transaction(() => {
for (const [p, m] of removals) {
deleteFallback.run(p, m);
deleteModel.run(p, m);
}
});
applyRemovals();
// GitHub: gpt-5 is in the model catalog but returns "unavailable_model" on free tier
// inference. Revert to gpt-4o which works. This only runs if the gpt-5 row exists.
db.prepare(`
UPDATE models
SET model_id = 'gpt-4o', display_name = 'GPT-4o', intelligence_rank = 5,
size_label = 'Large', context_window = 8000, monthly_token_budget = '~18M'
WHERE platform = 'github' AND model_id = 'openai/gpt-5'
`).run();
// Groq: scout requires the meta-llama/ publisher prefix
db.prepare(`
UPDATE models SET model_id = 'meta-llama/llama-4-scout-17b-16e-instruct'
WHERE platform = 'groq' AND model_id = 'llama-4-scout-17b-16e-instruct'
`).run();
// Add real OpenRouter :free models that exist in the live catalog
const insert = db.prepare(`
INSERT OR IGNORE INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
`);
const additions: Array<[string, string, string, number, number, string, number | null, number | null, number | null, number | null, string, number | null]> = [
// Frontier-tier free models verified in OR catalog 2026-04
['openrouter', 'nvidia/nemotron-3-super-120b-a12b:free', 'Nemotron 3 Super 120B (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
['openrouter', 'qwen/qwen3-next-80b-a3b-instruct:free', 'Qwen3-Next 80B (free)', 3, 9, 'Large', 20, 200, null, null, '~6M', 262144],
['openrouter', 'minimax/minimax-m2.5:free', 'MiniMax M2.5 (free)', 3, 9, 'Large', 20, 200, null, null, '~6M', 196608],
['openrouter', 'google/gemma-4-31b-it:free', 'Gemma 4 31B (free)', 5, 9, 'Medium', 20, 200, null, null, '~6M', 262144],
];
const applyAdditions = db.transaction(() => {
for (const a of additions) insert.run(...a);
// Fallback entries for new models
const missing = db.prepare(`
SELECT m.id FROM models m
LEFT JOIN fallback_config f ON m.id = f.model_db_id
WHERE f.id IS NULL ORDER BY m.intelligence_rank ASC
`).all() as { id: number }[];
if (missing.length > 0) {
const maxPriority = (db.prepare('SELECT COALESCE(MAX(priority), 0) AS mx FROM fallback_config').get() as { mx: number }).mx;
const addFb = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
for (let i = 0; i < missing.length; i++) addFb.run(missing[i].id, maxPriority + i + 1);
}
});
applyAdditions();
}
/**
* Re-rank intelligence based on April 2026 coding + agentic tool-use benchmarks:
* SWE-bench Verified, Terminal-Bench 2, TAU-Bench, Aider Polyglot.
* Higher rank = weaker. Ties are allowed (same weights across providers).
*/
function migrateModelsV3Ranks(db: Database.Database) {
const setRank = db.prepare(`UPDATE models SET intelligence_rank = ? WHERE platform = ? AND model_id = ?`);
const ranks: Array<[number, string, string]> = [
// #1-10 frontier coders / agents
[1, 'openrouter', 'minimax/minimax-m2.5:free'], // SWE-V ~80%, TB2 ~57%
[2, 'openrouter', 'qwen/qwen3-coder:free'], // SWE-V ~70%
[3, 'openrouter', 'qwen/qwen3-next-80b-a3b-instruct:free'], // SWE-V ~70.6%
[4, 'moonshot', 'kimi-latest'], // K2: SWE-V ~71%
[5, 'cerebras', 'qwen-3-235b-a22b-instruct-2507'], // SWE-V ~65-72%
[6, 'google', 'gemini-2.5-pro'], // SWE-V 63.8%, Aider 83%
[7, 'openrouter', 'z-ai/glm-4.5-air:free'], // ~58% SWE-V (distill of 4.5)
[8, 'openrouter', 'openai/gpt-oss-120b:free'], // SWE-V 62.4%
[9, 'openrouter', 'nvidia/nemotron-3-super-120b-a12b:free'], // SWE-V 53.7%
[10, 'minimax', 'MiniMax-M1'], // M1 predecessor, ~45-55%
// #11-15 mid-tier specialists
[11, 'mistral', 'codestral-latest'], // HumanEval 86.6%
[12, 'mistral', 'mistral-large-latest'],
[13, 'mistral', 'magistral-medium-latest'], // reasoning, not code-tuned
[14, 'google', 'gemini-2.5-flash'],
[15, 'zhipu', 'glm-4.5-flash'],
// #16 Llama 3.3 70B — identical weights across providers (tie)
[16, 'groq', 'llama-3.3-70b-versatile'],
[16, 'sambanova', 'Meta-Llama-3.3-70B-Instruct'],
[16, 'openrouter', 'meta-llama/llama-3.3-70b-instruct:free'],
[16, 'huggingface', 'accounts/fireworks/models/llama-v3p3-70b-instruct'],
// #17-23 weaker
[17, 'openrouter', 'nousresearch/hermes-3-llama-3.1-405b:free'], // L3.1 base with tool-use tune
[18, 'groq', 'meta-llama/llama-4-scout-17b-16e-instruct'], // multimodal focus
[19, 'openrouter', 'google/gemma-4-31b-it:free'],
[20, 'google', 'gemini-2.5-flash-lite'],
[21, 'github', 'gpt-4o'], // Aug 2024, SWE-V ~33%
[22, 'nvidia', 'meta/llama-3.1-70b-instruct'], // older Llama 3.1 tune
[22, 'cloudflare', '@cf/meta/llama-3.1-70b-instruct'], // same base weights
[23, 'cohere', 'command-r-plus-08-2024'], // RAG-focused, weakest on code
];
const apply = db.transaction(() => {
for (const [rank, platform, modelId] of ranks) {
setRank.run(rank, platform, modelId);
}
});
apply();
}
function ensureUnifiedKey(db: Database.Database) {
const existing = db.prepare("SELECT value FROM settings WHERE key = 'unified_api_key'").get() as { value: string } | undefined;
if (!existing) {
const key = `freellmapi-${crypto.randomBytes(24).toString('hex')}`;
db.prepare("INSERT INTO settings (key, value) VALUES ('unified_api_key', ?)").run(key);
console.log(`\n Your unified API key: ${key}\n`);
}
}
export function getUnifiedApiKey(): string {
const db = getDb();
const row = db.prepare("SELECT value FROM settings WHERE key = 'unified_api_key'").get() as { value: string };
return row.value;
}
export function regenerateUnifiedKey(): string {
const db = getDb();
const key = `freellmapi-${crypto.randomBytes(24).toString('hex')}`;
db.prepare("UPDATE settings SET value = ? WHERE key = 'unified_api_key'").run(key);
return key;
}
import { createApp } from './app.js';
import { initDb } from './db/index.js';
import { startHealthChecker } from './services/health.js';
const PORT = process.env.PORT ?? 3001;
async function main() {
initDb();
const app = createApp();
app.listen(Number(PORT), '0.0.0.0', () => {
console.log(`Server running on http://0.0.0.0:${PORT}`);
console.log(`Proxy endpoint: http://0.0.0.0:${PORT}/v1/chat/completions`);
startHealthChecker();
});
}
main().catch(console.error);
import crypto from 'crypto';
import Database from 'better-sqlite3';
const ALGORITHM = 'aes-256-gcm';
let cachedKey: Buffer | null = null;
/**
* Initialize encryption key from env, DB, or generate a new one.
* Must be called after DB is initialized.
*/
export function initEncryptionKey(db: Database.Database): void {
// 1. Check env var
const envKey = process.env.ENCRYPTION_KEY;
if (envKey && envKey !== 'your-64-char-hex-key-here') {
cachedKey = Buffer.from(envKey, 'hex');
return;
}
// 2. Check DB for persisted key
const row = db.prepare("SELECT value FROM settings WHERE key = 'encryption_key'").get() as { value: string } | undefined;
if (row) {
cachedKey = Buffer.from(row.value, 'hex');
return;
}
// 3. Generate and persist
cachedKey = crypto.randomBytes(32);
db.prepare("INSERT INTO settings (key, value) VALUES ('encryption_key', ?)").run(cachedKey.toString('hex'));
}
function getEncryptionKey(): Buffer {
if (!cachedKey) {
throw new Error('Encryption key not initialized. Call initEncryptionKey() first.');
}
return cachedKey;
}
export function encrypt(text: string): { encrypted: string; iv: string; authTag: string } {
const key = getEncryptionKey();
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv(ALGORITHM, key, iv);
let encrypted = cipher.update(text, 'utf8', 'hex');
encrypted += cipher.final('hex');
const authTag = cipher.getAuthTag().toString('hex');
return {
encrypted,
iv: iv.toString('hex'),
authTag,
};
}
export function decrypt(encrypted: string, iv: string, authTag: string): string {
const key = getEncryptionKey();
const decipher = crypto.createDecipheriv(ALGORITHM, key, Buffer.from(iv, 'hex'));
decipher.setAuthTag(Buffer.from(authTag, 'hex'));
let decrypted = decipher.update(encrypted, 'hex', 'utf8');
decrypted += decipher.final('utf8');
return decrypted;
}
export function maskKey(key: string): string {
if (key.length <= 8) return '****' + key.slice(-4);
return key.slice(0, 4) + '...' + key.slice(-4);
}
import type { Request, Response, NextFunction } from 'express';
export function errorHandler(err: Error, _req: Request, res: Response, _next: NextFunction) {
console.error('[Error]', err.message);
const status = (err as any).status ?? 500;
res.status(status).json({
error: {
message: err.message,
type: err.name ?? 'server_error',
},
});
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
Platform,
} from '@freellmapi/shared/types.js';
export interface CompletionOptions {
model?: string;
temperature?: number;
max_tokens?: number;
top_p?: number;
}
export abstract class BaseProvider {
abstract readonly platform: Platform;
abstract readonly name: string;
abstract chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse>;
abstract streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk>;
abstract validateKey(apiKey: string): Promise<boolean>;
protected async fetchWithTimeout(
url: string,
init: RequestInit,
timeoutMs = 15000,
): Promise<Response> {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), timeoutMs);
try {
return await fetch(url, { ...init, signal: controller.signal });
} finally {
clearTimeout(timeout);
}
}
protected makeId(): string {
return `chatcmpl-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
const API_BASE = 'https://api.cerebras.ai/v1';
export class CerebrasProvider extends BaseProvider {
readonly platform = 'cerebras' as const;
readonly name = 'Cerebras';
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Cerebras API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const data = await res.json() as ChatCompletionResponse;
data._routed_via = { platform: 'cerebras', model: modelId };
return data;
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Cerebras API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const data = trimmed.slice(6);
if (data === '[DONE]') return;
yield JSON.parse(data) as ChatCompletionChunk;
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
method: 'GET',
headers: { 'Authorization': `Bearer ${apiKey}` },
}, 10000);
return res.ok;
} catch {
return false;
}
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
/**
* Cloudflare Workers AI provider.
* API key format expected: "account_id:api_token"
* The account_id is extracted from the key to build the URL.
*/
export class CloudflareProvider extends BaseProvider {
readonly platform = 'cloudflare' as const;
readonly name = 'Cloudflare Workers AI';
private parseKey(apiKey: string): { accountId: string; token: string } {
const sep = apiKey.indexOf(':');
if (sep === -1) throw new Error('Cloudflare key must be in format "account_id:api_token"');
return { accountId: apiKey.slice(0, sep), token: apiKey.slice(sep + 1) };
}
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const { accountId, token } = this.parseKey(apiKey);
const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/run/${modelId}`;
const res = await this.fetchWithTimeout(url, {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
messages,
max_tokens: options?.max_tokens,
temperature: options?.temperature,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
const errors = (err as any).errors;
throw new Error(`Cloudflare API error ${res.status}: ${errors?.[0]?.message ?? res.statusText}`);
}
const data = await res.json() as any;
const text = data.result?.response ?? '';
return {
id: this.makeId(),
object: 'chat.completion',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{
index: 0,
message: { role: 'assistant', content: text },
finish_reason: 'stop',
}],
usage: {
prompt_tokens: 0,
completion_tokens: 0,
total_tokens: 0,
},
_routed_via: { platform: 'cloudflare', model: modelId },
};
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const { accountId, token } = this.parseKey(apiKey);
const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/run/${modelId}`;
const res = await this.fetchWithTimeout(url, {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
messages,
max_tokens: options?.max_tokens,
temperature: options?.temperature,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Cloudflare API error ${res.status}: ${(err as any).errors?.[0]?.message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
const id = this.makeId();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const data = trimmed.slice(6);
if (data === '[DONE]') return;
try {
const parsed = JSON.parse(data);
if (parsed.response) {
yield {
id,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{ index: 0, delta: { content: parsed.response }, finish_reason: null }],
};
}
} catch { /* skip */ }
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const { token } = this.parseKey(apiKey);
const res = await this.fetchWithTimeout(
'https://api.cloudflare.com/client/v4/user/tokens/verify',
{ method: 'GET', headers: { 'Authorization': `Bearer ${token}` } },
10000,
);
if (!res.ok) return false;
const data = await res.json() as any;
return data.success === true && data.result?.status === 'active';
} catch {
return false;
}
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
const API_BASE = 'https://api.cohere.com/v2';
interface CohereResponse {
id: string;
message?: { content?: { type: string; text: string }[] };
finish_reason?: string;
usage?: {
tokens?: { input_tokens?: number; output_tokens?: number };
};
}
export class CohereProvider extends BaseProvider {
readonly platform = 'cohere' as const;
readonly name = 'Cohere';
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const cohereMessages = messages.map(m => ({
role: m.role === 'system' ? 'system' as const : m.role === 'assistant' ? 'assistant' as const : 'user' as const,
content: m.content,
}));
const res = await this.fetchWithTimeout(`${API_BASE}/chat`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages: cohereMessages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
p: options?.top_p,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Cohere API error ${res.status}: ${(err as any).message ?? res.statusText}`);
}
const data = await res.json() as CohereResponse;
const text = data.message?.content?.[0]?.text ?? '';
return {
id: data.id ?? this.makeId(),
object: 'chat.completion',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{
index: 0,
message: { role: 'assistant', content: text },
finish_reason: data.finish_reason ?? 'stop',
}],
usage: {
prompt_tokens: data.usage?.tokens?.input_tokens ?? 0,
completion_tokens: data.usage?.tokens?.output_tokens ?? 0,
total_tokens: (data.usage?.tokens?.input_tokens ?? 0) + (data.usage?.tokens?.output_tokens ?? 0),
},
_routed_via: { platform: 'cohere', model: modelId },
};
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const cohereMessages = messages.map(m => ({
role: m.role === 'system' ? 'system' as const : m.role === 'assistant' ? 'assistant' as const : 'user' as const,
content: m.content,
}));
const res = await this.fetchWithTimeout(`${API_BASE}/chat`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages: cohereMessages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Cohere API error ${res.status}: ${(err as any).message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
const id = this.makeId();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed) continue;
try {
const event = JSON.parse(trimmed);
if (event.type === 'content-delta') {
const text = event.delta?.message?.content?.text ?? '';
if (text) {
yield {
id,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{ index: 0, delta: { content: text }, finish_reason: null }],
};
}
} else if (event.type === 'message-end') {
yield {
id,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
};
}
} catch {
// Skip malformed lines
}
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
method: 'GET',
headers: { 'Authorization': `Bearer ${apiKey}` },
}, 10000);
return res.ok;
} catch {
return false;
}
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
TokenUsage,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
const API_BASE = 'https://generativelanguage.googleapis.com/v1beta';
// Translate OpenAI messages to Gemini format
function toGeminiContents(messages: ChatMessage[]) {
const systemInstruction = messages.find(m => m.role === 'system');
const contents = messages
.filter(m => m.role !== 'system')
.map(m => ({
role: m.role === 'assistant' ? 'model' : 'user',
parts: [{ text: m.content }],
}));
return {
contents,
systemInstruction: systemInstruction
? { parts: [{ text: systemInstruction.content }] }
: undefined,
};
}
interface GeminiResponse {
candidates?: {
content?: { parts?: { text?: string }[] };
finishReason?: string;
}[];
usageMetadata?: {
promptTokenCount?: number;
candidatesTokenCount?: number;
totalTokenCount?: number;
};
}
export class GoogleProvider extends BaseProvider {
readonly platform = 'google' as const;
readonly name = 'Google AI Studio';
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const { contents, systemInstruction } = toGeminiContents(messages);
const body: Record<string, unknown> = {
contents,
generationConfig: {
temperature: options?.temperature,
maxOutputTokens: options?.max_tokens,
topP: options?.top_p,
},
};
if (systemInstruction) body.systemInstruction = systemInstruction;
const url = `${API_BASE}/models/${modelId}:generateContent?key=${apiKey}`;
const res = await this.fetchWithTimeout(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Google API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const data = await res.json() as GeminiResponse;
const text = data.candidates?.[0]?.content?.parts?.[0]?.text ?? '';
const usage: TokenUsage = {
prompt_tokens: data.usageMetadata?.promptTokenCount ?? 0,
completion_tokens: data.usageMetadata?.candidatesTokenCount ?? 0,
total_tokens: data.usageMetadata?.totalTokenCount ?? 0,
};
return {
id: this.makeId(),
object: 'chat.completion',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{
index: 0,
message: { role: 'assistant', content: text },
finish_reason: data.candidates?.[0]?.finishReason?.toLowerCase() === 'stop' ? 'stop' : 'stop',
}],
usage,
_routed_via: { platform: 'google', model: modelId },
};
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const { contents, systemInstruction } = toGeminiContents(messages);
const body: Record<string, unknown> = {
contents,
generationConfig: {
temperature: options?.temperature,
maxOutputTokens: options?.max_tokens,
topP: options?.top_p,
},
};
if (systemInstruction) body.systemInstruction = systemInstruction;
const url = `${API_BASE}/models/${modelId}:streamGenerateContent?alt=sse&key=${apiKey}`;
const res = await this.fetchWithTimeout(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Google API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
const id = this.makeId();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const raw = trimmed.slice(6);
if (raw === '[DONE]') return;
const chunk = JSON.parse(raw) as GeminiResponse;
const text = chunk.candidates?.[0]?.content?.parts?.[0]?.text ?? '';
if (!text) continue;
yield {
id,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{
index: 0,
delta: { content: text },
finish_reason: null,
}],
};
}
}
// Final chunk
yield {
id,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: modelId,
choices: [{
index: 0,
delta: {},
finish_reason: 'stop',
}],
};
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const res = await this.fetchWithTimeout(
`${API_BASE}/models?key=${apiKey}`,
{ method: 'GET' },
10000,
);
return res.ok;
} catch {
return false;
}
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
const API_BASE = 'https://api.groq.com/openai/v1';
export class GroqProvider extends BaseProvider {
readonly platform = 'groq' as const;
readonly name = 'Groq';
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Groq API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const data = await res.json() as ChatCompletionResponse;
data._routed_via = { platform: 'groq', model: modelId };
return data;
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`Groq API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const data = trimmed.slice(6);
if (data === '[DONE]') return;
yield JSON.parse(data) as ChatCompletionChunk;
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
method: 'GET',
headers: { 'Authorization': `Bearer ${apiKey}` },
}, 10000);
return res.ok;
} catch {
return false;
}
}
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
const API_BASE = 'https://router.huggingface.co/fireworks-ai/inference/v1';
export class HuggingFaceProvider extends BaseProvider {
readonly platform = 'huggingface' as const;
readonly name = 'Hugging Face';
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
// HF Inference API supports OpenAI-compatible chat endpoint
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`HuggingFace API error ${res.status}: ${(err as any).error ?? res.statusText}`);
}
const data = await res.json() as ChatCompletionResponse;
data._routed_via = { platform: 'huggingface', model: modelId };
return data;
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`HuggingFace API error ${res.status}: ${(err as any).error ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const data = trimmed.slice(6);
if (data === '[DONE]') return;
try {
yield JSON.parse(data) as ChatCompletionChunk;
} catch { /* skip */ }
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const res = await this.fetchWithTimeout('https://huggingface.co/api/whoami-v2', {
method: 'GET',
headers: { 'Authorization': `Bearer ${apiKey}` },
}, 10000);
return res.ok;
} catch {
return false;
}
}
}
import type { Platform } from '@freellmapi/shared/types.js';
import type { BaseProvider } from './base.js';
import { GoogleProvider } from './google.js';
import { OpenAICompatProvider } from './openai-compat.js';
import { CohereProvider } from './cohere.js';
import { CloudflareProvider } from './cloudflare.js';
import { HuggingFaceProvider } from './huggingface.js';
const providers = new Map<Platform, BaseProvider>();
function register(provider: BaseProvider) {
providers.set(provider.platform, provider);
}
// Google - unique Gemini API format
register(new GoogleProvider());
// Groq - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'groq',
name: 'Groq',
baseUrl: 'https://api.groq.com/openai/v1',
}));
// Cerebras - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'cerebras',
name: 'Cerebras',
baseUrl: 'https://api.cerebras.ai/v1',
}));
// SambaNova - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'sambanova',
name: 'SambaNova',
baseUrl: 'https://api.sambanova.ai/v1',
}));
// NVIDIA NIM - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'nvidia',
name: 'NVIDIA NIM',
baseUrl: 'https://integrate.api.nvidia.com/v1',
}));
// Mistral - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'mistral',
name: 'Mistral',
baseUrl: 'https://api.mistral.ai/v1',
}));
// OpenRouter - OpenAI-compatible with extra headers
register(new OpenAICompatProvider({
platform: 'openrouter',
name: 'OpenRouter',
baseUrl: 'https://openrouter.ai/api/v1',
extraHeaders: {
'HTTP-Referer': 'http://localhost:3001',
'X-Title': 'FreeLLMAPI',
},
}));
// GitHub Models - OpenAI-compatible via Azure endpoint
register(new OpenAICompatProvider({
platform: 'github',
name: 'GitHub Models',
baseUrl: 'https://models.inference.ai.azure.com',
}));
// Cohere - unique API format
register(new CohereProvider());
// Cloudflare Workers AI - unique API format (key = "account_id:token")
register(new CloudflareProvider());
// Hugging Face - OpenAI-compatible per-model endpoint
register(new HuggingFaceProvider());
// Zhipu (Z.ai / bigmodel.cn) - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'zhipu',
name: 'Zhipu AI',
baseUrl: 'https://open.bigmodel.cn/api/paas/v4',
}));
// Moonshot (Kimi) - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'moonshot',
name: 'Moonshot',
baseUrl: 'https://api.moonshot.ai/v1',
}));
// MiniMax - OpenAI-compatible
register(new OpenAICompatProvider({
platform: 'minimax',
name: 'MiniMax',
baseUrl: 'https://api.minimax.io/v1',
}));
export function getProvider(platform: Platform): BaseProvider | undefined {
return providers.get(platform);
}
export function getAllProviders(): BaseProvider[] {
return Array.from(providers.values());
}
export function hasProvider(platform: Platform): boolean {
return providers.has(platform);
}
import type {
ChatMessage,
ChatCompletionResponse,
ChatCompletionChunk,
Platform,
} from '@freellmapi/shared/types.js';
import { BaseProvider, type CompletionOptions } from './base.js';
/**
* Generic provider for platforms that use an OpenAI-compatible API.
* Covers: Groq, Cerebras, SambaNova, NVIDIA NIM, Mistral, OpenRouter,
* GitHub Models, Fireworks AI.
*/
export class OpenAICompatProvider extends BaseProvider {
readonly platform: Platform;
readonly name: string;
private readonly baseUrl: string;
private readonly extraHeaders: Record<string, string>;
private readonly validateUrl?: string;
constructor(opts: {
platform: Platform;
name: string;
baseUrl: string;
extraHeaders?: Record<string, string>;
validateUrl?: string;
}) {
super();
this.platform = opts.platform;
this.name = opts.name;
this.baseUrl = opts.baseUrl;
this.extraHeaders = opts.extraHeaders ?? {};
this.validateUrl = opts.validateUrl;
}
async chatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): Promise<ChatCompletionResponse> {
const res = await this.fetchWithTimeout(`${this.baseUrl}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
...this.extraHeaders,
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`${this.name} API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const data = await res.json() as ChatCompletionResponse;
data._routed_via = { platform: this.platform, model: modelId };
return data;
}
async *streamChatCompletion(
apiKey: string,
messages: ChatMessage[],
modelId: string,
options?: CompletionOptions,
): AsyncGenerator<ChatCompletionChunk> {
const res = await this.fetchWithTimeout(`${this.baseUrl}/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
...this.extraHeaders,
},
body: JSON.stringify({
model: modelId,
messages,
temperature: options?.temperature,
max_tokens: options?.max_tokens,
top_p: options?.top_p,
stream: true,
}),
});
if (!res.ok) {
const err = await res.json().catch(() => ({}));
throw new Error(`${this.name} API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
}
const reader = res.body?.getReader();
if (!reader) throw new Error('No response body');
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() ?? '';
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed || !trimmed.startsWith('data: ')) continue;
const data = trimmed.slice(6);
if (data === '[DONE]') return;
try {
yield JSON.parse(data) as ChatCompletionChunk;
} catch {
// Skip malformed chunks
}
}
}
}
async validateKey(apiKey: string): Promise<boolean> {
try {
const url = this.validateUrl ?? `${this.baseUrl}/models`;
const res = await this.fetchWithTimeout(url, {
method: 'GET',
headers: {
'Authorization': `Bearer ${apiKey}`,
...this.extraHeaders,
},
}, 10000);
// 401/403 = bad key, anything else (200, 404, etc) = key is valid
return res.status !== 401 && res.status !== 403;
} catch {
return false;
}
}
}
import { Router } from 'express';
import type { Request, Response } from 'express';
import { getDb } from '../db/index.js';
export const analyticsRouter = Router();
function getTimeFilter(range: string): string {
switch (range) {
case '24h': return "datetime('now', '-1 day')";
case '7d': return "datetime('now', '-7 days')";
case '30d': return "datetime('now', '-30 days')";
default: return "datetime('now', '-7 days')";
}
}
// Summary stats
analyticsRouter.get('/summary', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const since = getTimeFilter(range);
const db = getDb();
const stats = db.prepare(`
SELECT
COUNT(*) as total_requests,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as success_count,
SUM(input_tokens) as total_input_tokens,
SUM(output_tokens) as total_output_tokens,
AVG(latency_ms) as avg_latency_ms
FROM requests
WHERE created_at >= ${since}
`).get() as any;
const totalRequests = stats.total_requests ?? 0;
const successRate = totalRequests > 0 ? (stats.success_count / totalRequests) * 100 : 0;
const totalTokens = (stats.total_input_tokens ?? 0) + (stats.total_output_tokens ?? 0);
// Estimate cost savings: average ~$3/M input + $15/M output tokens (GPT-4o pricing)
const inputCost = ((stats.total_input_tokens ?? 0) / 1_000_000) * 3;
const outputCost = ((stats.total_output_tokens ?? 0) / 1_000_000) * 15;
res.json({
totalRequests,
successRate: Math.round(successRate * 10) / 10,
totalInputTokens: stats.total_input_tokens ?? 0,
totalOutputTokens: stats.total_output_tokens ?? 0,
avgLatencyMs: Math.round(stats.avg_latency_ms ?? 0),
estimatedCostSavings: Math.round((inputCost + outputCost) * 100) / 100,
});
});
// Stats grouped by model
analyticsRouter.get('/by-model', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const since = getTimeFilter(range);
const db = getDb();
const rows = db.prepare(`
SELECT
r.platform,
r.model_id,
m.display_name,
COUNT(*) as requests,
SUM(CASE WHEN r.status = 'success' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as success_rate,
AVG(r.latency_ms) as avg_latency_ms,
SUM(r.input_tokens) as total_input_tokens,
SUM(r.output_tokens) as total_output_tokens
FROM requests r
LEFT JOIN models m ON m.platform = r.platform AND m.model_id = r.model_id
WHERE r.created_at >= ${since}
GROUP BY r.platform, r.model_id
ORDER BY requests DESC
`).all() as any[];
res.json(rows.map(r => ({
platform: r.platform,
modelId: r.model_id,
displayName: r.display_name ?? r.model_id,
requests: r.requests,
successRate: Math.round(r.success_rate * 10) / 10,
avgLatencyMs: Math.round(r.avg_latency_ms),
totalInputTokens: r.total_input_tokens ?? 0,
totalOutputTokens: r.total_output_tokens ?? 0,
})));
});
// Stats grouped by platform
analyticsRouter.get('/by-platform', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const since = getTimeFilter(range);
const db = getDb();
const rows = db.prepare(`
SELECT
platform,
COUNT(*) as requests,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as success_rate,
AVG(latency_ms) as avg_latency_ms,
SUM(input_tokens) as total_input_tokens,
SUM(output_tokens) as total_output_tokens
FROM requests
WHERE created_at >= ${since}
GROUP BY platform
ORDER BY requests DESC
`).all() as any[];
res.json(rows.map(r => ({
platform: r.platform,
requests: r.requests,
successRate: Math.round(r.success_rate * 10) / 10,
avgLatencyMs: Math.round(r.avg_latency_ms),
totalInputTokens: r.total_input_tokens ?? 0,
totalOutputTokens: r.total_output_tokens ?? 0,
})));
});
// Timeline data
analyticsRouter.get('/timeline', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const interval = (req.query.interval as string) ?? (range === '24h' ? 'hour' : 'day');
const since = getTimeFilter(range);
const db = getDb();
const dateFormat = interval === 'hour' ? '%Y-%m-%dT%H:00:00' : '%Y-%m-%d';
const rows = db.prepare(`
SELECT
strftime('${dateFormat}', created_at) as timestamp,
COUNT(*) as requests,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as success_count,
SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as failure_count
FROM requests
WHERE created_at >= ${since}
GROUP BY strftime('${dateFormat}', created_at)
ORDER BY timestamp ASC
`).all() as any[];
res.json(rows.map(r => ({
timestamp: r.timestamp,
requests: r.requests,
successCount: r.success_count,
failureCount: r.failure_count,
})));
});
// Error distribution (grouped by error type and platform)
analyticsRouter.get('/error-distribution', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const since = getTimeFilter(range);
const db = getDb();
// Group errors by category (extract the key part of the error message)
const rows = db.prepare(`
SELECT
platform,
model_id,
CASE
WHEN error LIKE '%429%' OR error LIKE '%rate limit%' OR error LIKE '%too many%' OR error LIKE '%quota%' THEN 'Rate Limited (429)'
WHEN error LIKE '%401%' OR error LIKE '%unauthorized%' OR error LIKE '%invalid.*key%' THEN 'Auth Error (401)'
WHEN error LIKE '%403%' OR error LIKE '%forbidden%' THEN 'Forbidden (403)'
WHEN error LIKE '%404%' OR error LIKE '%not found%' THEN 'Not Found (404)'
WHEN error LIKE '%timeout%' OR error LIKE '%ETIMEDOUT%' OR error LIKE '%ECONNREFUSED%' THEN 'Timeout/Connection'
WHEN error LIKE '%500%' OR error LIKE '%internal server%' THEN 'Server Error (500)'
WHEN error LIKE '%503%' OR error LIKE '%unavailable%' THEN 'Unavailable (503)'
ELSE 'Other'
END as error_category,
COUNT(*) as count
FROM requests
WHERE status = 'error' AND created_at >= ${since}
GROUP BY platform, error_category
ORDER BY count DESC
`).all() as any[];
// Also get totals by category
const byCategory = db.prepare(`
SELECT
CASE
WHEN error LIKE '%429%' OR error LIKE '%rate limit%' OR error LIKE '%too many%' OR error LIKE '%quota%' THEN 'Rate Limited (429)'
WHEN error LIKE '%401%' OR error LIKE '%unauthorized%' OR error LIKE '%invalid.*key%' THEN 'Auth Error (401)'
WHEN error LIKE '%403%' OR error LIKE '%forbidden%' THEN 'Forbidden (403)'
WHEN error LIKE '%404%' OR error LIKE '%not found%' THEN 'Not Found (404)'
WHEN error LIKE '%timeout%' OR error LIKE '%ETIMEDOUT%' OR error LIKE '%ECONNREFUSED%' THEN 'Timeout/Connection'
WHEN error LIKE '%500%' OR error LIKE '%internal server%' THEN 'Server Error (500)'
WHEN error LIKE '%503%' OR error LIKE '%unavailable%' THEN 'Unavailable (503)'
ELSE 'Other'
END as category,
COUNT(*) as count
FROM requests
WHERE status = 'error' AND created_at >= ${since}
GROUP BY category
ORDER BY count DESC
`).all() as any[];
// Errors by platform
const byPlatform = db.prepare(`
SELECT platform, COUNT(*) as count
FROM requests
WHERE status = 'error' AND created_at >= ${since}
GROUP BY platform
ORDER BY count DESC
`).all() as any[];
res.json({
byCategory,
byPlatform,
detailed: rows,
});
});
// Recent errors
analyticsRouter.get('/errors', (req: Request, res: Response) => {
const range = (req.query.range as string) ?? '7d';
const since = getTimeFilter(range);
const db = getDb();
const rows = db.prepare(`
SELECT id, platform, model_id, error, latency_ms, created_at
FROM requests
WHERE status = 'error' AND created_at >= ${since}
ORDER BY created_at DESC
LIMIT 50
`).all() as any[];
res.json(rows.map(r => ({
id: r.id,
platform: r.platform,
modelId: r.model_id,
error: r.error,
latencyMs: r.latency_ms,
createdAt: r.created_at,
})));
});
import { Router } from 'express';
import type { Request, Response } from 'express';
import { z } from 'zod';
import { getDb } from '../db/index.js';
import { getAllPenalties } from '../services/router.js';
export const fallbackRouter = Router();
// Get fallback chain (with dynamic penalties)
fallbackRouter.get('/', (_req: Request, res: Response) => {
const db = getDb();
const rows = db.prepare(`
SELECT fc.model_db_id, fc.priority, fc.enabled,
m.platform, m.model_id, m.display_name, m.intelligence_rank,
m.speed_rank, m.size_label, m.rpm_limit, m.rpd_limit,
m.monthly_token_budget
FROM fallback_config fc
JOIN models m ON m.id = fc.model_db_id
ORDER BY fc.priority ASC
`).all() as any[];
// Count enabled keys per platform
const keyCounts = db.prepare(`
SELECT platform, COUNT(*) as count
FROM api_keys WHERE enabled = 1
GROUP BY platform
`).all() as { platform: string; count: number }[];
const keyCountMap = new Map(keyCounts.map(k => [k.platform, k.count]));
// Get current dynamic penalties
const penalties = getAllPenalties();
const penaltyMap = new Map(penalties.map(p => [p.modelDbId, p]));
res.json(rows.map(r => {
const penalty = penaltyMap.get(r.model_db_id);
return {
modelDbId: r.model_db_id,
priority: r.priority,
effectivePriority: r.priority + (penalty?.penalty ?? 0),
penalty: penalty?.penalty ?? 0,
rateLimitHits: penalty?.count ?? 0,
enabled: r.enabled === 1,
platform: r.platform,
modelId: r.model_id,
displayName: r.display_name,
intelligenceRank: r.intelligence_rank,
speedRank: r.speed_rank,
sizeLabel: r.size_label,
rpmLimit: r.rpm_limit,
rpdLimit: r.rpd_limit,
monthlyTokenBudget: r.monthly_token_budget,
keyCount: keyCountMap.get(r.platform) ?? 0,
};
}));
});
const updateSchema = z.array(z.object({
modelDbId: z.number(),
priority: z.number(),
enabled: z.boolean(),
}));
// Update fallback chain (full replace)
fallbackRouter.put('/', (req: Request, res: Response) => {
const parsed = updateSchema.safeParse(req.body);
if (!parsed.success) {
res.status(400).json({ error: { message: parsed.error.errors.map(e => e.message).join(', ') } });
return;
}
const db = getDb();
const update = db.prepare(`
UPDATE fallback_config SET priority = ?, enabled = ? WHERE model_db_id = ?
`);
const updateAll = db.transaction(() => {
for (const entry of parsed.data) {
update.run(entry.priority, entry.enabled ? 1 : 0, entry.modelDbId);
}
});
updateAll();
res.json({ success: true });
});
// Sort presets
fallbackRouter.post('/sort/:preset', (req: Request, res: Response) => {
const { preset } = req.params;
const db = getDb();
let orderBy: string;
switch (preset) {
case 'intelligence':
orderBy = 'm.intelligence_rank ASC';
break;
case 'speed':
orderBy = 'm.speed_rank ASC';
break;
case 'budget':
orderBy = "CASE m.monthly_token_budget WHEN '~120M' THEN 1 WHEN '~50-100M' THEN 2 WHEN '~30M' THEN 3 WHEN '~18-45M' THEN 4 WHEN '~18M' THEN 5 WHEN '~15M' THEN 6 WHEN '~12M' THEN 7 WHEN '~6M' THEN 8 WHEN '~5-10M' THEN 9 WHEN '~4M' THEN 10 ELSE 11 END ASC";
break;
default:
res.status(400).json({ error: { message: `Unknown preset: ${preset}. Use: intelligence, speed, budget` } });
return;
}
const models = db.prepare(`
SELECT m.id FROM models m ORDER BY ${orderBy}
`).all() as { id: number }[];
const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');
const reorder = db.transaction(() => {
for (let i = 0; i < models.length; i++) {
update.run(i + 1, models[i].id);
}
});
reorder();
res.json({ success: true, preset });
});
// Token usage per model for the stacked bar
fallbackRouter.get('/token-usage', (_req: Request, res: Response) => {
const db = getDb();
// Get platforms that have enabled keys
const platforms = db.prepare(`
SELECT DISTINCT ak.platform
FROM api_keys ak
WHERE ak.enabled = 1
`).all() as { platform: string }[];
const platformSet = new Set(platforms.map(p => p.platform));
// Get monthly budget per model, ordered by fallback priority
const models = db.prepare(`
SELECT m.platform, m.model_id, m.display_name, m.monthly_token_budget,
fc.priority
FROM models m
JOIN fallback_config fc ON fc.model_db_id = m.id
WHERE m.enabled = 1
ORDER BY fc.priority ASC
`).all() as { platform: string; model_id: string; display_name: string; monthly_token_budget: string; priority: number }[];
function parseBudget(s: string): number {
const m = s.match(/~?([\d.]+)(?:-([\d.]+))?([MK])?/);
if (!m) return 0;
const high = parseFloat(m[2] ?? m[1]);
const unit = m[3] === 'M' ? 1_000_000 : m[3] === 'K' ? 1_000 : 1;
return high * unit;
}
// Build per-model breakdown (only platforms with keys)
const modelBudgets = models
.filter(m => platformSet.has(m.platform))
.map(m => ({
displayName: m.display_name,
platform: m.platform,
budget: parseBudget(m.monthly_token_budget),
}));
const totalBudget = modelBudgets.reduce((s, m) => s + m.budget, 0);
// Tokens used this month
const usage = db.prepare(`
SELECT
COALESCE(SUM(input_tokens + output_tokens), 0) as total_used
FROM requests
WHERE created_at >= datetime('now', 'start of month')
`).get() as { total_used: number };
res.json({
totalBudget,
totalUsed: usage.total_used,
models: modelBudgets,
});
});
import { Router } from 'express';
import type { Request, Response } from 'express';
import { getDb } from '../db/index.js';
import { checkKeyHealth, checkAllKeys } from '../services/health.js';
import { hasProvider } from '../providers/index.js';
export const healthRouter = Router();
// Get health status for all platforms
healthRouter.get('/', (_req: Request, res: Response) => {
const db = getDb();
const platforms = db.prepare(`
SELECT
platform,
COUNT(*) as total_keys,
SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END) as healthy_keys,
SUM(CASE WHEN status = 'rate_limited' THEN 1 ELSE 0 END) as rate_limited_keys,
SUM(CASE WHEN status = 'invalid' THEN 1 ELSE 0 END) as invalid_keys,
SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as error_keys,
SUM(CASE WHEN status = 'unknown' THEN 1 ELSE 0 END) as unknown_keys,
SUM(CASE WHEN enabled = 1 THEN 1 ELSE 0 END) as enabled_keys
FROM api_keys
GROUP BY platform
`).all() as any[];
const keys = db.prepare(`
SELECT id, platform, label, status, enabled, created_at, last_checked_at
FROM api_keys
ORDER BY platform, created_at DESC
`).all() as any[];
res.json({
platforms: platforms.map(p => ({
platform: p.platform,
hasProvider: hasProvider(p.platform),
totalKeys: p.total_keys,
healthyKeys: p.healthy_keys,
rateLimitedKeys: p.rate_limited_keys,
invalidKeys: p.invalid_keys,
errorKeys: p.error_keys,
unknownKeys: p.unknown_keys,
enabledKeys: p.enabled_keys,
})),
keys: keys.map(k => ({
id: k.id,
platform: k.platform,
label: k.label,
status: k.status,
enabled: k.enabled === 1,
createdAt: k.created_at,
lastCheckedAt: k.last_checked_at,
})),
});
});
// Check a specific key
healthRouter.post('/check/:keyId', async (req: Request, res: Response) => {
const keyId = parseInt(req.params.keyId as string, 10);
if (isNaN(keyId)) {
res.status(400).json({ error: { message: 'Invalid key ID' } });
return;
}
const status = await checkKeyHealth(keyId);
res.json({ keyId, status });
});
// Check all keys
healthRouter.post('/check-all', async (_req: Request, res: Response) => {
await checkAllKeys();
res.json({ success: true });
});
import { Router } from 'express';
import type { Request, Response } from 'express';
import { z } from 'zod';
import { getDb } from '../db/index.js';
import { encrypt, decrypt, maskKey } from '../lib/crypto.js';
export const keysRouter = Router();
const PLATFORMS = [
'google', 'groq', 'cerebras', 'sambanova', 'nvidia', 'mistral',
'openrouter', 'github', 'huggingface', 'cohere', 'cloudflare',
] as const;
const addKeySchema = z.object({
platform: z.enum(PLATFORMS),
key: z.string().min(1),
label: z.string().optional(),
});
// List all keys (masked)
keysRouter.get('/', (_req: Request, res: Response) => {
const db = getDb();
const rows = db.prepare('SELECT * FROM api_keys ORDER BY created_at DESC').all() as any[];
const keys = rows.map(row => {
let maskedKey = '****';
try {
const realKey = decrypt(row.encrypted_key, row.iv, row.auth_tag);
maskedKey = maskKey(realKey);
} catch {
maskedKey = '[decrypt failed]';
}
return {
id: row.id,
platform: row.platform,
label: row.label,
maskedKey,
status: row.status,
enabled: row.enabled === 1,
createdAt: row.created_at,
lastCheckedAt: row.last_checked_at,
};
});
res.json(keys);
});
// Add a key
keysRouter.post('/', (req: Request, res: Response) => {
const parsed = addKeySchema.safeParse(req.body);
if (!parsed.success) {
res.status(400).json({ error: { message: parsed.error.errors.map(e => e.message).join(', ') } });
return;
}
const { platform, key, label } = parsed.data;
const { encrypted, iv, authTag } = encrypt(key);
const db = getDb();
const result = db.prepare(`
INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
VALUES (?, ?, ?, ?, ?, 'unknown', 1)
`).run(platform, label ?? '', encrypted, iv, authTag);
res.status(201).json({
id: result.lastInsertRowid,
platform,
label: label ?? '',
maskedKey: maskKey(key),
status: 'unknown',
enabled: true,
});
});
// Delete a key
keysRouter.delete('/:id', (req: Request, res: Response) => {
const id = parseInt(req.params.id as string, 10);
if (isNaN(id)) {
res.status(400).json({ error: { message: 'Invalid key ID' } });
return;
}
const db = getDb();
const result = db.prepare('DELETE FROM api_keys WHERE id = ?').run(id);
if (result.changes === 0) {
res.status(404).json({ error: { message: 'Key not found' } });
return;
}
res.json({ success: true });
});
// Toggle enable/disable
keysRouter.patch('/:id', (req: Request, res: Response) => {
const id = parseInt(req.params.id as string, 10);
if (isNaN(id)) {
res.status(400).json({ error: { message: 'Invalid key ID' } });
return;
}
const { enabled } = req.body;
if (typeof enabled !== 'boolean') {
res.status(400).json({ error: { message: 'enabled must be a boolean' } });
return;
}
const db = getDb();
const result = db.prepare('UPDATE api_keys SET enabled = ? WHERE id = ?').run(enabled ? 1 : 0, id);
if (result.changes === 0) {
res.status(404).json({ error: { message: 'Key not found' } });
return;
}
res.json({ success: true, enabled });
});
import { Router } from 'express';
import type { Request, Response } from 'express';
import { getDb } from '../db/index.js';
import { hasProvider } from '../providers/index.js';
export const modelsRouter = Router();
// List all models with availability info
modelsRouter.get('/', (_req: Request, res: Response) => {
const db = getDb();
const models = db.prepare(`
SELECT m.*, fc.priority, fc.enabled as fallback_enabled
FROM models m
LEFT JOIN fallback_config fc ON fc.model_db_id = m.id
ORDER BY COALESCE(fc.priority, m.intelligence_rank) ASC
`).all() as any[];
// Count keys per platform
const keyCounts = db.prepare(`
SELECT platform, COUNT(*) as count
FROM api_keys
WHERE enabled = 1
GROUP BY platform
`).all() as { platform: string; count: number }[];
const keyCountMap = new Map(keyCounts.map(k => [k.platform, k.count]));
const result = models.map(m => ({
id: m.id,
platform: m.platform,
modelId: m.model_id,
displayName: m.display_name,
intelligenceRank: m.intelligence_rank,
speedRank: m.speed_rank,
sizeLabel: m.size_label,
rpmLimit: m.rpm_limit,
rpdLimit: m.rpd_limit,
tpmLimit: m.tpm_limit,
tpdLimit: m.tpd_limit,
monthlyTokenBudget: m.monthly_token_budget,
contextWindow: m.context_window,
enabled: m.enabled === 1,
priority: m.priority,
fallbackEnabled: m.fallback_enabled === 1,
hasProvider: hasProvider(m.platform),
keyCount: keyCountMap.get(m.platform) ?? 0,
}));
res.json(result);
});
import { Router } from 'express';
import type { Request, Response } from 'express';
import { z } from 'zod';
import { routeRequest, recordRateLimitHit, recordSuccess, type RouteResult } from '../services/router.js';
import { recordRequest, recordTokens, setCooldown } from '../services/ratelimit.js';
import { getDb, getUnifiedApiKey } from '../db/index.js';
export const proxyRouter = Router();
// Sticky sessions: track which model served each "session"
// Key: hash of first user message → model_db_id
// This prevents model switching mid-conversation which causes hallucination
const stickySessionMap = new Map<string, { modelDbId: number; lastUsed: number }>();
const STICKY_TTL_MS = 30 * 60 * 1000; // 30 min session TTL
function getSessionKey(messages: { role: string; content: string }[]): string {
// Use the first user message as session identifier
// Hermes sends the full conversation each time, so first user msg is stable
const firstUser = messages.find(m => m.role === 'user');
if (!firstUser) return '';
// Hash: first 100 chars of first user message + message count
return `${firstUser.content.slice(0, 100)}:${messages.length > 2 ? 'multi' : 'single'}`;
}
function getStickyModel(messages: { role: string; content: string }[]): number | undefined {
// Only apply sticky for multi-turn (has assistant messages = continuation)
const hasAssistant = messages.some(m => m.role === 'assistant');
if (!hasAssistant) return undefined;
const key = getSessionKey(messages);
if (!key) return undefined;
const entry = stickySessionMap.get(key);
if (!entry) return undefined;
if (Date.now() - entry.lastUsed > STICKY_TTL_MS) {
stickySessionMap.delete(key);
return undefined;
}
return entry.modelDbId;
}
function setStickyModel(messages: { role: string; content: string }[], modelDbId: number) {
const key = getSessionKey(messages);
if (!key) return;
stickySessionMap.set(key, { modelDbId, lastUsed: Date.now() });
// Cleanup old entries
if (stickySessionMap.size > 500) {
const now = Date.now();
for (const [k, v] of stickySessionMap) {
if (now - v.lastUsed > STICKY_TTL_MS) stickySessionMap.delete(k);
}
}
}
// OpenAI-compatible /models endpoint (used by Hermes for metadata)
proxyRouter.get('/models', (_req: Request, res: Response) => {
const db = getDb();
const models = db.prepare('SELECT platform, model_id, display_name, context_window FROM models WHERE enabled = 1 ORDER BY intelligence_rank').all() as any[];
res.json({
object: 'list',
data: models.map(m => ({
id: m.model_id,
object: 'model',
created: 0,
owned_by: m.platform,
name: m.display_name,
context_window: m.context_window,
})),
});
});
const MAX_RETRIES = 20;
const chatCompletionSchema = z.object({
messages: z.array(z.object({
role: z.enum(['system', 'user', 'assistant']),
content: z.string(),
})).min(1),
model: z.string().optional(),
temperature: z.number().min(0).max(2).optional(),
max_tokens: z.number().int().positive().optional(),
top_p: z.number().min(0).max(1).optional(),
stream: z.boolean().optional(),
});
function isRetryableError(err: any): boolean {
const msg = (err.message ?? '').toLowerCase();
return msg.includes('429') || msg.includes('rate limit') || msg.includes('too many requests')
|| msg.includes('quota') || msg.includes('resource_exhausted')
|| msg.includes('aborted') || msg.includes('timeout') || msg.includes('etimedout')
|| msg.includes('econnrefused') || msg.includes('econnreset')
|| msg.includes('503') || msg.includes('unavailable')
|| msg.includes('500') || msg.includes('internal server error');
}
proxyRouter.post('/chat/completions', async (req: Request, res: Response) => {
const start = Date.now();
// Authenticate with unified API key (skip for local requests)
const authHeader = req.headers.authorization;
const isLocal = req.ip === '127.0.0.1' || req.ip === '::1' || req.ip === '::ffff:127.0.0.1';
if (authHeader && !isLocal) {
const token = authHeader.replace(/^Bearer\s+/i, '');
const unifiedKey = getUnifiedApiKey();
if (token !== unifiedKey) {
res.status(401).json({
error: { message: 'Invalid API key', type: 'authentication_error' },
});
return;
}
}
// Validate request
const parsed = chatCompletionSchema.safeParse(req.body);
if (!parsed.success) {
res.status(400).json({
error: {
message: `Invalid request: ${parsed.error.errors.map(e => e.message).join(', ')}`,
type: 'invalid_request_error',
},
});
return;
}
const { messages, temperature, max_tokens, top_p, stream } = parsed.data;
const estimatedInputTokens = messages.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);
const estimatedTotal = estimatedInputTokens + (max_tokens ?? 1000);
// Sticky session: prefer the same model for multi-turn conversations
const preferredModel = getStickyModel(messages);
// Retry loop: on 429/rate limit, skip that model+key and try the next one
const skipKeys = new Set<string>();
let lastError: any = null;
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
let route: RouteResult;
try {
route = routeRequest(estimatedTotal, skipKeys.size > 0 ? skipKeys : undefined, preferredModel);
} catch (err: any) {
// No more models available
if (lastError) {
res.status(429).json({
error: {
message: `All models rate-limited. Last error: ${lastError.message}`,
type: 'rate_limit_error',
},
});
} else {
res.status(err.status ?? 503).json({
error: { message: err.message, type: 'routing_error' },
});
}
return;
}
recordRequest(route.platform, route.modelId, route.keyId);
try {
if (stream) {
// Streaming - can't retry once we start writing
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('X-Routed-Via', `${route.platform}/${route.modelId}`);
if (attempt > 0) res.setHeader('X-Fallback-Attempts', String(attempt));
let totalOutputTokens = 0;
const gen = route.provider.streamChatCompletion(
route.apiKey, messages, route.modelId,
{ temperature, max_tokens, top_p },
);
for await (const chunk of gen) {
const text = chunk.choices[0]?.delta?.content ?? '';
totalOutputTokens += Math.ceil(text.length / 4);
res.write(`data: ${JSON.stringify(chunk)}\n\n`);
}
res.write('data: [DONE]\n\n');
res.end();
recordTokens(route.platform, route.modelId, route.keyId, estimatedInputTokens + totalOutputTokens);
recordSuccess(route.modelDbId);
setStickyModel(messages, route.modelDbId);
logRequest(route.platform, route.modelId, 'success', estimatedInputTokens, totalOutputTokens, Date.now() - start, null);
return;
} else {
const result = await route.provider.chatCompletion(
route.apiKey, messages, route.modelId,
{ temperature, max_tokens, top_p },
);
const totalTokens = result.usage?.total_tokens ?? 0;
recordTokens(route.platform, route.modelId, route.keyId, totalTokens);
recordSuccess(route.modelDbId);
setStickyModel(messages, route.modelDbId);
res.setHeader('X-Routed-Via', `${route.platform}/${route.modelId}`);
if (attempt > 0) res.setHeader('X-Fallback-Attempts', String(attempt));
res.json(result);
logRequest(
route.platform, route.modelId, 'success',
result.usage?.prompt_tokens ?? 0,
result.usage?.completion_tokens ?? 0,
Date.now() - start, null,
);
return;
}
} catch (err: any) {
const latency = Date.now() - start;
logRequest(route.platform, route.modelId, 'error', estimatedInputTokens, 0, latency, err.message);
if (isRetryableError(err)) {
// Put this model+key on cooldown and try the next one
const skipId = `${route.platform}:${route.modelId}:${route.keyId}`;
skipKeys.add(skipId);
setCooldown(route.platform, route.modelId, route.keyId, 120_000);
recordRateLimitHit(route.modelDbId);
lastError = err;
console.log(`[Proxy] ${err.message.slice(0, 60)} from ${route.displayName}, falling back (attempt ${attempt + 1}/${MAX_RETRIES})`);
continue;
}
// Non-retryable error (auth, 4xx, etc.): don't retry
res.status(502).json({
error: {
message: `Provider error (${route.displayName}): ${err.message}`,
type: 'provider_error',
},
});
return;
}
}
// Exhausted all retries
res.status(429).json({
error: {
message: `All models rate-limited after ${MAX_RETRIES} attempts. Last: ${lastError?.message}`,
type: 'rate_limit_error',
},
});
});
function logRequest(
platform: string,
modelId: string,
status: string,
inputTokens: number,
outputTokens: number,
latencyMs: number,
error: string | null,
) {
try {
const db = getDb();
db.prepare(`
INSERT INTO requests (platform, model_id, status, input_tokens, output_tokens, latency_ms, error)
VALUES (?, ?, ?, ?, ?, ?, ?)
`).run(platform, modelId, status, inputTokens, outputTokens, latencyMs, error);
} catch (e) {
console.error('Failed to log request:', e);
}
}
import { Router } from 'express';
import type { Request, Response } from 'express';
import { getUnifiedApiKey, regenerateUnifiedKey } from '../db/index.js';
export const settingsRouter = Router();
// Get the unified API key
settingsRouter.get('/api-key', (_req: Request, res: Response) => {
res.json({ apiKey: getUnifiedApiKey() });
});
// Regenerate the unified API key
settingsRouter.post('/api-key/regenerate', (_req: Request, res: Response) => {
const newKey = regenerateUnifiedKey();
res.json({ apiKey: newKey });
});
/**
* Probe every enabled model with a minimal request to find broken model IDs.
* Usage: npx tsx src/scripts/test-all-models.ts
*/
import { initDb, getDb } from '../db/index.js';
import { decrypt } from '../lib/crypto.js';
import { getProvider } from '../providers/index.js';
initDb();
const db = getDb();
interface Row {
id: number;
platform: string;
model_id: string;
display_name: string;
}
interface Key {
encrypted_key: string;
iv: string;
auth_tag: string;
}
const models = db.prepare(`
SELECT m.id, m.platform, m.model_id, m.display_name
FROM models m
WHERE m.enabled = 1
AND EXISTS (SELECT 1 FROM api_keys k WHERE k.platform = m.platform AND k.enabled = 1)
ORDER BY m.intelligence_rank, m.platform
`).all() as Row[];
const keyStmt = db.prepare(`
SELECT encrypted_key, iv, auth_tag FROM api_keys
WHERE platform = ? AND enabled = 1 ORDER BY id LIMIT 1
`);
const results: { row: Row; ok: boolean; ms: number; error?: string; reply?: string }[] = [];
for (const row of models) {
const keyRow = keyStmt.get(row.platform) as Key | undefined;
if (!keyRow) { results.push({ row, ok: false, ms: 0, error: 'no key' }); continue; }
const apiKey = decrypt(keyRow.encrypted_key, keyRow.iv, keyRow.auth_tag);
const provider = getProvider(row.platform as any);
if (!provider) { results.push({ row, ok: false, ms: 0, error: 'no provider' }); continue; }
const start = Date.now();
try {
const res = await provider.chatCompletion(apiKey, [{ role: 'user', content: 'hi' }], row.model_id, { max_tokens: 5 });
const reply = res.choices?.[0]?.message?.content?.slice(0, 40) ?? '';
results.push({ row, ok: true, ms: Date.now() - start, reply });
} catch (err: any) {
results.push({ row, ok: false, ms: Date.now() - start, error: String(err?.message ?? err).slice(0, 200) });
}
}
console.log('\n=== Results ===\n');
const pad = (s: string, n: number) => s.length > n ? s.slice(0, n - 1) + '…' : s.padEnd(n);
for (const r of results) {
const status = r.ok ? '✓' : '✗';
console.log(`${status} ${pad(r.row.platform, 12)} ${pad(r.row.model_id, 52)} ${String(r.ms).padStart(5)}ms ${r.ok ? `"${r.reply}"` : r.error}`);
}
const okCount = results.filter(r => r.ok).length;
console.log(`\n${okCount}/${results.length} models working\n`);
process.exit(0);
import { getDb } from '../db/index.js';
import { getProvider } from '../providers/index.js';
import { decrypt } from '../lib/crypto.js';
import type { Platform, KeyStatus } from '@freellmapi/shared/types.js';
const CHECK_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes
const CONSECUTIVE_FAILURES_TO_DISABLE = 3;
// Track consecutive failures per key
const failureCount = new Map<number, number>();
export async function checkKeyHealth(keyId: number): Promise<KeyStatus> {
const db = getDb();
const row = db.prepare('SELECT * FROM api_keys WHERE id = ?').get(keyId) as any;
if (!row) return 'error';
const provider = getProvider(row.platform as Platform);
if (!provider) return 'error';
try {
const apiKey = decrypt(row.encrypted_key, row.iv, row.auth_tag);
const isValid = await provider.validateKey(apiKey);
const status: KeyStatus = isValid ? 'healthy' : 'invalid';
db.prepare("UPDATE api_keys SET status = ?, last_checked_at = datetime('now') WHERE id = ?")
.run(status, keyId);
if (isValid) {
failureCount.delete(keyId);
} else {
const count = (failureCount.get(keyId) ?? 0) + 1;
failureCount.set(keyId, count);
if (count >= CONSECUTIVE_FAILURES_TO_DISABLE) {
db.prepare('UPDATE api_keys SET enabled = 0 WHERE id = ?').run(keyId);
console.log(`[Health] Auto-disabled key ${keyId} after ${count} consecutive failures`);
}
}
return status;
} catch (err: any) {
console.error(`[Health] Key ${keyId} check error:`, err.message);
db.prepare("UPDATE api_keys SET status = ?, last_checked_at = datetime('now') WHERE id = ?")
.run('error', keyId);
const count = (failureCount.get(keyId) ?? 0) + 1;
failureCount.set(keyId, count);
if (count >= CONSECUTIVE_FAILURES_TO_DISABLE) {
db.prepare('UPDATE api_keys SET enabled = 0 WHERE id = ?').run(keyId);
}
return 'error';
}
}
export async function checkAllKeys(): Promise<void> {
const db = getDb();
const keys = db.prepare('SELECT id, platform FROM api_keys WHERE enabled = 1').all() as { id: number; platform: string }[];
console.log(`[Health] Checking ${keys.length} keys...`);
for (const key of keys) {
await checkKeyHealth(key.id);
}
console.log(`[Health] Check complete.`);
}
let intervalId: ReturnType<typeof setInterval> | null = null;
export function startHealthChecker(): void {
if (intervalId) return;
console.log(`[Health] Starting health checker (every ${CHECK_INTERVAL_MS / 1000}s)`);
intervalId = setInterval(() => {
checkAllKeys().catch(err => console.error('[Health] Check failed:', err));
}, CHECK_INTERVAL_MS);
}
export function stopHealthChecker(): void {
if (intervalId) {
clearInterval(intervalId);
intervalId = null;
}
}
// In-memory sliding window rate limit tracker
interface Window {
timestamps: number[];
tokenCount: number;
tokenTimestamps: { ts: number; tokens: number }[];
}
// Key format: "platform:modelId:keyId:type" where type is rpm|rpd|tpm|tpd
const windows = new Map<string, Window>();
function getWindow(key: string): Window {
let w = windows.get(key);
if (!w) {
w = { timestamps: [], tokenCount: 0, tokenTimestamps: [] };
windows.set(key, w);
}
return w;
}
function pruneTimestamps(timestamps: number[], windowMs: number, now: number): number[] {
const cutoff = now - windowMs;
return timestamps.filter(ts => ts > cutoff);
}
const MINUTE = 60 * 1000;
const DAY = 24 * 60 * MINUTE;
export function canMakeRequest(
platform: string,
modelId: string,
keyId: number,
limits: { rpm: number | null; rpd: number | null; tpm: number | null; tpd: number | null },
): boolean {
const now = Date.now();
if (limits.rpm !== null) {
const key = `${platform}:${modelId}:${keyId}:rpm`;
const w = getWindow(key);
w.timestamps = pruneTimestamps(w.timestamps, MINUTE, now);
if (w.timestamps.length >= limits.rpm) return false;
}
if (limits.rpd !== null) {
const key = `${platform}:${modelId}:${keyId}:rpd`;
const w = getWindow(key);
w.timestamps = pruneTimestamps(w.timestamps, DAY, now);
if (w.timestamps.length >= limits.rpd) return false;
}
return true;
}
export function canUseTokens(
platform: string,
modelId: string,
keyId: number,
estimatedTokens: number,
limits: { tpm: number | null; tpd: number | null },
): boolean {
const now = Date.now();
if (limits.tpm !== null) {
const key = `${platform}:${modelId}:${keyId}:tpm`;
const w = getWindow(key);
w.tokenTimestamps = w.tokenTimestamps.filter(t => t.ts > now - MINUTE);
const used = w.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
if (used + estimatedTokens > limits.tpm) return false;
}
if (limits.tpd !== null) {
const key = `${platform}:${modelId}:${keyId}:tpd`;
const w = getWindow(key);
w.tokenTimestamps = w.tokenTimestamps.filter(t => t.ts > now - DAY);
const used = w.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
if (used + estimatedTokens > limits.tpd) return false;
}
return true;
}
export function recordRequest(platform: string, modelId: string, keyId: number) {
const now = Date.now();
const rpmKey = `${platform}:${modelId}:${keyId}:rpm`;
getWindow(rpmKey).timestamps.push(now);
const rpdKey = `${platform}:${modelId}:${keyId}:rpd`;
getWindow(rpdKey).timestamps.push(now);
}
export function recordTokens(
platform: string,
modelId: string,
keyId: number,
tokens: number,
) {
const now = Date.now();
const tpmKey = `${platform}:${modelId}:${keyId}:tpm`;
getWindow(tpmKey).tokenTimestamps.push({ ts: now, tokens });
const tpdKey = `${platform}:${modelId}:${keyId}:tpd`;
getWindow(tpdKey).tokenTimestamps.push({ ts: now, tokens });
}
// Cooldown: when a provider returns 429, block that model+key for a period
const cooldowns = new Map<string, number>(); // key -> expiry timestamp
export function setCooldown(platform: string, modelId: string, keyId: number, durationMs = 60_000) {
const key = `${platform}:${modelId}:${keyId}:cooldown`;
cooldowns.set(key, Date.now() + durationMs);
}
export function isOnCooldown(platform: string, modelId: string, keyId: number): boolean {
const key = `${platform}:${modelId}:${keyId}:cooldown`;
const expiry = cooldowns.get(key);
if (!expiry) return false;
if (Date.now() > expiry) {
cooldowns.delete(key);
return false;
}
return true;
}
export function getRateLimitStatus(
platform: string,
modelId: string,
keyId: number,
limits: { rpm: number | null; rpd: number | null; tpm: number | null; tpd: number | null },
) {
const now = Date.now();
const rpmW = getWindow(`${platform}:${modelId}:${keyId}:rpm`);
rpmW.timestamps = pruneTimestamps(rpmW.timestamps, MINUTE, now);
const rpdW = getWindow(`${platform}:${modelId}:${keyId}:rpd`);
rpdW.timestamps = pruneTimestamps(rpdW.timestamps, DAY, now);
const tpmW = getWindow(`${platform}:${modelId}:${keyId}:tpm`);
tpmW.tokenTimestamps = tpmW.tokenTimestamps.filter(t => t.ts > now - MINUTE);
const tpmUsed = tpmW.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
return {
rpm: { used: rpmW.timestamps.length, limit: limits.rpm },
rpd: { used: rpdW.timestamps.length, limit: limits.rpd },
tpm: { used: tpmUsed, limit: limits.tpm },
};
}
import { getDb } from '../db/index.js';
import { getProvider } from '../providers/index.js';
import { decrypt } from '../lib/crypto.js';
import { canMakeRequest, canUseTokens, isOnCooldown } from './ratelimit.js';
import type { BaseProvider } from '../providers/base.js';
interface ModelRow {
id: number;
platform: string;
model_id: string;
display_name: string;
rpm_limit: number | null;
rpd_limit: number | null;
tpm_limit: number | null;
tpd_limit: number | null;
}
interface KeyRow {
id: number;
platform: string;
encrypted_key: string;
iv: string;
auth_tag: string;
status: string;
enabled: number;
}
interface FallbackRow {
model_db_id: number;
priority: number;
enabled: number;
}
export interface RouteResult {
provider: BaseProvider;
modelId: string;
modelDbId: number;
apiKey: string;
keyId: number;
platform: string;
displayName: string;
}
// Round-robin index per platform
const roundRobinIndex = new Map<string, number>();
// ── Dynamic priority: track 429s per model and demote accordingly ──
// Key: model_db_id → { count, lastHit, penalty }
const rateLimitPenalties = new Map<number, { count: number; lastHit: number; penalty: number }>();
// Penalty decays over time so models recover
const PENALTY_PER_429 = 3; // each 429 adds this many priority positions
const MAX_PENALTY = 10; // cap so a model doesn't sink forever
const DECAY_INTERVAL_MS = 2 * 60 * 1000; // penalty decays every 2 minutes
const DECAY_AMOUNT = 1; // remove this much penalty per decay interval
/**
* Record a 429 for a model — increases its penalty so it sinks in priority.
*/
export function recordRateLimitHit(modelDbId: number) {
const existing = rateLimitPenalties.get(modelDbId);
const now = Date.now();
if (existing) {
existing.count++;
existing.lastHit = now;
existing.penalty = Math.min(existing.penalty + PENALTY_PER_429, MAX_PENALTY);
} else {
rateLimitPenalties.set(modelDbId, { count: 1, lastHit: now, penalty: PENALTY_PER_429 });
}
}
/**
* Record a success for a model — reduces its penalty so it rises back up.
*/
export function recordSuccess(modelDbId: number) {
const existing = rateLimitPenalties.get(modelDbId);
if (existing) {
existing.penalty = Math.max(0, existing.penalty - 1);
if (existing.penalty === 0) {
rateLimitPenalties.delete(modelDbId);
}
}
}
/**
* Get the current penalty for a model (with time-based decay).
*/
function getPenalty(modelDbId: number): number {
const entry = rateLimitPenalties.get(modelDbId);
if (!entry) return 0;
// Apply time-based decay
const now = Date.now();
const elapsed = now - entry.lastHit;
const decaySteps = Math.floor(elapsed / DECAY_INTERVAL_MS);
if (decaySteps > 0) {
entry.penalty = Math.max(0, entry.penalty - (decaySteps * DECAY_AMOUNT));
entry.lastHit = now; // reset so we don't double-decay
if (entry.penalty === 0) {
rateLimitPenalties.delete(modelDbId);
return 0;
}
}
return entry.penalty;
}
/**
* Get current penalties for all models (for the API/dashboard).
*/
export function getAllPenalties(): Array<{ modelDbId: number; count: number; penalty: number }> {
const result: Array<{ modelDbId: number; count: number; penalty: number }> = [];
for (const [modelDbId, entry] of rateLimitPenalties) {
const penalty = getPenalty(modelDbId);
if (penalty > 0) {
result.push({ modelDbId, count: entry.count, penalty });
}
}
return result.sort((a, b) => b.penalty - a.penalty);
}
/**
* Route a request to the best available model.
* Models are sorted by (base_priority + rate_limit_penalty) so frequently
* rate-limited models automatically sink below working ones.
*
* If preferredModelDbId is set, that model gets tried FIRST (sticky sessions).
* This prevents hallucination from model switching mid-conversation.
*
* @param estimatedTokens - estimated total tokens for rate limit check
* @param skipKeys - set of "platform:modelId:keyId" to skip (failed on this request)
* @param preferredModelDbId - try this model first (sticky session)
*/
export function routeRequest(estimatedTokens = 1000, skipKeys?: Set<string>, preferredModelDbId?: number): RouteResult {
const db = getDb();
// Get fallback chain ordered by priority
const fallbackChain = db.prepare(`
SELECT fc.model_db_id, fc.priority, fc.enabled
FROM fallback_config fc
ORDER BY fc.priority ASC
`).all() as FallbackRow[];
// Apply dynamic penalties: sort by (base priority + penalty)
const sortedChain = fallbackChain.map(entry => ({
...entry,
effectivePriority: entry.priority + getPenalty(entry.model_db_id),
})).sort((a, b) => a.effectivePriority - b.effectivePriority);
// Sticky session: move preferred model to front of chain
if (preferredModelDbId) {
const idx = sortedChain.findIndex(e => e.model_db_id === preferredModelDbId);
if (idx > 0) {
const [preferred] = sortedChain.splice(idx, 1);
sortedChain.unshift(preferred);
}
}
for (const entry of sortedChain) {
if (!entry.enabled) continue;
// Get model details
const model = db.prepare('SELECT * FROM models WHERE id = ? AND enabled = 1').get(entry.model_db_id) as ModelRow | undefined;
if (!model) continue;
// Check if we have a provider for this platform
const provider = getProvider(model.platform as any);
if (!provider) continue;
// Get all healthy, enabled keys for this platform
const keys = db.prepare(
'SELECT * FROM api_keys WHERE platform = ? AND enabled = 1 AND status != ?'
).all(model.platform, 'invalid') as KeyRow[];
if (keys.length === 0) continue;
// Round-robin across keys
const rrKey = `${model.platform}:${model.model_id}`;
let idx = roundRobinIndex.get(rrKey) ?? 0;
for (let attempt = 0; attempt < keys.length; attempt++) {
const key = keys[idx % keys.length];
idx++;
const skipId = `${model.platform}:${model.model_id}:${key.id}`;
if (skipKeys?.has(skipId)) continue;
// Check cooldown (from previous 429s)
if (isOnCooldown(model.platform, model.model_id, key.id)) continue;
const limits = {
rpm: model.rpm_limit,
rpd: model.rpd_limit,
tpm: model.tpm_limit,
tpd: model.tpd_limit,
};
if (!canMakeRequest(model.platform, model.model_id, key.id, limits)) continue;
if (!canUseTokens(model.platform, model.model_id, key.id, estimatedTokens, limits)) continue;
roundRobinIndex.set(rrKey, idx);
const decryptedKey = decrypt(key.encrypted_key, key.iv, key.auth_tag);
return {
provider,
modelId: model.model_id,
modelDbId: model.id,
apiKey: decryptedKey,
keyId: key.id,
platform: model.platform,
displayName: model.display_name,
};
}
roundRobinIndex.set(rrKey, idx);
}
const err = new Error('All models exhausted. Add more API keys or wait for rate limits to reset.') as any;
err.status = 429;
throw err;
}
{
"compilerOptions": {
"target": "ES2022",
"module": "ES2022",
"moduleResolution": "bundler",
"lib": ["ES2022"],
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"declaration": true,
"declarationMap": true,
"sourceMap": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist", "src/__tests__"]
}
import { defineConfig } from 'vitest/config';
export default defineConfig({
test: {
globals: true,
environment: 'node',
include: ['src/__tests__/**/*.test.ts'],
},
});
{
"name": "@freellmapi/shared",
"version": "0.1.0",
"private": true,
"main": "./types.ts",
"types": "./types.ts"
}
// ---- Platform & Model Types ----
export type Platform =
| 'google'
| 'groq'
| 'cerebras'
| 'sambanova'
| 'nvidia'
| 'mistral'
| 'openrouter'
| 'github'
| 'huggingface'
| 'cohere'
| 'cloudflare'
| 'zhipu'
| 'moonshot'
| 'minimax';
export interface Model {
id: number;
platform: Platform;
modelId: string;
displayName: string;
intelligenceRank: number;
speedRank: number;
sizeLabel: string;
rpmLimit: number | null;
rpdLimit: number | null;
tpmLimit: number | null;
tpdLimit: number | null;
monthlyTokenBudget: string;
contextWindow: number | null;
enabled: boolean;
}
export type KeyStatus = 'healthy' | 'rate_limited' | 'invalid' | 'error' | 'unknown';
export interface ApiKey {
id: number;
platform: Platform;
label: string;
maskedKey: string;
status: KeyStatus;
enabled: boolean;
createdAt: string;
lastCheckedAt: string | null;
}
export interface ApiKeyCreate {
platform: Platform;
key: string;
label?: string;
}
// ---- Fallback Config ----
export interface FallbackEntry {
modelId: number;
platform: Platform;
displayName: string;
intelligenceRank: number;
speedRank: number;
priority: number;
enabled: boolean;
}
// ---- OpenAI-Compatible Types ----
export interface ChatMessage {
role: 'system' | 'user' | 'assistant';
content: string;
}
export interface ChatCompletionRequest {
model?: string;
messages: ChatMessage[];
temperature?: number;
max_tokens?: number;
stream?: boolean;
top_p?: number;
}
export interface ChatCompletionChoice {
index: number;
message: ChatMessage;
finish_reason: string | null;
}
export interface TokenUsage {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
}
export interface ChatCompletionResponse {
id: string;
object: 'chat.completion';
created: number;
model: string;
choices: ChatCompletionChoice[];
usage: TokenUsage;
_routed_via?: {
platform: Platform;
model: string;
};
}
export interface ChatCompletionChunk {
id: string;
object: 'chat.completion.chunk';
created: number;
model: string;
choices: {
index: number;
delta: Partial<ChatMessage>;
finish_reason: string | null;
}[];
}
// ---- Analytics Types ----
export interface AnalyticsSummary {
totalRequests: number;
successRate: number;
totalInputTokens: number;
totalOutputTokens: number;
avgLatencyMs: number;
estimatedCostSavings: number;
}
export interface PlatformStats {
platform: Platform;
requests: number;
successRate: number;
avgLatencyMs: number;
totalInputTokens: number;
totalOutputTokens: number;
}
export interface TimelinePoint {
timestamp: string;
requests: number;
successCount: number;
failureCount: number;
}
export interface RequestLog {
id: number;
platform: Platform;
modelId: string;
status: 'success' | 'error';
inputTokens: number;
outputTokens: number;
latencyMs: number;
error: string | null;
createdAt: string;
}
// ---- Rate Limit Types ----
export interface RateLimitStatus {
platform: Platform;
modelId: string;
rpm: { used: number; limit: number | null };
rpd: { used: number; limit: number | null };
tpm: { used: number; limit: number | null };
available: boolean;
nextResetAt: string | null;
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment