Initial release of FreeLLMAPI

Self-hosted OpenAI-compatible proxy that aggregates the free tiers of fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint. Server: - Express + SQLite, per-provider adapters with streaming and non-streaming support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking, sticky sessions for multi-turn, AES-256-GCM encrypted key storage, unified bearer-token auth, periodic health checks. Client: - React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag to reorder, color-coded per-provider monthly token budget), playground, analytics with per-provider breakdowns. Tooling: - GitHub Actions CI (server tests + client build), MIT license, README with provider-by-provider ToS review. For personal experimentation, not production.

Initial release of FreeLLMAPI
Self-hosted OpenAI-compatible proxy that aggregates the free tiers of fourteen LLM providers — Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, MiniMax — behind a single /v1/chat/completions endpoint. Server: - Express + SQLite, per-provider adapters with streaming and non-streaming support, automatic fallover on 429/5xx, per-key RPM/RPD/TPM/TPD tracking, sticky sessions for multi-turn, AES-256-GCM encrypted key storage, unified bearer-token auth, periodic health checks. Client: - React + Vite + shadcn/ui admin dashboard: keys, fallback chain (drag to reorder, color-coded per-provider monthly token budget), playground, analytics with per-provider breakdowns. Tooling: - GitHub Actions CI (server tests + client build), MIT license, README with provider-by-provider ToS review. For personal experimentation, not production.
04e15037 · tashfeenahmed · 04e15037 · 04e15037 · 04e15037 · 04e15037
Commit 04e15037 authored Apr 21, 2026 by tashfeenahmed
88 changed files
--- a/.env.example
+++ b/.env.example
+# Server encryption key for API key storage (generate with: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")
+ENCRYPTION_KEY=your-64-char-hex-key-here
+# Server port (default: 3001)
+PORT=3001
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
+name: CI
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+jobs:
+  test:
+    name: Test & build
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+      - name: Install dependencies
+        run: npm install
+      - name: Run server tests
+        run: npm test -w server
+      - name: Build server
+        run: npm run build -w server
+      - name: Build client
+        run: npm run build -w client
--- a/.gitignore
+++ b/.gitignore
+node_modules/
+dist/
+server/data/
+*.db
+*.db-wal
+*.db-shm
+.env
+.env.local
+.DS_Store
+# Personal deployment scripts (contain keys/credentials — kept local)
+deploy-pi.sh
+update-hermes.sh
--- a/LICENSE
+++ b/LICENSE
+MIT License
+Copyright (c) 2026 Tashfeen Ahmed
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/README.md
+++ b/README.md
+<div align="center">
+# FreeLLMAPI
+**One OpenAI-compatible endpoint. Fourteen free LLM providers. ~800M+ tokens per month.**
+Aggregate the free tiers from Google, Groq, Cerebras, SambaNova, NVIDIA, Mistral, OpenRouter, GitHub Models, Hugging Face, Cohere, Cloudflare, Zhipu, Moonshot, and MiniMax behind a single `/v1/chat/completions` endpoint. Keys are stored encrypted. A router picks the best available model for each request, falls over to the next provider when one is rate-limited, and tracks per-key usage so you stay under every free-tier cap.
+[![CI](https://github.com/tashfeenahmed/freellmapi/actions/workflows/ci.yml/badge.svg)](https://github.com/tashfeenahmed/freellmapi/actions/workflows/ci.yml)
+[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](./LICENSE)
+[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](#contributing)
+![Fallback chain with per-provider token budget](repo-assets/fallback-chain.png)
+</div>
+---
+## Contents
+- [Why this exists](#why-this-exists)
+- [Supported providers](#supported-providers)
+- [Features](#features)
+- [Not yet supported](#not-yet-supported)
+- [Quick start](#quick-start)
+- [Using the API](#using-the-api)
+- [Screenshots](#screenshots)
+- [How it works](#how-it-works)
+- [Limitations](#limitations)
+- [Contributing](#contributing)
+- [Terms of Service review](#terms-of-service-review)
+- [Disclaimer](#disclaimer)
+## Why this exists
+Every serious AI lab now offers a free tier — a few million tokens a month, a few thousand requests a day. On its own each tier is a toy. Stacked together, they add up to roughly **800 million tokens per month** of working inference capacity, across dozens of models from small-and-fast to reasonably capable.
+The problem is that stacking them by hand is painful: fourteen different SDKs, fourteen different rate limits, fourteen places a request can fail. FreeLLMAPI collapses that into one OpenAI-compatible endpoint. Point any OpenAI client library at your local server, and it routes transparently across whichever providers you've added keys for.
+## Supported providers
+<table>
+<tr>
+<td align="center" width="180"><a href="https://ai.google.dev"><b>Google</b><br/>Gemini 2.5 Pro / Flash</a></td>
+<td align="center" width="180"><a href="https://groq.com"><b>Groq</b><br/>Llama 4, Qwen, Kimi</a></td>
+<td align="center" width="180"><a href="https://cerebras.ai"><b>Cerebras</b><br/>Llama 3.3, Qwen</a></td>
+<td align="center" width="180"><a href="https://cloud.sambanova.ai"><b>SambaNova</b><br/>Llama 3.3 70B</a></td>
+</tr>
+<tr>
+<td align="center"><a href="https://build.nvidia.com"><b>NVIDIA</b><br/>NIM catalog</a></td>
+<td align="center"><a href="https://mistral.ai"><b>Mistral</b><br/>La Plateforme</a></td>
+<td align="center"><a href="https://openrouter.ai"><b>OpenRouter</b><br/>Free-tier models</a></td>
+<td align="center"><a href="https://github.com/marketplace/models"><b>GitHub Models</b><br/>GPT-4o, Llama, Phi</a></td>
+</tr>
+<tr>
+<td align="center"><a href="https://huggingface.co"><b>Hugging Face</b><br/>Inference Providers</a></td>
+<td align="center"><a href="https://cohere.com"><b>Cohere</b><br/>Command R+ (trial)</a></td>
+<td align="center"><a href="https://developers.cloudflare.com/workers-ai"><b>Cloudflare</b><br/>Workers AI</a></td>
+<td align="center"><a href="https://bigmodel.cn"><b>Zhipu</b><br/>GLM-4 series</a></td>
+</tr>
+<tr>
+<td align="center"><a href="https://platform.moonshot.cn"><b>Moonshot</b><br/>Kimi</a></td>
+<td align="center"><a href="https://platform.minimax.io"><b>MiniMax</b><br/>abab / hailuo</a></td>
+<td align="center" colspan="2"><i>Adding another? See <a href="#contributing">Contributing</a>.</i></td>
+</tr>
+</table>
+## Features
+- **OpenAI-compatible** — `POST /v1/chat/completions` and `GET /v1/models` work with the official OpenAI SDKs and any OpenAI-compatible client (LangChain, LlamaIndex, Continue, Hermes, etc.). Just change `base_url`.
+- **Streaming and non-streaming** — Server-Sent Events for `stream: true`, JSON response otherwise. Every provider adapter implements both.
+- **Automatic fallover** — If the chosen provider returns a 429, 5xx, or times out, the router skips it, puts the key on a short cooldown, and retries on the next model in your fallback chain (up to 20 attempts).
+- **Per-key rate tracking** — RPM, RPD, TPM, and TPD counters per `(platform, model, key)` so the router always picks a key that's under its caps.
+- **Sticky sessions** — Multi-turn conversations keep talking to the same model for 30 minutes to avoid the hallucination spike that comes from mid-conversation model switches.
+- **Encrypted key storage** — API keys are encrypted with AES-256-GCM before hitting SQLite; decryption happens in-memory just before a request.
+- **Unified API key** — Clients authenticate to your proxy with a single `freellmapi-…` bearer token. You never expose upstream provider keys to your apps.
+- **Health checks** — Periodic probes mark keys as `healthy`, `rate_limited`, `invalid`, or `error` so the router skips dead ones automatically.
+- **Admin dashboard** — React + Vite UI to manage keys, reorder the fallback chain, inspect analytics, and run prompts in a playground. Dark mode included.
+- **Analytics** — Per-request logging with latency, token counts, success rate, and per-provider breakdowns.
+- **Deploys to a Raspberry Pi** — Runs happily on a Pi 4 under PM2 behind nginx. ~40 MB RSS at idle.
+## Not yet supported
+The scope is deliberately narrow. If a feature isn't on this list and isn't below, assume it isn't there yet.
+- **Embeddings** (`/v1/embeddings`)
+- **Image generation** (`/v1/images/*`)
+- **Audio / speech** (`/v1/audio/*`)
+- **Function / tool calling** — the request schema doesn't pass `tools` through yet
+- **Vision / multimodal inputs** — message content is text-only
+- **Legacy completions** (`/v1/completions`) — only the chat endpoint is implemented
+- **Moderation** (`/v1/moderations`)
+- **`n > 1`** (multiple completions per request)
+- **Per-user billing / multi-tenant auth** — single-user by design
+PRs that add any of these are very welcome. See [Contributing](#contributing).
+## Quick start
+**Prerequisites:** Node.js 20+, npm.
+```bash
+git clone https://github.com/tashfeenahmed/freellmapi.git
+cd freellmapi
+npm install
+# Generate an encryption key for at-rest key storage
+cp .env.example .env
+echo "ENCRYPTION_KEY=$(node -e "console.log(require('crypto').randomBytes(32).toString('hex'))")" >> .env
+# Start server + dashboard together
+npm run dev
+```
+Open http://localhost:5173 (the Vite dev UI), add your provider keys on the **Keys** page, reorder the **Fallback Chain** to taste, and grab your unified API key from the **Keys** page header. That unified key is what you point your OpenAI SDK at.
+For a production build:
+```bash
+npm run build
+node server/dist/index.js     # server + dashboard both served on :3001
+```
+## Using the API
+Any OpenAI-compatible client works. Examples:
+**Python**
+```python
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:3001/v1",
+    api_key="freellmapi-your-unified-key",
+)
+resp = client.chat.completions.create(
+    model="auto",  # let the router pick; or specify e.g. "gemini-2.5-flash"
+    messages=[{"role": "user", "content": "Summarise the fall of Rome in one sentence."}],
+)
+print(resp.choices[0].message.content)
+print("Routed via:", resp.headers.get("x-routed-via"))
+```
+**curl**
+```bash
+curl http://localhost:3001/v1/chat/completions \
+  -H "Authorization: Bearer freellmapi-your-unified-key" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "auto",
+    "messages": [{"role": "user", "content": "hi"}]
+  }'
+```
+**Streaming**
+```python
+stream = client.chat.completions.create(
+    model="auto",
+    messages=[{"role": "user", "content": "Stream me a haiku about SQLite."}],
+    stream=True,
+)
+for chunk in stream:
+    print(chunk.choices[0].delta.content or "", end="", flush=True)
+```
+Every response carries an `X-Routed-Via: <platform>/<model>` header so you can see which provider actually served each call. If a request fell over between providers, you'll also see `X-Fallback-Attempts: N`.
+## Screenshots
+### Keys
+Manage provider credentials and grab the unified API key your apps connect with. Each key shows a status dot and when it was last health-checked.
+![Keys page](repo-assets/keys.png)
+### Playground
+Send a chat completion through the router and see which provider served it, with the model ID and latency printed right on the message.
+![Playground page](repo-assets/playground.png)
+### Analytics
+Request volume, success rate, tokens in and out, average latency, and per-provider breakdowns over 24h / 7d / 30d windows.
+![Analytics page](repo-assets/analytics.png)
+## How it works
+```
+┌──────────────────┐   Bearer freellmapi-…   ┌─────────────────────────┐
+│  OpenAI SDK /    │ ──────────────────────▶ │  Express proxy (:3001)  │
+│  curl / any      │ ◀────────────────────── │  /v1/chat/completions   │
+│  OpenAI client   │      streamed tokens    └────────────┬────────────┘
+└──────────────────┘                                      │
+                                                          ▼
+                             ┌────────────────────────────────────────────────┐
+                             │  Router                                        │
+                             │   1. Pick highest-priority model that          │
+                             │      (a) has a healthy key and                 │
+                             │      (b) is under all its rate limits.         │
+                             │   2. Decrypt key, call provider SDK.           │
+                             │   3. On 429/5xx → cooldown + retry next model. │
+                             └────────────────────────────────────────────────┘
+                                          │
+   ┌──────────────┬────────────┬──────────┴─────────┬─────────────┬──────────┐
+   ▼              ▼            ▼                    ▼             ▼          ▼
+ Google         Groq        Cerebras           OpenRouter        HF       …10 more
+```
+- **Router** (`server/src/services/router.ts`) — picks a model per request.
+- **Rate-limit ledger** (`server/src/services/ratelimit.ts`) — in-memory RPM/RPD/TPM/TPD counters backed by SQLite, with cooldowns on 429s.
+- **Provider adapters** (`server/src/providers/*.ts`) — one file per provider, implementing the `Provider` base class: `chatCompletion()` and `streamChatCompletion()`.
+- **Health service** (`server/src/services/health.ts`) — periodic probe keeps key status fresh.
+- **Dashboard** (`client/`) — React + Vite + shadcn/ui admin surface.
+- **Storage** — SQLite (`better-sqlite3`) with AES-256-GCM envelope encryption for keys.
+## Limitations
+Stacking free tiers has real trade-offs. Be honest with yourself about them:
+- **No frontier models.** The free-tier catalog tops out around Llama 3.3 70B, GLM-4.5, Qwen 3 Coder, and Gemini 2.5 Pro. You will not get GPT-5 or Claude Opus class reasoning through this. For hard problems, pay for a real API.
+- **Intelligence degrades as the day progresses.** Your top-ranked models (usually Gemini 2.5 Pro, GPT-4o via GitHub Models) have the lowest daily caps. Once they hit their limits, the router falls down your priority chain to smaller/weaker models. Expect the effective intelligence of the endpoint to drop in the late hours of each day — then reset at UTC midnight.
+- **Latency is highly variable.** Cerebras and Groq are extremely fast; others are not. You get whichever one is available.
+- **Free tiers can change without notice.** Providers regularly tighten, loosen, or remove free tiers. When that happens you'll see 429s or auth errors until you update the catalog. Re-seed scripts live in `server/src/scripts/`.
+- **No SLA, by definition.** If you need reliability, use a paid provider with a contract.
+- **Local-first.** There's no multi-tenant auth. Run this for yourself; don't expose it to the internet.
+## Contributing
+Contributors very welcome! Good first PRs:
+- **Add a provider** — copy `server/src/providers/openai-compat.ts` as a template, wire it into `server/src/providers/index.ts`, seed its models in `server/src/db/index.ts`, add a test in `server/src/__tests__/providers/`.
+- **Add an endpoint** — embeddings, images, moderations. The provider base class can grow new methods; adapters declare which they support.
+- **Improve the router** — cost-aware routing (cheapest-healthy-fastest tradeoffs), better latency-weighted priority, regional pinning.
+- **Dashboard polish** — charts on the Analytics page, key rotation UX, batch import of keys from `.env`.
+- **Docs** — more examples, client library snippets for Go/Rust/etc., a deployment recipe for Docker or Fly.
+**Development loop:**
+```bash
+npm install
+npm run dev      # server on :3001, dashboard on :5173, both with HMR
+npm test         # vitest — 69 tests across providers, routes, router, ratelimit
+```
+PRs should include a test, keep the existing test suite green, and match the `.editorconfig` / tsconfig defaults already in the repo. Issues and discussions are open.
+## Terms of Service review
+A self-hosted, single-user, personal-use setup was reviewed against each provider's ToS (April 2026). Summary:
+| Provider | Verdict | Notes |
+|---|---|---|
+| Google Gemini | ✅ Likely OK | No adverse clause; proxy for personal use not prohibited. |
+| Groq | ✅ Likely OK | Explicitly permits integrating into a "Customer Application." |
+| Cerebras | ✅ Likely OK | Permitted; don't resell keys. |
+| Mistral | ✅ Likely OK | APIs allowed for personal/internal business use. |
+| OpenRouter | ✅ Likely OK | Private-use only; don't expose the proxy publicly. |
+| Hugging Face | ✅ Likely OK | BYO-key proxying is the documented pattern. |
+| Zhipu | ✅ Likely OK | Explicit "personal, non-commercial research" carve-out. |
+| Moonshot / Kimi | ✅ Likely OK | Competitive-products clause is broad but not aimed at single-user proxies. |
+| SambaNova | ⚠️ Ambiguous | Public terms are silent on APIs. |
+| MiniMax | ⚠️ Ambiguous | Public terms silent. |
+| Cloudflare Workers AI | ⚠️ Ambiguous | No adverse clause found. |
+| NVIDIA NIM | ⚠️ Caution | Free tier is "evaluation only, not production." |
+| GitHub Models | ⚠️ Caution | Free tier scoped to "experimentation." |
+| Cohere | ❌ Avoid | Trial ToS §14 explicitly forbids personal/household use. |
+Rules of thumb that keep most providers happy: **one account per provider**, **no reselling**, **no sharing your endpoint with other humans**, **don't hammer a free tier as a paid production backend**. This is informational, not legal advice — read each provider's ToS and make your own call.
+## Disclaimer
+**This project is for personal experimentation and learning, not production.** Free tiers exist so developers can prototype against them; they aren't a stable, supported inference substrate and shouldn't be treated as one. If you build something real on top of FreeLLMAPI, swap in a paid API before you ship. Your relationship with each upstream provider is governed by the terms you accepted when you created your account — those terms still apply when the traffic is proxied through this project, and you're responsible for complying with them.
+## License
+[MIT](./LICENSE)
--- a/client/.gitignore
+++ b/client/.gitignore
+# Logs
+logs
+*.log
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+pnpm-debug.log*
+lerna-debug.log*
+node_modules
+dist
+dist-ssr
+*.local
+# Editor directories and files
+.vscode/*
+!.vscode/extensions.json
+.idea
+.DS_Store
+*.suo
+*.ntvs*
+*.njsproj
+*.sln
+*.sw?
--- a/client/README.md
+++ b/client/README.md
+# React + TypeScript + Vite
+This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
+Currently, two official plugins are available:
+- [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Oxc](https://oxc.rs)
+- [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/)
+## React Compiler
+The React Compiler is not enabled on this template because of its impact on dev & build performances. To add it, see [this documentation](https://react.dev/learn/react-compiler/installation).
+## Expanding the ESLint configuration
+If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
+```js
+export default defineConfig([
+  globalIgnores(['dist']),
+  {
+    files: ['**/*.{ts,tsx}'],
+    extends: [
+      // Other configs...
+      // Remove tseslint.configs.recommended and replace with this
+      tseslint.configs.recommendedTypeChecked,
+      // Alternatively, use this for stricter rules
+      tseslint.configs.strictTypeChecked,
+      // Optionally, add this for stylistic rules
+      tseslint.configs.stylisticTypeChecked,
+      // Other configs...
+    ],
+    languageOptions: {
+      parserOptions: {
+        project: ['./tsconfig.node.json', './tsconfig.app.json'],
+        tsconfigRootDir: import.meta.dirname,
+      },
+      // other options...
+    },
+  },
+])
+```
+You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
+```js
+// eslint.config.js
+import reactX from 'eslint-plugin-react-x'
+import reactDom from 'eslint-plugin-react-dom'
+export default defineConfig([
+  globalIgnores(['dist']),
+  {
+    files: ['**/*.{ts,tsx}'],
+    extends: [
+      // Other configs...
+      // Enable lint rules for React
+      reactX.configs['recommended-typescript'],
+      // Enable lint rules for React DOM
+      reactDom.configs.recommended,
+    ],
+    languageOptions: {
+      parserOptions: {
+        project: ['./tsconfig.node.json', './tsconfig.app.json'],
+        tsconfigRootDir: import.meta.dirname,
+      },
+      // other options...
+    },
+  },
+])
+```
--- a/client/components.json
+++ b/client/components.json
+{
+  "$schema": "https://ui.shadcn.com/schema.json",
+  "style": "base-nova",
+  "rsc": false,
+  "tsx": true,
+  "tailwind": {
+    "config": "",
+    "css": "src/index.css",
+    "baseColor": "neutral",
+    "cssVariables": true,
+    "prefix": ""
+  },
+  "iconLibrary": "lucide",
+  "rtl": false,
+  "aliases": {
+    "components": "@/components",
+    "utils": "@/lib/utils",
+    "ui": "@/components/ui",
+    "lib": "@/lib",
+    "hooks": "@/hooks"
+  },
+  "menuColor": "default",
+  "menuAccent": "subtle",
+  "registries": {}
+}
--- a/client/eslint.config.js
+++ b/client/eslint.config.js
+import js from '@eslint/js'
+import globals from 'globals'
+import reactHooks from 'eslint-plugin-react-hooks'
+import reactRefresh from 'eslint-plugin-react-refresh'
+import tseslint from 'typescript-eslint'
+import { defineConfig, globalIgnores } from 'eslint/config'
+export default defineConfig([
+  globalIgnores(['dist']),
+  {
+    files: ['**/*.{ts,tsx}'],
+    extends: [
+      js.configs.recommended,
+      tseslint.configs.recommended,
+      reactHooks.configs.flat.recommended,
+      reactRefresh.configs.vite,
+    ],
+    languageOptions: {
+      ecmaVersion: 2020,
+      globals: globals.browser,
+    },
+  },
+])
--- a/client/index.html
+++ b/client/index.html
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>FreeLLMAPI · Unified LLM Router</title>
+  </head>
+  <body>
+    <div id="root"></div>
+    <script type="module" src="/src/main.tsx"></script>
+  </body>
+</html>
--- a/client/package.json
+++ b/client/package.json
+{
+  "name": "@freellmapi/client",
+  "private": true,
+  "version": "0.0.0",
+  "type": "module",
+  "scripts": {
+    "dev": "vite",
+    "build": "tsc -b && vite build",
+    "lint": "eslint .",
+    "preview": "vite preview"
+  },
+  "dependencies": {
+    "@base-ui/react": "^1.3.0",
+    "@dnd-kit/core": "^6.3.1",
+    "@dnd-kit/sortable": "^10.0.0",
+    "@dnd-kit/utilities": "^3.2.2",
+    "@fontsource-variable/geist": "^5.2.8",
+    "@fontsource-variable/geist-mono": "^5.2.7",
+    "@tailwindcss/vite": "^4.2.2",
+    "@tanstack/react-query": "^5.97.0",
+    "class-variance-authority": "^0.7.1",
+    "clsx": "^2.1.1",
+    "lucide-react": "^1.8.0",
+    "react": "^19.2.4",
+    "react-dom": "^19.2.4",
+    "react-router-dom": "^7.14.0",
+    "recharts": "^3.8.1",
+    "shadcn": "^4.2.0",
+    "tailwind-merge": "^3.5.0",
+    "tailwindcss": "^4.2.2",
+    "tw-animate-css": "^1.4.0"
+  },
+  "devDependencies": {
+    "@eslint/js": "^9.39.4",
+    "@types/node": "^24.12.2",
+    "@types/react": "^19.2.14",
+    "@types/react-dom": "^19.2.3",
+    "@vitejs/plugin-react": "^6.0.1",
+    "eslint": "^9.39.4",
+    "eslint-plugin-react-hooks": "^7.0.1",
+    "eslint-plugin-react-refresh": "^0.5.2",
+    "globals": "^17.4.0",
+    "typescript": "~6.0.2",
+    "typescript-eslint": "^8.58.0",
+    "vite": "^8.0.4"
+  }
+}
--- a/client/public/favicon.svg
+++ b/client/public/favicon.svg
+<svg xmlns="http://www.w3.org/2000/svg" width="48" height="46" fill="none" viewBox="0 0 48 46"><path fill="#863bff" d="M25.946 44.938c-.664.845-2.021.375-2.021-.698V33.937a2.26 2.26 0 0 0-2.262-2.262H10.287c-.92 0-1.456-1.04-.92-1.788l7.48-10.471c1.07-1.497 0-3.578-1.842-3.578H1.237c-.92 0-1.456-1.04-.92-1.788L10.013.474c.214-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.471c-1.07 1.498 0 3.579 1.842 3.579h11.377c.943 0 1.473 1.088.89 1.83L25.947 44.94z" style="fill:#863bff;fill:color(display-p3 .5252 .23 1);fill-opacity:1"/><mask id="a" width="48" height="46" x="0" y="0" maskUnits="userSpaceOnUse" style="mask-type:alpha"><path fill="#000" d="M25.842 44.938c-.664.844-2.021.375-2.021-.698V33.937a2.26 2.26 0 0 0-2.262-2.262H10.183c-.92 0-1.456-1.04-.92-1.788l7.48-10.471c1.07-1.498 0-3.579-1.842-3.579H1.133c-.92 0-1.456-1.04-.92-1.787L9.91.473c.214-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.471c-1.07 1.498 0 3.578 1.842 3.578h11.377c.943 0 1.473 1.088.89 1.832L25.843 44.94z" style="fill:#000;fill-opacity:1"/></mask><g mask="url(#a)"><g filter="url(#b)"><ellipse cx="5.508" cy="14.704" fill="#ede6ff" rx="5.508" ry="14.704" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -4.47 31.516)"/></g><g filter="url(#c)"><ellipse cx="10.399" cy="29.851" fill="#ede6ff" rx="10.399" ry="29.851" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -39.328 7.883)"/></g><g filter="url(#d)"><ellipse cx="5.508" cy="30.487" fill="#7e14ff" rx="5.508" ry="30.487" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.814 -25.913 -14.639)scale(1 -1)"/></g><g filter="url(#e)"><ellipse cx="5.508" cy="30.599" fill="#7e14ff" rx="5.508" ry="30.599" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.814 -32.644 -3.334)scale(1 -1)"/></g><g filter="url(#f)"><ellipse cx="5.508" cy="30.599" fill="#7e14ff" rx="5.508" ry="30.599" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="matrix(.00324 1 1 -.00324 -34.34 30.47)"/></g><g filter="url(#g)"><ellipse cx="14.072" cy="22.078" fill="#ede6ff" rx="14.072" ry="22.078" style="fill:#ede6ff;fill:color(display-p3 .9275 .9033 1);fill-opacity:1" transform="rotate(93.35 24.506 48.493)scale(-1 1)"/></g><g filter="url(#h)"><ellipse cx="3.47" cy="21.501" fill="#7e14ff" rx="3.47" ry="21.501" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.009 28.708 47.59)scale(-1 1)"/></g><g filter="url(#i)"><ellipse cx="3.47" cy="21.501" fill="#7e14ff" rx="3.47" ry="21.501" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(89.009 28.708 47.59)scale(-1 1)"/></g><g filter="url(#j)"><ellipse cx=".387" cy="8.972" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(39.51 .387 8.972)"/></g><g filter="url(#k)"><ellipse cx="47.523" cy="-6.092" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 47.523 -6.092)"/></g><g filter="url(#l)"><ellipse cx="41.412" cy="6.333" fill="#47bfff" rx="5.971" ry="9.665" style="fill:#47bfff;fill:color(display-p3 .2799 .748 1);fill-opacity:1" transform="rotate(37.892 41.412 6.333)"/></g><g filter="url(#m)"><ellipse cx="-1.879" cy="38.332" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 -1.88 38.332)"/></g><g filter="url(#n)"><ellipse cx="-1.879" cy="38.332" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 -1.88 38.332)"/></g><g filter="url(#o)"><ellipse cx="35.651" cy="29.907" fill="#7e14ff" rx="4.407" ry="29.108" style="fill:#7e14ff;fill:color(display-p3 .4922 .0767 1);fill-opacity:1" transform="rotate(37.892 35.651 29.907)"/></g><g filter="url(#p)"><ellipse cx="38.418" cy="32.4" fill="#47bfff" rx="5.971" ry="15.297" style="fill:#47bfff;fill:color(display-p3 .2799 .748 1);fill-opacity:1" transform="rotate(37.892 38.418 32.4)"/></g></g><defs><filter id="b" width="60.045" height="41.654" x="-19.77" y="16.149" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="c" width="90.34" height="51.437" x="-54.613" y="-7.533" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="d" width="79.355" height="29.4" x="-49.64" y="2.03" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="e" width="79.579" height="29.4" x="-45.045" y="20.029" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="f" width="79.579" height="29.4" x="-43.513" y="21.178" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="g" width="74.749" height="58.852" x="15.756" y="-17.901" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="7.659"/></filter><filter id="h" width="61.377" height="25.362" x="23.548" y="2.284" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="i" width="61.377" height="25.362" x="23.548" y="2.284" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="j" width="56.045" height="63.649" x="-27.636" y="-22.853" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="k" width="54.814" height="64.646" x="20.116" y="-38.415" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="l" width="33.541" height="35.313" x="24.641" y="-11.323" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="m" width="54.814" height="64.646" x="-29.286" y="6.009" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="n" width="54.814" height="64.646" x="-29.286" y="6.009" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="o" width="54.814" height="64.646" x="8.244" y="-2.416" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter><filter id="p" width="39.409" height="43.623" x="18.713" y="10.588" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17158" stdDeviation="4.596"/></filter></defs></svg>
\ No newline at end of file
--- a/client/public/icons.svg
+++ b/client/public/icons.svg
+<svg xmlns="http://www.w3.org/2000/svg">
+  <symbol id="bluesky-icon" viewBox="0 0 16 17">
+    <g clip-path="url(#bluesky-clip)"><path fill="#08060d" d="M7.75 7.735c-.693-1.348-2.58-3.86-4.334-5.097-1.68-1.187-2.32-.981-2.74-.79C.188 2.065.1 2.812.1 3.251s.241 3.602.398 4.13c.52 1.744 2.367 2.333 4.07 2.145-2.495.37-4.71 1.278-1.805 4.512 3.196 3.309 4.38-.71 4.987-2.746.608 2.036 1.307 5.91 4.93 2.746 2.72-2.746.747-4.143-1.747-4.512 1.702.189 3.55-.4 4.07-2.145.156-.528.397-3.691.397-4.13s-.088-1.186-.575-1.406c-.42-.19-1.06-.395-2.741.79-1.755 1.24-3.64 3.752-4.334 5.099"/></g>
+    <defs><clipPath id="bluesky-clip"><path fill="#fff" d="M.1.85h15.3v15.3H.1z"/></clipPath></defs>
+  </symbol>
+  <symbol id="discord-icon" viewBox="0 0 20 19">
+    <path fill="#08060d" d="M16.224 3.768a14.5 14.5 0 0 0-3.67-1.153c-.158.286-.343.67-.47.976a13.5 13.5 0 0 0-4.067 0c-.128-.306-.317-.69-.476-.976A14.4 14.4 0 0 0 3.868 3.77C1.546 7.28.916 10.703 1.231 14.077a14.7 14.7 0 0 0 4.5 2.306q.545-.748.965-1.587a9.5 9.5 0 0 1-1.518-.74q.191-.14.372-.293c2.927 1.369 6.107 1.369 8.999 0q.183.152.372.294-.723.437-1.52.74.418.838.963 1.588a14.6 14.6 0 0 0 4.504-2.308c.37-3.911-.63-7.302-2.644-10.309m-9.13 8.234c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.894 0 1.614.82 1.599 1.82.001 1-.705 1.82-1.6 1.82m5.91 0c-.878 0-1.599-.82-1.599-1.82 0-.998.705-1.82 1.6-1.82.893 0 1.614.82 1.599 1.82 0 1-.706 1.82-1.6 1.82"/>
+  </symbol>
+  <symbol id="documentation-icon" viewBox="0 0 21 20">
+    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="m15.5 13.333 1.533 1.322c.645.555.967.833.967 1.178s-.322.623-.967 1.179L15.5 18.333m-3.333-5-1.534 1.322c-.644.555-.966.833-.966 1.178s.322.623.966 1.179l1.534 1.321"/>
+    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M17.167 10.836v-4.32c0-1.41 0-2.117-.224-2.68-.359-.906-1.118-1.621-2.08-1.96-.599-.21-1.349-.21-2.848-.21-2.623 0-3.935 0-4.983.369-1.684.591-3.013 1.842-3.641 3.428C3 6.449 3 7.684 3 10.154v2.122c0 2.558 0 3.838.706 4.726q.306.383.713.671c.76.536 1.79.64 3.581.66"/>
+    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M3 10a2.78 2.78 0 0 1 2.778-2.778c.555 0 1.209.097 1.748-.047.48-.129.854-.503.982-.982.145-.54.048-1.194.048-1.749a2.78 2.78 0 0 1 2.777-2.777"/>
+  </symbol>
+  <symbol id="github-icon" viewBox="0 0 19 19">
+    <path fill="#08060d" fill-rule="evenodd" d="M9.356 1.85C5.05 1.85 1.57 5.356 1.57 9.694a7.84 7.84 0 0 0 5.324 7.44c.387.079.528-.168.528-.376 0-.182-.013-.805-.013-1.454-2.165.467-2.616-.935-2.616-.935-.349-.91-.864-1.143-.864-1.143-.71-.48.051-.48.051-.48.787.051 1.2.805 1.2.805.695 1.194 1.817.857 2.268.649.064-.507.27-.857.49-1.052-1.728-.182-3.545-.857-3.545-3.87 0-.857.31-1.558.8-2.104-.078-.195-.349-1 .077-2.078 0 0 .657-.208 2.14.805a7.5 7.5 0 0 1 1.946-.26c.657 0 1.328.092 1.946.26 1.483-1.013 2.14-.805 2.14-.805.426 1.078.155 1.883.078 2.078.502.546.799 1.247.799 2.104 0 3.013-1.818 3.675-3.558 3.87.284.247.528.714.528 1.454 0 1.052-.012 1.896-.012 2.156 0 .208.142.455.528.377a7.84 7.84 0 0 0 5.324-7.441c.013-4.338-3.48-7.844-7.773-7.844" clip-rule="evenodd"/>
+  </symbol>
+  <symbol id="social-icon" viewBox="0 0 20 20">
+    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M12.5 6.667a4.167 4.167 0 1 0-8.334 0 4.167 4.167 0 0 0 8.334 0"/>
+    <path fill="none" stroke="#aa3bff" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.35" d="M2.5 16.667a5.833 5.833 0 0 1 8.75-5.053m3.837.474.513 1.035c.07.144.257.282.414.309l.93.155c.596.1.736.536.307.965l-.723.73a.64.64 0 0 0-.152.531l.207.903c.164.715-.213.991-.84.618l-.872-.52a.63.63 0 0 0-.577 0l-.872.52c-.624.373-1.003.094-.84-.618l.207-.903a.64.64 0 0 0-.152-.532l-.723-.729c-.426-.43-.289-.864.306-.964l.93-.156a.64.64 0 0 0 .412-.31l.513-1.034c.28-.562.735-.562 1.012 0"/>
+  </symbol>
+  <symbol id="x-icon" viewBox="0 0 19 19">
+    <path fill="#08060d" fill-rule="evenodd" d="M1.893 1.98c.052.072 1.245 1.769 2.653 3.77l2.892 4.114c.183.261.333.48.333.486s-.068.089-.152.183l-.522.593-.765.867-3.597 4.087c-.375.426-.734.834-.798.905a1 1 0 0 0-.118.148c0 .01.236.017.664.017h.663l.729-.83c.4-.457.796-.906.879-.999a692 692 0 0 0 1.794-2.038c.034-.037.301-.34.594-.675l.551-.624.345-.392a7 7 0 0 1 .34-.374c.006 0 .93 1.306 2.052 2.903l2.084 2.965.045.063h2.275c1.87 0 2.273-.003 2.266-.021-.008-.02-1.098-1.572-3.894-5.547-2.013-2.862-2.28-3.246-2.273-3.266.008-.019.282-.332 2.085-2.38l2-2.274 1.567-1.782c.022-.028-.016-.03-.65-.03h-.674l-.3.342a871 871 0 0 1-1.782 2.025c-.067.075-.405.458-.75.852a100 100 0 0 1-.803.91c-.148.172-.299.344-.99 1.127-.304.343-.32.358-.345.327-.015-.019-.904-1.282-1.976-2.808L6.365 1.85H1.8zm1.782.91 8.078 11.294c.772 1.08 1.413 1.973 1.425 1.984.016.017.241.02 1.05.017l1.03-.004-2.694-3.766L7.796 5.75 5.722 2.852l-1.039-.004-1.039-.004z" clip-rule="evenodd"/>
+  </symbol>
+</svg>
--- a/client/src/App.tsx
+++ b/client/src/App.tsx
+import { useEffect, useState } from 'react'
+import { BrowserRouter, Routes, Route, Navigate, NavLink } from 'react-router-dom'
+import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
+import { Button } from '@/components/ui/button'
+import KeysPage from '@/pages/KeysPage'
+import PlaygroundPage from '@/pages/PlaygroundPage'
+import FallbackPage from '@/pages/FallbackPage'
+import AnalyticsPage from '@/pages/AnalyticsPage'
+const queryClient = new QueryClient()
+function NavItem({ to, children }: { to: string; children: React.ReactNode }) {
+  return (
+    <NavLink
+      to={to}
+      className={({ isActive }) =>
+        `relative text-sm px-1 py-4 transition-colors ${
+          isActive
+            ? 'text-foreground after:absolute after:inset-x-0 after:-bottom-px after:h-px after:bg-foreground'
+            : 'text-muted-foreground hover:text-foreground'
+        }`
+      }
+    >
+      {children}
+    </NavLink>
+  )
+}
+function DarkModeToggle() {
+  const [dark, setDark] = useState(() =>
+    typeof window !== 'undefined' && document.documentElement.classList.contains('dark')
+  )
+  useEffect(() => {
+    const stored = localStorage.getItem('theme')
+    if (stored === 'dark' || (!stored && window.matchMedia('(prefers-color-scheme: dark)').matches)) {
+      document.documentElement.classList.add('dark')
+      setDark(true)
+    }
+  }, [])
+  function toggle() {
+    const next = !dark
+    setDark(next)
+    document.documentElement.classList.toggle('dark', next)
+    localStorage.setItem('theme', next ? 'dark' : 'light')
+  }
+  return (
+    <Button variant="ghost" size="sm" onClick={toggle} aria-label="Toggle theme">
+      {dark ? (
+        <svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><circle cx="12" cy="12" r="4"/><path d="M12 2v2"/><path d="M12 20v2"/><path d="m4.93 4.93 1.41 1.41"/><path d="m17.66 17.66 1.41 1.41"/><path d="M2 12h2"/><path d="M20 12h2"/><path d="m6.34 17.66-1.41 1.41"/><path d="m19.07 4.93-1.41 1.41"/></svg>
+      ) : (
+        <svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"><path d="M12 3a6 6 0 0 0 9 9 9 9 0 1 1-9-9Z"/></svg>
+      )}
+    </Button>
+  )
+}
+function Brand() {
+  return (
+    <div className="flex items-center gap-2">
+      <span className="inline-block size-2 rounded-full bg-foreground" />
+      <span className="font-semibold tracking-tight text-sm">FreeLLMAPI</span>
+    </div>
+  )
+}
+function App() {
+  return (
+    <QueryClientProvider client={queryClient}>
+      <BrowserRouter basename={import.meta.env.BASE_URL}>
+        <div className="min-h-screen bg-background">
+          <header className="sticky top-0 z-40 bg-background/80 backdrop-blur border-b">
+            <div className="max-w-6xl mx-auto px-6 flex items-center">
+              <Brand />
+              <nav className="flex items-center gap-6 ml-10">
+                <NavItem to="/playground">Playground</NavItem>
+                <NavItem to="/keys">Keys</NavItem>
+                <NavItem to="/fallback">Fallback</NavItem>
+                <NavItem to="/analytics">Analytics</NavItem>
+              </nav>
+              <div className="ml-auto py-2">
+                <DarkModeToggle />
+              </div>
+            </div>
+          </header>
+          <main className="max-w-6xl mx-auto px-6 py-8">
+            <Routes>
+              <Route path="/" element={<Navigate to="/playground" replace />} />
+              <Route path="/playground" element={<PlaygroundPage />} />
+              <Route path="/keys" element={<KeysPage />} />
+              <Route path="/fallback" element={<FallbackPage />} />
+              <Route path="/analytics" element={<AnalyticsPage />} />
+              <Route path="/test" element={<Navigate to="/playground" replace />} />
+              <Route path="/health" element={<Navigate to="/keys" replace />} />
+            </Routes>
+          </main>
+        </div>
+      </BrowserRouter>
+    </QueryClientProvider>
+  )
+}
+export default App
--- a/client/src/assets/vite.svg
+++ b/client/src/assets/vite.svg
+<svg xmlns="http://www.w3.org/2000/svg" width="77" height="47" fill="none" aria-labelledby="vite-logo-title" viewBox="0 0 77 47"><title id="vite-logo-title">Vite</title><style>.parenthesis{fill:#000}@media (prefers-color-scheme:dark){.parenthesis{fill:#fff}}</style><path fill="#9135ff" d="M40.151 45.71c-.663.844-2.02.374-2.02-.699V34.708a2.26 2.26 0 0 0-2.262-2.262H24.493c-.92 0-1.457-1.04-.92-1.788l7.479-10.471c1.07-1.498 0-3.578-1.842-3.578H15.443c-.92 0-1.456-1.04-.92-1.788l9.696-13.576c.213-.297.556-.474.92-.474h28.894c.92 0 1.456 1.04.92 1.788l-7.48 10.472c-1.07 1.497 0 3.578 1.842 3.578h11.376c.944 0 1.474 1.087.89 1.83L40.153 45.712z"/><mask id="a" width="48" height="47" x="14" y="0" maskUnits="userSpaceOnUse" style="mask-type:alpha"><path fill="#000" d="M40.047 45.71c-.663.843-2.02.374-2.02-.699V34.708a2.26 2.26 0 0 0-2.262-2.262H24.389c-.92 0-1.457-1.04-.92-1.788l7.479-10.472c1.07-1.497 0-3.578-1.842-3.578H15.34c-.92 0-1.456-1.04-.92-1.788l9.696-13.575c.213-.297.556-.474.92-.474H53.93c.92 0 1.456 1.04.92 1.788L47.37 13.03c-1.07 1.498 0 3.578 1.842 3.578h11.376c.944 0 1.474 1.088.89 1.831L40.049 45.712z"/></mask><g mask="url(#a)"><g filter="url(#b)"><ellipse cx="5.508" cy="14.704" fill="#eee6ff" rx="5.508" ry="14.704" transform="rotate(269.814 20.96 11.29)scale(-1 1)"/></g><g filter="url(#c)"><ellipse cx="10.399" cy="29.851" fill="#eee6ff" rx="10.399" ry="29.851" transform="rotate(89.814 -16.902 -8.275)scale(1 -1)"/></g><g filter="url(#d)"><ellipse cx="5.508" cy="30.487" fill="#8900ff" rx="5.508" ry="30.487" transform="rotate(89.814 -19.197 -7.127)scale(1 -1)"/></g><g filter="url(#e)"><ellipse cx="5.508" cy="30.599" fill="#8900ff" rx="5.508" ry="30.599" transform="rotate(89.814 -25.928 4.177)scale(1 -1)"/></g><g filter="url(#f)"><ellipse cx="5.508" cy="30.599" fill="#8900ff" rx="5.508" ry="30.599" transform="rotate(89.814 -25.738 5.52)scale(1 -1)"/></g><g filter="url(#g)"><ellipse cx="14.072" cy="22.078" fill="#eee6ff" rx="14.072" ry="22.078" transform="rotate(93.35 31.245 55.578)scale(-1 1)"/></g><g filter="url(#h)"><ellipse cx="3.47" cy="21.501" fill="#8900ff" rx="3.47" ry="21.501" transform="rotate(89.009 35.419 55.202)scale(-1 1)"/></g><g filter="url(#i)"><ellipse cx="3.47" cy="21.501" fill="#8900ff" rx="3.47" ry="21.501" transform="rotate(89.009 35.419 55.202)scale(-1 1)"/></g><g filter="url(#j)"><ellipse cx="14.592" cy="9.743" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(39.51 14.592 9.743)"/></g><g filter="url(#k)"><ellipse cx="61.728" cy="-5.321" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 61.728 -5.32)"/></g><g filter="url(#l)"><ellipse cx="55.618" cy="7.104" fill="#00c2ff" rx="5.971" ry="9.665" transform="rotate(37.892 55.618 7.104)"/></g><g filter="url(#m)"><ellipse cx="12.326" cy="39.103" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 12.326 39.103)"/></g><g filter="url(#n)"><ellipse cx="12.326" cy="39.103" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 12.326 39.103)"/></g><g filter="url(#o)"><ellipse cx="49.857" cy="30.678" fill="#8900ff" rx="4.407" ry="29.108" transform="rotate(37.892 49.857 30.678)"/></g><g filter="url(#p)"><ellipse cx="52.623" cy="33.171" fill="#00c2ff" rx="5.971" ry="15.297" transform="rotate(37.892 52.623 33.17)"/></g></g><path d="M6.919 0c-9.198 13.166-9.252 33.575 0 46.789h6.215c-9.25-13.214-9.196-33.623 0-46.789zm62.424 0h-6.215c9.198 13.166 9.252 33.575 0 46.789h6.215c9.25-13.214 9.196-33.623 0-46.789" class="parenthesis"/><defs><filter id="b" width="60.045" height="41.654" x="-5.564" y="16.92" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="c" width="90.34" height="51.437" x="-40.407" y="-6.762" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="d" width="79.355" height="29.4" x="-35.435" y="2.801" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="e" width="79.579" height="29.4" x="-30.84" y="20.8" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="f" width="79.579" height="29.4" x="-29.307" y="21.949" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="g" width="74.749" height="58.852" x="29.961" y="-17.13" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="7.659"/></filter><filter id="h" width="61.377" height="25.362" x="37.754" y="3.055" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="i" width="61.377" height="25.362" x="37.754" y="3.055" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="j" width="56.045" height="63.649" x="-13.43" y="-22.082" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="k" width="54.814" height="64.646" x="34.321" y="-37.644" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="l" width="33.541" height="35.313" x="38.847" y="-10.552" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="m" width="54.814" height="64.646" x="-15.081" y="6.78" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="n" width="54.814" height="64.646" x="-15.081" y="6.78" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="o" width="54.814" height="64.646" x="22.45" y="-1.645" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter><filter id="p" width="39.409" height="43.623" x="32.919" y="11.36" color-interpolation-filters="sRGB" filterUnits="userSpaceOnUse"><feFlood flood-opacity="0" result="BackgroundImageFix"/><feBlend in="SourceGraphic" in2="BackgroundImageFix" result="shape"/><feGaussianBlur result="effect1_foregroundBlur_2002_17286" stdDeviation="4.596"/></filter></defs></svg>
--- a/client/src/components/page-header.tsx
+++ b/client/src/components/page-header.tsx
+import type { ReactNode } from 'react'
+export function PageHeader({
+  title,
+  description,
+  actions,
+}: {
+  title: string
+  description?: string
+  actions?: ReactNode
+}) {
+  return (
+    <div className="flex items-end justify-between gap-6 pb-6 mb-6 border-b">
+      <div className="min-w-0">
+        <h1 className="text-2xl font-semibold tracking-tight">{title}</h1>
+        {description && (
+          <p className="text-sm text-muted-foreground mt-1">{description}</p>
+        )}
+      </div>
+      {actions && <div className="flex items-center gap-2 shrink-0">{actions}</div>}
+    </div>
+  )
+}
--- a/client/src/components/ui/badge.tsx
+++ b/client/src/components/ui/badge.tsx
+import { mergeProps } from "@base-ui/react/merge-props"
+import { useRender } from "@base-ui/react/use-render"
+import { cva, type VariantProps } from "class-variance-authority"
+import { cn } from "@/lib/utils"
+const badgeVariants = cva(
+  "group/badge inline-flex h-5 w-fit shrink-0 items-center justify-center gap-1 overflow-hidden rounded-4xl border border-transparent px-2 py-0.5 text-xs font-medium whitespace-nowrap transition-all focus-visible:border-ring focus-visible:ring-[3px] focus-visible:ring-ring/50 has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 aria-invalid:border-destructive aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 [&>svg]:pointer-events-none [&>svg]:size-3!",
+  {
+    variants: {
+      variant: {
+        default: "bg-primary text-primary-foreground [a]:hover:bg-primary/80",
+        secondary:
+          "bg-secondary text-secondary-foreground [a]:hover:bg-secondary/80",
+        destructive:
+          "bg-destructive/10 text-destructive focus-visible:ring-destructive/20 dark:bg-destructive/20 dark:focus-visible:ring-destructive/40 [a]:hover:bg-destructive/20",
+        outline:
+          "border-border text-foreground [a]:hover:bg-muted [a]:hover:text-muted-foreground",
+        ghost:
+          "hover:bg-muted hover:text-muted-foreground dark:hover:bg-muted/50",
+        link: "text-primary underline-offset-4 hover:underline",
+      },
+    },
+    defaultVariants: {
+      variant: "default",
+    },
+  }
+)
+function Badge({
+  className,
+  variant = "default",
+  render,
+  ...props
+}: useRender.ComponentProps<"span"> & VariantProps<typeof badgeVariants>) {
+  return useRender({
+    defaultTagName: "span",
+    props: mergeProps<"span">(
+      {
+        className: cn(badgeVariants({ variant }), className),
+      },
+      props
+    ),
+    render,
+    state: {
+      slot: "badge",
+      variant,
+    },
+  })
+}
+export { Badge, badgeVariants }
--- a/client/src/components/ui/button.tsx
+++ b/client/src/components/ui/button.tsx
+import { Button as ButtonPrimitive } from "@base-ui/react/button"
+import { cva, type VariantProps } from "class-variance-authority"
+import { cn } from "@/lib/utils"
+const buttonVariants = cva(
+  "group/button inline-flex shrink-0 items-center justify-center rounded-lg border border-transparent bg-clip-padding text-sm font-medium whitespace-nowrap transition-all outline-none select-none focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 active:not-aria-[haspopup]:translate-y-px disabled:pointer-events-none disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
+  {
+    variants: {
+      variant: {
+        default: "bg-primary text-primary-foreground [a]:hover:bg-primary/80",
+        outline:
+          "border-border bg-background hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground dark:border-input dark:bg-input/30 dark:hover:bg-input/50",
+        secondary:
+          "bg-secondary text-secondary-foreground hover:bg-secondary/80 aria-expanded:bg-secondary aria-expanded:text-secondary-foreground",
+        ghost:
+          "hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground dark:hover:bg-muted/50",
+        destructive:
+          "bg-destructive/10 text-destructive hover:bg-destructive/20 focus-visible:border-destructive/40 focus-visible:ring-destructive/20 dark:bg-destructive/20 dark:hover:bg-destructive/30 dark:focus-visible:ring-destructive/40",
+        link: "text-primary underline-offset-4 hover:underline",
+      },
+      size: {
+        default:
+          "h-8 gap-1.5 px-2.5 has-data-[icon=inline-end]:pr-2 has-data-[icon=inline-start]:pl-2",
+        xs: "h-6 gap-1 rounded-[min(var(--radius-md),10px)] px-2 text-xs in-data-[slot=button-group]:rounded-lg has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 [&_svg:not([class*='size-'])]:size-3",
+        sm: "h-7 gap-1 rounded-[min(var(--radius-md),12px)] px-2.5 text-[0.8rem] in-data-[slot=button-group]:rounded-lg has-data-[icon=inline-end]:pr-1.5 has-data-[icon=inline-start]:pl-1.5 [&_svg:not([class*='size-'])]:size-3.5",
+        lg: "h-9 gap-1.5 px-2.5 has-data-[icon=inline-end]:pr-2 has-data-[icon=inline-start]:pl-2",
+        icon: "size-8",
+        "icon-xs":
+          "size-6 rounded-[min(var(--radius-md),10px)] in-data-[slot=button-group]:rounded-lg [&_svg:not([class*='size-'])]:size-3",
+        "icon-sm":
+          "size-7 rounded-[min(var(--radius-md),12px)] in-data-[slot=button-group]:rounded-lg",
+        "icon-lg": "size-9",
+      },
+    },
+    defaultVariants: {
+      variant: "default",
+      size: "default",
+    },
+  }
+)
+function Button({
+  className,
+  variant = "default",
+  size = "default",
+  ...props
+}: ButtonPrimitive.Props & VariantProps<typeof buttonVariants>) {
+  return (
+    <ButtonPrimitive
+      data-slot="button"
+      className={cn(buttonVariants({ variant, size, className }))}
+      {...props}
+    />
+  )
+}
+export { Button, buttonVariants }
--- a/client/src/components/ui/card.tsx
+++ b/client/src/components/ui/card.tsx
+import * as React from "react"
+import { cn } from "@/lib/utils"
+function Card({
+  className,
+  size = "default",
+  ...props
+}: React.ComponentProps<"div"> & { size?: "default" | "sm" }) {
+  return (
+    <div
+      data-slot="card"
+      data-size={size}
+      className={cn(
+        "group/card flex flex-col gap-4 overflow-hidden rounded-xl bg-card py-4 text-sm text-card-foreground ring-1 ring-foreground/10 has-data-[slot=card-footer]:pb-0 has-[>img:first-child]:pt-0 data-[size=sm]:gap-3 data-[size=sm]:py-3 data-[size=sm]:has-data-[slot=card-footer]:pb-0 *:[img:first-child]:rounded-t-xl *:[img:last-child]:rounded-b-xl",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function CardHeader({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-header"
+      className={cn(
+        "group/card-header @container/card-header grid auto-rows-min items-start gap-1 rounded-t-xl px-4 group-data-[size=sm]/card:px-3 has-data-[slot=card-action]:grid-cols-[1fr_auto] has-data-[slot=card-description]:grid-rows-[auto_auto] [.border-b]:pb-4 group-data-[size=sm]/card:[.border-b]:pb-3",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function CardTitle({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-title"
+      className={cn(
+        "font-heading text-base leading-snug font-medium group-data-[size=sm]/card:text-sm",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function CardDescription({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-description"
+      className={cn("text-sm text-muted-foreground", className)}
+      {...props}
+    />
+  )
+}
+function CardAction({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-action"
+      className={cn(
+        "col-start-2 row-span-2 row-start-1 self-start justify-self-end",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function CardContent({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-content"
+      className={cn("px-4 group-data-[size=sm]/card:px-3", className)}
+      {...props}
+    />
+  )
+}
+function CardFooter({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-footer"
+      className={cn(
+        "flex items-center rounded-b-xl border-t bg-muted/50 p-4 group-data-[size=sm]/card:p-3",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+export {
+  Card,
+  CardHeader,
+  CardFooter,
+  CardTitle,
+  CardAction,
+  CardDescription,
+  CardContent,
+}
--- a/client/src/components/ui/input.tsx
+++ b/client/src/components/ui/input.tsx
+import * as React from "react"
+import { Input as InputPrimitive } from "@base-ui/react/input"
+import { cn } from "@/lib/utils"
+function Input({ className, type, ...props }: React.ComponentProps<"input">) {
+  return (
+    <InputPrimitive
+      type={type}
+      data-slot="input"
+      className={cn(
+        "h-8 w-full min-w-0 rounded-lg border border-input bg-transparent px-2.5 py-1 text-base transition-colors outline-none file:inline-flex file:h-6 file:border-0 file:bg-transparent file:text-sm file:font-medium file:text-foreground placeholder:text-muted-foreground focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:pointer-events-none disabled:cursor-not-allowed disabled:bg-input/50 disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 md:text-sm dark:bg-input/30 dark:disabled:bg-input/80 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+export { Input }
--- a/client/src/components/ui/label.tsx
+++ b/client/src/components/ui/label.tsx
+import * as React from "react"
+import { cn } from "@/lib/utils"
+function Label({ className, ...props }: React.ComponentProps<"label">) {
+  return (
+    <label
+      data-slot="label"
+      className={cn(
+        "flex items-center gap-2 text-sm leading-none font-medium select-none group-data-[disabled=true]:pointer-events-none group-data-[disabled=true]:opacity-50 peer-disabled:cursor-not-allowed peer-disabled:opacity-50",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+export { Label }
--- a/client/src/components/ui/select.tsx
+++ b/client/src/components/ui/select.tsx
+"use client"
+import * as React from "react"
+import { Select as SelectPrimitive } from "@base-ui/react/select"
+import { cn } from "@/lib/utils"
+import { ChevronDownIcon, CheckIcon, ChevronUpIcon } from "lucide-react"
+const Select = SelectPrimitive.Root
+function SelectGroup({ className, ...props }: SelectPrimitive.Group.Props) {
+  return (
+    <SelectPrimitive.Group
+      data-slot="select-group"
+      className={cn("scroll-my-1 p-1", className)}
+      {...props}
+    />
+  )
+}
+function SelectValue({ className, ...props }: SelectPrimitive.Value.Props) {
+  return (
+    <SelectPrimitive.Value
+      data-slot="select-value"
+      className={cn("flex flex-1 text-left", className)}
+      {...props}
+    />
+  )
+}
+function SelectTrigger({
+  className,
+  size = "default",
+  children,
+  ...props
+}: SelectPrimitive.Trigger.Props & {
+  size?: "sm" | "default"
+}) {
+  return (
+    <SelectPrimitive.Trigger
+      data-slot="select-trigger"
+      data-size={size}
+      className={cn(
+        "flex w-fit items-center justify-between gap-1.5 rounded-lg border border-input bg-transparent py-2 pr-2 pl-2.5 text-sm whitespace-nowrap transition-colors outline-none select-none focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:cursor-not-allowed disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 data-placeholder:text-muted-foreground data-[size=default]:h-8 data-[size=sm]:h-7 data-[size=sm]:rounded-[min(var(--radius-md),10px)] *:data-[slot=select-value]:line-clamp-1 *:data-[slot=select-value]:flex *:data-[slot=select-value]:items-center *:data-[slot=select-value]:gap-1.5 dark:bg-input/30 dark:hover:bg-input/50 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4",
+        className
+      )}
+      {...props}
+    >
+      {children}
+      <SelectPrimitive.Icon
+        render={
+          <ChevronDownIcon className="pointer-events-none size-4 text-muted-foreground" />
+        }
+      />
+    </SelectPrimitive.Trigger>
+  )
+}
+function SelectContent({
+  className,
+  children,
+  side = "bottom",
+  sideOffset = 4,
+  align = "center",
+  alignOffset = 0,
+  alignItemWithTrigger = true,
+  ...props
+}: SelectPrimitive.Popup.Props &
+  Pick<
+    SelectPrimitive.Positioner.Props,
+    "align" | "alignOffset" | "side" | "sideOffset" | "alignItemWithTrigger"
+  >) {
+  return (
+    <SelectPrimitive.Portal>
+      <SelectPrimitive.Positioner
+        side={side}
+        sideOffset={sideOffset}
+        align={align}
+        alignOffset={alignOffset}
+        alignItemWithTrigger={alignItemWithTrigger}
+        className="isolate z-50"
+      >
+        <SelectPrimitive.Popup
+          data-slot="select-content"
+          data-align-trigger={alignItemWithTrigger}
+          className={cn("relative isolate z-50 max-h-(--available-height) w-(--anchor-width) min-w-36 origin-(--transform-origin) overflow-x-hidden overflow-y-auto rounded-lg bg-popover text-popover-foreground shadow-md ring-1 ring-foreground/10 duration-100 data-[align-trigger=true]:animate-none data-[side=bottom]:slide-in-from-top-2 data-[side=inline-end]:slide-in-from-left-2 data-[side=inline-start]:slide-in-from-right-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2 data-open:animate-in data-open:fade-in-0 data-open:zoom-in-95 data-closed:animate-out data-closed:fade-out-0 data-closed:zoom-out-95", className )}
+          {...props}
+        >
+          <SelectScrollUpButton />
+          <SelectPrimitive.List>{children}</SelectPrimitive.List>
+          <SelectScrollDownButton />
+        </SelectPrimitive.Popup>
+      </SelectPrimitive.Positioner>
+    </SelectPrimitive.Portal>
+  )
+}
+function SelectLabel({
+  className,
+  ...props
+}: SelectPrimitive.GroupLabel.Props) {
+  return (
+    <SelectPrimitive.GroupLabel
+      data-slot="select-label"
+      className={cn("px-1.5 py-1 text-xs text-muted-foreground", className)}
+      {...props}
+    />
+  )
+}
+function SelectItem({
+  className,
+  children,
+  ...props
+}: SelectPrimitive.Item.Props) {
+  return (
+    <SelectPrimitive.Item
+      data-slot="select-item"
+      className={cn(
+        "relative flex w-full cursor-default items-center gap-1.5 rounded-md py-1 pr-8 pl-1.5 text-sm outline-hidden select-none focus:bg-accent focus:text-accent-foreground not-data-[variant=destructive]:focus:**:text-accent-foreground data-disabled:pointer-events-none data-disabled:opacity-50 [&_svg]:pointer-events-none [&_svg]:shrink-0 [&_svg:not([class*='size-'])]:size-4 *:[span]:last:flex *:[span]:last:items-center *:[span]:last:gap-2",
+        className
+      )}
+      {...props}
+    >
+      <SelectPrimitive.ItemText className="flex flex-1 shrink-0 gap-2 whitespace-nowrap">
+        {children}
+      </SelectPrimitive.ItemText>
+      <SelectPrimitive.ItemIndicator
+        render={
+          <span className="pointer-events-none absolute right-2 flex size-4 items-center justify-center" />
+        }
+      >
+        <CheckIcon className="pointer-events-none" />
+      </SelectPrimitive.ItemIndicator>
+    </SelectPrimitive.Item>
+  )
+}
+function SelectSeparator({
+  className,
+  ...props
+}: SelectPrimitive.Separator.Props) {
+  return (
+    <SelectPrimitive.Separator
+      data-slot="select-separator"
+      className={cn("pointer-events-none -mx-1 my-1 h-px bg-border", className)}
+      {...props}
+    />
+  )
+}
+function SelectScrollUpButton({
+  className,
+  ...props
+}: React.ComponentProps<typeof SelectPrimitive.ScrollUpArrow>) {
+  return (
+    <SelectPrimitive.ScrollUpArrow
+      data-slot="select-scroll-up-button"
+      className={cn(
+        "top-0 z-10 flex w-full cursor-default items-center justify-center bg-popover py-1 [&_svg:not([class*='size-'])]:size-4",
+        className
+      )}
+      {...props}
+    >
+      <ChevronUpIcon
+      />
+    </SelectPrimitive.ScrollUpArrow>
+  )
+}
+function SelectScrollDownButton({
+  className,
+  ...props
+}: React.ComponentProps<typeof SelectPrimitive.ScrollDownArrow>) {
+  return (
+    <SelectPrimitive.ScrollDownArrow
+      data-slot="select-scroll-down-button"
+      className={cn(
+        "bottom-0 z-10 flex w-full cursor-default items-center justify-center bg-popover py-1 [&_svg:not([class*='size-'])]:size-4",
+        className
+      )}
+      {...props}
+    >
+      <ChevronDownIcon
+      />
+    </SelectPrimitive.ScrollDownArrow>
+  )
+}
+export {
+  Select,
+  SelectContent,
+  SelectGroup,
+  SelectItem,
+  SelectLabel,
+  SelectScrollDownButton,
+  SelectScrollUpButton,
+  SelectSeparator,
+  SelectTrigger,
+  SelectValue,
+}
--- a/client/src/components/ui/separator.tsx
+++ b/client/src/components/ui/separator.tsx
+import { Separator as SeparatorPrimitive } from "@base-ui/react/separator"
+import { cn } from "@/lib/utils"
+function Separator({
+  className,
+  orientation = "horizontal",
+  ...props
+}: SeparatorPrimitive.Props) {
+  return (
+    <SeparatorPrimitive
+      data-slot="separator"
+      orientation={orientation}
+      className={cn(
+        "shrink-0 bg-border data-horizontal:h-px data-horizontal:w-full data-vertical:w-px data-vertical:self-stretch",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+export { Separator }
--- a/client/src/components/ui/switch.tsx
+++ b/client/src/components/ui/switch.tsx
+import { Switch as SwitchPrimitive } from "@base-ui/react/switch"
+import { cn } from "@/lib/utils"
+function Switch({
+  className,
+  size = "default",
+  ...props
+}: SwitchPrimitive.Root.Props & {
+  size?: "sm" | "default"
+}) {
+  return (
+    <SwitchPrimitive.Root
+      data-slot="switch"
+      data-size={size}
+      className={cn(
+        "peer group/switch relative inline-flex shrink-0 items-center rounded-full border border-transparent transition-all outline-none after:absolute after:-inset-x-3 after:-inset-y-2 focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 data-[size=default]:h-[18.4px] data-[size=default]:w-[32px] data-[size=sm]:h-[14px] data-[size=sm]:w-[24px] dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 data-checked:bg-primary data-unchecked:bg-input dark:data-unchecked:bg-input/80 data-disabled:cursor-not-allowed data-disabled:opacity-50",
+        className
+      )}
+      {...props}
+    >
+      <SwitchPrimitive.Thumb
+        data-slot="switch-thumb"
+        className="pointer-events-none block rounded-full bg-background ring-0 transition-transform group-data-[size=default]/switch:size-4 group-data-[size=sm]/switch:size-3 group-data-[size=default]/switch:data-checked:translate-x-[calc(100%-2px)] group-data-[size=sm]/switch:data-checked:translate-x-[calc(100%-2px)] dark:data-checked:bg-primary-foreground group-data-[size=default]/switch:data-unchecked:translate-x-0 group-data-[size=sm]/switch:data-unchecked:translate-x-0 dark:data-unchecked:bg-foreground"
+      />
+    </SwitchPrimitive.Root>
+  )
+}
+export { Switch }
--- a/client/src/components/ui/table.tsx
+++ b/client/src/components/ui/table.tsx
+"use client"
+import * as React from "react"
+import { cn } from "@/lib/utils"
+function Table({ className, ...props }: React.ComponentProps<"table">) {
+  return (
+    <div
+      data-slot="table-container"
+      className="relative w-full overflow-x-auto"
+    >
+      <table
+        data-slot="table"
+        className={cn("w-full caption-bottom text-sm", className)}
+        {...props}
+      />
+    </div>
+  )
+}
+function TableHeader({ className, ...props }: React.ComponentProps<"thead">) {
+  return (
+    <thead
+      data-slot="table-header"
+      className={cn("[&_tr]:border-b", className)}
+      {...props}
+    />
+  )
+}
+function TableBody({ className, ...props }: React.ComponentProps<"tbody">) {
+  return (
+    <tbody
+      data-slot="table-body"
+      className={cn("[&_tr:last-child]:border-0", className)}
+      {...props}
+    />
+  )
+}
+function TableFooter({ className, ...props }: React.ComponentProps<"tfoot">) {
+  return (
+    <tfoot
+      data-slot="table-footer"
+      className={cn(
+        "border-t bg-muted/50 font-medium [&>tr]:last:border-b-0",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function TableRow({ className, ...props }: React.ComponentProps<"tr">) {
+  return (
+    <tr
+      data-slot="table-row"
+      className={cn(
+        "border-b transition-colors hover:bg-muted/50 has-aria-expanded:bg-muted/50 data-[state=selected]:bg-muted",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function TableHead({ className, ...props }: React.ComponentProps<"th">) {
+  return (
+    <th
+      data-slot="table-head"
+      className={cn(
+        "h-10 px-2 text-left align-middle font-medium whitespace-nowrap text-foreground [&:has([role=checkbox])]:pr-0",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function TableCell({ className, ...props }: React.ComponentProps<"td">) {
+  return (
+    <td
+      data-slot="table-cell"
+      className={cn(
+        "p-2 align-middle whitespace-nowrap [&:has([role=checkbox])]:pr-0",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+function TableCaption({
+  className,
+  ...props
+}: React.ComponentProps<"caption">) {
+  return (
+    <caption
+      data-slot="table-caption"
+      className={cn("mt-4 text-sm text-muted-foreground", className)}
+      {...props}
+    />
+  )
+}
+export {
+  Table,
+  TableHeader,
+  TableBody,
+  TableFooter,
+  TableHead,
+  TableRow,
+  TableCell,
+  TableCaption,
+}
--- a/client/src/components/ui/textarea.tsx
+++ b/client/src/components/ui/textarea.tsx
+import * as React from "react"
+import { cn } from "@/lib/utils"
+function Textarea({ className, ...props }: React.ComponentProps<"textarea">) {
+  return (
+    <textarea
+      data-slot="textarea"
+      className={cn(
+        "flex field-sizing-content min-h-16 w-full rounded-lg border border-input bg-transparent px-2.5 py-2 text-base transition-colors outline-none placeholder:text-muted-foreground focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 disabled:cursor-not-allowed disabled:bg-input/50 disabled:opacity-50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 md:text-sm dark:bg-input/30 dark:disabled:bg-input/80 dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+export { Textarea }
--- a/client/src/index.css
+++ b/client/src/index.css
+@import "tailwindcss";
+@import "tw-animate-css";
+@import "shadcn/tailwind.css";
+@import "@fontsource-variable/geist";
+@import "@fontsource-variable/geist-mono";
+@custom-variant dark (&:is(.dark *));
+@theme inline {
+    --font-heading: var(--font-sans);
+    --font-sans: 'Geist Variable', ui-sans-serif, system-ui, -apple-system, sans-serif;
+    --font-mono: 'Geist Mono Variable', ui-monospace, SFMono-Regular, Menlo, monospace;
+    --color-sidebar-ring: var(--sidebar-ring);
+    --color-sidebar-border: var(--sidebar-border);
+    --color-sidebar-accent-foreground: var(--sidebar-accent-foreground);
+    --color-sidebar-accent: var(--sidebar-accent);
+    --color-sidebar-primary-foreground: var(--sidebar-primary-foreground);
+    --color-sidebar-primary: var(--sidebar-primary);
+    --color-sidebar-foreground: var(--sidebar-foreground);
+    --color-sidebar: var(--sidebar);
+    --color-chart-5: var(--chart-5);
+    --color-chart-4: var(--chart-4);
+    --color-chart-3: var(--chart-3);
+    --color-chart-2: var(--chart-2);
+    --color-chart-1: var(--chart-1);
+    --color-ring: var(--ring);
+    --color-input: var(--input);
+    --color-border: var(--border);
+    --color-destructive: var(--destructive);
+    --color-accent-foreground: var(--accent-foreground);
+    --color-accent: var(--accent);
+    --color-muted-foreground: var(--muted-foreground);
+    --color-muted: var(--muted);
+    --color-secondary-foreground: var(--secondary-foreground);
+    --color-secondary: var(--secondary);
+    --color-primary-foreground: var(--primary-foreground);
+    --color-primary: var(--primary);
+    --color-popover-foreground: var(--popover-foreground);
+    --color-popover: var(--popover);
+    --color-card-foreground: var(--card-foreground);
+    --color-card: var(--card);
+    --color-foreground: var(--foreground);
+    --color-background: var(--background);
+    --radius-sm: calc(var(--radius) * 0.6);
+    --radius-md: calc(var(--radius) * 0.8);
+    --radius-lg: var(--radius);
+    --radius-xl: calc(var(--radius) * 1.4);
+    --radius-2xl: calc(var(--radius) * 1.8);
+    --radius-3xl: calc(var(--radius) * 2.2);
+    --radius-4xl: calc(var(--radius) * 2.6);
+}
+:root {
+    --background: oklch(0.995 0 0);
+    --foreground: oklch(0.18 0 0);
+    --card: oklch(1 0 0);
+    --card-foreground: oklch(0.18 0 0);
+    --popover: oklch(1 0 0);
+    --popover-foreground: oklch(0.18 0 0);
+    --primary: oklch(0.22 0 0);
+    --primary-foreground: oklch(0.98 0 0);
+    --secondary: oklch(0.97 0 0);
+    --secondary-foreground: oklch(0.22 0 0);
+    --muted: oklch(0.97 0 0);
+    --muted-foreground: oklch(0.52 0 0);
+    --accent: oklch(0.96 0 0);
+    --accent-foreground: oklch(0.22 0 0);
+    --destructive: oklch(0.577 0.245 27.325);
+    --border: oklch(0.92 0 0);
+    --input: oklch(0.92 0 0);
+    --ring: oklch(0.55 0 0);
+    --chart-1: oklch(0.28 0 0);
+    --chart-2: oklch(0.45 0 0);
+    --chart-3: oklch(0.6 0 0);
+    --chart-4: oklch(0.75 0 0);
+    --chart-5: oklch(0.87 0 0);
+    --radius: 0.5rem;
+    --sidebar: oklch(0.98 0 0);
+    --sidebar-foreground: oklch(0.18 0 0);
+    --sidebar-primary: oklch(0.22 0 0);
+    --sidebar-primary-foreground: oklch(0.98 0 0);
+    --sidebar-accent: oklch(0.96 0 0);
+    --sidebar-accent-foreground: oklch(0.22 0 0);
+    --sidebar-border: oklch(0.92 0 0);
+    --sidebar-ring: oklch(0.55 0 0);
+}
+.dark {
+    --background: oklch(0.135 0 0);
+    --foreground: oklch(0.98 0 0);
+    --card: oklch(0.175 0 0);
+    --card-foreground: oklch(0.98 0 0);
+    --popover: oklch(0.175 0 0);
+    --popover-foreground: oklch(0.98 0 0);
+    --primary: oklch(0.93 0 0);
+    --primary-foreground: oklch(0.18 0 0);
+    --secondary: oklch(0.24 0 0);
+    --secondary-foreground: oklch(0.98 0 0);
+    --muted: oklch(0.22 0 0);
+    --muted-foreground: oklch(0.68 0 0);
+    --accent: oklch(0.24 0 0);
+    --accent-foreground: oklch(0.98 0 0);
+    --destructive: oklch(0.704 0.191 22.216);
+    --border: oklch(1 0 0 / 10%);
+    --input: oklch(1 0 0 / 14%);
+    --ring: oklch(0.6 0 0);
+    --chart-1: oklch(0.92 0 0);
+    --chart-2: oklch(0.78 0 0);
+    --chart-3: oklch(0.6 0 0);
+    --chart-4: oklch(0.45 0 0);
+    --chart-5: oklch(0.32 0 0);
+    --sidebar: oklch(0.175 0 0);
+    --sidebar-foreground: oklch(0.98 0 0);
+    --sidebar-primary: oklch(0.93 0 0);
+    --sidebar-primary-foreground: oklch(0.18 0 0);
+    --sidebar-accent: oklch(0.24 0 0);
+    --sidebar-accent-foreground: oklch(0.98 0 0);
+    --sidebar-border: oklch(1 0 0 / 10%);
+    --sidebar-ring: oklch(0.6 0 0);
+}
+@layer base {
+    * {
+        @apply border-border outline-ring/50;
+    }
+    html {
+        @apply font-sans antialiased;
+        font-feature-settings: 'ss01', 'cv11';
+        text-rendering: optimizeLegibility;
+        -webkit-font-smoothing: antialiased;
+        -moz-osx-font-smoothing: grayscale;
+    }
+    body {
+        @apply bg-background text-foreground;
+        font-feature-settings: 'ss01', 'cv11', 'tnum';
+    }
+    code, pre, kbd, samp {
+        font-family: var(--font-mono);
+        font-feature-settings: 'ss02', 'ss03';
+    }
+    /* Tabular numerals on any element that opts in */
+    .tabular-nums {
+        font-variant-numeric: tabular-nums;
+    }
+    /* Tighter focus ring across inputs */
+    input, textarea, select {
+        font-feature-settings: 'ss01';
+    }
+}
--- a/client/src/lib/api.ts
+++ b/client/src/lib/api.ts
+const BASE = import.meta.env.BASE_URL.replace(/\/$/, '');
+export async function apiFetch<T>(path: string, options?: RequestInit): Promise<T> {
+  const res = await fetch(`${BASE}${path}`, {
+    headers: { 'Content-Type': 'application/json', ...options?.headers },
+    ...options,
+  });
+  if (!res.ok) {
+    const body = await res.json().catch(() => ({ error: { message: res.statusText } }));
+    throw new Error(body.error?.message ?? `HTTP ${res.status}`);
+  }
+  return res.json();
+}
--- a/client/src/lib/utils.ts
+++ b/client/src/lib/utils.ts
+import { clsx, type ClassValue } from "clsx"
+import { twMerge } from "tailwind-merge"
+export function cn(...inputs: ClassValue[]) {
+  return twMerge(clsx(inputs))
+}
--- a/client/src/main.tsx
+++ b/client/src/main.tsx
+import { StrictMode } from 'react'
+import { createRoot } from 'react-dom/client'
+import './index.css'
+import App from './App'
+createRoot(document.getElementById('root')!).render(
+  <StrictMode>
+    <App />
+  </StrictMode>,
+)
--- a/client/src/pages/AnalyticsPage.tsx
+++ b/client/src/pages/AnalyticsPage.tsx
+import { useState } from 'react'
+import { useQuery } from '@tanstack/react-query'
+import {
+  BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, ResponsiveContainer,
+  LineChart, Line, Legend,
+} from 'recharts'
+import { apiFetch } from '@/lib/api'
+import { Button } from '@/components/ui/button'
+import { Table, TableBody, TableCell, TableHead, TableHeader, TableRow } from '@/components/ui/table'
+import { PageHeader } from '@/components/page-header'
+type TimeRange = '24h' | '7d' | '30d'
+function formatTokens(n?: number): string {
+  if (!n) return '0'
+  if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`
+  if (n >= 1_000) return `${(n / 1_000).toFixed(1)}K`
+  return String(n)
+}
+function Stat({ label, value, className }: { label: string; value: string | number; className?: string }) {
+  return (
+    <div className="rounded-lg border bg-card px-4 py-3">
+      <p className="text-[11px] text-muted-foreground uppercase tracking-wider">{label}</p>
+      <p className={`text-xl font-semibold tabular-nums mt-1 ${className ?? ''}`}>{value}</p>
+    </div>
+  )
+}
+function Panel({ title, children }: { title: string; children: React.ReactNode }) {
+  return (
+    <div className="rounded-lg border bg-card">
+      <div className="px-4 py-3 border-b">
+        <h3 className="text-sm font-medium">{title}</h3>
+      </div>
+      <div className="p-4">{children}</div>
+    </div>
+  )
+}
+const axisStyle = { fontSize: 11, fill: 'var(--muted-foreground)' } as const
+const gridStyle = 'var(--border)'
+const primaryFill = 'var(--foreground)'
+export default function AnalyticsPage() {
+  const [range, setRange] = useState<TimeRange>('7d')
+  const { data: summary } = useQuery({
+    queryKey: ['analytics', 'summary', range],
+    queryFn: () => apiFetch<any>(`/api/analytics/summary?range=${range}`),
+  })
+  const { data: byPlatform = [] } = useQuery({
+    queryKey: ['analytics', 'by-platform', range],
+    queryFn: () => apiFetch<any[]>(`/api/analytics/by-platform?range=${range}`),
+  })
+  const { data: timeline = [] } = useQuery({
+    queryKey: ['analytics', 'timeline', range],
+    queryFn: () => apiFetch<any[]>(`/api/analytics/timeline?range=${range}`),
+  })
+  const { data: byModel = [] } = useQuery({
+    queryKey: ['analytics', 'by-model', range],
+    queryFn: () => apiFetch<any[]>(`/api/analytics/by-model?range=${range}`),
+  })
+  const { data: errors = [] } = useQuery({
+    queryKey: ['analytics', 'errors', range],
+    queryFn: () => apiFetch<any[]>(`/api/analytics/errors?range=${range}`),
+  })
+  const { data: errorDist } = useQuery({
+    queryKey: ['analytics', 'error-distribution', range],
+    queryFn: () => apiFetch<{ byCategory: any[]; byPlatform: any[]; detailed: any[] }>(`/api/analytics/error-distribution?range=${range}`),
+  })
+  return (
+    <div>
+      <PageHeader
+        title="Analytics"
+        description="Request volume, latency, token usage, and failures."
+        actions={
+          <div className="flex gap-1 rounded-md border p-0.5">
+            {(['24h', '7d', '30d'] as TimeRange[]).map(r => (
+              <Button
+                key={r}
+                variant={range === r ? 'secondary' : 'ghost'}
+                size="xs"
+                onClick={() => setRange(r)}
+              >
+                {r}
+              </Button>
+            ))}
+          </div>
+        }
+      />
+      <div className="space-y-6">
+        {/* Summary stats */}
+        <div className="grid grid-cols-2 sm:grid-cols-3 lg:grid-cols-6 gap-3">
+          <Stat label="Requests" value={summary?.totalRequests ?? 0} />
+          <Stat label="Success rate" value={`${summary?.successRate ?? 0}%`} />
+          <Stat label="Input tokens" value={formatTokens(summary?.totalInputTokens)} />
+          <Stat label="Output tokens" value={formatTokens(summary?.totalOutputTokens)} />
+          <Stat label="Avg latency" value={`${summary?.avgLatencyMs ?? 0} ms`} />
+          <Stat label="Est. savings" value={`$${summary?.estimatedCostSavings ?? '0.00'}`} />
+        </div>
+        <div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
+          <Panel title="Requests by provider">
+            {byPlatform.length === 0 ? (
+              <p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
+            ) : (
+              <ResponsiveContainer width="100%" height={240}>
+                <BarChart data={byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
+                  <CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
+                  <XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
+                  <YAxis tick={axisStyle} tickLine={false} axisLine={false} />
+                  <Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
+                  <Bar dataKey="requests" fill={primaryFill} radius={[3, 3, 0, 0]} />
+                </BarChart>
+              </ResponsiveContainer>
+            )}
+          </Panel>
+          <Panel title="Avg latency by provider">
+            {byPlatform.length === 0 ? (
+              <p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
+            ) : (
+              <ResponsiveContainer width="100%" height={240}>
+                <BarChart data={byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
+                  <CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
+                  <XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
+                  <YAxis unit="ms" tick={axisStyle} tickLine={false} axisLine={false} />
+                  <Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
+                  <Bar dataKey="avgLatencyMs" name="Latency (ms)" fill="var(--muted-foreground)" radius={[3, 3, 0, 0]} />
+                </BarChart>
+              </ResponsiveContainer>
+            )}
+          </Panel>
+          <div className="lg:col-span-2">
+            <Panel title="Requests over time">
+              {timeline.length === 0 ? (
+                <p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
+              ) : (
+                <ResponsiveContainer width="100%" height={240}>
+                  <LineChart data={timeline} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
+                    <CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
+                    <XAxis dataKey="timestamp" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
+                    <YAxis tick={axisStyle} tickLine={false} axisLine={false} />
+                    <Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
+                    <Legend wrapperStyle={{ fontSize: 12 }} iconType="line" />
+                    <Line type="monotone" dataKey="successCount" name="Success" stroke={primaryFill} strokeWidth={1.5} dot={false} />
+                    <Line type="monotone" dataKey="failureCount" name="Failures" stroke="var(--destructive)" strokeWidth={1.5} dot={false} />
+                  </LineChart>
+                </ResponsiveContainer>
+              )}
+            </Panel>
+          </div>
+          <div className="lg:col-span-2">
+            <Panel title="Per-model breakdown">
+              {byModel.length === 0 ? (
+                <p className="text-sm text-muted-foreground text-center py-8">No data yet</p>
+              ) : (
+                <div className="max-h-[360px] overflow-y-auto -mx-4">
+                  <Table>
+                    <TableHeader>
+                      <TableRow>
+                        <TableHead className="pl-4">Model</TableHead>
+                        <TableHead>Provider</TableHead>
+                        <TableHead className="text-right">Requests</TableHead>
+                        <TableHead className="text-right">Success</TableHead>
+                        <TableHead className="text-right">Latency</TableHead>
+                        <TableHead className="text-right">In tokens</TableHead>
+                        <TableHead className="text-right pr-4">Out tokens</TableHead>
+                      </TableRow>
+                    </TableHeader>
+                    <TableBody>
+                      {byModel.map((m: any, i: number) => (
+                        <TableRow key={i}>
+                          <TableCell className="pl-4 text-sm font-medium">{m.displayName}</TableCell>
+                          <TableCell className="text-xs text-muted-foreground">{m.platform}</TableCell>
+                          <TableCell className="text-right tabular-nums">{m.requests}</TableCell>
+                          <TableCell className="text-right tabular-nums">{m.successRate}%</TableCell>
+                          <TableCell className="text-right tabular-nums">{m.avgLatencyMs} ms</TableCell>
+                          <TableCell className="text-right tabular-nums">{formatTokens(m.totalInputTokens)}</TableCell>
+                          <TableCell className="text-right tabular-nums pr-4">{formatTokens(m.totalOutputTokens)}</TableCell>
+                        </TableRow>
+                      ))}
+                    </TableBody>
+                  </Table>
+                </div>
+              )}
+            </Panel>
+          </div>
+          <Panel title="Errors by provider">
+            {!errorDist?.byPlatform?.length ? (
+              <p className="text-sm text-muted-foreground text-center py-8">No errors</p>
+            ) : (
+              <ResponsiveContainer width="100%" height={240}>
+                <BarChart data={errorDist.byPlatform} margin={{ top: 6, right: 6, left: -12, bottom: 0 }}>
+                  <CartesianGrid strokeDasharray="2 4" stroke={gridStyle} />
+                  <XAxis dataKey="platform" tick={axisStyle} tickLine={false} axisLine={{ stroke: gridStyle }} />
+                  <YAxis tick={axisStyle} tickLine={false} axisLine={false} />
+                  <Tooltip contentStyle={{ backgroundColor: 'var(--popover)', border: '1px solid var(--border)', borderRadius: 8, fontSize: 12 }} />
+                  <Bar dataKey="count" fill="var(--destructive)" radius={[3, 3, 0, 0]} />
+                </BarChart>
+              </ResponsiveContainer>
+            )}
+          </Panel>
+          <Panel title="Recent errors">
+            {errors.length === 0 ? (
+              <p className="text-sm text-muted-foreground text-center py-8">No errors</p>
+            ) : (
+              <div className="max-h-[240px] overflow-y-auto -mx-4">
+                <Table>
+                  <TableHeader>
+                    <TableRow>
+                      <TableHead className="pl-4">Provider</TableHead>
+                      <TableHead>Message</TableHead>
+                      <TableHead className="text-right pr-4">Time</TableHead>
+                    </TableRow>
+                  </TableHeader>
+                  <TableBody>
+                    {errors.slice(0, 20).map((e: any) => (
+                      <TableRow key={e.id}>
+                        <TableCell className="pl-4 text-xs">{e.platform}</TableCell>
+                        <TableCell className="text-xs max-w-[200px] truncate">{e.error}</TableCell>
+                        <TableCell className="text-right text-xs text-muted-foreground tabular-nums pr-4">
+                          {new Date(e.createdAt).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })}
+                        </TableCell>
+                      </TableRow>
+                    ))}
+                  </TableBody>
+                </Table>
+              </div>
+            )}
+          </Panel>
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/client/src/pages/FallbackPage.tsx
+++ b/client/src/pages/FallbackPage.tsx
+import { useState } from 'react'
+import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
+import {
+  DndContext,
+  closestCenter,
+  KeyboardSensor,
+  PointerSensor,
+  useSensor,
+  useSensors,
+  type DragEndEvent,
+} from '@dnd-kit/core'
+import {
+  arrayMove,
+  SortableContext,
+  sortableKeyboardCoordinates,
+  useSortable,
+  verticalListSortingStrategy,
+} from '@dnd-kit/sortable'
+import { CSS } from '@dnd-kit/utilities'
+import { apiFetch } from '@/lib/api'
+import { Button } from '@/components/ui/button'
+import { Switch } from '@/components/ui/switch'
+import { PageHeader } from '@/components/page-header'
+interface FallbackEntry {
+  modelDbId: number
+  priority: number
+  effectivePriority: number
+  penalty: number
+  rateLimitHits: number
+  enabled: boolean
+  platform: string
+  modelId: string
+  displayName: string
+  intelligenceRank: number
+  speedRank: number
+  sizeLabel: string
+  rpmLimit: number | null
+  rpdLimit: number | null
+  monthlyTokenBudget: string
+  keyCount: number
+}
+function formatTokens(n: number): string {
+  if (n >= 1_000_000_000) return `${(n / 1_000_000_000).toFixed(1)}B`
+  if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`
+  if (n >= 1_000) return `${(n / 1_000).toFixed(1)}K`
+  return String(n)
+}
+interface TokenUsageData {
+  totalBudget: number
+  totalUsed: number
+  models: { displayName: string; platform: string; budget: number }[]
+}
+const platformColors: Record<string, string> = {
+  google:      '#4285f4',
+  groq:        '#f55036',
+  cerebras:    '#8b5cf6',
+  sambanova:   '#14b8a6',
+  nvidia:      '#76b900',
+  mistral:     '#f59e0b',
+  openrouter:  '#ec4899',
+  github:      '#6e7b8b',
+  huggingface: '#ffd21e',
+  cohere:      '#d946ef',
+  cloudflare:  '#f38020',
+  zhipu:       '#06b6d4',
+  moonshot:    '#4f46e5',
+  minimax:     '#a855f7',
+}
+function TokenUsageBar({ data }: { data: TokenUsageData }) {
+  const { totalBudget, totalUsed, models } = data
+  const remaining = Math.max(0, totalBudget - totalUsed)
+  const remainingPct = totalBudget > 0 ? Math.round((remaining / totalBudget) * 100) : 0
+  // Scale each model's segment proportionally so the colored portion of the
+  // bar sums to `remaining`; the grey tail represents what's been used.
+  const modelsWithWidth = models.map(m => ({
+    ...m,
+    remainingTokens: totalBudget > 0 ? (m.budget / totalBudget) * remaining : 0,
+    widthPct: totalBudget > 0 ? (m.budget / totalBudget) * (remaining / totalBudget) * 100 : 0,
+  }))
+  const usedPct = totalBudget > 0 ? (totalUsed / totalBudget) * 100 : 0
+  return (
+    <section className="rounded-lg border bg-card p-5">
+      <div className="flex items-baseline justify-between mb-3">
+        <h2 className="text-sm font-medium">Monthly token budget</h2>
+        <span className="text-xs text-muted-foreground tabular-nums">
+          <span className="text-foreground font-medium">{formatTokens(remaining)}</span> remaining
+          <span className="mx-1.5">·</span>
+          {remainingPct}% of {formatTokens(totalBudget)}
+        </span>
+      </div>
+      <div className="flex h-2.5 rounded-full overflow-hidden bg-muted">
+        {modelsWithWidth.map((m, i) => (
+          <div
+            key={i}
+            title={`${m.displayName} (${m.platform}) — ${formatTokens(m.remainingTokens)} remaining`}
+            style={{
+              width: `${m.widthPct}%`,
+              backgroundColor: platformColors[m.platform] ?? '#94a3b8',
+            }}
+          />
+        ))}
+        {totalUsed > 0 && (
+          <div
+            title={`Used — ${formatTokens(totalUsed)}`}
+            className="bg-muted-foreground/30"
+            style={{ width: `${usedPct}%` }}
+          />
+        )}
+      </div>
+      <div className="mt-4 grid grid-cols-1 sm:grid-cols-2 lg:grid-cols-3 gap-x-5 gap-y-1.5 text-xs tabular-nums">
+        {modelsWithWidth.map((m, i) => (
+          <div key={i} className="flex items-center gap-2 min-w-0">
+            <span
+              className="size-2 rounded-sm flex-shrink-0"
+              style={{ backgroundColor: platformColors[m.platform] ?? '#94a3b8' }}
+            />
+            <span className="truncate">{m.displayName}</span>
+            <span className="flex-1" />
+            <span className="font-mono text-muted-foreground">{formatTokens(m.remainingTokens)}</span>
+          </div>
+        ))}
+      </div>
+    </section>
+  )
+}
+function SortableModelRow({
+  entry,
+  index,
+  onToggle,
+}: {
+  entry: FallbackEntry
+  index: number
+  onToggle: (modelDbId: number, enabled: boolean) => void
+}) {
+  const { attributes, listeners, setNodeRef, transform, transition, isDragging } = useSortable({
+    id: entry.modelDbId,
+  })
+  const style = {
+    transform: CSS.Transform.toString(transform),
+    transition,
+  }
+  return (
+    <div
+      ref={setNodeRef}
+      style={style}
+      className={`group flex items-center gap-3 px-4 py-3 bg-card ${isDragging ? 'opacity-50' : ''} ${entry.enabled ? '' : 'opacity-50'}`}
+    >
+      <button
+        {...attributes}
+        {...listeners}
+        className="cursor-grab active:cursor-grabbing text-muted-foreground/50 hover:text-foreground transition-colors"
+        aria-label="Drag to reorder"
+      >
+        <svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor">
+          <circle cx="9" cy="6" r="1.5" /><circle cx="15" cy="6" r="1.5" />
+          <circle cx="9" cy="12" r="1.5" /><circle cx="15" cy="12" r="1.5" />
+          <circle cx="9" cy="18" r="1.5" /><circle cx="15" cy="18" r="1.5" />
+        </svg>
+      </button>
+      <span className="text-xs font-mono text-muted-foreground w-5 tabular-nums">{index + 1}</span>
+      <div className="flex-1 min-w-0">
+        <div className="flex items-center gap-2 flex-wrap">
+          <span className="font-medium text-sm">{entry.displayName}</span>
+          <span className="text-xs text-muted-foreground">{entry.platform}</span>
+          {entry.penalty > 0 && (
+            <span className="text-xs text-amber-600 dark:text-amber-400">
+              −{entry.penalty} penalty
+            </span>
+          )}
+        </div>
+        <div className="flex gap-3 mt-0.5 text-xs text-muted-foreground tabular-nums">
+          <span>Intel #{entry.intelligenceRank}</span>
+          <span>Speed #{entry.speedRank}</span>
+          {entry.rpmLimit && <span>{entry.rpmLimit} rpm</span>}
+          {entry.rpdLimit && <span>{entry.rpdLimit} rpd</span>}
+          <span>{entry.monthlyTokenBudget} tok/mo</span>
+        </div>
+      </div>
+      <Switch
+        checked={entry.enabled}
+        onCheckedChange={(checked) => onToggle(entry.modelDbId, checked)}
+      />
+    </div>
+  )
+}
+export default function FallbackPage() {
+  const queryClient = useQueryClient()
+  const [localEntries, setLocalEntries] = useState<FallbackEntry[] | null>(null)
+  const { data: entries = [], isLoading } = useQuery<FallbackEntry[]>({
+    queryKey: ['fallback'],
+    queryFn: () => apiFetch('/api/fallback'),
+  })
+  const { data: tokenUsage } = useQuery<TokenUsageData>({
+    queryKey: ['fallback', 'token-usage'],
+    queryFn: () => apiFetch('/api/fallback/token-usage'),
+  })
+  const saveMutation = useMutation({
+    mutationFn: (data: { modelDbId: number; priority: number; enabled: boolean }[]) =>
+      apiFetch('/api/fallback', { method: 'PUT', body: JSON.stringify(data) }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['fallback'] })
+      setLocalEntries(null)
+    },
+  })
+  const sortMutation = useMutation({
+    mutationFn: (preset: string) =>
+      apiFetch(`/api/fallback/sort/${preset}`, { method: 'POST' }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['fallback'] })
+      setLocalEntries(null)
+    },
+  })
+  const allEntries = localEntries ?? entries
+  const displayEntries = allEntries.filter(e => e.keyCount > 0)
+  const unconfiguredPlatforms = [...new Set(allEntries.filter(e => e.keyCount === 0).map(e => e.platform))]
+  const sensors = useSensors(
+    useSensor(PointerSensor),
+    useSensor(KeyboardSensor, { coordinateGetter: sortableKeyboardCoordinates }),
+  )
+  function handleDragEnd(event: DragEndEvent) {
+    const { active, over } = event
+    if (!over || active.id === over.id) return
+    const oldIndex = displayEntries.findIndex(e => e.modelDbId === active.id)
+    const newIndex = displayEntries.findIndex(e => e.modelDbId === over.id)
+    const reorderedVisible = arrayMove(displayEntries, oldIndex, newIndex)
+    const unconfigured = allEntries.filter(e => e.keyCount === 0)
+    const merged = [
+      ...reorderedVisible.map((e, i) => ({ ...e, priority: i + 1 })),
+      ...unconfigured.map((e, i) => ({ ...e, priority: reorderedVisible.length + i + 1 })),
+    ]
+    setLocalEntries(merged)
+  }
+  function handleToggle(modelDbId: number, enabled: boolean) {
+    const updated = allEntries.map(e =>
+      e.modelDbId === modelDbId ? { ...e, enabled } : e
+    )
+    setLocalEntries(updated)
+  }
+  function handleSave() {
+    if (!localEntries) return
+    saveMutation.mutate(
+      allEntries.map(e => ({
+        modelDbId: e.modelDbId,
+        priority: e.priority,
+        enabled: e.enabled,
+      }))
+    )
+  }
+  const hasChanges = localEntries !== null
+  return (
+    <div>
+      <PageHeader
+        title="Fallback chain"
+        description="Drag to reorder. Requests try models top-to-bottom until one succeeds."
+        actions={
+          <>
+            <Button variant="outline" size="sm" onClick={() => sortMutation.mutate('intelligence')} disabled={sortMutation.isPending}>
+              Sort by intelligence
+            </Button>
+            <Button variant="outline" size="sm" onClick={() => sortMutation.mutate('speed')} disabled={sortMutation.isPending}>
+              Sort by speed
+            </Button>
+            <Button variant="outline" size="sm" onClick={() => sortMutation.mutate('budget')} disabled={sortMutation.isPending}>
+              Sort by budget
+            </Button>
+          </>
+        }
+      />
+      <div className="space-y-6">
+        {tokenUsage && tokenUsage.totalBudget > 0 && (
+          <TokenUsageBar data={tokenUsage} />
+        )}
+        {isLoading ? (
+          <p className="text-sm text-muted-foreground">Loading…</p>
+        ) : displayEntries.length === 0 ? (
+          <div className="rounded-lg border border-dashed p-8 text-center">
+            <p className="text-sm text-muted-foreground">
+              No models available. Add API keys on the <a href="/keys" className="underline text-foreground">Keys page</a> first.
+            </p>
+          </div>
+        ) : (
+          <>
+            <div className="rounded-lg border divide-y overflow-hidden">
+              <DndContext
+                sensors={sensors}
+                collisionDetection={closestCenter}
+                onDragEnd={handleDragEnd}
+              >
+                <SortableContext
+                  items={displayEntries.map(e => e.modelDbId)}
+                  strategy={verticalListSortingStrategy}
+                >
+                  {displayEntries.map((entry, index) => (
+                    <SortableModelRow
+                      key={entry.modelDbId}
+                      entry={entry}
+                      index={index}
+                      onToggle={handleToggle}
+                    />
+                  ))}
+                </SortableContext>
+              </DndContext>
+            </div>
+            {hasChanges && (
+              <div className="flex justify-end gap-2">
+                <Button variant="outline" size="sm" onClick={() => setLocalEntries(null)}>
+                  Discard
+                </Button>
+                <Button size="sm" onClick={handleSave} disabled={saveMutation.isPending}>
+                  {saveMutation.isPending ? 'Saving…' : 'Save order'}
+                </Button>
+              </div>
+            )}
+            {unconfiguredPlatforms.length > 0 && (
+              <p className="text-xs text-muted-foreground">
+                Hidden (no keys): {unconfiguredPlatforms.join(', ')}
+              </p>
+            )}
+          </>
+        )}
+      </div>
+    </div>
+  )
+}
--- a/client/src/pages/KeysPage.tsx
+++ b/client/src/pages/KeysPage.tsx
+import { useState } from 'react'
+import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
+import { apiFetch } from '@/lib/api'
+import { Button } from '@/components/ui/button'
+import { Input } from '@/components/ui/input'
+import { Label } from '@/components/ui/label'
+import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
+import { PageHeader } from '@/components/page-header'
+import type { ApiKey, Platform } from '../../../shared/types'
+const PLATFORMS: { value: Platform; label: string }[] = [
+  { value: 'google', label: 'Google AI Studio' },
+  { value: 'groq', label: 'Groq' },
+  { value: 'cerebras', label: 'Cerebras' },
+  { value: 'sambanova', label: 'SambaNova' },
+  { value: 'nvidia', label: 'NVIDIA NIM' },
+  { value: 'mistral', label: 'Mistral' },
+  { value: 'openrouter', label: 'OpenRouter' },
+  { value: 'github', label: 'GitHub Models' },
+  { value: 'huggingface', label: 'Hugging Face' },
+  { value: 'cohere', label: 'Cohere' },
+  { value: 'cloudflare', label: 'Cloudflare Workers AI' },
+  { value: 'zhipu', label: 'Zhipu AI (Z.ai)' },
+  { value: 'moonshot', label: 'Moonshot (Kimi)' },
+  { value: 'minimax', label: 'MiniMax' },
+]
+const statusDot: Record<string, string> = {
+  healthy: 'bg-emerald-500',
+  rate_limited: 'bg-amber-500',
+  invalid: 'bg-rose-500',
+  error: 'bg-rose-500',
+  unknown: 'bg-muted-foreground/40',
+}
+const statusLabel: Record<string, string> = {
+  healthy: 'healthy',
+  rate_limited: 'rate-limited',
+  invalid: 'invalid',
+  error: 'error',
+  unknown: 'unchecked',
+}
+interface HealthPlatform {
+  platform: string
+  totalKeys: number
+  healthyKeys: number
+  rateLimitedKeys: number
+  invalidKeys: number
+  errorKeys: number
+  unknownKeys: number
+}
+interface HealthData {
+  platforms: HealthPlatform[]
+  keys: { id: number; platform: string; status: string; lastCheckedAt: string | null }[]
+}
+function UnifiedKeySection() {
+  const queryClient = useQueryClient()
+  const [showKey, setShowKey] = useState(false)
+  const [copied, setCopied] = useState(false)
+  const { data } = useQuery<{ apiKey: string }>({
+    queryKey: ['unified-key'],
+    queryFn: () => apiFetch('/api/settings/api-key'),
+  })
+  const regenerate = useMutation({
+    mutationFn: () => apiFetch('/api/settings/api-key/regenerate', { method: 'POST' }),
+    onSuccess: () => queryClient.invalidateQueries({ queryKey: ['unified-key'] }),
+  })
+  const apiKey = data?.apiKey ?? ''
+  const masked = apiKey ? apiKey.slice(0, 13) + '•'.repeat(32) : '…'
+  function copy() {
+    navigator.clipboard.writeText(apiKey)
+    setCopied(true)
+    setTimeout(() => setCopied(false), 1500)
+  }
+  return (
+    <section className="rounded-lg border bg-card p-5">
+      <div className="flex items-start justify-between gap-4 mb-3">
+        <div>
+          <h2 className="text-sm font-medium">Your unified API key</h2>
+          <p className="text-xs text-muted-foreground mt-0.5">
+            Use this as your OpenAI <code className="font-mono">api_key</code>; it authenticates requests to this proxy.
+          </p>
+        </div>
+        <Button
+          variant="ghost"
+          size="sm"
+          onClick={() => regenerate.mutate()}
+          disabled={regenerate.isPending}
+        >
+          Regenerate
+        </Button>
+      </div>
+      <div className="flex items-center gap-2">
+        <code className="flex-1 font-mono text-xs bg-muted px-3 py-2 rounded-md select-all truncate tabular-nums">
+          {showKey ? apiKey : masked}
+        </code>
+        <Button variant="outline" size="sm" onClick={() => setShowKey(!showKey)}>
+          {showKey ? 'Hide' : 'Show'}
+        </Button>
+        <Button variant="outline" size="sm" onClick={copy}>
+          {copied ? 'Copied' : 'Copy'}
+        </Button>
+      </div>
+      <div className="mt-4 grid grid-cols-[auto_1fr] gap-x-4 gap-y-1.5 text-xs">
+        <span className="text-muted-foreground">Base URL</span>
+        <code className="font-mono">http://localhost:3001/v1</code>
+        <span className="text-muted-foreground">Endpoint</span>
+        <code className="font-mono">/v1/chat/completions</code>
+      </div>
+    </section>
+  )
+}
+export default function KeysPage() {
+  const queryClient = useQueryClient()
+  const [platform, setPlatform] = useState<Platform | ''>('')
+  const [apiKey, setApiKey] = useState('')
+  const [accountId, setAccountId] = useState('')
+  const [label, setLabel] = useState('')
+  const { data: keys = [], isLoading } = useQuery<ApiKey[]>({
+    queryKey: ['keys'],
+    queryFn: () => apiFetch('/api/keys'),
+  })
+  const { data: healthData } = useQuery<HealthData>({
+    queryKey: ['health'],
+    queryFn: () => apiFetch('/api/health'),
+    refetchInterval: 30000,
+  })
+  const addKey = useMutation({
+    mutationFn: (body: { platform: string; key: string; label?: string }) =>
+      apiFetch('/api/keys', { method: 'POST', body: JSON.stringify(body) }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['keys'] })
+      queryClient.invalidateQueries({ queryKey: ['health'] })
+      queryClient.invalidateQueries({ queryKey: ['fallback'] })
+      setPlatform('')
+      setApiKey('')
+      setAccountId('')
+      setLabel('')
+    },
+  })
+  const deleteKey = useMutation({
+    mutationFn: (id: number) => apiFetch(`/api/keys/${id}`, { method: 'DELETE' }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['keys'] })
+      queryClient.invalidateQueries({ queryKey: ['health'] })
+    },
+  })
+  const checkAll = useMutation({
+    mutationFn: () => apiFetch('/api/health/check-all', { method: 'POST' }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['health'] })
+      queryClient.invalidateQueries({ queryKey: ['keys'] })
+    },
+  })
+  const checkKey = useMutation({
+    mutationFn: (keyId: number) => apiFetch(`/api/health/check/${keyId}`, { method: 'POST' }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['health'] })
+      queryClient.invalidateQueries({ queryKey: ['keys'] })
+    },
+  })
+  const needsAccountId = platform === 'cloudflare'
+  const handleSubmit = (e: React.FormEvent) => {
+    e.preventDefault()
+    if (!platform || !apiKey) return
+    if (needsAccountId && !accountId) return
+    const key = needsAccountId ? `${accountId}:${apiKey}` : apiKey
+    addKey.mutate({ platform, key, label: label || undefined })
+  }
+  const healthKeyMap = new Map<number, { status: string; lastCheckedAt: string | null }>()
+  for (const k of healthData?.keys ?? []) healthKeyMap.set(k.id, k)
+  const grouped = PLATFORMS.map(p => ({
+    ...p,
+    keys: keys.filter(k => k.platform === p.value),
+  })).filter(p => p.keys.length > 0)
+  return (
+    <div>
+      <PageHeader
+        title="Keys"
+        description="Provider credentials and the unified API key your apps connect with."
+        actions={
+          keys.length > 0 && (
+            <Button variant="outline" size="sm" onClick={() => checkAll.mutate()} disabled={checkAll.isPending}>
+              {checkAll.isPending ? 'Checking…' : 'Check all'}
+            </Button>
+          )
+        }
+      />
+      <div className="space-y-8">
+        <UnifiedKeySection />
+        <section>
+          <h2 className="text-sm font-medium mb-3">Add a provider key</h2>
+          <form onSubmit={handleSubmit} className="flex flex-wrap items-end gap-3 rounded-lg border p-4 bg-card">
+            <div className="space-y-1.5">
+              <Label className="text-xs">Platform</Label>
+              <Select value={platform} onValueChange={(v) => setPlatform(v as Platform)}>
+                <SelectTrigger className="w-[220px]">
+                  <SelectValue placeholder="Select provider" />
+                </SelectTrigger>
+                <SelectContent>
+                  {PLATFORMS.map(p => (
+                    <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                  ))}
+                </SelectContent>
+              </Select>
+            </div>
+            {needsAccountId && (
+              <div className="space-y-1.5">
+                <Label className="text-xs">Account ID</Label>
+                <Input
+                  value={accountId}
+                  onChange={e => setAccountId(e.target.value)}
+                  placeholder="a1b2c3d4…"
+                  className="w-[200px] font-mono text-xs"
+                />
+              </div>
+            )}
+            <div className="space-y-1.5 flex-1 min-w-[240px]">
+              <Label className="text-xs">{needsAccountId ? 'API token' : 'API key'}</Label>
+              <Input
+                type="password"
+                value={apiKey}
+                onChange={e => setApiKey(e.target.value)}
+                placeholder={needsAccountId ? 'Bearer token' : 'paste key here'}
+                className="font-mono text-xs"
+              />
+            </div>
+            <div className="space-y-1.5">
+              <Label className="text-xs">Label</Label>
+              <Input
+                value={label}
+                onChange={e => setLabel(e.target.value)}
+                placeholder="optional"
+                className="w-[160px]"
+              />
+            </div>
+            <Button type="submit" size="sm" disabled={!platform || !apiKey || (needsAccountId && !accountId) || addKey.isPending}>
+              {addKey.isPending ? 'Adding…' : 'Add key'}
+            </Button>
+          </form>
+          {addKey.isError && (
+            <p className="text-destructive text-xs mt-2">{(addKey.error as Error).message}</p>
+          )}
+        </section>
+        <section>
+          <h2 className="text-sm font-medium mb-3">Configured providers</h2>
+          {isLoading ? (
+            <p className="text-sm text-muted-foreground">Loading…</p>
+          ) : keys.length === 0 ? (
+            <div className="rounded-lg border border-dashed p-8 text-center">
+              <p className="text-sm text-muted-foreground">
+                No provider keys yet. Add one above to start routing.
+              </p>
+            </div>
+          ) : (
+            <div className="space-y-6">
+              {grouped.map(group => (
+                <div key={group.value}>
+                  <div className="flex items-baseline justify-between mb-2">
+                    <h3 className="text-sm font-medium">{group.label}</h3>
+                    <span className="text-xs text-muted-foreground tabular-nums">
+                      {group.keys.length} key{group.keys.length === 1 ? '' : 's'}
+                    </span>
+                  </div>
+                  <div className="rounded-lg border divide-y bg-card overflow-hidden">
+                    {group.keys.map(k => {
+                      const h = healthKeyMap.get(k.id)
+                      const status = h?.status ?? k.status
+                      const lastChecked = h?.lastCheckedAt
+                      return (
+                        <div key={k.id} className="flex items-center gap-3 px-4 py-3 hover:bg-muted/40 transition-colors">
+                          <span className={`size-1.5 rounded-full flex-shrink-0 ${statusDot[status] ?? statusDot.unknown}`} />
+                          <code className="text-xs font-mono flex-shrink-0">{k.maskedKey}</code>
+                          {k.label && <span className="text-xs text-muted-foreground">{k.label}</span>}
+                          <span className="text-xs text-muted-foreground">{statusLabel[status] ?? status}</span>
+                          <div className="flex-1" />
+                          {lastChecked && (
+                            <span className="text-[11px] text-muted-foreground tabular-nums">
+                              {new Date(lastChecked).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })}
+                            </span>
+                          )}
+                          <Button variant="ghost" size="xs" onClick={() => checkKey.mutate(k.id)} disabled={checkKey.isPending}>
+                            Check
+                          </Button>
+                          <Button variant="ghost" size="xs" className="text-muted-foreground hover:text-destructive" onClick={() => deleteKey.mutate(k.id)} disabled={deleteKey.isPending}>
+                            Remove
+                          </Button>
+                        </div>
+                      )
+                    })}
+                  </div>
+                </div>
+              ))}
+            </div>
+          )}
+        </section>
+      </div>
+    </div>
+  )
+}
--- a/client/src/pages/PlaygroundPage.tsx
+++ b/client/src/pages/PlaygroundPage.tsx
+import { useState, useRef, useEffect } from 'react'
+import { useQuery } from '@tanstack/react-query'
+import { apiFetch } from '@/lib/api'
+import { Button } from '@/components/ui/button'
+import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
+import { PageHeader } from '@/components/page-header'
+interface FallbackEntry {
+  modelDbId: number
+  priority: number
+  enabled: boolean
+  platform: string
+  modelId: string
+  displayName: string
+  sizeLabel: string
+  keyCount: number
+}
+interface ChatMessage {
+  role: 'user' | 'assistant'
+  content: string
+  meta?: {
+    platform?: string
+    model?: string
+    latency?: number
+    fallbackAttempts?: number
+  }
+}
+export default function PlaygroundPage() {
+  const [messages, setMessages] = useState<ChatMessage[]>([])
+  const [input, setInput] = useState('')
+  const [loading, setLoading] = useState(false)
+  const [selectedModel, setSelectedModel] = useState<string>('auto')
+  const messagesEndRef = useRef<HTMLDivElement>(null)
+  const inputRef = useRef<HTMLTextAreaElement>(null)
+  const { data: keyData } = useQuery<{ apiKey: string }>({
+    queryKey: ['unified-key'],
+    queryFn: () => apiFetch('/api/settings/api-key'),
+  })
+  const { data: fallbackEntries = [] } = useQuery<FallbackEntry[]>({
+    queryKey: ['fallback'],
+    queryFn: () => apiFetch('/api/fallback'),
+  })
+  const availableModels = fallbackEntries.filter(e => e.keyCount > 0 && e.enabled)
+  useEffect(() => {
+    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
+  }, [messages])
+  const handleSend = async () => {
+    const text = input.trim()
+    if (!text || loading) return
+    const userMsg: ChatMessage = { role: 'user', content: text }
+    const newMessages = [...messages, userMsg]
+    setMessages(newMessages)
+    setInput('')
+    setLoading(true)
+    inputRef.current?.focus()
+    try {
+      const headers: Record<string, string> = { 'Content-Type': 'application/json' }
+      if (keyData?.apiKey) headers['Authorization'] = `Bearer ${keyData.apiKey}`
+      const body: any = {
+        messages: newMessages.map(m => ({ role: m.role, content: m.content })),
+      }
+      if (selectedModel !== 'auto') body.model = selectedModel
+      const base = import.meta.env.BASE_URL.replace(/\/$/, '')
+      const start = Date.now()
+      const res = await fetch(`${base}/v1/chat/completions`, {
+        method: 'POST',
+        headers,
+        body: JSON.stringify(body),
+      })
+      const latency = Date.now() - start
+      const routedVia = res.headers.get('X-Routed-Via')
+      const fallbackAttempts = res.headers.get('X-Fallback-Attempts')
+      if (!res.ok) {
+        const err = await res.json().catch(() => ({ error: { message: `HTTP ${res.status}` } }))
+        setMessages([...newMessages, {
+          role: 'assistant',
+          content: `Error: ${err.error?.message ?? 'Unknown error'}`,
+        }])
+        return
+      }
+      const data = await res.json()
+      const content = data.choices?.[0]?.message?.content ?? JSON.stringify(data, null, 2)
+      const via = data._routed_via ?? (routedVia ? {
+        platform: routedVia.split('/')[0],
+        model: routedVia.split('/').slice(1).join('/'),
+      } : undefined)
+      setMessages([...newMessages, {
+        role: 'assistant',
+        content,
+        meta: {
+          platform: via?.platform,
+          model: via?.model,
+          latency,
+          fallbackAttempts: fallbackAttempts ? parseInt(fallbackAttempts) : undefined,
+        },
+      }])
+    } catch (err: any) {
+      setMessages([...newMessages, {
+        role: 'assistant',
+        content: `Error: ${err.message}`,
+      }])
+    } finally {
+      setLoading(false)
+      setTimeout(() => inputRef.current?.focus(), 0)
+    }
+  }
+  const handleKeyDown = (e: React.KeyboardEvent) => {
+    if (e.key === 'Enter' && !e.shiftKey) {
+      e.preventDefault()
+      handleSend()
+    }
+  }
+  const handleClear = () => {
+    setMessages([])
+    inputRef.current?.focus()
+  }
+  const activeModelLabel = selectedModel === 'auto'
+    ? 'Auto (fallback chain)'
+    : availableModels.find(m => m.modelId === selectedModel)?.displayName ?? selectedModel
+  return (
+    <div className="flex flex-col h-[calc(100vh-8rem)]">
+      <PageHeader
+        title="Playground"
+        description="Send a chat completion through the router and see which provider serves it."
+        actions={
+          <>
+            <Select value={selectedModel} onValueChange={(v) => setSelectedModel(v ?? 'auto')}>
+              <SelectTrigger className="w-[260px]">
+                <SelectValue />
+              </SelectTrigger>
+              <SelectContent>
+                <SelectItem value="auto">Auto (fallback chain)</SelectItem>
+                {availableModels.map(m => (
+                  <SelectItem key={m.modelDbId} value={m.modelId}>
+                    <span className="flex items-center gap-2">
+                      <span>{m.displayName}</span>
+                      <span className="text-xs text-muted-foreground">{m.platform}</span>
+                    </span>
+                  </SelectItem>
+                ))}
+              </SelectContent>
+            </Select>
+            {messages.length > 0 && (
+              <Button variant="outline" size="sm" onClick={handleClear}>
+                Clear
+              </Button>
+            )}
+          </>
+        }
+      />
+      <div className="flex-1 flex flex-col rounded-lg border bg-card overflow-hidden min-h-0">
+        <div className="flex-1 overflow-y-auto p-6 space-y-4">
+          {messages.length === 0 ? (
+            <div className="flex items-center justify-center h-full text-center">
+              <div className="space-y-2 max-w-sm">
+                <p className="text-base font-medium">Send a message to get started.</p>
+                <p className="text-sm text-muted-foreground">
+                  Using <span className="text-foreground">{activeModelLabel}</span>. Switch models in the selector above.
+                </p>
+              </div>
+            </div>
+          ) : (
+            <>
+              {messages.map((msg, i) => (
+                <div key={i} className={`flex ${msg.role === 'user' ? 'justify-end' : 'justify-start'}`}>
+                  <div
+                    className={`max-w-[78%] rounded-2xl px-4 py-2.5 text-sm leading-relaxed ${
+                      msg.role === 'user'
+                        ? 'bg-primary text-primary-foreground'
+                        : 'bg-muted'
+                    }`}
+                  >
+                    <div className="whitespace-pre-wrap">{msg.content}</div>
+                    {msg.meta && (
+                      <div className="flex items-center gap-2 mt-2 flex-wrap text-[11px] opacity-70 tabular-nums">
+                        {msg.meta.platform && <span>{msg.meta.platform}</span>}
+                        {msg.meta.model && <span className="font-mono">· {msg.meta.model}</span>}
+                        {msg.meta.latency != null && <span>· {msg.meta.latency} ms</span>}
+                        {msg.meta.fallbackAttempts != null && msg.meta.fallbackAttempts > 0 && (
+                          <span>· {msg.meta.fallbackAttempts} fallback{msg.meta.fallbackAttempts > 1 ? 's' : ''}</span>
+                        )}
+                      </div>
+                    )}
+                  </div>
+                </div>
+              ))}
+              {loading && (
+                <div className="flex justify-start">
+                  <div className="bg-muted rounded-2xl px-4 py-3">
+                    <div className="flex gap-1">
+                      <span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '0ms' }} />
+                      <span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '150ms' }} />
+                      <span className="size-1.5 rounded-full bg-muted-foreground/50 animate-bounce" style={{ animationDelay: '300ms' }} />
+                    </div>
+                  </div>
+                </div>
+              )}
+              <div ref={messagesEndRef} />
+            </>
+          )}
+        </div>
+        <div className="border-t bg-background/50 p-3">
+          <div className="flex gap-2 items-end">
+            <textarea
+              ref={inputRef}
+              value={input}
+              onChange={e => setInput(e.target.value)}
+              onKeyDown={handleKeyDown}
+              placeholder="Type a message… (⏎ to send, ⇧⏎ for newline)"
+              rows={1}
+              className="flex-1 resize-none rounded-md border bg-background px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-ring/50 min-h-[40px] max-h-[160px]"
+              style={{ height: 'auto', overflow: 'hidden' }}
+              onInput={e => {
+                const el = e.target as HTMLTextAreaElement
+                el.style.height = 'auto'
+                el.style.height = Math.min(el.scrollHeight, 160) + 'px'
+              }}
+            />
+            <Button onClick={handleSend} disabled={loading || !input.trim()} size="default">
+              {loading ? 'Sending…' : 'Send'}
+            </Button>
+          </div>
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/client/tsconfig.app.json
+++ b/client/tsconfig.app.json
+{
+  "compilerOptions": {
+    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
+    "target": "es2023",
+    "lib": ["ES2023", "DOM", "DOM.Iterable"],
+    "module": "esnext",
+    "types": ["vite/client"],
+    "skipLibCheck": true,
+    /* Bundler mode */
+    "moduleResolution": "bundler",
+    "allowImportingTsExtensions": true,
+    "verbatimModuleSyntax": true,
+    "moduleDetection": "force",
+    "noEmit": true,
+    "jsx": "react-jsx",
+    /* Linting */
+    "noUnusedLocals": true,
+    "noUnusedParameters": true,
+    "erasableSyntaxOnly": true,
+    "noFallthroughCasesInSwitch": true,
+    "paths": {
+      "@/*": ["./src/*"]
+    }
+  },
+  "include": ["src"]
+}
--- a/client/tsconfig.json
+++ b/client/tsconfig.json
+{
+  "files": [],
+  "references": [
+    { "path": "./tsconfig.app.json" },
+    { "path": "./tsconfig.node.json" }
+  ],
+  "compilerOptions": {
+    "baseUrl": ".",
+    "paths": {
+      "@/*": ["./src/*"]
+    }
+  }
+}
--- a/client/tsconfig.node.json
+++ b/client/tsconfig.node.json
+{
+  "compilerOptions": {
+    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
+    "target": "es2023",
+    "lib": ["ES2023"],
+    "module": "esnext",
+    "types": ["node"],
+    "skipLibCheck": true,
+    /* Bundler mode */
+    "moduleResolution": "bundler",
+    "allowImportingTsExtensions": true,
+    "verbatimModuleSyntax": true,
+    "moduleDetection": "force",
+    "noEmit": true,
+    /* Linting */
+    "noUnusedLocals": true,
+    "noUnusedParameters": true,
+    "erasableSyntaxOnly": true,
+    "noFallthroughCasesInSwitch": true
+  },
+  "include": ["vite.config.ts"]
+}
--- a/client/vite.config.ts
+++ b/client/vite.config.ts
+import { defineConfig } from 'vite'
+import react from '@vitejs/plugin-react'
+import tailwindcss from '@tailwindcss/vite'
+import path from 'path'
+export default defineConfig({
+  plugins: [react(), tailwindcss()],
+  base: process.env.VITE_BASE ?? '/',
+  resolve: {
+    alias: {
+      '@': path.resolve(__dirname, './src'),
+    },
+  },
+  server: {
+    proxy: {
+      '/api': 'http://localhost:3001',
+      '/v1': 'http://localhost:3001',
+    },
+  },
+})
--- a/free-ai-apis.pdf
+++ b/free-ai-apis.pdf
--- a/free-ai-apis.typ
+++ b/free-ai-apis.typ
+#set page(margin: (x: 1.2in, y: 1in), numbering: "1")
+#set text(font: "New Computer Modern", size: 10.5pt)
+#set heading(numbering: "1.1")
+#set par(justify: true, leading: 0.65em)
+#set table(stroke: 0.5pt + luma(180))
+#show heading.where(level: 1): it => {
+  v(0.8em)
+  text(size: 16pt, weight: "bold", it)
+  v(0.4em)
+}
+#show heading.where(level: 2): it => {
+  v(0.6em)
+  text(size: 13pt, weight: "bold", it)
+  v(0.3em)
+}
+#show heading.where(level: 3): it => {
+  v(0.4em)
+  text(size: 11pt, weight: "bold", it)
+  v(0.2em)
+}
+#show table: set text(size: 9pt)
+// Title page
+#align(center)[
+  #v(2in)
+  #text(size: 26pt, weight: "bold")[Free AI API Platforms]
+  #v(0.2em)
+  #text(size: 14pt, fill: luma(80))[Ongoing Free Tiers for a Unified LLM Routing Service]
+  #v(1em)
+  #line(length: 40%, stroke: 0.5pt + luma(120))
+  #v(0.5em)
+  #text(size: 11pt, fill: luma(100))[April 2026]
+  #v(0.3em)
+  #text(size: 10pt, fill: luma(130))[Only platforms with ongoing monthly free access --- no expiring trial credits.]
+]
+#pagebreak()
+#outline(title: "Contents", indent: 1.5em)
+#pagebreak()
+= Executive Summary
+This report catalogs every major platform offering *ongoing* free API access to LLMs (not one-time expiring trial credits). The goal: a service where users contribute their free API keys, and a unified endpoint routes requests to the best available free LLM, ranked by intelligence.
+*Key Findings:*
+- *13 platforms* offer genuinely ongoing free tiers. None require a credit card.
+- *Google AI Studio* (Gemini 2.5 Pro) offers the highest-intelligence model for free.
+- *Cerebras* and *NVIDIA NIM* offer the most generous throughput.
+- *Groq* and *Cerebras* offer the fastest inference speeds.
+= Platform-by-Platform Analysis
+== Google AI Studio (Gemini API)
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing, no expiration],
+  [*Credit Card*], [No],
+  [*Best Free Model*], [Gemini 2.5 Pro],
+  [*Other Models*], [Gemini 2.5 Flash, Gemini 2.5 Flash-Lite],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, 0.5in, 0.5in, 0.7in),
+  align: (left, center, center, center),
+  [*Model*], [*RPM*], [*RPD*], [*TPM*],
+  [Gemini 2.5 Pro], [5], [100], [250,000],
+  [Gemini 2.5 Flash], [10], [250], [250,000],
+  [Gemini 2.5 Flash-Lite], [15], [1,000], [250,000],
+)
+*Monthly Token Budget:* ~12M tokens (Pro), ~30M (Flash), ~120M (Flash-Lite)
+*Benchmarks (Gemini 2.5 Pro):*
+#table(
+  columns: (auto, auto),
+  align: (left, center),
+  [*Benchmark*], [*Score*],
+  [Global MMLU], [89.8%],
+  [MMLU-Pro], [86.0%],
+  [AIME], [88.0%],
+  [GPQA], [84.0%],
+  [SWE-Bench Verified], [63.8%],
+  [Chatbot Arena ELO], [~1450+],
+)
+*Speed:* ~80--150 tokens/sec
+*Limitations:* Free tier data may be used for training. Rate limits reduced 50--80% in Dec 2025 due to abuse. Limits are per-project.
+== Groq
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing, no expiration],
+  [*Credit Card*], [No],
+  [*Best Free Model*], [Llama 3.3 70B Versatile],
+  [*Other Models*], [Llama 4 Scout, Qwen3 32B, Llama 3.1 8B, Kimi K2, 15+ more],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, 0.4in, 0.55in, 0.55in, 0.65in),
+  align: (left, center, center, center, center),
+  [*Model*], [*RPM*], [*RPD*], [*TPM*], [*TPD*],
+  [Llama 3.3 70B], [30], [1,000], [6,000], [~500K],
+  [Llama 4 Scout 17B], [30], [1,000], [30,000], [~1M],
+  [Qwen3 32B], [60], [~1,000], [~6,000], [~500K],
+  [Llama 3.1 8B], [30], [14,400], [6,000], [500K],
+)
+*Monthly Token Budget:* ~15M/month per model, ~45--60M combined
+*Benchmarks (Llama 3.3 70B):* MMLU 82.0%, HumanEval 88.4%, Arena ELO ~1250
+*Speed:* 276--316 tok/sec (standard), up to 1,665 tok/sec (speculative decoding)
+*Limitations:* Cached tokens don't count toward limits (advantage). Only open-source models.
+== Cerebras
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing, no expiration],
+  [*Credit Card*], [No],
+  [*Best Free Model*], [Qwen3 235B-A22B Instruct],
+  [*Other Models*], [Llama 3.1 8B/70B, Llama 4 Scout, GPT-OSS 120B],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, auto),
+  align: (left, center),
+  [RPM], [30],
+  [TPM], [60,000],
+  [Tokens/Day], [1,000,000],
+  [Context Window (free)], [8,192 tokens],
+)
+*Monthly Token Budget:* ~30M tokens/month
+*Benchmarks (Qwen3 235B):* MMLU 88.4%, HumanEval 79.2%, AIME '24 85.7%, Arena ELO 1422
+*Speed:* ~1,400 tok/sec (Qwen3 235B), ~2,600 tok/sec (Scout), ~1,800 tok/sec (8B)
+*Limitations:* Context window capped at 8,192 tokens on free tier (major limitation).
+== SambaNova
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing (after initial \$5 credit expires)],
+  [*Credit Card*], [No],
+  [*Best Free Model*], [Llama 3.1 405B / MiniMax-M2.5],
+  [*Other Models*], [Llama 3.3 70B, Qwen3 32B, DeepSeek V3.1, Llama 4 Maverick],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, 0.5in, 0.7in),
+  align: (left, center, center),
+  [*Model*], [*RPM*], [*TPD*],
+  [Llama 3.1 405B], [10], [~200K],
+  [Llama 3.3 70B], [20], [~200K],
+  [Llama 3.1 8B], [30], [~200K],
+)
+*Monthly Token Budget:* ~6M tokens/month
+*Benchmarks (Llama 3.1 405B):* MMLU 88.6%, HumanEval 89.0%, MATH 73.8%, Arena ELO ~1320
+*Speed:* ~114 tok/sec (405B)
+== NVIDIA NIM
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing, rate-limited (no token cap)],
+  [*Credit Card*], [No (requires NVIDIA Developer signup)],
+  [*Best Free Model*], [Nemotron 3 Super 120B, Kimi K2.5, GLM-5 744B, DeepSeek-R1 671B],
+  [*Catalog*], [100+ models],
+)
+*Rate Limits:* 40 RPM, no daily token cap
+*Monthly Token Budget:* ~50--100M tokens/month (practically)
+*Speed:* Varies by model; NIM-optimized for throughput
+*Limitations:* Intended for prototyping/evaluation. Heavy models may be slow at peak times.
+== Mistral (Experiment Plan)
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing (Experiment plan)],
+  [*Credit Card*], [No (requires phone verification)],
+  [*Best Free Model*], [Mistral Large 3],
+  [*Other Models*], [Codestral, Mistral Small, all Mistral models],
+)
+*Rate Limits:* 2 RPM, 500K TPM, 1B monthly token cap
+*Monthly Token Budget:* ~50--100M tokens/month (2 RPM is the bottleneck)
+*Benchmarks (Mistral Large 3):* MMLU 85.5%, Arena ELO ~1280
+*Limitations:* Only 2 RPM is extremely restrictive. Data may be used for training.
+== OpenRouter (Free Models)
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing, free model variants],
+  [*Credit Card*], [No],
+  [*Best Free Model*], [DeepSeek R1 (free), Qwen3 Coder 480B (free)],
+  [*Free Models*], [29 total, including Gemma 3, Nemotron 3 Super],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, 0.5in, 0.5in),
+  align: (left, center, center),
+  [*Tier*], [*RPM*], [*RPD*],
+  [No credits purchased], [20], [50],
+  [\$10+ credits purchased], [20], [1,000],
+)
+*Monthly Token Budget:* ~6M (no credits) / ~120M (\$10 purchase)
+*Benchmarks (DeepSeek R1 free):* MMLU 90.8%, AIME '24 79.8%, Arena ELO 1398
+== GitHub Models
+#table(
+  columns: (1.3in, auto),
+  [*Free Tier Type*], [Ongoing],
+  [*Credit Card*], [No (requires GitHub account)],
+  [*Best Free Model*], [GPT-4o, DeepSeek-R1, Llama 3.3 70B],
+)
+*Rate Limits:*
+#table(
+  columns: (auto, 0.4in, 0.4in, 0.8in, 0.8in),
+  align: (left, center, center, center, center),
+  [*Tier*], [*RPM*], [*RPD*], [*Input Tok/Req*], [*Output Tok/Req*],
+  [High (GPT-4o)], [10], [50], [8,000], [4,000],
+  [Low (smaller)], [15], [150], [8,000], [4,000],
+)
+*Monthly Token Budget:* ~18M (high), ~54M (low)
+*Benchmarks (GPT-4o):* MMLU 88.7%, HumanEval 90.2%, Arena ELO ~1350
+== Other Platforms
+#table(
+  columns: (auto, auto, auto, auto),
+  align: (left, left, center, left),
+  [*Platform*], [*Best Free Model*], [*Monthly Tokens*], [*Notes*],
+  [Hugging Face], [Various (1000s)], [~5--10M], [100K inference credits/mo],
+  [Cohere], [Command R+], [~4M], [1,000 calls/mo, 20 RPM],
+  [Cloudflare Workers AI], [Llama 3.1 70B], [~18--45M], [10K neurons/day],
+  [Fireworks AI], [Open-source], [~5--10M], [10 RPM (after \$1 credit)],
+)
+#pagebreak()
+= Comprehensive Rankings
+== By Intelligence (Best Free Model Per Platform)
+#table(
+  columns: (0.3in, auto, auto, 0.5in, 0.55in, 0.55in, auto),
+  align: (center, left, left, center, center, center, left),
+  [*\#*], [*Platform*], [*Best Free Model*], [*MMLU*], [*Human\ Eval*], [*Arena\ ELO*], [*Tier*],
+  [1], [Google AI Studio], [Gemini 2.5 Pro], [89.8%], [~92%], [~1450], [Frontier],
+  [2], [OpenRouter], [DeepSeek R1 (free)], [90.8%], [~85%], [1398], [Frontier],
+  [3], [Cerebras], [Qwen3 235B], [88.4%], [79.2%], [1422], [Near-Frontier],
+  [4], [SambaNova], [Llama 3.1 405B], [88.6%], [89.0%], [~1320], [Near-Frontier],
+  [5], [GitHub Models], [GPT-4o], [88.7%], [90.2%], [~1350], [Near-Frontier],
+  [6], [Cohere], [Command R+], [88.2%], [--], [~1200], [Strong],
+  [7], [Mistral], [Mistral Large 3], [85.5%], [--], [~1280], [Strong],
+  [8], [NVIDIA NIM], [Nemotron 3 / GLM-5], [--], [--], [~1300], [Strong],
+  [9], [Groq], [Llama 3.3 70B], [82.0%], [88.4%], [~1250], [Good],
+  [10], [Cloudflare], [Llama 3.1 70B], [82.0%], [88.4%], [~1250], [Good],
+)
+== By Monthly Token Budget
+#table(
+  columns: (0.3in, auto, auto, auto),
+  align: (center, left, right, left),
+  [*\#*], [*Platform*], [*Est. Monthly Tokens*], [*Budget Tier*],
+  [1], [NVIDIA NIM], [~50--100M+], [Excellent],
+  [2], [Mistral], [~50--100M], [Excellent],
+  [3], [Google AI Studio (Flash-Lite)], [~120M], [Excellent],
+  [4], [Cloudflare Workers AI], [~18--45M], [Very Good],
+  [5], [Cerebras], [~30M], [Very Good],
+  [6], [GitHub Models], [~18--54M], [Good],
+  [7], [Groq], [~15--60M], [Good],
+  [8], [Hugging Face], [~5--10M], [Moderate],
+  [9], [SambaNova], [~6M], [Moderate],
+  [10], [OpenRouter (no credits)], [~6M], [Moderate],
+  [11], [Fireworks AI], [~5--10M], [Moderate],
+  [12], [Cohere], [~4M], [Limited],
+)
+== Final Composite Ranking
+Scoring: Intelligence (0--40) + Generosity (0--30) + Usability (0--20) + Reliability (0--10)
+#table(
+  columns: (0.3in, auto, auto, 0.4in, 0.45in, 0.45in, 0.4in, 0.45in),
+  align: (center, left, left, center, center, center, center, center),
+  [*\#*], [*Platform*], [*Best Model*], [*Intel*], [*Gener.*], [*Usab.*], [*Rel.*], [*Total*],
+  [1], [*Google AI Studio*], [Gemini 2.5 Pro], [40], [18], [14], [8], [*80*],
+  [2], [*Cerebras*], [Qwen3 235B], [35], [22], [16], [7], [*80*],
+  [3], [*NVIDIA NIM*], [100+ models], [32], [28], [15], [5], [*80*],
+  [4], [*Groq*], [Llama 3.3 70B], [28], [20], [20], [8], [*76*],
+  [5], [*Cloudflare*], [Llama 3.1 70B], [28], [22], [12], [6], [*68*],
+  [6], [*OpenRouter*], [DeepSeek R1], [38], [12], [12], [6], [*68*],
+  [7], [*GitHub Models*], [GPT-4o], [34], [16], [10], [7], [*67*],
+  [8], [*Mistral*], [Mistral Large 3], [30], [24], [6], [6], [*66*],
+  [9], [*SambaNova*], [Llama 3.1 405B], [34], [10], [12], [7], [*63*],
+  [10], [*Cohere*], [Command R+], [30], [8], [14], [7], [*59*],
+  [11], [*Hugging Face*], [Various], [25], [12], [10], [5], [*52*],
+  [12], [*Fireworks AI*], [Open-source], [25], [10], [10], [5], [*50*],
+)
+#pagebreak()
+= Architecture for Unified Routing Service
+== Routing Priority (by intelligence)
+ *Gemini 2.5 Pro* (Google AI Studio) --- highest intelligence, 100 RPD/key
+ *DeepSeek R1* (OpenRouter free) --- near-frontier reasoning, 50 RPD/key
+ *Qwen3 235B* (Cerebras) --- near-frontier, 1M tokens/day, 8K context limit
+ *GPT-4o* (GitHub Models) --- strong, 50 RPD/key
+ *Llama 3.1 405B* (SambaNova) --- strong, 10 RPM
+ *Mistral Large 3* (Mistral) --- good, 2 RPM bottleneck
+ *Llama 3.3 70B* (Groq) --- good intelligence, fastest speed, 1,000 RPD
+ *Any NIM model* (NVIDIA) --- huge variety, no daily token cap
+== Key Pooling Multiplier
+If 100 users each contribute one API key per platform:
+#table(
+  columns: (auto, auto, auto),
+  align: (left, center, center),
+  [*Platform*], [*Per Key RPD*], [*100 Keys RPD*],
+  [Google AI Studio (Pro)], [100], [*10,000*],
+  [Groq (70B)], [1,000], [*100,000*],
+  [Cerebras], [~33K tok/hr], [*3.3M tok/hr*],
+  [OpenRouter (R1)], [50], [*5,000*],
+  [GitHub Models (GPT-4o)], [50], [*5,000*],
+  [NVIDIA NIM], [40 RPM], [*4,000 RPM*],
+)
+== Recommended Architecture
+*"Quality burst" backends* (highest intelligence, low per-key limits):
+- Gemini 2.5 Pro, DeepSeek R1, GPT-4o
+*"Workhorse" backends* (high throughput, good intelligence):
+- Cerebras Qwen3 235B (30M tok/mo/key)
+- NVIDIA NIM (no daily cap, 100+ models)
+- Groq Llama 3.3 70B (fast, reliable)
+*"Speed" backends* (real-time chat):
+- Groq: 276--1,665 tok/sec
+- Cerebras: 1,400--2,600 tok/sec
+== Excluded Platforms
+#table(
+  columns: (auto, auto),
+  align: (left, left),
+  [*Platform*], [*Reason*],
+  [OpenAI], [One-time \$5 trial credit, expires],
+  [Anthropic], [One-time trial credits only],
+  [Together AI], [\$25 signup credit, no confirmed ongoing free tier],
+  [DeepSeek], [5M free tokens expire in 30 days (but API is near-free at \$0.28/M)],
+)
+#v(1em)
+#line(length: 100%, stroke: 0.5pt + luma(150))
+#text(size: 8.5pt, fill: luma(100))[
+  _Free tier details change frequently. Verify current limits on each platform's pricing page. Benchmark scores from published papers, LMSYS Chatbot Arena, and OpenLLM Leaderboard as of April 2026._
+]
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
+{
+  "name": "freellmapi",
+  "private": true,
+  "workspaces": [
+    "shared",
+    "server",
+    "client"
+  ],
+  "scripts": {
+    "dev": "concurrently \"npm run dev -w server\" \"npm run dev -w client\"",
+    "test": "npm run test -w server && npm run test -w client",
+    "build": "npm run build -w server && npm run build -w client",
+    "build:server": "npm run build -w server"
+  },
+  "devDependencies": {
+    "concurrently": "^9.1.2"
+  }
+}
--- a/repo-assets/analytics.png
+++ b/repo-assets/analytics.png
--- a/repo-assets/fallback-chain.png
+++ b/repo-assets/fallback-chain.png
--- a/repo-assets/keys.png
+++ b/repo-assets/keys.png
--- a/repo-assets/playground.png
+++ b/repo-assets/playground.png
--- a/server/package.json
+++ b/server/package.json
+{
+  "name": "@freellmapi/server",
+  "version": "0.1.0",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "dev": "tsx watch src/index.ts",
+    "build": "tsc",
+    "start": "node dist/index.js",
+    "test": "vitest run",
+    "test:watch": "vitest"
+  },
+  "dependencies": {
+    "@freellmapi/shared": "*",
+    "better-sqlite3": "^11.8.1",
+    "cors": "^2.8.5",
+    "drizzle-orm": "^0.44.2",
+    "express": "^5.1.0",
+    "helmet": "^8.1.0",
+    "zod": "^3.24.4"
+  },
+  "devDependencies": {
+    "@types/better-sqlite3": "^7.6.13",
+    "@types/cors": "^2.8.17",
+    "@types/express": "^5.0.2",
+    "@types/node": "^22.15.3",
+    "drizzle-kit": "^0.31.1",
+    "tsx": "^4.19.4",
+    "typescript": "^5.8.3",
+    "vitest": "^3.1.3"
+  }
+}
--- a/server/src/__tests__/integration/full-flow.test.ts
+++ b/server/src/__tests__/integration/full-flow.test.ts
+import { describe, it, expect, beforeAll, vi } from 'vitest';
+import type { Express } from 'express';
+import { createApp } from '../../app.js';
+import { initDb, getDb } from '../../db/index.js';
+async function req(app: Express, method: string, path: string, body?: any) {
+  const server = app.listen(0);
+  const addr = server.address() as any;
+  const url = `http://127.0.0.1:${addr.port}${path}`;
+  const res = await fetch(url, {
+    method,
+    headers: body ? { 'Content-Type': 'application/json' } : {},
+    body: body ? JSON.stringify(body) : undefined,
+  });
+  const data = await res.text();
+  server.close();
+  let json: any = null;
+  try { json = JSON.parse(data); } catch {}
+  return { status: res.status, body: json, headers: res.headers, raw: data };
+}
+describe('Full Integration Flow', () => {
+  let app: Express;
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+    app = createApp();
+    // Clean
+    const db = getDb();
+    db.prepare('DELETE FROM api_keys').run();
+    db.prepare('DELETE FROM requests').run();
+  });
+  it('Step 1: Verify models are seeded', async () => {
+    const { status, body } = await req(app, 'GET', '/api/models');
+    expect(status).toBe(200);
+    expect(body.length).toBeGreaterThanOrEqual(14);
+    expect(body[0]).toHaveProperty('modelId');
+    expect(body[0]).toHaveProperty('hasProvider');
+    // All should have providers
+    for (const m of body) {
+      expect(m.hasProvider).toBe(true);
+    }
+  });
+  it('Step 2: Verify fallback chain is populated', async () => {
+    const { status, body } = await req(app, 'GET', '/api/fallback');
+    expect(status).toBe(200);
+    expect(body.length).toBeGreaterThanOrEqual(14);
+    expect(body[0]).toHaveProperty('priority');
+    expect(body[0]).toHaveProperty('enabled');
+  });
+  it('Step 3: Proxy returns 429 with no keys', async () => {
+    const { status, body } = await req(app, 'POST', '/v1/chat/completions', {
+      messages: [{ role: 'user', content: 'hello' }],
+    });
+    // 429 (all exhausted) or 502 (provider error) or 503 (no route)
+    expect([429, 502, 503]).toContain(status);
+    expect(body.error).toBeDefined();
+  });
+  it('Step 4: Add a Groq key', async () => {
+    const { status, body } = await req(app, 'POST', '/api/keys', {
+      platform: 'groq',
+      key: 'gsk_integration_test_key',
+      label: 'Integration Test',
+    });
+    expect(status).toBe(201);
+    expect(body.platform).toBe('groq');
+    expect(body.maskedKey).toContain('...');
+  });
+  it('Step 5: Proxy routes to Groq and handles provider error gracefully', async () => {
+    // Mock fetch to simulate a Groq API error
+    const origFetch = global.fetch;
+    vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
+      const urlStr = typeof url === 'string' ? url : url.toString();
+      // If it's calling the Groq API, return an error
+      if (urlStr.includes('api.groq.com')) {
+        return {
+          ok: false,
+          status: 401,
+          statusText: 'Unauthorized',
+          json: () => Promise.resolve({ error: { message: 'Invalid API Key' } }),
+        } as any;
+      }
+      // Otherwise pass through (for our test server)
+      return origFetch(url, init);
+    });
+    const { status, body } = await req(app, 'POST', '/v1/chat/completions', {
+      messages: [{ role: 'user', content: 'hello' }],
+    });
+    // 502 (provider error) or 429 (all exhausted after retries)
+    expect([502, 429]).toContain(status);
+    expect(body.error).toBeDefined();
+    vi.restoreAllMocks();
+  });
+  it('Step 6: Error was logged in analytics', async () => {
+    const { status, body } = await req(app, 'GET', '/api/analytics/summary?range=24h');
+    expect(status).toBe(200);
+    // May or may not have logged depending on retry behavior
+    expect(body.totalRequests).toBeGreaterThanOrEqual(0);
+  });
+  it('Step 7: Sort fallback by speed', async () => {
+    const { status } = await req(app, 'POST', '/api/fallback/sort/speed');
+    expect(status).toBe(200);
+    const { body } = await req(app, 'GET', '/api/fallback');
+    expect(body[0].speedRank).toBe(1);
+  });
+  it('Step 8: Health endpoint works', async () => {
+    const { status, body } = await req(app, 'GET', '/api/health');
+    expect(status).toBe(200);
+    expect(body).toHaveProperty('platforms');
+    expect(body).toHaveProperty('keys');
+  });
+  it('Step 9: Delete a key if any exist', async () => {
+    // Add a fresh key to ensure we have one to delete
+    await req(app, 'POST', '/api/keys', {
+      platform: 'groq', key: 'gsk_delete_test', label: 'delete-test',
+    });
+    const { body: keys } = await req(app, 'GET', '/api/keys');
+    const target = keys.find((k: any) => k.label === 'delete-test');
+    expect(target).toBeDefined();
+    const { status } = await req(app, 'DELETE', `/api/keys/${target.id}`);
+    expect(status).toBe(200);
+  });
+  it('Step 10: Validate request schema', async () => {
+    const { status } = await req(app, 'POST', '/v1/chat/completions', {
+      messages: [], // empty
+    });
+    expect(status).toBe(400);
+    const { status: s2 } = await req(app, 'POST', '/v1/chat/completions', {
+      // missing messages entirely
+    });
+    expect(s2).toBe(400);
+  });
+});
--- a/server/src/__tests__/lib/crypto.test.ts
+++ b/server/src/__tests__/lib/crypto.test.ts
+import { describe, it, expect, beforeAll } from 'vitest';
+import { initDb } from '../../db/index.js';
+import { encrypt, decrypt, maskKey } from '../../lib/crypto.js';
+describe('Crypto', () => {
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+  });
+  it('should encrypt and decrypt a key round-trip', () => {
+    const original = 'gsk_test1234567890abcdef';
+    const { encrypted, iv, authTag } = encrypt(original);
+    const decrypted = decrypt(encrypted, iv, authTag);
+    expect(decrypted).toBe(original);
+  });
+  it('should produce different ciphertext for same input (random IV)', () => {
+    const original = 'same-key';
+    const a = encrypt(original);
+    const b = encrypt(original);
+    expect(a.encrypted).not.toBe(b.encrypted);
+    expect(a.iv).not.toBe(b.iv);
+  });
+  it('should fail to decrypt with wrong auth tag', () => {
+    const { encrypted, iv } = encrypt('test-key');
+    expect(() => decrypt(encrypted, iv, 'a'.repeat(32))).toThrow();
+  });
+  describe('maskKey', () => {
+    it('should mask long keys', () => {
+      expect(maskKey('gsk_test1234567890abcdef')).toBe('gsk_...cdef');
+    });
+    it('should mask short keys', () => {
+      expect(maskKey('abcd')).toBe('****abcd');
+    });
+  });
+});
--- a/server/src/__tests__/providers/cerebras.test.ts
+++ b/server/src/__tests__/providers/cerebras.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { CerebrasProvider } from '../../providers/cerebras.js';
+describe('CerebrasProvider', () => {
+  let provider: CerebrasProvider;
+  beforeEach(() => {
+    provider = new CerebrasProvider();
+  });
+  it('should have correct platform and name', () => {
+    expect(provider.platform).toBe('cerebras');
+    expect(provider.name).toBe('Cerebras');
+  });
+  it('should call Cerebras API with OpenAI-compatible format', async () => {
+    const mockResponse = {
+      id: 'chatcmpl-456',
+      object: 'chat.completion',
+      created: 1234567890,
+      model: 'qwen3-235b',
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: 'Response from Cerebras' },
+        finish_reason: 'stop',
+      }],
+      usage: { prompt_tokens: 8, completion_tokens: 4, total_tokens: 12 },
+    };
+    let capturedUrl = '';
+    vi.spyOn(global, 'fetch').mockImplementation(async (url, _init) => {
+      capturedUrl = url as string;
+      return {
+        ok: true,
+        json: () => Promise.resolve(mockResponse),
+      } as any;
+    });
+    const result = await provider.chatCompletion(
+      'csk_test456',
+      [{ role: 'user', content: 'Hello' }],
+      'qwen3-235b',
+    );
+    expect(capturedUrl).toContain('api.cerebras.ai');
+    expect(result.choices[0].message.content).toBe('Response from Cerebras');
+    expect(result._routed_via?.platform).toBe('cerebras');
+    expect(result._routed_via?.model).toBe('qwen3-235b');
+  });
+  it('should validate key', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
+    expect(await provider.validateKey('valid')).toBe(true);
+  });
+});
--- a/server/src/__tests__/providers/cloudflare.test.ts
+++ b/server/src/__tests__/providers/cloudflare.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { CloudflareProvider } from '../../providers/cloudflare.js';
+describe('CloudflareProvider', () => {
+  let provider: CloudflareProvider;
+  beforeEach(() => {
+    provider = new CloudflareProvider();
+  });
+  it('should have correct platform and name', () => {
+    expect(provider.platform).toBe('cloudflare');
+    expect(provider.name).toBe('Cloudflare Workers AI');
+  });
+  it('should parse account_id:token key format', async () => {
+    let capturedUrl = '';
+    let capturedHeaders: Record<string, string> = {};
+    vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
+      capturedUrl = url as string;
+      capturedHeaders = (init as any).headers;
+      return {
+        ok: true,
+        json: () => Promise.resolve({ result: { response: 'Hello from CF!' } }),
+      } as any;
+    });
+    const result = await provider.chatCompletion(
+      'abc123:my-token-here',
+      [{ role: 'user', content: 'Hi' }],
+      '@cf/meta/llama-3.1-70b-instruct',
+    );
+    expect(capturedUrl).toContain('abc123');
+    expect(capturedUrl).toContain('@cf/meta/llama-3.1-70b-instruct');
+    expect(capturedHeaders['Authorization']).toBe('Bearer my-token-here');
+    expect(result.choices[0].message.content).toBe('Hello from CF!');
+  });
+  it('should throw if key format is wrong', async () => {
+    await expect(
+      provider.chatCompletion('no-colon-here', [{ role: 'user', content: 'Hi' }], 'model')
+    ).rejects.toThrow(/account_id:api_token/);
+  });
+});
--- a/server/src/__tests__/providers/cohere.test.ts
+++ b/server/src/__tests__/providers/cohere.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { CohereProvider } from '../../providers/cohere.js';
+describe('CohereProvider', () => {
+  let provider: CohereProvider;
+  beforeEach(() => {
+    provider = new CohereProvider();
+  });
+  it('should have correct platform and name', () => {
+    expect(provider.platform).toBe('cohere');
+    expect(provider.name).toBe('Cohere');
+  });
+  it('should translate response to OpenAI format', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({
+      ok: true,
+      json: () => Promise.resolve({
+        id: 'cohere-123',
+        message: { content: [{ type: 'text', text: 'Hello from Cohere!' }] },
+        finish_reason: 'COMPLETE',
+        usage: { tokens: { input_tokens: 10, output_tokens: 5 } },
+      }),
+    } as any);
+    const result = await provider.chatCompletion(
+      'test-key',
+      [{ role: 'user', content: 'Hi' }],
+      'command-r-plus-08-2024',
+    );
+    expect(result.object).toBe('chat.completion');
+    expect(result.choices[0].message.content).toBe('Hello from Cohere!');
+    expect(result.usage.prompt_tokens).toBe(10);
+    expect(result.usage.completion_tokens).toBe(5);
+    expect(result._routed_via?.platform).toBe('cohere');
+  });
+  it('should validate key', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
+    expect(await provider.validateKey('valid')).toBe(true);
+  });
+});
--- a/server/src/__tests__/providers/google.test.ts
+++ b/server/src/__tests__/providers/google.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { GoogleProvider } from '../../providers/google.js';
+describe('GoogleProvider', () => {
+  let provider: GoogleProvider;
+  beforeEach(() => {
+    provider = new GoogleProvider();
+  });
+  it('should have correct platform and name', () => {
+    expect(provider.platform).toBe('google');
+    expect(provider.name).toBe('Google AI Studio');
+  });
+  it('should call Gemini API and return OpenAI-compatible response', async () => {
+    const mockResponse = {
+      candidates: [{
+        content: { parts: [{ text: 'Hello from Gemini!' }] },
+        finishReason: 'STOP',
+      }],
+      usageMetadata: {
+        promptTokenCount: 10,
+        candidatesTokenCount: 5,
+        totalTokenCount: 15,
+      },
+    };
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({
+      ok: true,
+      json: () => Promise.resolve(mockResponse),
+    } as any);
+    const result = await provider.chatCompletion(
+      'test-key',
+      [{ role: 'user', content: 'Hi' }],
+      'gemini-2.5-pro',
+    );
+    expect(result.object).toBe('chat.completion');
+    expect(result.choices[0].message.content).toBe('Hello from Gemini!');
+    expect(result.choices[0].message.role).toBe('assistant');
+    expect(result.usage.prompt_tokens).toBe(10);
+    expect(result.usage.completion_tokens).toBe(5);
+    expect(result._routed_via?.platform).toBe('google');
+  });
+  it('should throw on API error', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({
+      ok: false,
+      status: 429,
+      statusText: 'Too Many Requests',
+      json: () => Promise.resolve({ error: { message: 'Rate limit exceeded' } }),
+    } as any);
+    await expect(
+      provider.chatCompletion('test-key', [{ role: 'user', content: 'Hi' }], 'gemini-2.5-pro')
+    ).rejects.toThrow(/Rate limit exceeded/);
+  });
+  it('should validate key via models endpoint', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
+    expect(await provider.validateKey('valid-key')).toBe(true);
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: false, status: 401 } as any);
+    expect(await provider.validateKey('invalid-key')).toBe(false);
+  });
+  it('should translate system messages to systemInstruction', async () => {
+    let capturedBody: any;
+    vi.spyOn(global, 'fetch').mockImplementation(async (_url, init) => {
+      capturedBody = JSON.parse((init as any).body);
+      return {
+        ok: true,
+        json: () => Promise.resolve({
+          candidates: [{ content: { parts: [{ text: 'ok' }] }, finishReason: 'STOP' }],
+          usageMetadata: { promptTokenCount: 1, candidatesTokenCount: 1, totalTokenCount: 2 },
+        }),
+      } as any;
+    });
+    await provider.chatCompletion(
+      'test-key',
+      [
+        { role: 'system', content: 'You are helpful' },
+        { role: 'user', content: 'Hi' },
+      ],
+      'gemini-2.5-pro',
+    );
+    expect(capturedBody.systemInstruction).toEqual({ parts: [{ text: 'You are helpful' }] });
+    expect(capturedBody.contents).toHaveLength(1);
+    expect(capturedBody.contents[0].role).toBe('user');
+  });
+});
--- a/server/src/__tests__/providers/groq.test.ts
+++ b/server/src/__tests__/providers/groq.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { GroqProvider } from '../../providers/groq.js';
+describe('GroqProvider', () => {
+  let provider: GroqProvider;
+  beforeEach(() => {
+    provider = new GroqProvider();
+  });
+  it('should have correct platform and name', () => {
+    expect(provider.platform).toBe('groq');
+    expect(provider.name).toBe('Groq');
+  });
+  it('should call Groq API with OpenAI-compatible format', async () => {
+    const mockResponse = {
+      id: 'chatcmpl-123',
+      object: 'chat.completion',
+      created: 1234567890,
+      model: 'llama-3.3-70b-versatile',
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: 'Hello!' },
+        finish_reason: 'stop',
+      }],
+      usage: { prompt_tokens: 5, completion_tokens: 2, total_tokens: 7 },
+    };
+    let capturedHeaders: Record<string, string> = {};
+    vi.spyOn(global, 'fetch').mockImplementation(async (_url, init) => {
+      capturedHeaders = Object.fromEntries(
+        Object.entries((init as any).headers)
+      );
+      return {
+        ok: true,
+        json: () => Promise.resolve(mockResponse),
+      } as any;
+    });
+    const result = await provider.chatCompletion(
+      'gsk_test123',
+      [{ role: 'user', content: 'Hi' }],
+      'llama-3.3-70b-versatile',
+    );
+    expect(capturedHeaders['Authorization']).toBe('Bearer gsk_test123');
+    expect(result.choices[0].message.content).toBe('Hello!');
+    expect(result._routed_via?.platform).toBe('groq');
+  });
+  it('should throw on API error', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({
+      ok: false,
+      status: 401,
+      statusText: 'Unauthorized',
+      json: () => Promise.resolve({ error: { message: 'Invalid API key' } }),
+    } as any);
+    await expect(
+      provider.chatCompletion('bad-key', [{ role: 'user', content: 'Hi' }], 'llama-3.3-70b-versatile')
+    ).rejects.toThrow(/Invalid API key/);
+  });
+  it('should validate key', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
+    expect(await provider.validateKey('valid')).toBe(true);
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: false } as any);
+    expect(await provider.validateKey('invalid')).toBe(false);
+  });
+});
--- a/server/src/__tests__/providers/openai-compat.test.ts
+++ b/server/src/__tests__/providers/openai-compat.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { OpenAICompatProvider } from '../../providers/openai-compat.js';
+describe('OpenAICompatProvider', () => {
+  let provider: OpenAICompatProvider;
+  beforeEach(() => {
+    provider = new OpenAICompatProvider({
+      platform: 'groq',
+      name: 'TestProvider',
+      baseUrl: 'https://api.test.com/v1',
+      extraHeaders: { 'X-Custom': 'test' },
+    });
+  });
+  it('should set platform and name from config', () => {
+    expect(provider.platform).toBe('groq');
+    expect(provider.name).toBe('TestProvider');
+  });
+  it('should call API with correct URL and headers', async () => {
+    let capturedUrl = '';
+    let capturedHeaders: Record<string, string> = {};
+    vi.spyOn(global, 'fetch').mockImplementation(async (url, init) => {
+      capturedUrl = url as string;
+      capturedHeaders = (init as any).headers;
+      return {
+        ok: true,
+        json: () => Promise.resolve({
+          id: 'test-id',
+          object: 'chat.completion',
+          created: 123,
+          model: 'test-model',
+          choices: [{ index: 0, message: { role: 'assistant', content: 'hi' }, finish_reason: 'stop' }],
+          usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
+        }),
+      } as any;
+    });
+    await provider.chatCompletion('my-key', [{ role: 'user', content: 'test' }], 'test-model');
+    expect(capturedUrl).toBe('https://api.test.com/v1/chat/completions');
+    expect(capturedHeaders['Authorization']).toBe('Bearer my-key');
+    expect(capturedHeaders['X-Custom']).toBe('test');
+  });
+  it('should throw on error response', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({
+      ok: false,
+      status: 429,
+      statusText: 'Rate Limited',
+      json: () => Promise.resolve({ error: { message: 'Too many requests' } }),
+    } as any);
+    await expect(
+      provider.chatCompletion('key', [{ role: 'user', content: 'hi' }], 'model')
+    ).rejects.toThrow(/Too many requests/);
+  });
+  it('should validate key using models endpoint', async () => {
+    vi.spyOn(global, 'fetch').mockResolvedValueOnce({ ok: true } as any);
+    expect(await provider.validateKey('valid')).toBe(true);
+  });
+});
+describe('OpenAICompatProvider - platform instances', () => {
+  const platforms = [
+    { platform: 'sambanova', name: 'SambaNova', baseUrl: 'https://api.sambanova.ai/v1' },
+    { platform: 'nvidia', name: 'NVIDIA NIM', baseUrl: 'https://integrate.api.nvidia.com/v1' },
+    { platform: 'mistral', name: 'Mistral', baseUrl: 'https://api.mistral.ai/v1' },
+    { platform: 'openrouter', name: 'OpenRouter', baseUrl: 'https://openrouter.ai/api/v1' },
+    { platform: 'github', name: 'GitHub Models', baseUrl: 'https://models.inference.ai.azure.com' },
+    { platform: 'fireworks', name: 'Fireworks AI', baseUrl: 'https://api.fireworks.ai/inference/v1' },
+  ] as const;
+  for (const p of platforms) {
+    it(`${p.name} provider should make requests to ${p.baseUrl}`, async () => {
+      const provider = new OpenAICompatProvider(p as any);
+      let capturedUrl = '';
+      vi.spyOn(global, 'fetch').mockImplementation(async (url) => {
+        capturedUrl = url as string;
+        return {
+          ok: true,
+          json: () => Promise.resolve({
+            id: 'id', object: 'chat.completion', created: 1, model: 'm',
+            choices: [{ index: 0, message: { role: 'assistant', content: 'ok' }, finish_reason: 'stop' }],
+            usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 },
+          }),
+        } as any;
+      });
+      const result = await provider.chatCompletion('key', [{ role: 'user', content: 'hi' }], 'model');
+      expect(capturedUrl).toContain(p.baseUrl);
+      expect(result._routed_via?.platform).toBe(p.platform);
+    });
+  }
+});
--- a/server/src/__tests__/routes/fallback.test.ts
+++ b/server/src/__tests__/routes/fallback.test.ts
+import { describe, it, expect, beforeAll } from 'vitest';
+import type { Express } from 'express';
+import { createApp } from '../../app.js';
+import { initDb } from '../../db/index.js';
+async function request(app: Express, method: string, path: string, body?: any) {
+  const server = app.listen(0);
+  const addr = server.address() as any;
+  const url = `http://127.0.0.1:${addr.port}${path}`;
+  const res = await fetch(url, {
+    method,
+    headers: body ? { 'Content-Type': 'application/json' } : {},
+    body: body ? JSON.stringify(body) : undefined,
+  });
+  const data = await res.json().catch(() => null);
+  server.close();
+  return { status: res.status, body: data };
+}
+describe('Fallback API', () => {
+  let app: Express;
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+    app = createApp();
+  });
+  it('GET /api/fallback returns fallback chain', async () => {
+    const { status, body } = await request(app, 'GET', '/api/fallback');
+    expect(status).toBe(200);
+    expect(Array.isArray(body)).toBe(true);
+    expect(body.length).toBeGreaterThan(0);
+    // Should be sorted by priority
+    for (let i = 1; i < body.length; i++) {
+      expect(body[i].priority).toBeGreaterThanOrEqual(body[i - 1].priority);
+    }
+  });
+  it('GET /api/fallback entries have expected fields', async () => {
+    const { body } = await request(app, 'GET', '/api/fallback');
+    const first = body[0];
+    expect(first).toHaveProperty('modelDbId');
+    expect(first).toHaveProperty('priority');
+    expect(first).toHaveProperty('enabled');
+    expect(first).toHaveProperty('platform');
+    expect(first).toHaveProperty('displayName');
+    expect(first).toHaveProperty('intelligenceRank');
+  });
+  it('PUT /api/fallback updates order', async () => {
+    const { body: original } = await request(app, 'GET', '/api/fallback');
+    // Reverse the order
+    const reversed = original.map((e: any, i: number) => ({
+      modelDbId: e.modelDbId,
+      priority: original.length - i,
+      enabled: e.enabled,
+    }));
+    const { status } = await request(app, 'PUT', '/api/fallback', reversed);
+    expect(status).toBe(200);
+    // Verify order changed
+    const { body: after } = await request(app, 'GET', '/api/fallback');
+    expect(after[0].modelDbId).toBe(original[original.length - 1].modelDbId);
+    // Restore original order
+    const restore = original.map((e: any, i: number) => ({
+      modelDbId: e.modelDbId,
+      priority: i + 1,
+      enabled: e.enabled,
+    }));
+    await request(app, 'PUT', '/api/fallback', restore);
+  });
+  it('POST /api/fallback/sort/intelligence sorts by intelligence', async () => {
+    const { status } = await request(app, 'POST', '/api/fallback/sort/intelligence');
+    expect(status).toBe(200);
+    const { body } = await request(app, 'GET', '/api/fallback');
+    // Should be sorted ascending by intelligence rank
+    for (let i = 1; i < body.length; i++) {
+      expect(body[i].intelligenceRank).toBeGreaterThanOrEqual(body[i - 1].intelligenceRank);
+    }
+  });
+  it('POST /api/fallback/sort/speed sorts by speed', async () => {
+    const { status } = await request(app, 'POST', '/api/fallback/sort/speed');
+    expect(status).toBe(200);
+    const { body } = await request(app, 'GET', '/api/fallback');
+    // Should be sorted ascending by speed rank
+    for (let i = 1; i < body.length; i++) {
+      expect(body[i].speedRank).toBeGreaterThanOrEqual(body[i - 1].speedRank);
+    }
+  });
+  it('POST /api/fallback/sort/invalid returns 400', async () => {
+    const { status } = await request(app, 'POST', '/api/fallback/sort/invalid');
+    expect(status).toBe(400);
+  });
+});
--- a/server/src/__tests__/routes/keys.test.ts
+++ b/server/src/__tests__/routes/keys.test.ts
+import { describe, it, expect, beforeAll, beforeEach } from 'vitest';
+import type { Express } from 'express';
+import { createApp } from '../../app.js';
+import { initDb, getDb } from '../../db/index.js';
+async function request(app: Express, method: string, path: string, body?: any) {
+  const server = app.listen(0);
+  const addr = server.address() as any;
+  const url = `http://127.0.0.1:${addr.port}${path}`;
+  const res = await fetch(url, {
+    method,
+    headers: body ? { 'Content-Type': 'application/json' } : {},
+    body: body ? JSON.stringify(body) : undefined,
+  });
+  const data = await res.json().catch(() => null);
+  server.close();
+  return { status: res.status, body: data };
+}
+describe('Keys API', () => {
+  let app: Express;
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+    app = createApp();
+  });
+  beforeEach(() => {
+    const db = getDb();
+    db.prepare('DELETE FROM api_keys').run();
+  });
+  it('GET /api/keys returns empty array initially', async () => {
+    const { status, body } = await request(app, 'GET', '/api/keys');
+    expect(status).toBe(200);
+    expect(body).toEqual([]);
+  });
+  it('POST /api/keys creates a new key', async () => {
+    const { status, body } = await request(app, 'POST', '/api/keys', {
+      platform: 'groq',
+      key: 'gsk_test123456789',
+      label: 'My Groq Key',
+    });
+    expect(status).toBe(201);
+    expect(body.platform).toBe('groq');
+    expect(body.label).toBe('My Groq Key');
+    expect(body.maskedKey).toContain('...');
+  });
+  it('GET /api/keys returns the created key', async () => {
+    // First create a key
+    await request(app, 'POST', '/api/keys', {
+      platform: 'groq',
+      key: 'gsk_test123456789',
+    });
+    const { status, body } = await request(app, 'GET', '/api/keys');
+    expect(status).toBe(200);
+    expect(body).toHaveLength(1);
+    expect(body[0].platform).toBe('groq');
+  });
+  it('POST /api/keys rejects invalid platform', async () => {
+    const { status } = await request(app, 'POST', '/api/keys', {
+      platform: 'invalid_platform',
+      key: 'test',
+    });
+    expect(status).toBe(400);
+  });
+  it('POST /api/keys rejects missing key', async () => {
+    const { status } = await request(app, 'POST', '/api/keys', {
+      platform: 'groq',
+    });
+    expect(status).toBe(400);
+  });
+  it('DELETE /api/keys/:id removes a key', async () => {
+    const { body: created } = await request(app, 'POST', '/api/keys', {
+      platform: 'groq',
+      key: 'gsk_test123456789',
+    });
+    const { status } = await request(app, 'DELETE', `/api/keys/${created.id}`);
+    expect(status).toBe(200);
+    const { body: after } = await request(app, 'GET', '/api/keys');
+    expect(after).toHaveLength(0);
+  });
+  it('DELETE /api/keys/:id returns 404 for nonexistent key', async () => {
+    const { status } = await request(app, 'DELETE', '/api/keys/99999');
+    expect(status).toBe(404);
+  });
+});
--- a/server/src/__tests__/services/ratelimit.test.ts
+++ b/server/src/__tests__/services/ratelimit.test.ts
+import { describe, it, expect, beforeEach } from 'vitest';
+import {
+  canMakeRequest,
+  canUseTokens,
+  recordRequest,
+  recordTokens,
+  getRateLimitStatus,
+} from '../../services/ratelimit.js';
+describe('Rate Limiter', () => {
+  // Use unique identifiers per test to avoid cross-contamination
+  let testId: number;
+  beforeEach(() => {
+    testId = Math.floor(Math.random() * 1_000_000);
+  });
+  describe('canMakeRequest', () => {
+    it('should allow request when under RPM limit', () => {
+      expect(canMakeRequest('groq', 'llama-70b', testId, {
+        rpm: 30, rpd: null, tpm: null, tpd: null,
+      })).toBe(true);
+    });
+    it('should deny request when RPM limit reached', () => {
+      const limits = { rpm: 2, rpd: null, tpm: null, tpd: null };
+      recordRequest('groq', 'llama-70b', testId);
+      recordRequest('groq', 'llama-70b', testId);
+      expect(canMakeRequest('groq', 'llama-70b', testId, limits)).toBe(false);
+    });
+    it('should deny request when RPD limit reached', () => {
+      const limits = { rpm: null, rpd: 1, tpm: null, tpd: null };
+      recordRequest('google', 'gemini', testId);
+      expect(canMakeRequest('google', 'gemini', testId, limits)).toBe(false);
+    });
+    it('should allow request when limits are null (unlimited)', () => {
+      expect(canMakeRequest('nvidia', 'nemotron', testId, {
+        rpm: null, rpd: null, tpm: null, tpd: null,
+      })).toBe(true);
+    });
+  });
+  describe('canUseTokens', () => {
+    it('should allow tokens when under TPM limit', () => {
+      expect(canUseTokens('groq', 'llama-70b', testId, 500, {
+        tpm: 6000, tpd: null,
+      })).toBe(true);
+    });
+    it('should deny tokens when TPM limit would be exceeded', () => {
+      recordTokens('cerebras', 'qwen3', testId, 50000);
+      expect(canUseTokens('cerebras', 'qwen3', testId, 20000, {
+        tpm: 60000, tpd: null,
+      })).toBe(false);
+    });
+    it('should allow when limit is null', () => {
+      expect(canUseTokens('nvidia', 'nemotron', testId, 100000, {
+        tpm: null, tpd: null,
+      })).toBe(true);
+    });
+  });
+  describe('getRateLimitStatus', () => {
+    it('should return current usage counts', () => {
+      const limits = { rpm: 30, rpd: 1000, tpm: 6000, tpd: null };
+      recordRequest('groq', 'test-model', testId);
+      recordRequest('groq', 'test-model', testId);
+      recordTokens('groq', 'test-model', testId, 500);
+      const status = getRateLimitStatus('groq', 'test-model', testId, limits);
+      expect(status.rpm.used).toBe(2);
+      expect(status.rpm.limit).toBe(30);
+      expect(status.rpd.used).toBe(2);
+      expect(status.tpm.used).toBe(500);
+    });
+  });
+});
--- a/server/src/__tests__/services/router.test.ts
+++ b/server/src/__tests__/services/router.test.ts
+import { describe, it, expect, beforeAll, beforeEach } from 'vitest';
+import { initDb, getDb } from '../../db/index.js';
+import { encrypt } from '../../lib/crypto.js';
+import { routeRequest } from '../../services/router.js';
+describe('Router', () => {
+  beforeAll(() => {
+    process.env.ENCRYPTION_KEY = '0'.repeat(64);
+    initDb(':memory:');
+  });
+  beforeEach(() => {
+    const db = getDb();
+    db.prepare('DELETE FROM api_keys').run();
+    // Reset fallback order to intelligence ranking
+    const models = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as any[];
+    const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');
+    for (let i = 0; i < models.length; i++) {
+      update.run(i + 1, models[i].id);
+    }
+  });
+  it('should throw when no keys are configured', () => {
+    expect(() => routeRequest()).toThrow(/exhausted/i);
+  });
+  it('should route to highest priority model with available key', () => {
+    const db = getDb();
+    const { encrypted, iv, authTag } = encrypt('test-groq-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('groq', 'test', encrypted, iv, authTag, 'healthy', 1);
+    const result = routeRequest();
+    expect(result.platform).toBe('groq');
+    expect(result.apiKey).toBe('test-groq-key');
+  });
+  it('should prefer higher-priority model when keys exist for multiple platforms', () => {
+    const db = getDb();
+    const googleKey = encrypt('test-google-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('google', 'test', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 1);
+    const groqKey = encrypt('test-groq-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
+    const result = routeRequest();
+    expect(result.platform).toBe('google');
+  });
+  it('should skip disabled keys', () => {
+    const db = getDb();
+    const googleKey = encrypt('test-google-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('google', 'disabled', googleKey.encrypted, googleKey.iv, googleKey.authTag, 'healthy', 0);
+    const groqKey = encrypt('test-groq-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
+    const result = routeRequest();
+    expect(result.platform).toBe('groq');
+  });
+  it('should skip invalid keys', () => {
+    const db = getDb();
+    const invalidKey = encrypt('invalid-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('google', 'invalid', invalidKey.encrypted, invalidKey.iv, invalidKey.authTag, 'invalid', 1);
+    const groqKey = encrypt('test-groq-key');
+    db.prepare(`
+      INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run('groq', 'test', groqKey.encrypted, groqKey.iv, groqKey.authTag, 'healthy', 1);
+    const result = routeRequest();
+    expect(result.platform).toBe('groq');
+  });
+});
--- a/server/src/app.ts
+++ b/server/src/app.ts
+import express from 'express';
+import cors from 'cors';
+import helmet from 'helmet';
+import path from 'path';
+import { fileURLToPath } from 'url';
+import { keysRouter } from './routes/keys.js';
+import { modelsRouter } from './routes/models.js';
+import { proxyRouter } from './routes/proxy.js';
+import { fallbackRouter } from './routes/fallback.js';
+import { analyticsRouter } from './routes/analytics.js';
+import { healthRouter } from './routes/health.js';
+import { settingsRouter } from './routes/settings.js';
+import { errorHandler } from './middleware/errorHandler.js';
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+export function createApp() {
+  const app = express();
+  app.use(helmet({ contentSecurityPolicy: false, hsts: false }));
+  app.use(cors());
+  app.use(express.json({ limit: '1mb' }));
+  // API routes
+  app.use('/api/keys', keysRouter);
+  app.use('/api/models', modelsRouter);
+  app.use('/api/fallback', fallbackRouter);
+  app.use('/api/analytics', analyticsRouter);
+  app.use('/api/health', healthRouter);
+  app.use('/api/settings', settingsRouter);
+  // OpenAI-compatible proxy
+  app.use('/v1', proxyRouter);
+  // Health check
+  app.get('/api/ping', (_req, res) => {
+    res.json({ status: 'ok', timestamp: new Date().toISOString() });
+  });
+  // Error handler (for API routes)
+  app.use(errorHandler);
+  // Serve client static files (after API error handler)
+  const clientDist = path.resolve(__dirname, '../../client/dist');
+  app.use(express.static(clientDist));
+  // SPA fallback — serve index.html for non-API routes
+  app.use((req, res, next) => {
+    if (req.path.startsWith('/api/') || req.path.startsWith('/v1/')) {
+      next();
+      return;
+    }
+    res.sendFile(path.join(clientDist, 'index.html'));
+  });
+  return app;
+}
--- a/server/src/db/index.ts
+++ b/server/src/db/index.ts
+import crypto from 'crypto';
+import Database from 'better-sqlite3';
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+import { initEncryptionKey } from '../lib/crypto.js';
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
+const DB_PATH = path.resolve(__dirname, '../../data/freeapi.db');
+let db: Database.Database;
+export function getDb(): Database.Database {
+  if (!db) {
+    throw new Error('Database not initialized. Call initDb() first.');
+  }
+  return db;
+}
+export function initDb(dbPath?: string): Database.Database {
+  const resolvedPath = dbPath ?? DB_PATH;
+  const isMemory = resolvedPath === ':memory:';
+  if (!isMemory) {
+    const dataDir = path.dirname(resolvedPath);
+    if (!fs.existsSync(dataDir)) {
+      fs.mkdirSync(dataDir, { recursive: true });
+    }
+  }
+  db = new Database(resolvedPath);
+  if (!isMemory) db.pragma('journal_mode = WAL');
+  db.pragma('foreign_keys = ON');
+  createTables(db);
+  initEncryptionKey(db);
+  seedModels(db);
+  migrateModels(db);
+  migrateModelsV2(db);
+  migrateModelsV3Ranks(db);
+  ensureUnifiedKey(db);
+  console.log(`Database initialized at ${resolvedPath}`);
+  return db;
+}
+function createTables(db: Database.Database) {
+  db.exec(`
+    CREATE TABLE IF NOT EXISTS models (
+      id INTEGER PRIMARY KEY AUTOINCREMENT,
+      platform TEXT NOT NULL,
+      model_id TEXT NOT NULL,
+      display_name TEXT NOT NULL,
+      intelligence_rank INTEGER NOT NULL,
+      speed_rank INTEGER NOT NULL,
+      size_label TEXT NOT NULL DEFAULT '',
+      rpm_limit INTEGER,
+      rpd_limit INTEGER,
+      tpm_limit INTEGER,
+      tpd_limit INTEGER,
+      monthly_token_budget TEXT NOT NULL DEFAULT '',
+      context_window INTEGER,
+      enabled INTEGER NOT NULL DEFAULT 1,
+      UNIQUE(platform, model_id)
+    );
+    CREATE TABLE IF NOT EXISTS api_keys (
+      id INTEGER PRIMARY KEY AUTOINCREMENT,
+      platform TEXT NOT NULL,
+      label TEXT NOT NULL DEFAULT '',
+      encrypted_key TEXT NOT NULL,
+      iv TEXT NOT NULL,
+      auth_tag TEXT NOT NULL,
+      status TEXT NOT NULL DEFAULT 'unknown',
+      enabled INTEGER NOT NULL DEFAULT 1,
+      created_at TEXT NOT NULL DEFAULT (datetime('now')),
+      last_checked_at TEXT
+    );
+    CREATE TABLE IF NOT EXISTS requests (
+      id INTEGER PRIMARY KEY AUTOINCREMENT,
+      platform TEXT NOT NULL,
+      model_id TEXT NOT NULL,
+      status TEXT NOT NULL,
+      input_tokens INTEGER NOT NULL DEFAULT 0,
+      output_tokens INTEGER NOT NULL DEFAULT 0,
+      latency_ms INTEGER NOT NULL DEFAULT 0,
+      error TEXT,
+      created_at TEXT NOT NULL DEFAULT (datetime('now'))
+    );
+    CREATE TABLE IF NOT EXISTS fallback_config (
+      id INTEGER PRIMARY KEY AUTOINCREMENT,
+      model_db_id INTEGER NOT NULL REFERENCES models(id),
+      priority INTEGER NOT NULL,
+      enabled INTEGER NOT NULL DEFAULT 1,
+      UNIQUE(model_db_id)
+    );
+    CREATE TABLE IF NOT EXISTS settings (
+      key TEXT PRIMARY KEY,
+      value TEXT NOT NULL
+    );
+    CREATE INDEX IF NOT EXISTS idx_requests_created_at ON requests(created_at);
+    CREATE INDEX IF NOT EXISTS idx_requests_platform ON requests(platform);
+    CREATE INDEX IF NOT EXISTS idx_api_keys_platform ON api_keys(platform);
+  `);
+}
+function seedModels(db: Database.Database) {
+  const count = db.prepare('SELECT COUNT(*) as cnt FROM models').get() as { cnt: number };
+  if (count.cnt > 0) return;
+  const insert = db.prepare(`
+    INSERT INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
+    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+  `);
+  // NOTE: Limits current as of April 2026. See migrateModels() for in-place updates.
+  const models = [
+    // Google — gemini-2.5-flash free quotas were cut Dec 2025 (now ~20 RPD, budget much lower than before)
+    ['google', 'gemini-2.5-pro', 'Gemini 2.5 Pro', 1, 8, 'Frontier', 5, 100, 250000, null, '~12M', 1048576],
+    ['google', 'gemini-2.5-flash', 'Gemini 2.5 Flash', 4, 5, 'Large', 10, 20, 250000, null, '~3M', 1048576],
+    ['google', 'gemini-2.5-flash-lite', 'Gemini 2.5 Flash-Lite', 8, 3, 'Medium', 15, 1000, 250000, null, '~120M', 1048576],
+    // OpenRouter — upgraded DeepSeek R1 -> V3.1 (stronger reasoning); default RPD ~200
+    ['openrouter', 'deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, 10, 'Frontier', 20, 200, null, null, '~6M', 131072],
+    ['openrouter', 'moonshotai/kimi-k2:free', 'Kimi K2 (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 131072],
+    ['openrouter', 'qwen/qwen3-coder:free', 'Qwen3 Coder (free)', 3, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
+    ['openrouter', 'z-ai/glm-4.5-air:free', 'GLM-4.5 Air (free)', 4, 9, 'Large', 20, 200, null, null, '~6M', 131072],
+    // Cerebras — same 30 RPM / 1M TPD free pool; adding frontier coder, Llama 4 Maverick, GPT-OSS
+    ['cerebras', 'qwen-3-coder-480b', 'Qwen3-Coder 480B', 2, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
+    ['cerebras', 'llama-4-maverick-17b-128e-instruct', 'Llama 4 Maverick', 3, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
+    ['cerebras', 'qwen3-235b', 'Qwen3 235B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 8192],
+    ['cerebras', 'gpt-oss-120b', 'GPT-OSS 120B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 131072],
+    // GitHub Models — GPT-4o replaced with GPT-5 (same free tier key)
+    ['github', 'openai/gpt-5', 'GPT-5 (GitHub)', 1, 7, 'Frontier', 10, 50, null, null, '~18M', 128000],
+    // SambaNova — 70B RPM bumped to 20
+    ['sambanova', 'Meta-Llama-3.3-70B-Instruct', 'Llama 3.3 70B', 6, 9, 'Large', 20, null, null, 200000, '~6M', 8192],
+    // Mistral — Experiment pool ~1B tokens/mo shared across all models
+    ['mistral', 'mistral-large-latest', 'Mistral Large 3', 7, 8, 'Large', 2, null, 500000, null, '~50-100M', 131072],
+    ['mistral', 'magistral-medium-latest', 'Magistral Medium', 4, 8, 'Large', 2, null, 500000, null, '~50-100M', 40000],
+    ['mistral', 'codestral-latest', 'Codestral', 6, 6, 'Medium', 2, null, 500000, null, '~50-100M', 32000],
+    // Groq — scout TPM corrected to 6k (not 30k)
+    ['groq', 'llama-3.3-70b-versatile', 'Llama 3.3 70B', 9, 2, 'Medium', 30, 1000, 6000, 500000, '~15M', 131072],
+    ['groq', 'llama-4-scout-17b-16e-instruct', 'Llama 4 Scout', 10, 2, 'Medium', 30, 1000, 6000, 1000000, '~30M', 131072],
+    // NVIDIA NIM — moved to credit-based model in 2025; no longer truly recurring monthly. Disabled by default.
+    ['nvidia', 'meta/llama-3.1-70b-instruct', 'Llama 3.1 70B (NV)', 11, 6, 'Large', 40, null, null, null, 'credits-based', 131072],
+    // Cohere — trial tier is 1000 calls/mo total → realistic budget 1-2M
+    ['cohere', 'command-r-plus-08-2024', 'Command R+ (08-2024)', 12, 11, 'Large', 20, 33, null, null, '~1-2M', 131072],
+    ['cloudflare', '@cf/meta/llama-3.1-70b-instruct', 'Llama 3.1 70B (CF)', 13, 11, 'Medium', null, null, null, null, '~18-45M', 131072],
+    // Hugging Face — free Inference credits are ~$0.10/mo → budget closer to 1-3M on a 70B model
+    ['huggingface', 'accounts/fireworks/models/llama-v3p3-70b-instruct', 'Llama 3.3 70B (HF)', 14, 11, 'Medium', null, null, null, null, '~1-3M', 131072],
+    // New providers — recurring monthly free tiers, no card required
+    ['zhipu', 'glm-4.5-flash', 'GLM-4.5 Flash', 5, 4, 'Large', null, null, null, 1000000, '~30M', 131072],
+    ['moonshot', 'kimi-latest', 'Kimi Latest', 4, 8, 'Large', 60, null, null, 500000, '~15M', 200000],
+    ['minimax', 'MiniMax-M1', 'MiniMax M1', 5, 8, 'Large', 20, null, 1000000, null, '~30M', 200000],
+  ];
+  const insertMany = db.transaction(() => {
+    for (const m of models) {
+      insert.run(...m);
+    }
+  });
+  insertMany();
+  // Seed default fallback config from models
+  const allModels = db.prepare('SELECT id, intelligence_rank FROM models ORDER BY intelligence_rank ASC').all() as { id: number; intelligence_rank: number }[];
+  const insertFallback = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
+  const insertFallbacks = db.transaction(() => {
+    for (let i = 0; i < allModels.length; i++) {
+      insertFallback.run(allModels[i].id, i + 1);
+    }
+  });
+  insertFallbacks();
+  console.log(`Seeded ${models.length} models and fallback config`);
+}
+/**
+ * Idempotent migration to bring existing DBs up to the April 2026 pool.
+ * Covers: replaces outdated models (DeepSeek R1 → V3.1, GPT-4o → GPT-5),
+ * corrects stale rate-limits / monthly budgets, adds new smarter models
+ * and three new providers (Zhipu, Moonshot, MiniMax).
+ */
+function migrateModels(db: Database.Database) {
+  // 1) Replace outdated models in-place (preserves fallback_config & any references)
+  const renames: Array<[string, string, string, string, number, string, number | null, number | null, number]> = [
+    // platform, oldModelId, newModelId, newDisplayName, intelligenceRank, monthlyBudget, rpdLimit, contextWindow, sizeLabelPriority(unused)
+  ];
+  const renameStmt = db.prepare(`
+    UPDATE models
+       SET model_id = ?, display_name = ?, intelligence_rank = ?,
+           monthly_token_budget = ?, rpd_limit = COALESCE(?, rpd_limit),
+           context_window = COALESCE(?, context_window),
+           size_label = COALESCE(?, size_label)
+     WHERE platform = ? AND model_id = ?
+  `);
+  // DeepSeek R1 (free) -> DeepSeek V3.1 (free)
+  renameStmt.run('deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, '~6M', 200, 131072, 'Frontier', 'openrouter', 'deepseek/deepseek-r1:free');
+  // GitHub GPT-4o -> GPT-5
+  renameStmt.run('openai/gpt-5', 'GPT-5 (GitHub)', 1, '~18M', null, 128000, 'Frontier', 'github', 'gpt-4o');
+  // 2) Correct stale limits / budgets on existing rows
+  db.prepare(`UPDATE models SET rpd_limit = 20, monthly_token_budget = '~3M' WHERE platform = 'google' AND model_id = 'gemini-2.5-flash'`).run();
+  db.prepare(`UPDATE models SET rpm_limit = 20 WHERE platform = 'sambanova' AND model_id = 'Meta-Llama-3.3-70B-Instruct'`).run();
+  db.prepare(`UPDATE models SET tpm_limit = 6000 WHERE platform = 'groq' AND model_id = 'llama-4-scout-17b-16e-instruct'`).run();
+  db.prepare(`UPDATE models SET monthly_token_budget = '~1-2M' WHERE platform = 'cohere' AND model_id = 'command-r-plus-08-2024'`).run();
+  db.prepare(`UPDATE models SET monthly_token_budget = '~1-3M' WHERE platform = 'huggingface' AND model_id = 'accounts/fireworks/models/llama-v3p3-70b-instruct'`).run();
+  // NVIDIA moved to credit model — disable and label accordingly
+  db.prepare(`UPDATE models SET monthly_token_budget = 'credits-based', enabled = 0 WHERE platform = 'nvidia' AND model_id = 'meta/llama-3.1-70b-instruct'`).run();
+  // 3) Insert new models (UNIQUE(platform, model_id) makes this idempotent)
+  const insert = db.prepare(`
+    INSERT OR IGNORE INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
+    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+  `);
+  const newModels: Array<[string, string, string, number, number, string, number | null, number | null, number | null, number | null, string, number | null]> = [
+    // Cerebras — same free pool as qwen3-235b
+    ['cerebras', 'qwen-3-coder-480b', 'Qwen3-Coder 480B', 2, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
+    ['cerebras', 'llama-4-maverick-17b-128e-instruct', 'Llama 4 Maverick', 3, 1, 'Frontier', 30, null, 60000, 1000000, '~30M', 131072],
+    ['cerebras', 'gpt-oss-120b', 'GPT-OSS 120B', 3, 1, 'Large', 30, null, 60000, 1000000, '~30M', 131072],
+    // OpenRouter free tier
+    ['openrouter', 'deepseek/deepseek-v3.1:free', 'DeepSeek V3.1 (free)', 2, 10, 'Frontier', 20, 200, null, null, '~6M', 131072],
+    ['openrouter', 'moonshotai/kimi-k2:free', 'Kimi K2 (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 131072],
+    ['openrouter', 'qwen/qwen3-coder:free', 'Qwen3 Coder (free)', 3, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
+    ['openrouter', 'z-ai/glm-4.5-air:free', 'GLM-4.5 Air (free)', 4, 9, 'Large', 20, 200, null, null, '~6M', 131072],
+    // Mistral Experiment pool — shared ~1B/mo across models
+    ['mistral', 'magistral-medium-latest', 'Magistral Medium', 4, 8, 'Large', 2, null, 500000, null, '~50-100M', 40000],
+    ['mistral', 'codestral-latest', 'Codestral', 6, 6, 'Medium', 2, null, 500000, null, '~50-100M', 32000],
+    // New providers
+    ['zhipu', 'glm-4.5-flash', 'GLM-4.5 Flash', 5, 4, 'Large', null, null, null, 1000000, '~30M', 131072],
+    ['moonshot', 'kimi-latest', 'Kimi Latest', 4, 8, 'Large', 60, null, null, 500000, '~15M', 200000],
+    ['minimax', 'MiniMax-M1', 'MiniMax M1', 5, 8, 'Large', 20, null, 1000000, null, '~30M', 200000],
+  ];
+  const apply = db.transaction(() => {
+    for (const m of newModels) insert.run(...m);
+    // Ensure every model has a fallback_config row (new inserts + any orphans)
+    const missing = db.prepare(`
+      SELECT m.id FROM models m
+      LEFT JOIN fallback_config f ON m.id = f.model_db_id
+      WHERE f.id IS NULL
+      ORDER BY m.intelligence_rank ASC
+    `).all() as { id: number }[];
+    if (missing.length > 0) {
+      const maxPriority = (db.prepare('SELECT COALESCE(MAX(priority), 0) AS mx FROM fallback_config').get() as { mx: number }).mx;
+      const addFallback = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
+      for (let i = 0; i < missing.length; i++) {
+        addFallback.run(missing[i].id, maxPriority + i + 1);
+      }
+    }
+  });
+  apply();
+}
+/**
+ * Second-pass migration after live-testing every model against its provider.
+ * Corrects model IDs verified wrong, removes models not actually available on
+ * the current free tier, and adds real :free OpenRouter models found in the
+ * live catalog (April 2026).
+ */
+function migrateModelsV2(db: Database.Database) {
+  // Helper: delete a model and its fallback_config entry (FK is RESTRICT-by-default)
+  const deleteModel = db.prepare(`DELETE FROM models WHERE platform = ? AND model_id = ?`);
+  const deleteFallback = db.prepare(`
+    DELETE FROM fallback_config WHERE model_db_id IN (
+      SELECT id FROM models WHERE platform = ? AND model_id = ?
+    )
+  `);
+  const removals: Array<[string, string]> = [
+    // GitHub free tier does NOT include GPT-5 (only catalog-listed). Revert handled below.
+    // Cerebras: qwen-3-coder-480b and llama-4-maverick not on free tier; gpt-oss-120b is listed
+    // but requires special access — our key gets 404. Remove all three.
+    ['cerebras', 'qwen-3-coder-480b'],
+    ['cerebras', 'llama-4-maverick-17b-128e-instruct'],
+    ['cerebras', 'gpt-oss-120b'],
+    // These OpenRouter :free variants do not exist in the live catalog (April 2026)
+    ['openrouter', 'deepseek/deepseek-v3.1:free'],
+    ['openrouter', 'moonshotai/kimi-k2:free'],
+  ];
+  const applyRemovals = db.transaction(() => {
+    for (const [p, m] of removals) {
+      deleteFallback.run(p, m);
+      deleteModel.run(p, m);
+    }
+  });
+  applyRemovals();
+  // GitHub: gpt-5 is in the model catalog but returns "unavailable_model" on free tier
+  // inference. Revert to gpt-4o which works. This only runs if the gpt-5 row exists.
+  db.prepare(`
+    UPDATE models
+       SET model_id = 'gpt-4o', display_name = 'GPT-4o', intelligence_rank = 5,
+           size_label = 'Large', context_window = 8000, monthly_token_budget = '~18M'
+     WHERE platform = 'github' AND model_id = 'openai/gpt-5'
+  `).run();
+  // Groq: scout requires the meta-llama/ publisher prefix
+  db.prepare(`
+    UPDATE models SET model_id = 'meta-llama/llama-4-scout-17b-16e-instruct'
+     WHERE platform = 'groq' AND model_id = 'llama-4-scout-17b-16e-instruct'
+  `).run();
+  // Add real OpenRouter :free models that exist in the live catalog
+  const insert = db.prepare(`
+    INSERT OR IGNORE INTO models (platform, model_id, display_name, intelligence_rank, speed_rank, size_label, rpm_limit, rpd_limit, tpm_limit, tpd_limit, monthly_token_budget, context_window)
+    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+  `);
+  const additions: Array<[string, string, string, number, number, string, number | null, number | null, number | null, number | null, string, number | null]> = [
+    // Frontier-tier free models verified in OR catalog 2026-04
+    ['openrouter', 'nvidia/nemotron-3-super-120b-a12b:free', 'Nemotron 3 Super 120B (free)', 2, 9, 'Frontier', 20, 200, null, null, '~6M', 262144],
+    ['openrouter', 'qwen/qwen3-next-80b-a3b-instruct:free', 'Qwen3-Next 80B (free)', 3, 9, 'Large', 20, 200, null, null, '~6M', 262144],
+    ['openrouter', 'minimax/minimax-m2.5:free', 'MiniMax M2.5 (free)', 3, 9, 'Large', 20, 200, null, null, '~6M', 196608],
+    ['openrouter', 'google/gemma-4-31b-it:free', 'Gemma 4 31B (free)', 5, 9, 'Medium', 20, 200, null, null, '~6M', 262144],
+  ];
+  const applyAdditions = db.transaction(() => {
+    for (const a of additions) insert.run(...a);
+    // Fallback entries for new models
+    const missing = db.prepare(`
+      SELECT m.id FROM models m
+      LEFT JOIN fallback_config f ON m.id = f.model_db_id
+      WHERE f.id IS NULL ORDER BY m.intelligence_rank ASC
+    `).all() as { id: number }[];
+    if (missing.length > 0) {
+      const maxPriority = (db.prepare('SELECT COALESCE(MAX(priority), 0) AS mx FROM fallback_config').get() as { mx: number }).mx;
+      const addFb = db.prepare('INSERT INTO fallback_config (model_db_id, priority, enabled) VALUES (?, ?, 1)');
+      for (let i = 0; i < missing.length; i++) addFb.run(missing[i].id, maxPriority + i + 1);
+    }
+  });
+  applyAdditions();
+}
+/**
+ * Re-rank intelligence based on April 2026 coding + agentic tool-use benchmarks:
+ * SWE-bench Verified, Terminal-Bench 2, TAU-Bench, Aider Polyglot.
+ * Higher rank = weaker. Ties are allowed (same weights across providers).
+ */
+function migrateModelsV3Ranks(db: Database.Database) {
+  const setRank = db.prepare(`UPDATE models SET intelligence_rank = ? WHERE platform = ? AND model_id = ?`);
+  const ranks: Array<[number, string, string]> = [
+    // #1-10 frontier coders / agents
+    [1,  'openrouter',  'minimax/minimax-m2.5:free'],                     // SWE-V ~80%, TB2 ~57%
+    [2,  'openrouter',  'qwen/qwen3-coder:free'],                         // SWE-V ~70%
+    [3,  'openrouter',  'qwen/qwen3-next-80b-a3b-instruct:free'],         // SWE-V ~70.6%
+    [4,  'moonshot',    'kimi-latest'],                                   // K2: SWE-V ~71%
+    [5,  'cerebras',    'qwen-3-235b-a22b-instruct-2507'],                // SWE-V ~65-72%
+    [6,  'google',      'gemini-2.5-pro'],                                // SWE-V 63.8%, Aider 83%
+    [7,  'openrouter',  'z-ai/glm-4.5-air:free'],                         // ~58% SWE-V (distill of 4.5)
+    [8,  'openrouter',  'openai/gpt-oss-120b:free'],                      // SWE-V 62.4%
+    [9,  'openrouter',  'nvidia/nemotron-3-super-120b-a12b:free'],        // SWE-V 53.7%
+    [10, 'minimax',     'MiniMax-M1'],                                    // M1 predecessor, ~45-55%
+    // #11-15 mid-tier specialists
+    [11, 'mistral',     'codestral-latest'],                              // HumanEval 86.6%
+    [12, 'mistral',     'mistral-large-latest'],
+    [13, 'mistral',     'magistral-medium-latest'],                       // reasoning, not code-tuned
+    [14, 'google',      'gemini-2.5-flash'],
+    [15, 'zhipu',       'glm-4.5-flash'],
+    // #16 Llama 3.3 70B — identical weights across providers (tie)
+    [16, 'groq',        'llama-3.3-70b-versatile'],
+    [16, 'sambanova',   'Meta-Llama-3.3-70B-Instruct'],
+    [16, 'openrouter',  'meta-llama/llama-3.3-70b-instruct:free'],
+    [16, 'huggingface', 'accounts/fireworks/models/llama-v3p3-70b-instruct'],
+    // #17-23 weaker
+    [17, 'openrouter',  'nousresearch/hermes-3-llama-3.1-405b:free'],     // L3.1 base with tool-use tune
+    [18, 'groq',        'meta-llama/llama-4-scout-17b-16e-instruct'],     // multimodal focus
+    [19, 'openrouter',  'google/gemma-4-31b-it:free'],
+    [20, 'google',      'gemini-2.5-flash-lite'],
+    [21, 'github',      'gpt-4o'],                                        // Aug 2024, SWE-V ~33%
+    [22, 'nvidia',      'meta/llama-3.1-70b-instruct'],                   // older Llama 3.1 tune
+    [22, 'cloudflare',  '@cf/meta/llama-3.1-70b-instruct'],               // same base weights
+    [23, 'cohere',      'command-r-plus-08-2024'],                        // RAG-focused, weakest on code
+  ];
+  const apply = db.transaction(() => {
+    for (const [rank, platform, modelId] of ranks) {
+      setRank.run(rank, platform, modelId);
+    }
+  });
+  apply();
+}
+function ensureUnifiedKey(db: Database.Database) {
+  const existing = db.prepare("SELECT value FROM settings WHERE key = 'unified_api_key'").get() as { value: string } | undefined;
+  if (!existing) {
+    const key = `freellmapi-${crypto.randomBytes(24).toString('hex')}`;
+    db.prepare("INSERT INTO settings (key, value) VALUES ('unified_api_key', ?)").run(key);
+    console.log(`\n  Your unified API key: ${key}\n`);
+  }
+}
+export function getUnifiedApiKey(): string {
+  const db = getDb();
+  const row = db.prepare("SELECT value FROM settings WHERE key = 'unified_api_key'").get() as { value: string };
+  return row.value;
+}
+export function regenerateUnifiedKey(): string {
+  const db = getDb();
+  const key = `freellmapi-${crypto.randomBytes(24).toString('hex')}`;
+  db.prepare("UPDATE settings SET value = ? WHERE key = 'unified_api_key'").run(key);
+  return key;
+}
--- a/server/src/index.ts
+++ b/server/src/index.ts
+import { createApp } from './app.js';
+import { initDb } from './db/index.js';
+import { startHealthChecker } from './services/health.js';
+const PORT = process.env.PORT ?? 3001;
+async function main() {
+  initDb();
+  const app = createApp();
+  app.listen(Number(PORT), '0.0.0.0', () => {
+    console.log(`Server running on http://0.0.0.0:${PORT}`);
+    console.log(`Proxy endpoint: http://0.0.0.0:${PORT}/v1/chat/completions`);
+    startHealthChecker();
+  });
+}
+main().catch(console.error);
--- a/server/src/lib/crypto.ts
+++ b/server/src/lib/crypto.ts
+import crypto from 'crypto';
+import Database from 'better-sqlite3';
+const ALGORITHM = 'aes-256-gcm';
+let cachedKey: Buffer | null = null;
+/**
+ * Initialize encryption key from env, DB, or generate a new one.
+ * Must be called after DB is initialized.
+ */
+export function initEncryptionKey(db: Database.Database): void {
+  // 1. Check env var
+  const envKey = process.env.ENCRYPTION_KEY;
+  if (envKey && envKey !== 'your-64-char-hex-key-here') {
+    cachedKey = Buffer.from(envKey, 'hex');
+    return;
+  }
+  // 2. Check DB for persisted key
+  const row = db.prepare("SELECT value FROM settings WHERE key = 'encryption_key'").get() as { value: string } | undefined;
+  if (row) {
+    cachedKey = Buffer.from(row.value, 'hex');
+    return;
+  }
+  // 3. Generate and persist
+  cachedKey = crypto.randomBytes(32);
+  db.prepare("INSERT INTO settings (key, value) VALUES ('encryption_key', ?)").run(cachedKey.toString('hex'));
+}
+function getEncryptionKey(): Buffer {
+  if (!cachedKey) {
+    throw new Error('Encryption key not initialized. Call initEncryptionKey() first.');
+  }
+  return cachedKey;
+}
+export function encrypt(text: string): { encrypted: string; iv: string; authTag: string } {
+  const key = getEncryptionKey();
+  const iv = crypto.randomBytes(16);
+  const cipher = crypto.createCipheriv(ALGORITHM, key, iv);
+  let encrypted = cipher.update(text, 'utf8', 'hex');
+  encrypted += cipher.final('hex');
+  const authTag = cipher.getAuthTag().toString('hex');
+  return {
+    encrypted,
+    iv: iv.toString('hex'),
+    authTag,
+  };
+}
+export function decrypt(encrypted: string, iv: string, authTag: string): string {
+  const key = getEncryptionKey();
+  const decipher = crypto.createDecipheriv(ALGORITHM, key, Buffer.from(iv, 'hex'));
+  decipher.setAuthTag(Buffer.from(authTag, 'hex'));
+  let decrypted = decipher.update(encrypted, 'hex', 'utf8');
+  decrypted += decipher.final('utf8');
+  return decrypted;
+}
+export function maskKey(key: string): string {
+  if (key.length <= 8) return '****' + key.slice(-4);
+  return key.slice(0, 4) + '...' + key.slice(-4);
+}
--- a/server/src/middleware/errorHandler.ts
+++ b/server/src/middleware/errorHandler.ts
+import type { Request, Response, NextFunction } from 'express';
+export function errorHandler(err: Error, _req: Request, res: Response, _next: NextFunction) {
+  console.error('[Error]', err.message);
+  const status = (err as any).status ?? 500;
+  res.status(status).json({
+    error: {
+      message: err.message,
+      type: err.name ?? 'server_error',
+    },
+  });
+}
--- a/server/src/providers/base.ts
+++ b/server/src/providers/base.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+  Platform,
+} from '@freellmapi/shared/types.js';
+export interface CompletionOptions {
+  model?: string;
+  temperature?: number;
+  max_tokens?: number;
+  top_p?: number;
+}
+export abstract class BaseProvider {
+  abstract readonly platform: Platform;
+  abstract readonly name: string;
+  abstract chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse>;
+  abstract streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk>;
+  abstract validateKey(apiKey: string): Promise<boolean>;
+  protected async fetchWithTimeout(
+    url: string,
+    init: RequestInit,
+    timeoutMs = 15000,
+  ): Promise<Response> {
+    const controller = new AbortController();
+    const timeout = setTimeout(() => controller.abort(), timeoutMs);
+    try {
+      return await fetch(url, { ...init, signal: controller.signal });
+    } finally {
+      clearTimeout(timeout);
+    }
+  }
+  protected makeId(): string {
+    return `chatcmpl-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
+  }
+}
--- a/server/src/providers/cerebras.ts
+++ b/server/src/providers/cerebras.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+const API_BASE = 'https://api.cerebras.ai/v1';
+export class CerebrasProvider extends BaseProvider {
+  readonly platform = 'cerebras' as const;
+  readonly name = 'Cerebras';
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Cerebras API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const data = await res.json() as ChatCompletionResponse;
+    data._routed_via = { platform: 'cerebras', model: modelId };
+    return data;
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Cerebras API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const data = trimmed.slice(6);
+        if (data === '[DONE]') return;
+        yield JSON.parse(data) as ChatCompletionChunk;
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
+        method: 'GET',
+        headers: { 'Authorization': `Bearer ${apiKey}` },
+      }, 10000);
+      return res.ok;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/cloudflare.ts
+++ b/server/src/providers/cloudflare.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+/**
+ * Cloudflare Workers AI provider.
+ * API key format expected: "account_id:api_token"
+ * The account_id is extracted from the key to build the URL.
+ */
+export class CloudflareProvider extends BaseProvider {
+  readonly platform = 'cloudflare' as const;
+  readonly name = 'Cloudflare Workers AI';
+  private parseKey(apiKey: string): { accountId: string; token: string } {
+    const sep = apiKey.indexOf(':');
+    if (sep === -1) throw new Error('Cloudflare key must be in format "account_id:api_token"');
+    return { accountId: apiKey.slice(0, sep), token: apiKey.slice(sep + 1) };
+  }
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const { accountId, token } = this.parseKey(apiKey);
+    const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/run/${modelId}`;
+    const res = await this.fetchWithTimeout(url, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${token}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        messages,
+        max_tokens: options?.max_tokens,
+        temperature: options?.temperature,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      const errors = (err as any).errors;
+      throw new Error(`Cloudflare API error ${res.status}: ${errors?.[0]?.message ?? res.statusText}`);
+    }
+    const data = await res.json() as any;
+    const text = data.result?.response ?? '';
+    return {
+      id: this.makeId(),
+      object: 'chat.completion',
+      created: Math.floor(Date.now() / 1000),
+      model: modelId,
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: text },
+        finish_reason: 'stop',
+      }],
+      usage: {
+        prompt_tokens: 0,
+        completion_tokens: 0,
+        total_tokens: 0,
+      },
+      _routed_via: { platform: 'cloudflare', model: modelId },
+    };
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const { accountId, token } = this.parseKey(apiKey);
+    const url = `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/run/${modelId}`;
+    const res = await this.fetchWithTimeout(url, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${token}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        messages,
+        max_tokens: options?.max_tokens,
+        temperature: options?.temperature,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Cloudflare API error ${res.status}: ${(err as any).errors?.[0]?.message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    const id = this.makeId();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const data = trimmed.slice(6);
+        if (data === '[DONE]') return;
+        try {
+          const parsed = JSON.parse(data);
+          if (parsed.response) {
+            yield {
+              id,
+              object: 'chat.completion.chunk',
+              created: Math.floor(Date.now() / 1000),
+              model: modelId,
+              choices: [{ index: 0, delta: { content: parsed.response }, finish_reason: null }],
+            };
+          }
+        } catch { /* skip */ }
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const { token } = this.parseKey(apiKey);
+      const res = await this.fetchWithTimeout(
+        'https://api.cloudflare.com/client/v4/user/tokens/verify',
+        { method: 'GET', headers: { 'Authorization': `Bearer ${token}` } },
+        10000,
+      );
+      if (!res.ok) return false;
+      const data = await res.json() as any;
+      return data.success === true && data.result?.status === 'active';
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/cohere.ts
+++ b/server/src/providers/cohere.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+const API_BASE = 'https://api.cohere.com/v2';
+interface CohereResponse {
+  id: string;
+  message?: { content?: { type: string; text: string }[] };
+  finish_reason?: string;
+  usage?: {
+    tokens?: { input_tokens?: number; output_tokens?: number };
+  };
+}
+export class CohereProvider extends BaseProvider {
+  readonly platform = 'cohere' as const;
+  readonly name = 'Cohere';
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const cohereMessages = messages.map(m => ({
+      role: m.role === 'system' ? 'system' as const : m.role === 'assistant' ? 'assistant' as const : 'user' as const,
+      content: m.content,
+    }));
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages: cohereMessages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        p: options?.top_p,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Cohere API error ${res.status}: ${(err as any).message ?? res.statusText}`);
+    }
+    const data = await res.json() as CohereResponse;
+    const text = data.message?.content?.[0]?.text ?? '';
+    return {
+      id: data.id ?? this.makeId(),
+      object: 'chat.completion',
+      created: Math.floor(Date.now() / 1000),
+      model: modelId,
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: text },
+        finish_reason: data.finish_reason ?? 'stop',
+      }],
+      usage: {
+        prompt_tokens: data.usage?.tokens?.input_tokens ?? 0,
+        completion_tokens: data.usage?.tokens?.output_tokens ?? 0,
+        total_tokens: (data.usage?.tokens?.input_tokens ?? 0) + (data.usage?.tokens?.output_tokens ?? 0),
+      },
+      _routed_via: { platform: 'cohere', model: modelId },
+    };
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const cohereMessages = messages.map(m => ({
+      role: m.role === 'system' ? 'system' as const : m.role === 'assistant' ? 'assistant' as const : 'user' as const,
+      content: m.content,
+    }));
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages: cohereMessages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Cohere API error ${res.status}: ${(err as any).message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    const id = this.makeId();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed) continue;
+        try {
+          const event = JSON.parse(trimmed);
+          if (event.type === 'content-delta') {
+            const text = event.delta?.message?.content?.text ?? '';
+            if (text) {
+              yield {
+                id,
+                object: 'chat.completion.chunk',
+                created: Math.floor(Date.now() / 1000),
+                model: modelId,
+                choices: [{ index: 0, delta: { content: text }, finish_reason: null }],
+              };
+            }
+          } else if (event.type === 'message-end') {
+            yield {
+              id,
+              object: 'chat.completion.chunk',
+              created: Math.floor(Date.now() / 1000),
+              model: modelId,
+              choices: [{ index: 0, delta: {}, finish_reason: 'stop' }],
+            };
+          }
+        } catch {
+          // Skip malformed lines
+        }
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
+        method: 'GET',
+        headers: { 'Authorization': `Bearer ${apiKey}` },
+      }, 10000);
+      return res.ok;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/google.ts
+++ b/server/src/providers/google.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+  TokenUsage,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+const API_BASE = 'https://generativelanguage.googleapis.com/v1beta';
+// Translate OpenAI messages to Gemini format
+function toGeminiContents(messages: ChatMessage[]) {
+  const systemInstruction = messages.find(m => m.role === 'system');
+  const contents = messages
+    .filter(m => m.role !== 'system')
+    .map(m => ({
+      role: m.role === 'assistant' ? 'model' : 'user',
+      parts: [{ text: m.content }],
+    }));
+  return {
+    contents,
+    systemInstruction: systemInstruction
+      ? { parts: [{ text: systemInstruction.content }] }
+      : undefined,
+  };
+}
+interface GeminiResponse {
+  candidates?: {
+    content?: { parts?: { text?: string }[] };
+    finishReason?: string;
+  }[];
+  usageMetadata?: {
+    promptTokenCount?: number;
+    candidatesTokenCount?: number;
+    totalTokenCount?: number;
+  };
+}
+export class GoogleProvider extends BaseProvider {
+  readonly platform = 'google' as const;
+  readonly name = 'Google AI Studio';
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const { contents, systemInstruction } = toGeminiContents(messages);
+    const body: Record<string, unknown> = {
+      contents,
+      generationConfig: {
+        temperature: options?.temperature,
+        maxOutputTokens: options?.max_tokens,
+        topP: options?.top_p,
+      },
+    };
+    if (systemInstruction) body.systemInstruction = systemInstruction;
+    const url = `${API_BASE}/models/${modelId}:generateContent?key=${apiKey}`;
+    const res = await this.fetchWithTimeout(url, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify(body),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Google API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const data = await res.json() as GeminiResponse;
+    const text = data.candidates?.[0]?.content?.parts?.[0]?.text ?? '';
+    const usage: TokenUsage = {
+      prompt_tokens: data.usageMetadata?.promptTokenCount ?? 0,
+      completion_tokens: data.usageMetadata?.candidatesTokenCount ?? 0,
+      total_tokens: data.usageMetadata?.totalTokenCount ?? 0,
+    };
+    return {
+      id: this.makeId(),
+      object: 'chat.completion',
+      created: Math.floor(Date.now() / 1000),
+      model: modelId,
+      choices: [{
+        index: 0,
+        message: { role: 'assistant', content: text },
+        finish_reason: data.candidates?.[0]?.finishReason?.toLowerCase() === 'stop' ? 'stop' : 'stop',
+      }],
+      usage,
+      _routed_via: { platform: 'google', model: modelId },
+    };
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const { contents, systemInstruction } = toGeminiContents(messages);
+    const body: Record<string, unknown> = {
+      contents,
+      generationConfig: {
+        temperature: options?.temperature,
+        maxOutputTokens: options?.max_tokens,
+        topP: options?.top_p,
+      },
+    };
+    if (systemInstruction) body.systemInstruction = systemInstruction;
+    const url = `${API_BASE}/models/${modelId}:streamGenerateContent?alt=sse&key=${apiKey}`;
+    const res = await this.fetchWithTimeout(url, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify(body),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Google API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    const id = this.makeId();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const raw = trimmed.slice(6);
+        if (raw === '[DONE]') return;
+        const chunk = JSON.parse(raw) as GeminiResponse;
+        const text = chunk.candidates?.[0]?.content?.parts?.[0]?.text ?? '';
+        if (!text) continue;
+        yield {
+          id,
+          object: 'chat.completion.chunk',
+          created: Math.floor(Date.now() / 1000),
+          model: modelId,
+          choices: [{
+            index: 0,
+            delta: { content: text },
+            finish_reason: null,
+          }],
+        };
+      }
+    }
+    // Final chunk
+    yield {
+      id,
+      object: 'chat.completion.chunk',
+      created: Math.floor(Date.now() / 1000),
+      model: modelId,
+      choices: [{
+        index: 0,
+        delta: {},
+        finish_reason: 'stop',
+      }],
+    };
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const res = await this.fetchWithTimeout(
+        `${API_BASE}/models?key=${apiKey}`,
+        { method: 'GET' },
+        10000,
+      );
+      return res.ok;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/groq.ts
+++ b/server/src/providers/groq.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+const API_BASE = 'https://api.groq.com/openai/v1';
+export class GroqProvider extends BaseProvider {
+  readonly platform = 'groq' as const;
+  readonly name = 'Groq';
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Groq API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const data = await res.json() as ChatCompletionResponse;
+    data._routed_via = { platform: 'groq', model: modelId };
+    return data;
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`Groq API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const data = trimmed.slice(6);
+        if (data === '[DONE]') return;
+        yield JSON.parse(data) as ChatCompletionChunk;
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const res = await this.fetchWithTimeout(`${API_BASE}/models`, {
+        method: 'GET',
+        headers: { 'Authorization': `Bearer ${apiKey}` },
+      }, 10000);
+      return res.ok;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/huggingface.ts
+++ b/server/src/providers/huggingface.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+const API_BASE = 'https://router.huggingface.co/fireworks-ai/inference/v1';
+export class HuggingFaceProvider extends BaseProvider {
+  readonly platform = 'huggingface' as const;
+  readonly name = 'Hugging Face';
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    // HF Inference API supports OpenAI-compatible chat endpoint
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`HuggingFace API error ${res.status}: ${(err as any).error ?? res.statusText}`);
+    }
+    const data = await res.json() as ChatCompletionResponse;
+    data._routed_via = { platform: 'huggingface', model: modelId };
+    return data;
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const res = await this.fetchWithTimeout(`${API_BASE}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`HuggingFace API error ${res.status}: ${(err as any).error ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const data = trimmed.slice(6);
+        if (data === '[DONE]') return;
+        try {
+          yield JSON.parse(data) as ChatCompletionChunk;
+        } catch { /* skip */ }
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const res = await this.fetchWithTimeout('https://huggingface.co/api/whoami-v2', {
+        method: 'GET',
+        headers: { 'Authorization': `Bearer ${apiKey}` },
+      }, 10000);
+      return res.ok;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/providers/index.ts
+++ b/server/src/providers/index.ts
+import type { Platform } from '@freellmapi/shared/types.js';
+import type { BaseProvider } from './base.js';
+import { GoogleProvider } from './google.js';
+import { OpenAICompatProvider } from './openai-compat.js';
+import { CohereProvider } from './cohere.js';
+import { CloudflareProvider } from './cloudflare.js';
+import { HuggingFaceProvider } from './huggingface.js';
+const providers = new Map<Platform, BaseProvider>();
+function register(provider: BaseProvider) {
+  providers.set(provider.platform, provider);
+}
+// Google - unique Gemini API format
+register(new GoogleProvider());
+// Groq - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'groq',
+  name: 'Groq',
+  baseUrl: 'https://api.groq.com/openai/v1',
+}));
+// Cerebras - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'cerebras',
+  name: 'Cerebras',
+  baseUrl: 'https://api.cerebras.ai/v1',
+}));
+// SambaNova - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'sambanova',
+  name: 'SambaNova',
+  baseUrl: 'https://api.sambanova.ai/v1',
+}));
+// NVIDIA NIM - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'nvidia',
+  name: 'NVIDIA NIM',
+  baseUrl: 'https://integrate.api.nvidia.com/v1',
+}));
+// Mistral - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'mistral',
+  name: 'Mistral',
+  baseUrl: 'https://api.mistral.ai/v1',
+}));
+// OpenRouter - OpenAI-compatible with extra headers
+register(new OpenAICompatProvider({
+  platform: 'openrouter',
+  name: 'OpenRouter',
+  baseUrl: 'https://openrouter.ai/api/v1',
+  extraHeaders: {
+    'HTTP-Referer': 'http://localhost:3001',
+    'X-Title': 'FreeLLMAPI',
+  },
+}));
+// GitHub Models - OpenAI-compatible via Azure endpoint
+register(new OpenAICompatProvider({
+  platform: 'github',
+  name: 'GitHub Models',
+  baseUrl: 'https://models.inference.ai.azure.com',
+}));
+// Cohere - unique API format
+register(new CohereProvider());
+// Cloudflare Workers AI - unique API format (key = "account_id:token")
+register(new CloudflareProvider());
+// Hugging Face - OpenAI-compatible per-model endpoint
+register(new HuggingFaceProvider());
+// Zhipu (Z.ai / bigmodel.cn) - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'zhipu',
+  name: 'Zhipu AI',
+  baseUrl: 'https://open.bigmodel.cn/api/paas/v4',
+}));
+// Moonshot (Kimi) - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'moonshot',
+  name: 'Moonshot',
+  baseUrl: 'https://api.moonshot.ai/v1',
+}));
+// MiniMax - OpenAI-compatible
+register(new OpenAICompatProvider({
+  platform: 'minimax',
+  name: 'MiniMax',
+  baseUrl: 'https://api.minimax.io/v1',
+}));
+export function getProvider(platform: Platform): BaseProvider | undefined {
+  return providers.get(platform);
+}
+export function getAllProviders(): BaseProvider[] {
+  return Array.from(providers.values());
+}
+export function hasProvider(platform: Platform): boolean {
+  return providers.has(platform);
+}
--- a/server/src/providers/openai-compat.ts
+++ b/server/src/providers/openai-compat.ts
+import type {
+  ChatMessage,
+  ChatCompletionResponse,
+  ChatCompletionChunk,
+  Platform,
+} from '@freellmapi/shared/types.js';
+import { BaseProvider, type CompletionOptions } from './base.js';
+/**
+ * Generic provider for platforms that use an OpenAI-compatible API.
+ * Covers: Groq, Cerebras, SambaNova, NVIDIA NIM, Mistral, OpenRouter,
+ * GitHub Models, Fireworks AI.
+ */
+export class OpenAICompatProvider extends BaseProvider {
+  readonly platform: Platform;
+  readonly name: string;
+  private readonly baseUrl: string;
+  private readonly extraHeaders: Record<string, string>;
+  private readonly validateUrl?: string;
+  constructor(opts: {
+    platform: Platform;
+    name: string;
+    baseUrl: string;
+    extraHeaders?: Record<string, string>;
+    validateUrl?: string;
+  }) {
+    super();
+    this.platform = opts.platform;
+    this.name = opts.name;
+    this.baseUrl = opts.baseUrl;
+    this.extraHeaders = opts.extraHeaders ?? {};
+    this.validateUrl = opts.validateUrl;
+  }
+  async chatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): Promise<ChatCompletionResponse> {
+    const res = await this.fetchWithTimeout(`${this.baseUrl}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+        ...this.extraHeaders,
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`${this.name} API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const data = await res.json() as ChatCompletionResponse;
+    data._routed_via = { platform: this.platform, model: modelId };
+    return data;
+  }
+  async *streamChatCompletion(
+    apiKey: string,
+    messages: ChatMessage[],
+    modelId: string,
+    options?: CompletionOptions,
+  ): AsyncGenerator<ChatCompletionChunk> {
+    const res = await this.fetchWithTimeout(`${this.baseUrl}/chat/completions`, {
+      method: 'POST',
+      headers: {
+        'Authorization': `Bearer ${apiKey}`,
+        'Content-Type': 'application/json',
+        ...this.extraHeaders,
+      },
+      body: JSON.stringify({
+        model: modelId,
+        messages,
+        temperature: options?.temperature,
+        max_tokens: options?.max_tokens,
+        top_p: options?.top_p,
+        stream: true,
+      }),
+    });
+    if (!res.ok) {
+      const err = await res.json().catch(() => ({}));
+      throw new Error(`${this.name} API error ${res.status}: ${(err as any).error?.message ?? res.statusText}`);
+    }
+    const reader = res.body?.getReader();
+    if (!reader) throw new Error('No response body');
+    const decoder = new TextDecoder();
+    let buffer = '';
+    while (true) {
+      const { done, value } = await reader.read();
+      if (done) break;
+      buffer += decoder.decode(value, { stream: true });
+      const lines = buffer.split('\n');
+      buffer = lines.pop() ?? '';
+      for (const line of lines) {
+        const trimmed = line.trim();
+        if (!trimmed || !trimmed.startsWith('data: ')) continue;
+        const data = trimmed.slice(6);
+        if (data === '[DONE]') return;
+        try {
+          yield JSON.parse(data) as ChatCompletionChunk;
+        } catch {
+          // Skip malformed chunks
+        }
+      }
+    }
+  }
+  async validateKey(apiKey: string): Promise<boolean> {
+    try {
+      const url = this.validateUrl ?? `${this.baseUrl}/models`;
+      const res = await this.fetchWithTimeout(url, {
+        method: 'GET',
+        headers: {
+          'Authorization': `Bearer ${apiKey}`,
+          ...this.extraHeaders,
+        },
+      }, 10000);
+      // 401/403 = bad key, anything else (200, 404, etc) = key is valid
+      return res.status !== 401 && res.status !== 403;
+    } catch {
+      return false;
+    }
+  }
+}
--- a/server/src/routes/analytics.ts
+++ b/server/src/routes/analytics.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { getDb } from '../db/index.js';
+export const analyticsRouter = Router();
+function getTimeFilter(range: string): string {
+  switch (range) {
+    case '24h': return "datetime('now', '-1 day')";
+    case '7d': return "datetime('now', '-7 days')";
+    case '30d': return "datetime('now', '-30 days')";
+    default: return "datetime('now', '-7 days')";
+  }
+}
+// Summary stats
+analyticsRouter.get('/summary', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const since = getTimeFilter(range);
+  const db = getDb();
+  const stats = db.prepare(`
+    SELECT
+      COUNT(*) as total_requests,
+      SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as success_count,
+      SUM(input_tokens) as total_input_tokens,
+      SUM(output_tokens) as total_output_tokens,
+      AVG(latency_ms) as avg_latency_ms
+    FROM requests
+    WHERE created_at >= ${since}
+  `).get() as any;
+  const totalRequests = stats.total_requests ?? 0;
+  const successRate = totalRequests > 0 ? (stats.success_count / totalRequests) * 100 : 0;
+  const totalTokens = (stats.total_input_tokens ?? 0) + (stats.total_output_tokens ?? 0);
+  // Estimate cost savings: average ~$3/M input + $15/M output tokens (GPT-4o pricing)
+  const inputCost = ((stats.total_input_tokens ?? 0) / 1_000_000) * 3;
+  const outputCost = ((stats.total_output_tokens ?? 0) / 1_000_000) * 15;
+  res.json({
+    totalRequests,
+    successRate: Math.round(successRate * 10) / 10,
+    totalInputTokens: stats.total_input_tokens ?? 0,
+    totalOutputTokens: stats.total_output_tokens ?? 0,
+    avgLatencyMs: Math.round(stats.avg_latency_ms ?? 0),
+    estimatedCostSavings: Math.round((inputCost + outputCost) * 100) / 100,
+  });
+});
+// Stats grouped by model
+analyticsRouter.get('/by-model', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const since = getTimeFilter(range);
+  const db = getDb();
+  const rows = db.prepare(`
+    SELECT
+      r.platform,
+      r.model_id,
+      m.display_name,
+      COUNT(*) as requests,
+      SUM(CASE WHEN r.status = 'success' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as success_rate,
+      AVG(r.latency_ms) as avg_latency_ms,
+      SUM(r.input_tokens) as total_input_tokens,
+      SUM(r.output_tokens) as total_output_tokens
+    FROM requests r
+    LEFT JOIN models m ON m.platform = r.platform AND m.model_id = r.model_id
+    WHERE r.created_at >= ${since}
+    GROUP BY r.platform, r.model_id
+    ORDER BY requests DESC
+  `).all() as any[];
+  res.json(rows.map(r => ({
+    platform: r.platform,
+    modelId: r.model_id,
+    displayName: r.display_name ?? r.model_id,
+    requests: r.requests,
+    successRate: Math.round(r.success_rate * 10) / 10,
+    avgLatencyMs: Math.round(r.avg_latency_ms),
+    totalInputTokens: r.total_input_tokens ?? 0,
+    totalOutputTokens: r.total_output_tokens ?? 0,
+  })));
+});
+// Stats grouped by platform
+analyticsRouter.get('/by-platform', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const since = getTimeFilter(range);
+  const db = getDb();
+  const rows = db.prepare(`
+    SELECT
+      platform,
+      COUNT(*) as requests,
+      SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as success_rate,
+      AVG(latency_ms) as avg_latency_ms,
+      SUM(input_tokens) as total_input_tokens,
+      SUM(output_tokens) as total_output_tokens
+    FROM requests
+    WHERE created_at >= ${since}
+    GROUP BY platform
+    ORDER BY requests DESC
+  `).all() as any[];
+  res.json(rows.map(r => ({
+    platform: r.platform,
+    requests: r.requests,
+    successRate: Math.round(r.success_rate * 10) / 10,
+    avgLatencyMs: Math.round(r.avg_latency_ms),
+    totalInputTokens: r.total_input_tokens ?? 0,
+    totalOutputTokens: r.total_output_tokens ?? 0,
+  })));
+});
+// Timeline data
+analyticsRouter.get('/timeline', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const interval = (req.query.interval as string) ?? (range === '24h' ? 'hour' : 'day');
+  const since = getTimeFilter(range);
+  const db = getDb();
+  const dateFormat = interval === 'hour' ? '%Y-%m-%dT%H:00:00' : '%Y-%m-%d';
+  const rows = db.prepare(`
+    SELECT
+      strftime('${dateFormat}', created_at) as timestamp,
+      COUNT(*) as requests,
+      SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as success_count,
+      SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as failure_count
+    FROM requests
+    WHERE created_at >= ${since}
+    GROUP BY strftime('${dateFormat}', created_at)
+    ORDER BY timestamp ASC
+  `).all() as any[];
+  res.json(rows.map(r => ({
+    timestamp: r.timestamp,
+    requests: r.requests,
+    successCount: r.success_count,
+    failureCount: r.failure_count,
+  })));
+});
+// Error distribution (grouped by error type and platform)
+analyticsRouter.get('/error-distribution', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const since = getTimeFilter(range);
+  const db = getDb();
+  // Group errors by category (extract the key part of the error message)
+  const rows = db.prepare(`
+    SELECT
+      platform,
+      model_id,
+      CASE
+        WHEN error LIKE '%429%' OR error LIKE '%rate limit%' OR error LIKE '%too many%' OR error LIKE '%quota%' THEN 'Rate Limited (429)'
+        WHEN error LIKE '%401%' OR error LIKE '%unauthorized%' OR error LIKE '%invalid.*key%' THEN 'Auth Error (401)'
+        WHEN error LIKE '%403%' OR error LIKE '%forbidden%' THEN 'Forbidden (403)'
+        WHEN error LIKE '%404%' OR error LIKE '%not found%' THEN 'Not Found (404)'
+        WHEN error LIKE '%timeout%' OR error LIKE '%ETIMEDOUT%' OR error LIKE '%ECONNREFUSED%' THEN 'Timeout/Connection'
+        WHEN error LIKE '%500%' OR error LIKE '%internal server%' THEN 'Server Error (500)'
+        WHEN error LIKE '%503%' OR error LIKE '%unavailable%' THEN 'Unavailable (503)'
+        ELSE 'Other'
+      END as error_category,
+      COUNT(*) as count
+    FROM requests
+    WHERE status = 'error' AND created_at >= ${since}
+    GROUP BY platform, error_category
+    ORDER BY count DESC
+  `).all() as any[];
+  // Also get totals by category
+  const byCategory = db.prepare(`
+    SELECT
+      CASE
+        WHEN error LIKE '%429%' OR error LIKE '%rate limit%' OR error LIKE '%too many%' OR error LIKE '%quota%' THEN 'Rate Limited (429)'
+        WHEN error LIKE '%401%' OR error LIKE '%unauthorized%' OR error LIKE '%invalid.*key%' THEN 'Auth Error (401)'
+        WHEN error LIKE '%403%' OR error LIKE '%forbidden%' THEN 'Forbidden (403)'
+        WHEN error LIKE '%404%' OR error LIKE '%not found%' THEN 'Not Found (404)'
+        WHEN error LIKE '%timeout%' OR error LIKE '%ETIMEDOUT%' OR error LIKE '%ECONNREFUSED%' THEN 'Timeout/Connection'
+        WHEN error LIKE '%500%' OR error LIKE '%internal server%' THEN 'Server Error (500)'
+        WHEN error LIKE '%503%' OR error LIKE '%unavailable%' THEN 'Unavailable (503)'
+        ELSE 'Other'
+      END as category,
+      COUNT(*) as count
+    FROM requests
+    WHERE status = 'error' AND created_at >= ${since}
+    GROUP BY category
+    ORDER BY count DESC
+  `).all() as any[];
+  // Errors by platform
+  const byPlatform = db.prepare(`
+    SELECT platform, COUNT(*) as count
+    FROM requests
+    WHERE status = 'error' AND created_at >= ${since}
+    GROUP BY platform
+    ORDER BY count DESC
+  `).all() as any[];
+  res.json({
+    byCategory,
+    byPlatform,
+    detailed: rows,
+  });
+});
+// Recent errors
+analyticsRouter.get('/errors', (req: Request, res: Response) => {
+  const range = (req.query.range as string) ?? '7d';
+  const since = getTimeFilter(range);
+  const db = getDb();
+  const rows = db.prepare(`
+    SELECT id, platform, model_id, error, latency_ms, created_at
+    FROM requests
+    WHERE status = 'error' AND created_at >= ${since}
+    ORDER BY created_at DESC
+    LIMIT 50
+  `).all() as any[];
+  res.json(rows.map(r => ({
+    id: r.id,
+    platform: r.platform,
+    modelId: r.model_id,
+    error: r.error,
+    latencyMs: r.latency_ms,
+    createdAt: r.created_at,
+  })));
+});
--- a/server/src/routes/fallback.ts
+++ b/server/src/routes/fallback.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { z } from 'zod';
+import { getDb } from '../db/index.js';
+import { getAllPenalties } from '../services/router.js';
+export const fallbackRouter = Router();
+// Get fallback chain (with dynamic penalties)
+fallbackRouter.get('/', (_req: Request, res: Response) => {
+  const db = getDb();
+  const rows = db.prepare(`
+    SELECT fc.model_db_id, fc.priority, fc.enabled,
+           m.platform, m.model_id, m.display_name, m.intelligence_rank,
+           m.speed_rank, m.size_label, m.rpm_limit, m.rpd_limit,
+           m.monthly_token_budget
+    FROM fallback_config fc
+    JOIN models m ON m.id = fc.model_db_id
+    ORDER BY fc.priority ASC
+  `).all() as any[];
+  // Count enabled keys per platform
+  const keyCounts = db.prepare(`
+    SELECT platform, COUNT(*) as count
+    FROM api_keys WHERE enabled = 1
+    GROUP BY platform
+  `).all() as { platform: string; count: number }[];
+  const keyCountMap = new Map(keyCounts.map(k => [k.platform, k.count]));
+  // Get current dynamic penalties
+  const penalties = getAllPenalties();
+  const penaltyMap = new Map(penalties.map(p => [p.modelDbId, p]));
+  res.json(rows.map(r => {
+    const penalty = penaltyMap.get(r.model_db_id);
+    return {
+      modelDbId: r.model_db_id,
+      priority: r.priority,
+      effectivePriority: r.priority + (penalty?.penalty ?? 0),
+      penalty: penalty?.penalty ?? 0,
+      rateLimitHits: penalty?.count ?? 0,
+      enabled: r.enabled === 1,
+      platform: r.platform,
+      modelId: r.model_id,
+      displayName: r.display_name,
+      intelligenceRank: r.intelligence_rank,
+      speedRank: r.speed_rank,
+      sizeLabel: r.size_label,
+      rpmLimit: r.rpm_limit,
+      rpdLimit: r.rpd_limit,
+      monthlyTokenBudget: r.monthly_token_budget,
+      keyCount: keyCountMap.get(r.platform) ?? 0,
+    };
+  }));
+});
+const updateSchema = z.array(z.object({
+  modelDbId: z.number(),
+  priority: z.number(),
+  enabled: z.boolean(),
+}));
+// Update fallback chain (full replace)
+fallbackRouter.put('/', (req: Request, res: Response) => {
+  const parsed = updateSchema.safeParse(req.body);
+  if (!parsed.success) {
+    res.status(400).json({ error: { message: parsed.error.errors.map(e => e.message).join(', ') } });
+    return;
+  }
+  const db = getDb();
+  const update = db.prepare(`
+    UPDATE fallback_config SET priority = ?, enabled = ? WHERE model_db_id = ?
+  `);
+  const updateAll = db.transaction(() => {
+    for (const entry of parsed.data) {
+      update.run(entry.priority, entry.enabled ? 1 : 0, entry.modelDbId);
+    }
+  });
+  updateAll();
+  res.json({ success: true });
+});
+// Sort presets
+fallbackRouter.post('/sort/:preset', (req: Request, res: Response) => {
+  const { preset } = req.params;
+  const db = getDb();
+  let orderBy: string;
+  switch (preset) {
+    case 'intelligence':
+      orderBy = 'm.intelligence_rank ASC';
+      break;
+    case 'speed':
+      orderBy = 'm.speed_rank ASC';
+      break;
+    case 'budget':
+      orderBy = "CASE m.monthly_token_budget WHEN '~120M' THEN 1 WHEN '~50-100M' THEN 2 WHEN '~30M' THEN 3 WHEN '~18-45M' THEN 4 WHEN '~18M' THEN 5 WHEN '~15M' THEN 6 WHEN '~12M' THEN 7 WHEN '~6M' THEN 8 WHEN '~5-10M' THEN 9 WHEN '~4M' THEN 10 ELSE 11 END ASC";
+      break;
+    default:
+      res.status(400).json({ error: { message: `Unknown preset: ${preset}. Use: intelligence, speed, budget` } });
+      return;
+  }
+  const models = db.prepare(`
+    SELECT m.id FROM models m ORDER BY ${orderBy}
+  `).all() as { id: number }[];
+  const update = db.prepare('UPDATE fallback_config SET priority = ? WHERE model_db_id = ?');
+  const reorder = db.transaction(() => {
+    for (let i = 0; i < models.length; i++) {
+      update.run(i + 1, models[i].id);
+    }
+  });
+  reorder();
+  res.json({ success: true, preset });
+});
+// Token usage per model for the stacked bar
+fallbackRouter.get('/token-usage', (_req: Request, res: Response) => {
+  const db = getDb();
+  // Get platforms that have enabled keys
+  const platforms = db.prepare(`
+    SELECT DISTINCT ak.platform
+    FROM api_keys ak
+    WHERE ak.enabled = 1
+  `).all() as { platform: string }[];
+  const platformSet = new Set(platforms.map(p => p.platform));
+  // Get monthly budget per model, ordered by fallback priority
+  const models = db.prepare(`
+    SELECT m.platform, m.model_id, m.display_name, m.monthly_token_budget,
+           fc.priority
+    FROM models m
+    JOIN fallback_config fc ON fc.model_db_id = m.id
+    WHERE m.enabled = 1
+    ORDER BY fc.priority ASC
+  `).all() as { platform: string; model_id: string; display_name: string; monthly_token_budget: string; priority: number }[];
+  function parseBudget(s: string): number {
+    const m = s.match(/~?([\d.]+)(?:-([\d.]+))?([MK])?/);
+    if (!m) return 0;
+    const high = parseFloat(m[2] ?? m[1]);
+    const unit = m[3] === 'M' ? 1_000_000 : m[3] === 'K' ? 1_000 : 1;
+    return high * unit;
+  }
+  // Build per-model breakdown (only platforms with keys)
+  const modelBudgets = models
+    .filter(m => platformSet.has(m.platform))
+    .map(m => ({
+      displayName: m.display_name,
+      platform: m.platform,
+      budget: parseBudget(m.monthly_token_budget),
+    }));
+  const totalBudget = modelBudgets.reduce((s, m) => s + m.budget, 0);
+  // Tokens used this month
+  const usage = db.prepare(`
+    SELECT
+      COALESCE(SUM(input_tokens + output_tokens), 0) as total_used
+    FROM requests
+    WHERE created_at >= datetime('now', 'start of month')
+  `).get() as { total_used: number };
+  res.json({
+    totalBudget,
+    totalUsed: usage.total_used,
+    models: modelBudgets,
+  });
+});
--- a/server/src/routes/health.ts
+++ b/server/src/routes/health.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { getDb } from '../db/index.js';
+import { checkKeyHealth, checkAllKeys } from '../services/health.js';
+import { hasProvider } from '../providers/index.js';
+export const healthRouter = Router();
+// Get health status for all platforms
+healthRouter.get('/', (_req: Request, res: Response) => {
+  const db = getDb();
+  const platforms = db.prepare(`
+    SELECT
+      platform,
+      COUNT(*) as total_keys,
+      SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END) as healthy_keys,
+      SUM(CASE WHEN status = 'rate_limited' THEN 1 ELSE 0 END) as rate_limited_keys,
+      SUM(CASE WHEN status = 'invalid' THEN 1 ELSE 0 END) as invalid_keys,
+      SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as error_keys,
+      SUM(CASE WHEN status = 'unknown' THEN 1 ELSE 0 END) as unknown_keys,
+      SUM(CASE WHEN enabled = 1 THEN 1 ELSE 0 END) as enabled_keys
+    FROM api_keys
+    GROUP BY platform
+  `).all() as any[];
+  const keys = db.prepare(`
+    SELECT id, platform, label, status, enabled, created_at, last_checked_at
+    FROM api_keys
+    ORDER BY platform, created_at DESC
+  `).all() as any[];
+  res.json({
+    platforms: platforms.map(p => ({
+      platform: p.platform,
+      hasProvider: hasProvider(p.platform),
+      totalKeys: p.total_keys,
+      healthyKeys: p.healthy_keys,
+      rateLimitedKeys: p.rate_limited_keys,
+      invalidKeys: p.invalid_keys,
+      errorKeys: p.error_keys,
+      unknownKeys: p.unknown_keys,
+      enabledKeys: p.enabled_keys,
+    })),
+    keys: keys.map(k => ({
+      id: k.id,
+      platform: k.platform,
+      label: k.label,
+      status: k.status,
+      enabled: k.enabled === 1,
+      createdAt: k.created_at,
+      lastCheckedAt: k.last_checked_at,
+    })),
+  });
+});
+// Check a specific key
+healthRouter.post('/check/:keyId', async (req: Request, res: Response) => {
+  const keyId = parseInt(req.params.keyId as string, 10);
+  if (isNaN(keyId)) {
+    res.status(400).json({ error: { message: 'Invalid key ID' } });
+    return;
+  }
+  const status = await checkKeyHealth(keyId);
+  res.json({ keyId, status });
+});
+// Check all keys
+healthRouter.post('/check-all', async (_req: Request, res: Response) => {
+  await checkAllKeys();
+  res.json({ success: true });
+});
--- a/server/src/routes/keys.ts
+++ b/server/src/routes/keys.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { z } from 'zod';
+import { getDb } from '../db/index.js';
+import { encrypt, decrypt, maskKey } from '../lib/crypto.js';
+export const keysRouter = Router();
+const PLATFORMS = [
+  'google', 'groq', 'cerebras', 'sambanova', 'nvidia', 'mistral',
+  'openrouter', 'github', 'huggingface', 'cohere', 'cloudflare',
+] as const;
+const addKeySchema = z.object({
+  platform: z.enum(PLATFORMS),
+  key: z.string().min(1),
+  label: z.string().optional(),
+});
+// List all keys (masked)
+keysRouter.get('/', (_req: Request, res: Response) => {
+  const db = getDb();
+  const rows = db.prepare('SELECT * FROM api_keys ORDER BY created_at DESC').all() as any[];
+  const keys = rows.map(row => {
+    let maskedKey = '****';
+    try {
+      const realKey = decrypt(row.encrypted_key, row.iv, row.auth_tag);
+      maskedKey = maskKey(realKey);
+    } catch {
+      maskedKey = '[decrypt failed]';
+    }
+    return {
+      id: row.id,
+      platform: row.platform,
+      label: row.label,
+      maskedKey,
+      status: row.status,
+      enabled: row.enabled === 1,
+      createdAt: row.created_at,
+      lastCheckedAt: row.last_checked_at,
+    };
+  });
+  res.json(keys);
+});
+// Add a key
+keysRouter.post('/', (req: Request, res: Response) => {
+  const parsed = addKeySchema.safeParse(req.body);
+  if (!parsed.success) {
+    res.status(400).json({ error: { message: parsed.error.errors.map(e => e.message).join(', ') } });
+    return;
+  }
+  const { platform, key, label } = parsed.data;
+  const { encrypted, iv, authTag } = encrypt(key);
+  const db = getDb();
+  const result = db.prepare(`
+    INSERT INTO api_keys (platform, label, encrypted_key, iv, auth_tag, status, enabled)
+    VALUES (?, ?, ?, ?, ?, 'unknown', 1)
+  `).run(platform, label ?? '', encrypted, iv, authTag);
+  res.status(201).json({
+    id: result.lastInsertRowid,
+    platform,
+    label: label ?? '',
+    maskedKey: maskKey(key),
+    status: 'unknown',
+    enabled: true,
+  });
+});
+// Delete a key
+keysRouter.delete('/:id', (req: Request, res: Response) => {
+  const id = parseInt(req.params.id as string, 10);
+  if (isNaN(id)) {
+    res.status(400).json({ error: { message: 'Invalid key ID' } });
+    return;
+  }
+  const db = getDb();
+  const result = db.prepare('DELETE FROM api_keys WHERE id = ?').run(id);
+  if (result.changes === 0) {
+    res.status(404).json({ error: { message: 'Key not found' } });
+    return;
+  }
+  res.json({ success: true });
+});
+// Toggle enable/disable
+keysRouter.patch('/:id', (req: Request, res: Response) => {
+  const id = parseInt(req.params.id as string, 10);
+  if (isNaN(id)) {
+    res.status(400).json({ error: { message: 'Invalid key ID' } });
+    return;
+  }
+  const { enabled } = req.body;
+  if (typeof enabled !== 'boolean') {
+    res.status(400).json({ error: { message: 'enabled must be a boolean' } });
+    return;
+  }
+  const db = getDb();
+  const result = db.prepare('UPDATE api_keys SET enabled = ? WHERE id = ?').run(enabled ? 1 : 0, id);
+  if (result.changes === 0) {
+    res.status(404).json({ error: { message: 'Key not found' } });
+    return;
+  }
+  res.json({ success: true, enabled });
+});
--- a/server/src/routes/models.ts
+++ b/server/src/routes/models.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { getDb } from '../db/index.js';
+import { hasProvider } from '../providers/index.js';
+export const modelsRouter = Router();
+// List all models with availability info
+modelsRouter.get('/', (_req: Request, res: Response) => {
+  const db = getDb();
+  const models = db.prepare(`
+    SELECT m.*, fc.priority, fc.enabled as fallback_enabled
+    FROM models m
+    LEFT JOIN fallback_config fc ON fc.model_db_id = m.id
+    ORDER BY COALESCE(fc.priority, m.intelligence_rank) ASC
+  `).all() as any[];
+  // Count keys per platform
+  const keyCounts = db.prepare(`
+    SELECT platform, COUNT(*) as count
+    FROM api_keys
+    WHERE enabled = 1
+    GROUP BY platform
+  `).all() as { platform: string; count: number }[];
+  const keyCountMap = new Map(keyCounts.map(k => [k.platform, k.count]));
+  const result = models.map(m => ({
+    id: m.id,
+    platform: m.platform,
+    modelId: m.model_id,
+    displayName: m.display_name,
+    intelligenceRank: m.intelligence_rank,
+    speedRank: m.speed_rank,
+    sizeLabel: m.size_label,
+    rpmLimit: m.rpm_limit,
+    rpdLimit: m.rpd_limit,
+    tpmLimit: m.tpm_limit,
+    tpdLimit: m.tpd_limit,
+    monthlyTokenBudget: m.monthly_token_budget,
+    contextWindow: m.context_window,
+    enabled: m.enabled === 1,
+    priority: m.priority,
+    fallbackEnabled: m.fallback_enabled === 1,
+    hasProvider: hasProvider(m.platform),
+    keyCount: keyCountMap.get(m.platform) ?? 0,
+  }));
+  res.json(result);
+});
--- a/server/src/routes/proxy.ts
+++ b/server/src/routes/proxy.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { z } from 'zod';
+import { routeRequest, recordRateLimitHit, recordSuccess, type RouteResult } from '../services/router.js';
+import { recordRequest, recordTokens, setCooldown } from '../services/ratelimit.js';
+import { getDb, getUnifiedApiKey } from '../db/index.js';
+export const proxyRouter = Router();
+// Sticky sessions: track which model served each "session"
+// Key: hash of first user message → model_db_id
+// This prevents model switching mid-conversation which causes hallucination
+const stickySessionMap = new Map<string, { modelDbId: number; lastUsed: number }>();
+const STICKY_TTL_MS = 30 * 60 * 1000; // 30 min session TTL
+function getSessionKey(messages: { role: string; content: string }[]): string {
+  // Use the first user message as session identifier
+  // Hermes sends the full conversation each time, so first user msg is stable
+  const firstUser = messages.find(m => m.role === 'user');
+  if (!firstUser) return '';
+  // Hash: first 100 chars of first user message + message count
+  return `${firstUser.content.slice(0, 100)}:${messages.length > 2 ? 'multi' : 'single'}`;
+}
+function getStickyModel(messages: { role: string; content: string }[]): number | undefined {
+  // Only apply sticky for multi-turn (has assistant messages = continuation)
+  const hasAssistant = messages.some(m => m.role === 'assistant');
+  if (!hasAssistant) return undefined;
+  const key = getSessionKey(messages);
+  if (!key) return undefined;
+  const entry = stickySessionMap.get(key);
+  if (!entry) return undefined;
+  if (Date.now() - entry.lastUsed > STICKY_TTL_MS) {
+    stickySessionMap.delete(key);
+    return undefined;
+  }
+  return entry.modelDbId;
+}
+function setStickyModel(messages: { role: string; content: string }[], modelDbId: number) {
+  const key = getSessionKey(messages);
+  if (!key) return;
+  stickySessionMap.set(key, { modelDbId, lastUsed: Date.now() });
+  // Cleanup old entries
+  if (stickySessionMap.size > 500) {
+    const now = Date.now();
+    for (const [k, v] of stickySessionMap) {
+      if (now - v.lastUsed > STICKY_TTL_MS) stickySessionMap.delete(k);
+    }
+  }
+}
+// OpenAI-compatible /models endpoint (used by Hermes for metadata)
+proxyRouter.get('/models', (_req: Request, res: Response) => {
+  const db = getDb();
+  const models = db.prepare('SELECT platform, model_id, display_name, context_window FROM models WHERE enabled = 1 ORDER BY intelligence_rank').all() as any[];
+  res.json({
+    object: 'list',
+    data: models.map(m => ({
+      id: m.model_id,
+      object: 'model',
+      created: 0,
+      owned_by: m.platform,
+      name: m.display_name,
+      context_window: m.context_window,
+    })),
+  });
+});
+const MAX_RETRIES = 20;
+const chatCompletionSchema = z.object({
+  messages: z.array(z.object({
+    role: z.enum(['system', 'user', 'assistant']),
+    content: z.string(),
+  })).min(1),
+  model: z.string().optional(),
+  temperature: z.number().min(0).max(2).optional(),
+  max_tokens: z.number().int().positive().optional(),
+  top_p: z.number().min(0).max(1).optional(),
+  stream: z.boolean().optional(),
+});
+function isRetryableError(err: any): boolean {
+  const msg = (err.message ?? '').toLowerCase();
+  return msg.includes('429') || msg.includes('rate limit') || msg.includes('too many requests')
+    || msg.includes('quota') || msg.includes('resource_exhausted')
+    || msg.includes('aborted') || msg.includes('timeout') || msg.includes('etimedout')
+    || msg.includes('econnrefused') || msg.includes('econnreset')
+    || msg.includes('503') || msg.includes('unavailable')
+    || msg.includes('500') || msg.includes('internal server error');
+}
+proxyRouter.post('/chat/completions', async (req: Request, res: Response) => {
+  const start = Date.now();
+  // Authenticate with unified API key (skip for local requests)
+  const authHeader = req.headers.authorization;
+  const isLocal = req.ip === '127.0.0.1' || req.ip === '::1' || req.ip === '::ffff:127.0.0.1';
+  if (authHeader && !isLocal) {
+    const token = authHeader.replace(/^Bearer\s+/i, '');
+    const unifiedKey = getUnifiedApiKey();
+    if (token !== unifiedKey) {
+      res.status(401).json({
+        error: { message: 'Invalid API key', type: 'authentication_error' },
+      });
+      return;
+    }
+  }
+  // Validate request
+  const parsed = chatCompletionSchema.safeParse(req.body);
+  if (!parsed.success) {
+    res.status(400).json({
+      error: {
+        message: `Invalid request: ${parsed.error.errors.map(e => e.message).join(', ')}`,
+        type: 'invalid_request_error',
+      },
+    });
+    return;
+  }
+  const { messages, temperature, max_tokens, top_p, stream } = parsed.data;
+  const estimatedInputTokens = messages.reduce((sum, m) => sum + Math.ceil(m.content.length / 4), 0);
+  const estimatedTotal = estimatedInputTokens + (max_tokens ?? 1000);
+  // Sticky session: prefer the same model for multi-turn conversations
+  const preferredModel = getStickyModel(messages);
+  // Retry loop: on 429/rate limit, skip that model+key and try the next one
+  const skipKeys = new Set<string>();
+  let lastError: any = null;
+  for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
+    let route: RouteResult;
+    try {
+      route = routeRequest(estimatedTotal, skipKeys.size > 0 ? skipKeys : undefined, preferredModel);
+    } catch (err: any) {
+      // No more models available
+      if (lastError) {
+        res.status(429).json({
+          error: {
+            message: `All models rate-limited. Last error: ${lastError.message}`,
+            type: 'rate_limit_error',
+          },
+        });
+      } else {
+        res.status(err.status ?? 503).json({
+          error: { message: err.message, type: 'routing_error' },
+        });
+      }
+      return;
+    }
+    recordRequest(route.platform, route.modelId, route.keyId);
+    try {
+      if (stream) {
+        // Streaming - can't retry once we start writing
+        res.setHeader('Content-Type', 'text/event-stream');
+        res.setHeader('Cache-Control', 'no-cache');
+        res.setHeader('Connection', 'keep-alive');
+        res.setHeader('X-Routed-Via', `${route.platform}/${route.modelId}`);
+        if (attempt > 0) res.setHeader('X-Fallback-Attempts', String(attempt));
+        let totalOutputTokens = 0;
+        const gen = route.provider.streamChatCompletion(
+          route.apiKey, messages, route.modelId,
+          { temperature, max_tokens, top_p },
+        );
+        for await (const chunk of gen) {
+          const text = chunk.choices[0]?.delta?.content ?? '';
+          totalOutputTokens += Math.ceil(text.length / 4);
+          res.write(`data: ${JSON.stringify(chunk)}\n\n`);
+        }
+        res.write('data: [DONE]\n\n');
+        res.end();
+        recordTokens(route.platform, route.modelId, route.keyId, estimatedInputTokens + totalOutputTokens);
+        recordSuccess(route.modelDbId);
+        setStickyModel(messages, route.modelDbId);
+        logRequest(route.platform, route.modelId, 'success', estimatedInputTokens, totalOutputTokens, Date.now() - start, null);
+        return;
+      } else {
+        const result = await route.provider.chatCompletion(
+          route.apiKey, messages, route.modelId,
+          { temperature, max_tokens, top_p },
+        );
+        const totalTokens = result.usage?.total_tokens ?? 0;
+        recordTokens(route.platform, route.modelId, route.keyId, totalTokens);
+        recordSuccess(route.modelDbId);
+        setStickyModel(messages, route.modelDbId);
+        res.setHeader('X-Routed-Via', `${route.platform}/${route.modelId}`);
+        if (attempt > 0) res.setHeader('X-Fallback-Attempts', String(attempt));
+        res.json(result);
+        logRequest(
+          route.platform, route.modelId, 'success',
+          result.usage?.prompt_tokens ?? 0,
+          result.usage?.completion_tokens ?? 0,
+          Date.now() - start, null,
+        );
+        return;
+      }
+    } catch (err: any) {
+      const latency = Date.now() - start;
+      logRequest(route.platform, route.modelId, 'error', estimatedInputTokens, 0, latency, err.message);
+      if (isRetryableError(err)) {
+        // Put this model+key on cooldown and try the next one
+        const skipId = `${route.platform}:${route.modelId}:${route.keyId}`;
+        skipKeys.add(skipId);
+        setCooldown(route.platform, route.modelId, route.keyId, 120_000);
+        recordRateLimitHit(route.modelDbId);
+        lastError = err;
+        console.log(`[Proxy] ${err.message.slice(0, 60)} from ${route.displayName}, falling back (attempt ${attempt + 1}/${MAX_RETRIES})`);
+        continue;
+      }
+      // Non-retryable error (auth, 4xx, etc.): don't retry
+      res.status(502).json({
+        error: {
+          message: `Provider error (${route.displayName}): ${err.message}`,
+          type: 'provider_error',
+        },
+      });
+      return;
+    }
+  }
+  // Exhausted all retries
+  res.status(429).json({
+    error: {
+      message: `All models rate-limited after ${MAX_RETRIES} attempts. Last: ${lastError?.message}`,
+      type: 'rate_limit_error',
+    },
+  });
+});
+function logRequest(
+  platform: string,
+  modelId: string,
+  status: string,
+  inputTokens: number,
+  outputTokens: number,
+  latencyMs: number,
+  error: string | null,
+) {
+  try {
+    const db = getDb();
+    db.prepare(`
+      INSERT INTO requests (platform, model_id, status, input_tokens, output_tokens, latency_ms, error)
+      VALUES (?, ?, ?, ?, ?, ?, ?)
+    `).run(platform, modelId, status, inputTokens, outputTokens, latencyMs, error);
+  } catch (e) {
+    console.error('Failed to log request:', e);
+  }
+}
--- a/server/src/routes/settings.ts
+++ b/server/src/routes/settings.ts
+import { Router } from 'express';
+import type { Request, Response } from 'express';
+import { getUnifiedApiKey, regenerateUnifiedKey } from '../db/index.js';
+export const settingsRouter = Router();
+// Get the unified API key
+settingsRouter.get('/api-key', (_req: Request, res: Response) => {
+  res.json({ apiKey: getUnifiedApiKey() });
+});
+// Regenerate the unified API key
+settingsRouter.post('/api-key/regenerate', (_req: Request, res: Response) => {
+  const newKey = regenerateUnifiedKey();
+  res.json({ apiKey: newKey });
+});
--- a/server/src/scripts/test-all-models.ts
+++ b/server/src/scripts/test-all-models.ts
+/**
+ * Probe every enabled model with a minimal request to find broken model IDs.
+ * Usage: npx tsx src/scripts/test-all-models.ts
+ */
+import { initDb, getDb } from '../db/index.js';
+import { decrypt } from '../lib/crypto.js';
+import { getProvider } from '../providers/index.js';
+initDb();
+const db = getDb();
+interface Row {
+  id: number;
+  platform: string;
+  model_id: string;
+  display_name: string;
+}
+interface Key {
+  encrypted_key: string;
+  iv: string;
+  auth_tag: string;
+}
+const models = db.prepare(`
+  SELECT m.id, m.platform, m.model_id, m.display_name
+    FROM models m
+   WHERE m.enabled = 1
+     AND EXISTS (SELECT 1 FROM api_keys k WHERE k.platform = m.platform AND k.enabled = 1)
+   ORDER BY m.intelligence_rank, m.platform
+`).all() as Row[];
+const keyStmt = db.prepare(`
+  SELECT encrypted_key, iv, auth_tag FROM api_keys
+   WHERE platform = ? AND enabled = 1 ORDER BY id LIMIT 1
+`);
+const results: { row: Row; ok: boolean; ms: number; error?: string; reply?: string }[] = [];
+for (const row of models) {
+  const keyRow = keyStmt.get(row.platform) as Key | undefined;
+  if (!keyRow) { results.push({ row, ok: false, ms: 0, error: 'no key' }); continue; }
+  const apiKey = decrypt(keyRow.encrypted_key, keyRow.iv, keyRow.auth_tag);
+  const provider = getProvider(row.platform as any);
+  if (!provider) { results.push({ row, ok: false, ms: 0, error: 'no provider' }); continue; }
+  const start = Date.now();
+  try {
+    const res = await provider.chatCompletion(apiKey, [{ role: 'user', content: 'hi' }], row.model_id, { max_tokens: 5 });
+    const reply = res.choices?.[0]?.message?.content?.slice(0, 40) ?? '';
+    results.push({ row, ok: true, ms: Date.now() - start, reply });
+  } catch (err: any) {
+    results.push({ row, ok: false, ms: Date.now() - start, error: String(err?.message ?? err).slice(0, 200) });
+  }
+}
+console.log('\n=== Results ===\n');
+const pad = (s: string, n: number) => s.length > n ? s.slice(0, n - 1) + '…' : s.padEnd(n);
+for (const r of results) {
+  const status = r.ok ? '✓' : '✗';
+  console.log(`${status} ${pad(r.row.platform, 12)} ${pad(r.row.model_id, 52)} ${String(r.ms).padStart(5)}ms  ${r.ok ? `"${r.reply}"` : r.error}`);
+}
+const okCount = results.filter(r => r.ok).length;
+console.log(`\n${okCount}/${results.length} models working\n`);
+process.exit(0);
--- a/server/src/services/health.ts
+++ b/server/src/services/health.ts
+import { getDb } from '../db/index.js';
+import { getProvider } from '../providers/index.js';
+import { decrypt } from '../lib/crypto.js';
+import type { Platform, KeyStatus } from '@freellmapi/shared/types.js';
+const CHECK_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes
+const CONSECUTIVE_FAILURES_TO_DISABLE = 3;
+// Track consecutive failures per key
+const failureCount = new Map<number, number>();
+export async function checkKeyHealth(keyId: number): Promise<KeyStatus> {
+  const db = getDb();
+  const row = db.prepare('SELECT * FROM api_keys WHERE id = ?').get(keyId) as any;
+  if (!row) return 'error';
+  const provider = getProvider(row.platform as Platform);
+  if (!provider) return 'error';
+  try {
+    const apiKey = decrypt(row.encrypted_key, row.iv, row.auth_tag);
+    const isValid = await provider.validateKey(apiKey);
+    const status: KeyStatus = isValid ? 'healthy' : 'invalid';
+    db.prepare("UPDATE api_keys SET status = ?, last_checked_at = datetime('now') WHERE id = ?")
+      .run(status, keyId);
+    if (isValid) {
+      failureCount.delete(keyId);
+    } else {
+      const count = (failureCount.get(keyId) ?? 0) + 1;
+      failureCount.set(keyId, count);
+      if (count >= CONSECUTIVE_FAILURES_TO_DISABLE) {
+        db.prepare('UPDATE api_keys SET enabled = 0 WHERE id = ?').run(keyId);
+        console.log(`[Health] Auto-disabled key ${keyId} after ${count} consecutive failures`);
+      }
+    }
+    return status;
+  } catch (err: any) {
+    console.error(`[Health] Key ${keyId} check error:`, err.message);
+    db.prepare("UPDATE api_keys SET status = ?, last_checked_at = datetime('now') WHERE id = ?")
+      .run('error', keyId);
+    const count = (failureCount.get(keyId) ?? 0) + 1;
+    failureCount.set(keyId, count);
+    if (count >= CONSECUTIVE_FAILURES_TO_DISABLE) {
+      db.prepare('UPDATE api_keys SET enabled = 0 WHERE id = ?').run(keyId);
+    }
+    return 'error';
+  }
+}
+export async function checkAllKeys(): Promise<void> {
+  const db = getDb();
+  const keys = db.prepare('SELECT id, platform FROM api_keys WHERE enabled = 1').all() as { id: number; platform: string }[];
+  console.log(`[Health] Checking ${keys.length} keys...`);
+  for (const key of keys) {
+    await checkKeyHealth(key.id);
+  }
+  console.log(`[Health] Check complete.`);
+}
+let intervalId: ReturnType<typeof setInterval> | null = null;
+export function startHealthChecker(): void {
+  if (intervalId) return;
+  console.log(`[Health] Starting health checker (every ${CHECK_INTERVAL_MS / 1000}s)`);
+  intervalId = setInterval(() => {
+    checkAllKeys().catch(err => console.error('[Health] Check failed:', err));
+  }, CHECK_INTERVAL_MS);
+}
+export function stopHealthChecker(): void {
+  if (intervalId) {
+    clearInterval(intervalId);
+    intervalId = null;
+  }
+}
--- a/server/src/services/ratelimit.ts
+++ b/server/src/services/ratelimit.ts
+// In-memory sliding window rate limit tracker
+interface Window {
+  timestamps: number[];
+  tokenCount: number;
+  tokenTimestamps: { ts: number; tokens: number }[];
+}
+// Key format: "platform:modelId:keyId:type" where type is rpm|rpd|tpm|tpd
+const windows = new Map<string, Window>();
+function getWindow(key: string): Window {
+  let w = windows.get(key);
+  if (!w) {
+    w = { timestamps: [], tokenCount: 0, tokenTimestamps: [] };
+    windows.set(key, w);
+  }
+  return w;
+}
+function pruneTimestamps(timestamps: number[], windowMs: number, now: number): number[] {
+  const cutoff = now - windowMs;
+  return timestamps.filter(ts => ts > cutoff);
+}
+const MINUTE = 60 * 1000;
+const DAY = 24 * 60 * MINUTE;
+export function canMakeRequest(
+  platform: string,
+  modelId: string,
+  keyId: number,
+  limits: { rpm: number | null; rpd: number | null; tpm: number | null; tpd: number | null },
+): boolean {
+  const now = Date.now();
+  if (limits.rpm !== null) {
+    const key = `${platform}:${modelId}:${keyId}:rpm`;
+    const w = getWindow(key);
+    w.timestamps = pruneTimestamps(w.timestamps, MINUTE, now);
+    if (w.timestamps.length >= limits.rpm) return false;
+  }
+  if (limits.rpd !== null) {
+    const key = `${platform}:${modelId}:${keyId}:rpd`;
+    const w = getWindow(key);
+    w.timestamps = pruneTimestamps(w.timestamps, DAY, now);
+    if (w.timestamps.length >= limits.rpd) return false;
+  }
+  return true;
+}
+export function canUseTokens(
+  platform: string,
+  modelId: string,
+  keyId: number,
+  estimatedTokens: number,
+  limits: { tpm: number | null; tpd: number | null },
+): boolean {
+  const now = Date.now();
+  if (limits.tpm !== null) {
+    const key = `${platform}:${modelId}:${keyId}:tpm`;
+    const w = getWindow(key);
+    w.tokenTimestamps = w.tokenTimestamps.filter(t => t.ts > now - MINUTE);
+    const used = w.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
+    if (used + estimatedTokens > limits.tpm) return false;
+  }
+  if (limits.tpd !== null) {
+    const key = `${platform}:${modelId}:${keyId}:tpd`;
+    const w = getWindow(key);
+    w.tokenTimestamps = w.tokenTimestamps.filter(t => t.ts > now - DAY);
+    const used = w.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
+    if (used + estimatedTokens > limits.tpd) return false;
+  }
+  return true;
+}
+export function recordRequest(platform: string, modelId: string, keyId: number) {
+  const now = Date.now();
+  const rpmKey = `${platform}:${modelId}:${keyId}:rpm`;
+  getWindow(rpmKey).timestamps.push(now);
+  const rpdKey = `${platform}:${modelId}:${keyId}:rpd`;
+  getWindow(rpdKey).timestamps.push(now);
+}
+export function recordTokens(
+  platform: string,
+  modelId: string,
+  keyId: number,
+  tokens: number,
+) {
+  const now = Date.now();
+  const tpmKey = `${platform}:${modelId}:${keyId}:tpm`;
+  getWindow(tpmKey).tokenTimestamps.push({ ts: now, tokens });
+  const tpdKey = `${platform}:${modelId}:${keyId}:tpd`;
+  getWindow(tpdKey).tokenTimestamps.push({ ts: now, tokens });
+}
+// Cooldown: when a provider returns 429, block that model+key for a period
+const cooldowns = new Map<string, number>(); // key -> expiry timestamp
+export function setCooldown(platform: string, modelId: string, keyId: number, durationMs = 60_000) {
+  const key = `${platform}:${modelId}:${keyId}:cooldown`;
+  cooldowns.set(key, Date.now() + durationMs);
+}
+export function isOnCooldown(platform: string, modelId: string, keyId: number): boolean {
+  const key = `${platform}:${modelId}:${keyId}:cooldown`;
+  const expiry = cooldowns.get(key);
+  if (!expiry) return false;
+  if (Date.now() > expiry) {
+    cooldowns.delete(key);
+    return false;
+  }
+  return true;
+}
+export function getRateLimitStatus(
+  platform: string,
+  modelId: string,
+  keyId: number,
+  limits: { rpm: number | null; rpd: number | null; tpm: number | null; tpd: number | null },
+) {
+  const now = Date.now();
+  const rpmW = getWindow(`${platform}:${modelId}:${keyId}:rpm`);
+  rpmW.timestamps = pruneTimestamps(rpmW.timestamps, MINUTE, now);
+  const rpdW = getWindow(`${platform}:${modelId}:${keyId}:rpd`);
+  rpdW.timestamps = pruneTimestamps(rpdW.timestamps, DAY, now);
+  const tpmW = getWindow(`${platform}:${modelId}:${keyId}:tpm`);
+  tpmW.tokenTimestamps = tpmW.tokenTimestamps.filter(t => t.ts > now - MINUTE);
+  const tpmUsed = tpmW.tokenTimestamps.reduce((sum, t) => sum + t.tokens, 0);
+  return {
+    rpm: { used: rpmW.timestamps.length, limit: limits.rpm },
+    rpd: { used: rpdW.timestamps.length, limit: limits.rpd },
+    tpm: { used: tpmUsed, limit: limits.tpm },
+  };
+}
--- a/server/src/services/router.ts
+++ b/server/src/services/router.ts
+import { getDb } from '../db/index.js';
+import { getProvider } from '../providers/index.js';
+import { decrypt } from '../lib/crypto.js';
+import { canMakeRequest, canUseTokens, isOnCooldown } from './ratelimit.js';
+import type { BaseProvider } from '../providers/base.js';
+interface ModelRow {
+  id: number;
+  platform: string;
+  model_id: string;
+  display_name: string;
+  rpm_limit: number | null;
+  rpd_limit: number | null;
+  tpm_limit: number | null;
+  tpd_limit: number | null;
+}
+interface KeyRow {
+  id: number;
+  platform: string;
+  encrypted_key: string;
+  iv: string;
+  auth_tag: string;
+  status: string;
+  enabled: number;
+}
+interface FallbackRow {
+  model_db_id: number;
+  priority: number;
+  enabled: number;
+}
+export interface RouteResult {
+  provider: BaseProvider;
+  modelId: string;
+  modelDbId: number;
+  apiKey: string;
+  keyId: number;
+  platform: string;
+  displayName: string;
+}
+// Round-robin index per platform
+const roundRobinIndex = new Map<string, number>();
+// ── Dynamic priority: track 429s per model and demote accordingly ──
+// Key: model_db_id → { count, lastHit, penalty }
+const rateLimitPenalties = new Map<number, { count: number; lastHit: number; penalty: number }>();
+// Penalty decays over time so models recover
+const PENALTY_PER_429 = 3;        // each 429 adds this many priority positions
+const MAX_PENALTY = 10;            // cap so a model doesn't sink forever
+const DECAY_INTERVAL_MS = 2 * 60 * 1000; // penalty decays every 2 minutes
+const DECAY_AMOUNT = 1;            // remove this much penalty per decay interval
+/**
+ * Record a 429 for a model — increases its penalty so it sinks in priority.
+ */
+export function recordRateLimitHit(modelDbId: number) {
+  const existing = rateLimitPenalties.get(modelDbId);
+  const now = Date.now();
+  if (existing) {
+    existing.count++;
+    existing.lastHit = now;
+    existing.penalty = Math.min(existing.penalty + PENALTY_PER_429, MAX_PENALTY);
+  } else {
+    rateLimitPenalties.set(modelDbId, { count: 1, lastHit: now, penalty: PENALTY_PER_429 });
+  }
+}
+/**
+ * Record a success for a model — reduces its penalty so it rises back up.
+ */
+export function recordSuccess(modelDbId: number) {
+  const existing = rateLimitPenalties.get(modelDbId);
+  if (existing) {
+    existing.penalty = Math.max(0, existing.penalty - 1);
+    if (existing.penalty === 0) {
+      rateLimitPenalties.delete(modelDbId);
+    }
+  }
+}
+/**
+ * Get the current penalty for a model (with time-based decay).
+ */
+function getPenalty(modelDbId: number): number {
+  const entry = rateLimitPenalties.get(modelDbId);
+  if (!entry) return 0;
+  // Apply time-based decay
+  const now = Date.now();
+  const elapsed = now - entry.lastHit;
+  const decaySteps = Math.floor(elapsed / DECAY_INTERVAL_MS);
+  if (decaySteps > 0) {
+    entry.penalty = Math.max(0, entry.penalty - (decaySteps * DECAY_AMOUNT));
+    entry.lastHit = now; // reset so we don't double-decay
+    if (entry.penalty === 0) {
+      rateLimitPenalties.delete(modelDbId);
+      return 0;
+    }
+  }
+  return entry.penalty;
+}
+/**
+ * Get current penalties for all models (for the API/dashboard).
+ */
+export function getAllPenalties(): Array<{ modelDbId: number; count: number; penalty: number }> {
+  const result: Array<{ modelDbId: number; count: number; penalty: number }> = [];
+  for (const [modelDbId, entry] of rateLimitPenalties) {
+    const penalty = getPenalty(modelDbId);
+    if (penalty > 0) {
+      result.push({ modelDbId, count: entry.count, penalty });
+    }
+  }
+  return result.sort((a, b) => b.penalty - a.penalty);
+}
+/**
+ * Route a request to the best available model.
+ * Models are sorted by (base_priority + rate_limit_penalty) so frequently
+ * rate-limited models automatically sink below working ones.
+ *
+ * If preferredModelDbId is set, that model gets tried FIRST (sticky sessions).
+ * This prevents hallucination from model switching mid-conversation.
+ *
+ * @param estimatedTokens - estimated total tokens for rate limit check
+ * @param skipKeys - set of "platform:modelId:keyId" to skip (failed on this request)
+ * @param preferredModelDbId - try this model first (sticky session)
+ */
+export function routeRequest(estimatedTokens = 1000, skipKeys?: Set<string>, preferredModelDbId?: number): RouteResult {
+  const db = getDb();
+  // Get fallback chain ordered by priority
+  const fallbackChain = db.prepare(`
+    SELECT fc.model_db_id, fc.priority, fc.enabled
+    FROM fallback_config fc
+    ORDER BY fc.priority ASC
+  `).all() as FallbackRow[];
+  // Apply dynamic penalties: sort by (base priority + penalty)
+  const sortedChain = fallbackChain.map(entry => ({
+    ...entry,
+    effectivePriority: entry.priority + getPenalty(entry.model_db_id),
+  })).sort((a, b) => a.effectivePriority - b.effectivePriority);
+  // Sticky session: move preferred model to front of chain
+  if (preferredModelDbId) {
+    const idx = sortedChain.findIndex(e => e.model_db_id === preferredModelDbId);
+    if (idx > 0) {
+      const [preferred] = sortedChain.splice(idx, 1);
+      sortedChain.unshift(preferred);
+    }
+  }
+  for (const entry of sortedChain) {
+    if (!entry.enabled) continue;
+    // Get model details
+    const model = db.prepare('SELECT * FROM models WHERE id = ? AND enabled = 1').get(entry.model_db_id) as ModelRow | undefined;
+    if (!model) continue;
+    // Check if we have a provider for this platform
+    const provider = getProvider(model.platform as any);
+    if (!provider) continue;
+    // Get all healthy, enabled keys for this platform
+    const keys = db.prepare(
+      'SELECT * FROM api_keys WHERE platform = ? AND enabled = 1 AND status != ?'
+    ).all(model.platform, 'invalid') as KeyRow[];
+    if (keys.length === 0) continue;
+    // Round-robin across keys
+    const rrKey = `${model.platform}:${model.model_id}`;
+    let idx = roundRobinIndex.get(rrKey) ?? 0;
+    for (let attempt = 0; attempt < keys.length; attempt++) {
+      const key = keys[idx % keys.length];
+      idx++;
+      const skipId = `${model.platform}:${model.model_id}:${key.id}`;
+      if (skipKeys?.has(skipId)) continue;
+      // Check cooldown (from previous 429s)
+      if (isOnCooldown(model.platform, model.model_id, key.id)) continue;
+      const limits = {
+        rpm: model.rpm_limit,
+        rpd: model.rpd_limit,
+        tpm: model.tpm_limit,
+        tpd: model.tpd_limit,
+      };
+      if (!canMakeRequest(model.platform, model.model_id, key.id, limits)) continue;
+      if (!canUseTokens(model.platform, model.model_id, key.id, estimatedTokens, limits)) continue;
+      roundRobinIndex.set(rrKey, idx);
+      const decryptedKey = decrypt(key.encrypted_key, key.iv, key.auth_tag);
+      return {
+        provider,
+        modelId: model.model_id,
+        modelDbId: model.id,
+        apiKey: decryptedKey,
+        keyId: key.id,
+        platform: model.platform,
+        displayName: model.display_name,
+      };
+    }
+    roundRobinIndex.set(rrKey, idx);
+  }
+  const err = new Error('All models exhausted. Add more API keys or wait for rate limits to reset.') as any;
+  err.status = 429;
+  throw err;
+}
--- a/server/tsconfig.json
+++ b/server/tsconfig.json
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ES2022",
+    "moduleResolution": "bundler",
+    "lib": ["ES2022"],
+    "outDir": "./dist",
+    "rootDir": "./src",
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "forceConsistentCasingInFileNames": true,
+    "resolveJsonModule": true,
+    "declaration": true,
+    "declarationMap": true,
+    "sourceMap": true
+  },
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "dist", "src/__tests__"]
+}
--- a/server/vitest.config.ts
+++ b/server/vitest.config.ts
+import { defineConfig } from 'vitest/config';
+export default defineConfig({
+  test: {
+    globals: true,
+    environment: 'node',
+    include: ['src/__tests__/**/*.test.ts'],
+  },
+});
--- a/shared/package.json
+++ b/shared/package.json
+{
+  "name": "@freellmapi/shared",
+  "version": "0.1.0",
+  "private": true,
+  "main": "./types.ts",
+  "types": "./types.ts"
+}
--- a/shared/types.ts
+++ b/shared/types.ts
+// ---- Platform & Model Types ----
+export type Platform =
+  | 'google'
+  | 'groq'
+  | 'cerebras'
+  | 'sambanova'
+  | 'nvidia'
+  | 'mistral'
+  | 'openrouter'
+  | 'github'
+  | 'huggingface'
+  | 'cohere'
+  | 'cloudflare'
+  | 'zhipu'
+  | 'moonshot'
+  | 'minimax';
+export interface Model {
+  id: number;
+  platform: Platform;
+  modelId: string;
+  displayName: string;
+  intelligenceRank: number;
+  speedRank: number;
+  sizeLabel: string;
+  rpmLimit: number | null;
+  rpdLimit: number | null;
+  tpmLimit: number | null;
+  tpdLimit: number | null;
+  monthlyTokenBudget: string;
+  contextWindow: number | null;
+  enabled: boolean;
+}
+export type KeyStatus = 'healthy' | 'rate_limited' | 'invalid' | 'error' | 'unknown';
+export interface ApiKey {
+  id: number;
+  platform: Platform;
+  label: string;
+  maskedKey: string;
+  status: KeyStatus;
+  enabled: boolean;
+  createdAt: string;
+  lastCheckedAt: string | null;
+}
+export interface ApiKeyCreate {
+  platform: Platform;
+  key: string;
+  label?: string;
+}
+// ---- Fallback Config ----
+export interface FallbackEntry {
+  modelId: number;
+  platform: Platform;
+  displayName: string;
+  intelligenceRank: number;
+  speedRank: number;
+  priority: number;
+  enabled: boolean;
+}
+// ---- OpenAI-Compatible Types ----
+export interface ChatMessage {
+  role: 'system' | 'user' | 'assistant';
+  content: string;
+}
+export interface ChatCompletionRequest {
+  model?: string;
+  messages: ChatMessage[];
+  temperature?: number;
+  max_tokens?: number;
+  stream?: boolean;
+  top_p?: number;
+}
+export interface ChatCompletionChoice {
+  index: number;
+  message: ChatMessage;
+  finish_reason: string | null;
+}
+export interface TokenUsage {
+  prompt_tokens: number;
+  completion_tokens: number;
+  total_tokens: number;
+}
+export interface ChatCompletionResponse {
+  id: string;
+  object: 'chat.completion';
+  created: number;
+  model: string;
+  choices: ChatCompletionChoice[];
+  usage: TokenUsage;
+  _routed_via?: {
+    platform: Platform;
+    model: string;
+  };
+}
+export interface ChatCompletionChunk {
+  id: string;
+  object: 'chat.completion.chunk';
+  created: number;
+  model: string;
+  choices: {
+    index: number;
+    delta: Partial<ChatMessage>;
+    finish_reason: string | null;
+  }[];
+}
+// ---- Analytics Types ----
+export interface AnalyticsSummary {
+  totalRequests: number;
+  successRate: number;
+  totalInputTokens: number;
+  totalOutputTokens: number;
+  avgLatencyMs: number;
+  estimatedCostSavings: number;
+}
+export interface PlatformStats {
+  platform: Platform;
+  requests: number;
+  successRate: number;
+  avgLatencyMs: number;
+  totalInputTokens: number;
+  totalOutputTokens: number;
+}
+export interface TimelinePoint {
+  timestamp: string;
+  requests: number;
+  successCount: number;
+  failureCount: number;
+}
+export interface RequestLog {
+  id: number;
+  platform: Platform;
+  modelId: string;
+  status: 'success' | 'error';
+  inputTokens: number;
+  outputTokens: number;
+  latencyMs: number;
+  error: string | null;
+  createdAt: string;
+}
+// ---- Rate Limit Types ----
+export interface RateLimitStatus {
+  platform: Platform;
+  modelId: string;
+  rpm: { used: number; limit: number | null };
+  rpd: { used: number; limit: number | null };
+  tpm: { used: number; limit: number | null };
+  available: boolean;
+  nextResetAt: string | null;
+}