Your simple prompts are burning premium tokens
Every "write a test" or "fix this typo" burns gpt-5.2 credits. nadirclaw routes simple prompts to cheaper models automatically. Save on every call that doesn't need your most expensive model.
Two commands. Zero configuration.
Most prompts don't need your best model
Code formatting, basic questions, simple edits. They don't need gpt-5.2 at $1.75/$14 per 1M tokens or opus-4.6 at $15/$75, but that's what you're paying for.
You're blind to where money goes
API bills show totals, not breakdowns. You have no idea which prompts cost $0.001 and which cost $0.50.
Changing models breaks workflow
Switching between gpt-5.2 and gpt-5-mini in your editor kills momentum. So you just pay more.
How it works
Install once. Route forever.
Start the router
Run NadirClaw locally. It sits between your app and OpenAI's API. No cloud services, no signup, no tracking.
Point your tools to localhost
Change your base URL from api.openai.com to localhost:8000. Works with Claude Code, Cursor, Aider, or any OpenAI-compatible client.
Watch costs drop
NadirClaw classifies every prompt and routes it to the cheapest model that can handle it. Check the dashboard to see where you're actually spending.
See exactly where your money goes
Full request logs, per-model costs, latency tracking. Know which features of your app are expensive before the bill arrives.
One line changes everything
Literally one URL swap. That's it.
import openai
client = openai.OpenAI(
base_url="https://api.openai.com/v1",
api_key="sk-..."
)
import openai
client = openai.OpenAI(
base_url="http://localhost:8000",
api_key="sk-..."
)
gpt-5-mini ($0.25/1M) or haiku-4.5 ($1/1M) instead of gpt-5.2 ($1.75/$14) or opus-4.6 ($15/$75)nadirclaw report to see your real breakdown.
Observability built in
Every request through nadirclaw is logged automatically. No SDK changes, no decorators, no instrumentation.
Cost per request
See exactly what each prompt costs. Break down spend by model, by task, by user. Find the $5 prompt hiding in your $200 bill.
Full request logs
Every prompt and response captured. Debug weird agent behavior by reading the actual conversation, not guessing.
Latency tracking
p50, p95, p99 per model. See which calls are slow. Spot timeouts before they become a problem.
Error rates and retries
How often are calls failing? Which models have the highest error rates? Are you retrying intelligently or burning money?
Classification breakdown
See what percentage of your traffic is simple vs complex. Understand your actual usage patterns, not assumptions.
Zero instrumentation
Other tools require decorators, SDK wrappers, or OpenTelemetry setup. nadirclaw logs everything at the proxy layer. Point your app at it and you're done.
How we compare
Verified against each product's public docs and pricing pages.
| Feature | nadirclaw |
OpenRouter | Helicone | Portkey | vLLM Router | Langfuse | LangSmith |
|---|---|---|---|---|---|---|---|
| Auto Classification | ✓ Complexity classifier | — | — | — | — | — | — |
| Cost-Based Routing | ✓ Automatic | Manual model selection | Fallbacks only | ✓ Rule-based | Load balancing only | — | — |
| Observability | ✓ Built-in | Basic usage stats | ✓ Full suite | ✓ Full suite | — | ✓ Full suite | ✓ Full suite |
| Open Source | ✓ MIT | — | ✓ Apache 2.0 | Gateway only | ✓ Apache 2.0 | ✓ MIT | — |
| Pricing | Free | Per-token markup | Free 10K req, Pro $79/mo | Free 10K req, Pro undisclosed | Free | Free hobby, Pro $59/mo | Free 5K traces, Plus $39/seat |
nadirclaw's differentiator: automatic prompt complexity classification. Other routers need manual rules. We analyze the prompt and route it for you.
Get started now
Two commands. Seriously.