Sample audit output
What a 48h LLM Cost & Routing Audit looks like.
This sample uses a coding-agent workflow, but the format also works for RAG, chat, support, batch jobs and agent products. The real audit uses your providers, models, usage and failure examples.
Route map sample
Where cost leaks usually hide
The point is not to find one magic cheap model. It is to stop treating every workflow step as the same route.
Workflow step
Current route
Likely leak
Route to test
Repo scan / context loading
Premium model sees large project context before the task is scoped.
Repeated context becomes a standing tax, especially after resumed sessions.
Use cached repo summary or cheaper discovery route; escalate only when ambiguity is detected.
Routine implementation
Same high-cost route used for first-pass edits and final judgment.
Premium tokens are spent on reversible draft work.
Cheaper coding route with diff-size guardrails and a hard retry budget.
Failed test / command recovery
Full context and logs are sent back through the strongest model.
The same failure is paid for twice, sometimes with bigger context.
Classify failure first, then send only command, error, and relevant diff to recovery route.
Final review
Review is mixed into generation and recovery turns.
No clear boundary for judgment-heavy work.
Keep the strongest model here; review final diff, risk, and test evidence only.
Recommended route
- Keep the strongest model for architecture, final review, security-sensitive changes and ambiguous failures.
- Move repo discovery, routine transforms and low-risk first-pass edits to a cheaper route.
- Separate recovery from generation: failed command handling should have a smaller context and a stop condition.
- Track cost by workflow bucket for 7 days before making a permanent provider switch.
Config-shaped notes
routine_route: model: cheaper-coding-model max_retries: 1 context: scoped_files_only review_route: model: strongest-reasoning-model input: final_diff_and_test_evidence retry_on: security_or_architecture_risk
7-day validation plan
- 1Pick one representative workflow from the last week.
- 2Tag turns as context, routine edit, command output, retry/recovery or final review.
- 3Run one cheaper routine route while preserving the premium review route.
- 4Compare success rate, retry count, wall time and estimated cost per completed task.
Turn this into your stack
Send providers, models, rough spend and one expensive workflow. The paid audit returns a route map like this, grounded in your actual setup.