Sample audit output

What a 48h LLM Cost & Routing Audit looks like.

This sample uses a coding-agent workflow, but the format also works for RAG, chat, support, batch jobs and agent products. The real audit uses your providers, models, usage and failure examples.

Book 100 EUR audit Coding-agent version

Route map sample

Where cost leaks usually hide

The point is not to find one magic cheap model. It is to stop treating every workflow step as the same route.

Workflow step

Current route

Likely leak

Route to test

Repo scan / context loading

Premium model sees large project context before the task is scoped.

Repeated context becomes a standing tax, especially after resumed sessions.

Use cached repo summary or cheaper discovery route; escalate only when ambiguity is detected.

Routine implementation

Same high-cost route used for first-pass edits and final judgment.

Premium tokens are spent on reversible draft work.

Cheaper coding route with diff-size guardrails and a hard retry budget.

Failed test / command recovery

Full context and logs are sent back through the strongest model.

The same failure is paid for twice, sometimes with bigger context.

Classify failure first, then send only command, error, and relevant diff to recovery route.

Final review

Review is mixed into generation and recovery turns.

No clear boundary for judgment-heavy work.

Keep the strongest model here; review final diff, risk, and test evidence only.

Recommended route

Keep the strongest model for architecture, final review, security-sensitive changes and ambiguous failures.
Move repo discovery, routine transforms and low-risk first-pass edits to a cheaper route.
Separate recovery from generation: failed command handling should have a smaller context and a stop condition.
Track cost by workflow bucket for 7 days before making a permanent provider switch.

Config-shaped notes

routine_route:
  model: cheaper-coding-model
  max_retries: 1
  context: scoped_files_only

review_route:
  model: strongest-reasoning-model
  input: final_diff_and_test_evidence
  retry_on: security_or_architecture_risk

7-day validation plan

1Pick one representative workflow from the last week.
2Tag turns as context, routine edit, command output, retry/recovery or final review.
3Run one cheaper routine route while preserving the premium review route.
4Compare success rate, retry count, wall time and estimated cost per completed task.

Turn this into your stack

Send providers, models, rough spend and one expensive workflow. The paid audit returns a route map like this, grounded in your actual setup.

Send stack first