Weave displays cost and token usage estimates based on data captured from your LLM calls. Discrepancies between Weave’s numbers and your provider’s invoice or dashboard are common and have several causes. Token counts come from the provider response, not Weave For supported integrations (OpenAI, Anthropic, Google, etc.), Weave reads token usage directly from the API response object—the sameDocumentation Index
Fetch the complete documentation index at: https://wb-21fd5541-kb-refresh.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
usage field your code receives. If your provider reports a different count on their billing page, the difference is on the provider side (for example, they may aggregate tokens across streaming chunks differently than per-call reporting).
Weave cost estimates use a static pricing table
Weave calculates estimated cost by multiplying token counts by known per-token prices for each model. This table is updated periodically but may lag behind provider pricing changes. If a provider recently changed pricing for a model, Weave’s estimate will be stale until the next SDK release that updates the table.
To check the model pricing Weave is using, see the pricing reference in the Weave source.
Custom or fine-tuned models may not have pricing entries
If you use a fine-tuned model or a model identifier that is not in Weave’s pricing table, the cost column will show — or $0.00. You can see token counts but cost will not be estimated for unknown models.
Sampling reduces total captured tokens
If you set tracing_sample_rate on an op, only a fraction of calls are traced. The token totals in Weave reflect only the sampled calls, not your full usage:
usage object as returned by the provider, which should reflect cached token pricing if the provider reports it in the response. However, Weave’s static pricing table reflects standard (non-cached) prices for each token category. If you use prompt caching heavily, the gap between Weave’s estimate and your actual bill will be larger.
Batch API requests may report token usage differently than real-time requests; verify that your batch responses include standard usage fields if you expect Weave to capture them.
Data Capture Trace Data