mirror of
https://github.com/QwenLM/qwen-code.git
synced 2025-12-19 09:33:53 +00:00
feat: update docs
This commit is contained in:
@@ -1,14 +1,29 @@
|
||||
# Token Caching and Cost Optimization
|
||||
|
||||
Qwen Code automatically optimizes API costs through token caching when using API key [authentication](/users/configuration/settings#environment-variables-for-api-access) (e.g., OpenAI-compatible providers). This feature reuses previous system instructions and context to reduce the number of tokens processed in subsequent requests.
|
||||
Qwen Code automatically optimizes API costs through token caching when using API key authentication. This feature stores frequently used content like system instructions and conversation history to reduce the number of tokens processed in subsequent requests.
|
||||
|
||||
**Token caching is available for:**
|
||||
## How It Benefits You
|
||||
|
||||
- API key users (Qwen API key)
|
||||
- Vertex AI users (with project and location setup)
|
||||
- **Cost reduction**: Less tokens mean lower API costs
|
||||
- **Faster responses**: Cached content is retrieved more quickly
|
||||
- **Automatic optimization**: No configuration needed - it works behind the scenes
|
||||
|
||||
**Token caching is not available for:**
|
||||
## Token caching is available for
|
||||
|
||||
- OAuth users (Google Personal/Enterprise accounts) - the Code Assist API does not support cached content creation at this time
|
||||
- API key users (Qwen API key, OpenAI-compatible providers)
|
||||
|
||||
You can view your token usage and cached token savings using the `/stats` command. When cached tokens are available, they will be displayed in the stats output.
|
||||
## Monitoring Your Savings
|
||||
|
||||
Use the `/stats` command to see your cached token savings:
|
||||
|
||||
- When active, the stats display shows how many tokens were served from cache
|
||||
- You'll see both the absolute number and percentage of cached tokens
|
||||
- Example: "10,500 (90.4%) of input tokens were served from the cache, reducing costs."
|
||||
|
||||
This information is only displayed when cached tokens are being used, which occurs with API key authentication but not with OAuth authentication.
|
||||
|
||||
## Example Stats Display
|
||||
|
||||

|
||||
|
||||
The above image shows an example of the `/stats` command output, highlighting the cached token savings information.
|
||||
|
||||
Reference in New Issue
Block a user