API pricing
Kimi API billing based on token consumption, with model-specific and feature-specific pricing.
Billing basics
- Per-token billing: Each API call is billed separately for input tokens and output tokens
- Token unit: 1M = 1,000,000 tokens
- Model-specific pricing: Higher-capability models have higher per-token costs — choose the model that best fits your use case
Additional feature billing
Context caching
Context Caching allows you to cache frequently used context content (such as system prompts and reference documents). Tokens that hit the cache are billed at a discounted rate, effectively reducing costs for repetitive context.
Refer to the official documentation for detailed Context Caching pricing.
Pricing details
For the complete model pricing table and billing rules:
Cost optimization tips
- Set the
max_tokensparameter appropriately to avoid unnecessarily long outputs - Use Context Caching for repetitive system prompts and context
- Choose the right model for the task complexity — use lightweight models for simple tasks
- Streamline your prompt design to minimize unnecessary input tokens