What is Kimi K2.7 Code?
Kimi K2.7 Code is an open-source, coding-focused agentic model developed by Moonshot AI. It delivers stronger coding and agent performance, with substantial improvements in real-world long-horizon coding tasks. These gains translate into higher end-to-end task success rates across complex software engineering workflows. K2.7 Code also improves reasoning efficiency, reducing thinking-token usage by approximately 30% compared with K2.6.
Benchmark performance
Kimi K2.7 Code was evaluated against K2.6 on a combination of internal and external benchmarks covering two dimensions: coding capability and agentic task execution.
On coding benchmarks, K2.7 Code shows substantial gains over K2.6: +21.8% on Kimi Code Bench v2 (62.0 vs 50.9), +11.0% on Program Bench (53.6 vs 48.3), and +31.5% on MLS Bench Lite (35.1 vs 26.7).
Stronger coding capability also translates into stronger agentic performance. On Kimi Claw 24/7 Bench, MCP Atlas, and MCP Mark Verified — benchmarks that measure autonomous agent task execution — K2.7 Code improves by roughly 10% over K2.6.
Coding:
| Benchmark | Kimi K2.6 | Kimi K2.7 Code | GPT-5.5 | Claude Opus 4.8 |
|---|---|---|---|---|
| Kimi Code Bench v2 | 50.9 | 62.0 | 69.0 | 67.4 |
| Program Bench | 48.3 | 53.6 | 69.1 | 63.8 |
| MLS Bench Lite | 26.7 | 35.1 | 35.5 | 42.8 |
Agentic:
| Benchmark | Kimi K2.6 | Kimi K2.7 Code | GPT-5.5 | Claude Opus 4.8 |
|---|---|---|---|---|
| Kimi Claw 24/7 Bench | 42.9 | 46.9 | 52.8 | 50.4 |
| MCP Atlas | 69.4 | 76.0 | 79.4 | 81.3 |
| MCP Mark Verified | 72.8 | 81.1 | 92.9 | 76.4 |
Kimi Code Bench v2 is an in-house benchmark developed by Moonshot AI, and Kimi Claw 24/7 Bench is an in-house benchmark for agentic evaluation. Kimi K2.7 Code and K2.6 were tested via Kimi Code CLI with thinking enabled (temperature 1.0, top-p 0.95, 262,144-token context), while GPT-5.5 was evaluated in Codex (xhigh) and Opus 4.8 in Claude Code (xhigh). Per-benchmark exceptions and full methodology are detailed in the Hugging Face model card.
Built for long-horizon coding
Real-world software engineering rarely ends in a single step. Tasks like refactoring a codebase, implementing a feature across multiple files, or debugging over long agent sessions require a model to follow instructions reliably across extended contexts, and to carry a task through to completion.
Kimi K2.7 Code is optimized for these long-horizon scenarios. Compared with K2.6, it follows instructions more reliably in long contexts and achieves higher end-to-end task success rates, making it better suited for complex software engineering workflows.
Optimized reasoning efficiency
Reasoning models tend to overthink, spending thousands of tokens deliberating on problems that don't need it. Kimi K2.7 Code significantly reduces this tendency: it cuts thinking-token usage by approximately 30% on average compared with K2.6.
Across Kimi Code Bench v2, Program Bench, and MLS Bench Lite, Kimi K2.7 Code achieves higher scores than K2.6 while consuming fewer tokens on each benchmark.
For developers, this efficiency compounds across every task: faster responses in interactive coding sessions, lower API costs in production, and agent workflows that complete more work within the same context budget.
Model architecture
Kimi K2.7 Code is built on a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters per token. The model supports a 256K context length and uses Multi-head Latent Attention (MLA). It also includes MoonViT, a 400M-parameter vision encoder.
The full model weights are open-sourced and available on Hugging Face.
Choosing between Kimi K2.7 Code and K2.6
Kimi K2.7 Code is purpose-built for coding tasks. For general-purpose work such as writing, analysis, and conversation, we recommend K2.6, which offers more well-rounded capabilities.
How to access Kimi K2.7 Code
Where to use it
Kimi K2.7 Code is available through:
Kimi Code (https://www.kimi.com/code). Kimi K2.7 Code is now the default model, with thinking mode enabled by default. To get started, follow the setup instructions on the page.
Kimi API on the open platform (https://platform.kimi.ai/). Developers can call Kimi K2.7 Code via the Kimi API and integrate it into their own coding workflows, agents, and developer tools.
Thinking mode requirement
Kimi K2.7 Code does not support non-thinking mode. It always runs with thinking enabled, on both the Kimi API and Kimi Code. In Kimi Code, requests made with thinking disabled are automatically served by K2.6 instead.
Kimi K2.7 Code pricing
Kimi Code Plans
Kimi K2.7 Code is included in Kimi Code as part of the Kimi membership, with plans starting at $19/month. All plans include weekly refreshed usage quotas, and higher tiers offer larger weekly limits and higher concurrency caps, making them suitable for more intensive development workflows, complex projects, and large codebases. See the Kimi Code page for current plans, pricing, and quota details.
Kimi API pricing
Kimi K2.7 Code is available through the Kimi API with usage-based, per-token billing:
| Model | Unit | Input Price (Cache Hit) | Input Price (Cache Miss) | Output Price | Context Window |
|---|---|---|---|---|---|
| kimi-k2.7-code | 1M tokens | $0.19 | $0.95 | $4.00 | 262,144 tokens |
The API supports automatic context caching, which lowers the input cost for reused context (cache hit $0.19 vs cache miss $0.95 per million tokens). Prices exclude applicable taxes. See the official pricing documentation for the latest rates.