Kimi K2.7 Code

An open-source, coding-focused agentic model built for long-horizon software engineering

8 min read2026-06-13
Kimi K2.7 Code

What is Kimi K2.7 Code?

Kimi K2.7 Code is an open-source, coding-focused agentic model developed by Moonshot AI. It delivers stronger coding and agent performance, with substantial improvements in real-world long-horizon coding tasks. These gains translate into higher end-to-end task success rates across complex software engineering workflows. K2.7 Code also improves reasoning efficiency, reducing thinking-token usage by approximately 30% compared with K2.6.

Benchmark performance

Kimi K2.7 Code was evaluated against K2.6 on a combination of internal and external benchmarks covering two dimensions: coding capability and agentic task execution.

Benchmark comparison of Kimi K2.7 Code, Kimi K2.6, GPT-5.5, and Claude Opus 4.8 across six coding and agentic benchmarks

On coding benchmarks, K2.7 Code shows substantial gains over K2.6: +21.8% on Kimi Code Bench v2 (62.0 vs 50.9), +11.0% on Program Bench (53.6 vs 48.3), and +31.5% on MLS Bench Lite (35.1 vs 26.7).

Stronger coding capability also translates into stronger agentic performance. On Kimi Claw 24/7 Bench, MCP Atlas, and MCP Mark Verified — benchmarks that measure autonomous agent task execution — K2.7 Code improves by roughly 10% over K2.6.

  • Coding:

BenchmarkKimi K2.6Kimi K2.7 CodeGPT-5.5Claude Opus 4.8
Kimi Code Bench v250.962.069.067.4
Program Bench48.353.669.163.8
MLS Bench Lite26.735.135.542.8
  • Agentic:

BenchmarkKimi K2.6Kimi K2.7 CodeGPT-5.5Claude Opus 4.8
Kimi Claw 24/7 Bench42.946.952.850.4
MCP Atlas69.476.079.481.3
MCP Mark Verified72.881.192.976.4

Kimi Code Bench v2 is an in-house benchmark developed by Moonshot AI, and Kimi Claw 24/7 Bench is an in-house benchmark for agentic evaluation. Kimi K2.7 Code and K2.6 were tested via Kimi Code CLI with thinking enabled (temperature 1.0, top-p 0.95, 262,144-token context), while GPT-5.5 was evaluated in Codex (xhigh) and Opus 4.8 in Claude Code (xhigh). Per-benchmark exceptions and full methodology are detailed in the Hugging Face model card.

Built for long-horizon coding

Real-world software engineering rarely ends in a single step. Tasks like refactoring a codebase, implementing a feature across multiple files, or debugging over long agent sessions require a model to follow instructions reliably across extended contexts, and to carry a task through to completion.

Kimi K2.7 Code is optimized for these long-horizon scenarios. Compared with K2.6, it follows instructions more reliably in long contexts and achieves higher end-to-end task success rates, making it better suited for complex software engineering workflows.

Optimized reasoning efficiency

Reasoning models tend to overthink, spending thousands of tokens deliberating on problems that don't need it. Kimi K2.7 Code significantly reduces this tendency: it cuts thinking-token usage by approximately 30% on average compared with K2.6.

Across Kimi Code Bench v2, Program Bench, and MLS Bench Lite, Kimi K2.7 Code achieves higher scores than K2.6 while consuming fewer tokens on each benchmark.

Performance vs Tokens of Kimi K2.7 Code

For developers, this efficiency compounds across every task: faster responses in interactive coding sessions, lower API costs in production, and agent workflows that complete more work within the same context budget.

Model architecture

Kimi K2.7 Code is built on a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters per token. The model supports a 256K context length and uses Multi-head Latent Attention (MLA). It also includes MoonViT, a 400M-parameter vision encoder.

model summary of Kimi K2.7 Code

The full model weights are open-sourced and available on Hugging Face.

Choosing between Kimi K2.7 Code and K2.6

Kimi K2.7 Code is purpose-built for coding tasks. For general-purpose work such as writing, analysis, and conversation, we recommend K2.6, which offers more well-rounded capabilities.

How to access Kimi K2.7 Code

Where to use it

Kimi K2.7 Code is available through:

  • Kimi Code (https://www.kimi.com/code). Kimi K2.7 Code is now the default model, with thinking mode enabled by default. To get started, follow the setup instructions on the page.

    interface of Kimi Code
  • Kimi API on the open platform (https://platform.kimi.ai/). Developers can call Kimi K2.7 Code via the Kimi API and integrate it into their own coding workflows, agents, and developer tools.

Thinking mode requirement

Kimi K2.7 Code does not support non-thinking mode. It always runs with thinking enabled, on both the Kimi API and Kimi Code. In Kimi Code, requests made with thinking disabled are automatically served by K2.6 instead.

Kimi K2.7 Code pricing

Kimi Code Plans

Kimi K2.7 Code is included in Kimi Code as part of the Kimi membership, with plans starting at $19/month. All plans include weekly refreshed usage quotas, and higher tiers offer larger weekly limits and higher concurrency caps, making them suitable for more intensive development workflows, complex projects, and large codebases. See the Kimi Code page for current plans, pricing, and quota details.

Kimi API pricing

Kimi K2.7 Code is available through the Kimi API with usage-based, per-token billing:

ModelUnitInput Price (Cache Hit)Input Price (Cache Miss)Output PriceContext Window
kimi-k2.7-code1M tokens$0.19$0.95$4.00262,144 tokens

The API supports automatic context caching, which lowers the input cost for reused context (cache hit $0.19 vs cache miss $0.95 per million tokens). Prices exclude applicable taxes. See the official pricing documentation for the latest rates.

FAQ

Is Kimi K2.7 Code open-source?
Yes. The model weights are open-sourced and available for download on Hugging Face, where you can also find deployment guides and full documentation.
What is the context window of Kimi K2.7 Code?
Kimi K2.7 Code supports a 256K context window (262,144 tokens), making it well-suited for repository-scale codebases and long, multi-turn coding sessions.
Does Kimi K2.7 Code support image and video input?
Yes. Kimi K2.7 Code uses a natively multimodal architecture that supports text, image, and video input, in addition to its coding and agentic capabilities.
Is thinking mode required to use Kimi K2.7 Code?
Yes. Kimi K2.7 Code does not support non-thinking mode and always runs with thinking enabled. In Kimi Code, requests made with thinking disabled are automatically served by K2.6 instead.