Kimi K2.7 Code

An open-source, coding-focused agentic model built for long-horizon software engineering.

8 min read2026-07-22

What is Kimi K2.7 Code?

Kimi K2.7 Code is an open-source, coding-focused agentic model developed by Moonshot AI. It delivers stronger coding and agent performance, with substantial improvements in real-world long-horizon coding tasks. These gains translate into higher end-to-end task success rates across complex software engineering workflows. K2.7 Code also improves reasoning efficiency, reducing thinking-token usage by approximately 30% compared with K2.6.

Benchmark performance

Kimi K2.7 Code was evaluated against K2.6 on a combination of internal and external benchmarks covering two dimensions: coding capability and agentic task execution.

Benchmark comparison of Kimi K2.7 Code, Kimi K2.6, GPT-5.5, and Claude Opus 4.8 across six coding and agentic benchmarks

On coding benchmarks, K2.7 Code shows substantial gains over K2.6: +21.8% on Kimi Code Bench v2 (62.0 vs 50.9), +11.0% on Program Bench (53.6 vs 48.3), and +31.5% on MLS Bench Lite (35.1 vs 26.7).

Stronger coding capability also translates into stronger agentic performance. On Kimi Claw 24/7 Bench, MCP Atlas, and MCP Mark Verified — benchmarks that measure autonomous agent task execution — K2.7 Code improves by roughly 10% over K2.6.

Coding:

Benchmark	Kimi K2.6	Kimi K2.7 Code	GPT-5.5	Claude Opus 4.8
Kimi Code Bench v2	50.9	62.0	69.0	67.4
Program Bench	48.3	53.6	69.1	63.8
MLS Bench Lite	26.7	35.1	35.5	42.8

Agentic:

Benchmark	Kimi K2.6	Kimi K2.7 Code	GPT-5.5	Claude Opus 4.8
Kimi Claw 24/7 Bench	42.9	46.9	52.8	50.4
MCP Atlas	69.4	76.0	79.4	81.3
MCP Mark Verified	72.8	81.1	92.9	76.4

Kimi Code Bench v2 is an in-house benchmark developed by Moonshot AI, and Kimi Claw 24/7 Bench is an in-house benchmark for agentic evaluation. Kimi K2.7 Code and K2.6 were tested via Kimi Code CLI with thinking enabled (temperature 1.0, top-p 0.95, 262,144-token context), while GPT-5.5 was evaluated in Codex (xhigh) and Opus 4.8 in Claude Code (xhigh). Per-benchmark exceptions and full methodology are detailed in the Hugging Face model card.

Built for long-horizon coding

Real-world software engineering rarely ends in a single step. Tasks like refactoring a codebase, implementing a feature across multiple files, or debugging over long agent sessions require a model to follow instructions reliably across extended contexts, and to carry a task through to completion.

Kimi K2.7 Code is optimized for these long-horizon scenarios. Compared with K2.6, it follows instructions more reliably in long contexts and achieves higher end-to-end task success rates, making it better suited for complex software engineering workflows.

Try in Kimi Code

Optimized reasoning efficiency

Reasoning models tend to overthink, spending thousands of tokens deliberating on problems that don't need it. Kimi K2.7 Code significantly reduces this tendency: it cuts thinking-token usage by approximately 30% on average compared with K2.6.

Across Kimi Code Bench v2, Program Bench, and MLS Bench Lite, Kimi K2.7 Code achieves higher scores than K2.6 while consuming fewer tokens on each benchmark.

For developers, this efficiency compounds across every task: faster responses in interactive coding sessions, lower API costs in production, and agent workflows that complete more work within the same context budget.

Try in Kimi Code

Model architecture

Kimi K2.7 Code is built on a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters per token. The model supports a 256K context length and uses Multi-head Latent Attention (MLA). It also includes MoonViT, a 400M-parameter vision encoder.

Parameter	Value
Architecture	Mixture-of-Experts (MoE)
Total Parameters	1T
Activated Parameters	32B
Number of Layers (Dense layer included)	61
Number of Dense Layers	1
Attention Hidden Dimension	7168
MoE Hidden Dimension (per Expert)	2048
Number of Attention Heads	64
Number of Experts	384
Selected Experts per Token	8
Number of Shared Experts	1
Vocabulary Size	160K
Context Length	256K
Attention Mechanism	MLA
Activation Function	SwiGLU
Vision Encoder	MoonViT
Parameters of Vision Encoder	400M

The full model weights are open-sourced and available on Hugging Face.

Choosing between Kimi K2.7 Code and K2.6

Kimi K2.7 Code is purpose-built for coding tasks. For general-purpose work such as writing, analysis, and conversation, we recommend K2.6, which offers more well-rounded capabilities.

How to access Kimi K2.7 Code

Where to use it

Kimi K2.7 Code is available through:

Kimi Code (https://www.kimi.com/code). Kimi K2.7 Code is now the default model, with thinking mode enabled by default. To get started, follow the setup instructions on the page.
Kimi API on the open platform (https://platform.kimi.ai/). Developers can call Kimi K2.7 Code via the Kimi API and integrate it into their own coding workflows, agents, and developer tools.

Thinking mode requirement

Kimi K2.7 Code does not support non-thinking mode. It always runs with thinking enabled, on both the Kimi API and Kimi Code. In Kimi Code, requests made with thinking disabled are automatically served by K2.6 instead.

Kimi K2.7 Code pricing

Kimi Code Plans

For users who want to experience Kimi K2.7 Code directly through Kimi Code, including terminal and IDE plugins, you can choose our Code plans. Prices shown below are monthly prices under annual billing:

Plan	Price	Best for
Moderato	$15 / month	Users who need weekly refreshed usage quotas and multi-device access for regular coding workflows
Allegretto	$31 / month	Advanced users who need larger weekly limits and increased concurrency caps
Allegro	$79 / month	Users working on intensive development tasks, complex projects, and larger workloads
Vivace	$159 / month	Users who need the highest weekly plan quotas for complex projects and large codebases

Each plan includes weekly refreshed usage limits. Higher-tier plans provide larger weekly limits and higher concurrency caps, making them suitable for more complex projects.For the latest plan details, see the official membership page.

Kimi API pricing

Kimi K2.7 Code is available through the Kimi API with usage-based, per-token billing:

Model	Unit	Input Price (Cache Hit)	Input Price (Cache Miss)	Output Price	Context Window
kimi-k2.7-code	1M tokens	$0.19	$0.95	$4.00	262,144 tokens

The API supports automatic context caching, which lowers the input cost for reused context (cache hit $0.19 vs cache miss $0.95 per million tokens). Prices exclude applicable taxes. See the official pricing documentation for the latest rates.

FAQ

Is Kimi K2.7 Code open-source?

Yes. The model weights are open-sourced and available for download on Hugging Face, where you can also find deployment guides and full documentation.

What is the context window of Kimi K2.7 Code?

Kimi K2.7 Code supports a 256K context window (262,144 tokens), making it well-suited for repository-scale codebases and long, multi-turn coding sessions.

Does Kimi K2.7 Code support image and video input?

Yes. Kimi K2.7 Code uses a natively multimodal architecture that supports text, image, and video input, in addition to its coding and agentic capabilities.

Is thinking mode required to use Kimi K2.7 Code?

Yes. Kimi K2.7 Code does not support non-thinking mode and always runs with thinking enabled. In Kimi Code, requests made with thinking disabled are automatically served by K2.6 instead.