Kimi K2.5 Fast

by Moonshot

Kimi K2.5 Fast is a speed-oriented variant of the full K2.5 model, built to deliver quicker responses while keeping its native multimodal and agentic capabilities. It is based on the same 1 trillion parameter MoE architecture with 32 billion active parameters and retains the vision-language integration and reasoning of the base model. The variant supports text and image inputs alongside the agent swarm coordination paradigm, letting developers balance response speed against reasoning depth. It supports a 256K token context window and tool calling for autonomous agent workflows. It suits applications that need responsive visual reasoning, coding assistance, and tool-augmented tasks. Like the rest of the K2.5 line it uses native INT4 quantization from quantization-aware training, which keeps quality close to full precision while lowering memory use.

Key info

Input
Output
Features
Context window
262K
Max output
262K

Available routes

No routes currently available — Kimi K2.5 Fast isn't routed through the Opper gateway right now. It may return.

Contact us about this model →

Available models from Moonshot

Start building with 300+ models

One API key. Every major provider. Up and running in minutes.

Get startedView Documentation
Kimi K2.5 Fast by Moonshot — not currently on Opper | Opper AI