GPT OSS 120B Fast
GPT OSS 120B Fast is a latency-optimized configuration of GPT-OSS 120B, tuned for faster responses while keeping the same open-weight architecture and 117B-parameter Mixture-of-Experts design. It preserves the core strengths of GPT-OSS 120B, including reasoning, agentic tool use with function calling, and structured outputs, making it a fit for latency-sensitive applications that still need the full model's capability.
Key info
Input
Output
Features
Context window
131K
Max output
8K
Available routes
No routes currently available — GPT OSS 120B Fast isn't routed through the Opper gateway right now. It may return.
Contact us about this model →