GPT OSS 120B Fast

by OpenAI

GPT OSS 120B Fast is a latency-optimized configuration of GPT-OSS 120B, tuned for faster responses while keeping the same open-weight architecture and 117B-parameter Mixture-of-Experts design. It preserves the core strengths of GPT-OSS 120B, including reasoning, agentic tool use with function calling, and structured outputs, making it a fit for latency-sensitive applications that still need the full model's capability.