A new mystery model, dubbed Horizon, has recently appeared on Open Router, sparking significant speculation that it could be OpenAI’s long-anticipated open-weight model. This development is particularly intriguing given previous instances of OpenAI testing models like Quasar Alpha (a version of GPT-4.1 with a million tokens) on Open Router before their official release.

Contents
What We Know About Horizon Beta
Horizon is currently available in its beta version, following an earlier Horizon Alpha that was tested by users. While its origins remain officially unconfirmed by OpenAI, the model’s capabilities and the surrounding chatter suggest a major contender in the open-weight AI space.
Key Technical Details:
- Context Window: Horizon boasts an impressive 256,000 tokens, allowing it to process and understand very long inputs.
- Max Output: It can generate a maximum output of 128,000 tokens, making it highly suitable for tasks requiring extensive code or detailed responses.
- Throughput: The model offers a throughput of 37 to 140 tokens per second on Open Router, indicating it might be a smaller model compared to the GPT-4 series.
Demonstrating Horizon’s Prowess
Tests conducted with Horizon Beta reveal a model with remarkable capabilities, particularly in coding and complex problem-solving.
Impressive Demos Include:
- Gemini 2.5 DeepMind Prompt: Horizon beta provided a highly impressive output, even allowing users to zoom in and out and adding details like the time of day. It also offers the ability to regenerate outputs in different styles.
- Rubik’s Cube Solver: Given detailed instructions, Horizon generated a functional Rubik’s Cube solver based on Kociemba’s algorithm, working flawlessly for smaller cube sizes. While it showed some limitations with scrambling for larger sizes (only scrambling the outer layer), it could still solve them.
- SaaS Website Generation: The model did an excellent job creating a mock Software as a Service (SaaS) website landing page, incorporating desired features like animations, working buttons, and even a dark/light theme switch.
- Tower of Hanoi Solution: Horizon successfully implemented a solution for the Tower of Hanoi puzzle, demonstrating its ability to handle recursion and achieve the optimal solution in 15 steps for a typical problem. A minor visual glitch was noted where blocks appeared upside down.
- Planet Generation: The model could generate different planets with various biomes (ocean, forest, deserts, mountains) and allowed for quality adjustments, improving the output.
Identified Failure Cases: Despite its strengths, Horizon Beta also exhibited some limitations in specific tasks:
- It failed to implement a heptagon with 20 bouncing balls that collide with the sides.
- It was unable to simulate letters falling under the influence of gravity.
Real-World Application: Beyond these toy examples, Horizon shows significant promise for practical applications, especially in generating structured output from unstructured text based on a given schema. This capability is crucial for building agentic systems that require organized data. Overall, it appears to be an excellent coding model, especially if it is indeed an open-source model of 12 billion or 20 billion parameters.
Google Gemini Deep Think Launched: A New Frontier in AI
The Open-Weight Leak and Speculation
The discussion around Horizon has been fueled by alleged leaks and claims regarding its parameters and architecture.
Leak Claims:
- Jimmy Apples, a notable figure in the AI community, claimed to have found and saved configurations of an OpenAI open-source model shortly after it was briefly uploaded to Hugging Face via test accounts.
- According to Jimmy Apples, there are two different models: one with 120 billion parameters and another with 20 billion parameters.
- Information also suggests an Mixture-of-Experts (MoE) architecture with 128 experts, though the full accuracy of this architectural detail is unconfirmed.
Industry Confirmation (and Clarification): Yanjen, the CTO of Hyperbolic, who previously hinted at having access to these weights, corroborated some of these claims. While stating that OpenAI is not releasing GPT-5 or open-source models, he specifically mentioned “120B and 20B today” in reference to OS models (open-source models). He clarified that the leaked weights were quantized and not the original FP4 pre-trained weights. This suggests there’s truth to the weight leaks, and the models mentioned could be the actual weights, albeit in a quantized form.
There is anticipation that an open-weight model from OpenAI could potentially be released in the near future.
The Broader Open-Weight Landscape
The potential release of an OpenAI open-weight model comes at a critical time in the AI industry.
- Currently, the open-weight model landscape is heavily dominated by models primarily originating from China.
- According to the Artificial Analysis Intelligence Index, Grok 4 is considered the best model, but nearly all other top open-weight models on their list are from China.
- Out of the first 22 open-weight models, only two (or possibly three, including a Neotron model based on Quinn) are not Chinese, with models like Mr. Small and Meta Maverick appearing much lower on the list.
An open-weight model from OpenAI would significantly diversify this landscape. However, the licensing of such a model remains a key question, as it could be for research purposes, commercial use, or a combination.
The emergence of Horizon highlights the ongoing shift towards more accessible and potentially open-source AI models, challenging the current market dynamics and offering exciting possibilities for developers and researchers worldwide.
Pingback: Open AI Open Source Models Released: 2 New GPT-OSS Models - TokenBae