Alibaba’s Qwen team launches Qwen3-235B-A22B-Thinking-2507, a new open-source AI model built for heavy-duty reasoning.
This beast packs 235 billion parameters but uses a Mixture-of-Experts setup. Only about 22 billion are active at once—like calling in the eight best specialists from a 128-person squad for each task.
Benchmarks are tough to beat. It scores 92.3 on AIME25 (math), 74.1 on LiveCodeBench v6 (coding), and a solid 79.7 on Arena-Hard v2 (human preference alignment). It’s clearly gunning for top open-source reasoning AI.
Memory zooms up with a whopping 262,144-token context length. That’s massive for handling huge info sets in one go.
Developers can grab the model on Hugging Face. Deployments work with sglang or vllm, and the Qwen-Agent framework is best for leveraging its tool-calling skills.
For peak results, the team recommends setting output lengths to 32,768 tokens for usual tasks and pushing it to 81,920 tokens for tough problems. They advise telling the model to “reason step-by-step” for tricky math to get cleaner answers.
This isn’t just another open-source model—it’s built to challenge proprietary options on complex logic and coding.