Singapore – Alibaba, a Chinese tech company, has recently introduced the latest addition to its open-source large language model, Qwen3.
Part of this new Qwen3 series includes six dense models and two Mixture-of-Experts (MoE) models. This offers developers the versatility to create advanced applications for mobile devices, smart glasses, autonomous vehicles, robotics, and more.
The complete suite encompasses dense models (0.6B, 1.7B, 4B, 8B, 14B, and 32B parameters) and MoE models (30B with 3B active and 235B with 22B active), which are now open-sourced and ready for worldwide access.
Said offering marks the debut of hybrid reasoning models that combine traditional LLM capabilities and advanced, dynamic reasoning. The Qwen3 models, in particular, can seamlessly switch between thinking mode for complex, multi-step tasks such as mathematics, coding, and logical deduction, and non-thinking mode for fast, general-purpose responses.
With this, developers using the Qwen3 API also gain precise control over processing time, enabling intelligent output and computational efficiency. The Qwen3-235B-A22B MoE model further reduces deployment costs, highlighting Alibaba’s dedication to providing accessible, high-performance AI.
As it is trained on a massive dataset of 36 trillion tokens, this new offering further brings significant advancement in reasoning, instruction following, tool use, and multilingual tasks.
Among the highlighted capabilities are multilingual mastery, supporting languages and dialects, and leading performance in translation and multilingual instruction-following.
Another feature is its advanced agent integration that natively supports the Model Context Protocol (MCP) and robust function calling, leading open-source models in complex agent-based tasks.
Next is superior reasoning, surpassing previous Qwen models in mathematics, coding, and logical reasoning benchmarks. Lastly, it also offers enhanced human alignment, which delivers more natural creative writing, role-playing, and multi-turn dialogue experiences for more natural, engaging conversations.
Following these advancements in model design, increased training data, and more effective training approaches, Qwen3 models offer services in industry benchmarks like AIME25 (mathematical reasoning), LiveCodeBench, BFCL, and Arena-Hard.
In addition, a four-stage process, which includes CoT cold start, reasoning-based RL, thinking mode fusion, and general RL, was used to build the hybrid reasoning model.
The Qwen model family has gained over 300 million downloads worldwide since its debut. With over 100,000 Qwen-based derivative models created on Hugging Face, Qwen has established itself as one of the leading open-source AI model series globally.