{"id":1495,"date":"2025-04-29T02:45:14","date_gmt":"2025-04-29T02:45:14","guid":{"rendered":"https:\/\/violethoward.com\/new\/alibaba-launches-open-source-qwen3-besting-openai-o1\/"},"modified":"2025-04-29T02:45:14","modified_gmt":"2025-04-29T02:45:14","slug":"alibaba-launches-open-source-qwen3-besting-openai-o1","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/alibaba-launches-open-source-qwen3-besting-openai-o1\/","title":{"rendered":"Alibaba launches open source Qwen3 besting OpenAI o1"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>Chinese e-commerce and web giant Alibaba\u2019s Qwen team has officially launched a new series of open source AI large language multimodal models known as Qwen3 that appear to be among the state-of-the-art for open models, and approach performance of proprietary models from the likes of OpenAI and Google. <\/p>\n\n\n\n<p>The Qwen3 series features two \u201cmixture-of-experts\u201d models and six dense models for a total of eight (!) new models. The \u201cmixture-of-experts\u201d approach involves having several different specialty model types combined into one, with only those relevant models to the task at hand being activated when needed in the internal settings of the model (known as parameters). It was popularized by open source French AI startup Mistral. <\/p>\n\n\n\n<p>According to the team, the 235-billion parameter version of Qwen3 codenamed A22B outperforms DeepSeek\u2019s open source R1 and OpenAI\u2019s proprietary o1 on key third-party benchmarks including ArenaHard (with 500 user questions in software engineering and math) and nears the performance of the new, proprietary Google Gemini 2.5-Pro.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"3413\" height=\"1920\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?w=800\" alt=\"\" class=\"wp-image-3005882\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg 3413w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=300,169 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=768,432 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=800,450 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=1536,864 1536w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=2048,1152 2048w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=400,225 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=750,422 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=578,325 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Gppj9_kbEAAkO9U.jpg?resize=930,523 930w\" sizes=\"(max-width: 3413px) 100vw, 3413px\"\/><\/figure>\n\n\n\n<p>Overall, the benchmark data positions Qwen3-235B-A22B as one of the most powerful publicly available models, achieving parity or superiority relative to major industry offerings.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-hybrid-reasoning-theory\">Hybrid (reasoning) theory<\/h2>\n\n\n\n<p>The Qwen3 models are trained to provide so-called \u201chybrid reasoning\u201d or \u201cdynamic reasoning\u201d capabilities, allowing users to toggle between fast, accurate responses and more time-consuming and compute-intensive reasoning steps (similar to OpenAI\u2019s \u201co\u201d series) for more difficult queries in science, math, engineering and other specialized fields. This is an approach pioneered by Nous Research and other AI startups and research collectives. <\/p>\n\n\n\n<p>With Qwen3, users can engage the more intensive \u201cThinking Mode\u201d using the button marked as such on the Qwen Chat website or by embedding specific prompts like <code>\/think<\/code> or <code>\/no_think<\/code> when deploying the model locally or through the API, allowing for flexible use depending on the task complexity.<\/p>\n\n\n\n<p>Users can now access and deploy these models across platforms like Hugging Face, ModelScope, Kaggle, and GitHub, as well as interact with them directly via the Qwen Chat web interface and mobile applications. The release includes both Mixture of Experts (MoE) and dense models, all available under the Apache 2.0 open-source license. <\/p>\n\n\n\n<p>In my brief usage of the Qwen Chat website so far, it was able to generate imagery relatively rapidly and with decent prompt adherence \u2014 especially when incorporating text into the image natively while matching the style. However, it often prompted me to log in and was subject to the usual Chinese content restrictions (such as prohibiting prompts or responses related to the Tiananmen Square protests).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2302\" height=\"770\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44%E2%80%AFPM.png?w=800\" alt=\"\" class=\"wp-image-3005903\" style=\"width:840px;height:auto\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png 2302w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=300,100 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=768,257 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=800,268 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=1536,514 1536w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=2048,685 2048w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=400,134 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=750,251 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=578,193 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-28-at-6.31.44\u202fPM.png?resize=930,311 930w\" sizes=\"auto, (max-width: 2302px) 100vw, 2302px\"\/><\/figure>\n\n\n\n<p>In addition to the MoE offerings, Qwen3 includes dense models at different scales: Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B. <\/p>\n\n\n\n<p>These models vary in size and architecture, offering users options to fit diverse needs and computational budgets.<\/p>\n\n\n\n<p>The Qwen3 models also significantly expand multilingual support, now covering 119 languages and dialects across major language families. This broadens the models\u2019 potential applications globally, facilitating research and deployment in a wide range of linguistic contexts.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-model-training-and-architecture\">Model training and architecture<\/h2>\n\n\n\n<p>In terms of model training, Qwen3 represents a substantial step up from its predecessor, Qwen2.5. The pretraining dataset doubled in size to approximately 36 trillion tokens. <\/p>\n\n\n\n<p>The data sources include web crawls, PDF-like document extractions, and synthetic content generated using previous Qwen models focused on math and coding.<\/p>\n\n\n\n<p>The training pipeline consisted of a three-stage pretraining process followed by a four-stage post-training refinement to enable the hybrid thinking and non-thinking capabilities. The training improvements allow the dense base models of Qwen3 to match or exceed the performance of much larger Qwen2.5 models.<\/p>\n\n\n\n<p>Deployment options are versatile. Users can integrate Qwen3 models using frameworks such as SGLang and vLLM, both of which offer OpenAI-compatible endpoints. <\/p>\n\n\n\n<p>For local usage, options like Ollama, LMStudio, MLX, llama.cpp, and KTransformers are recommended. Additionally, users interested in the models\u2019 agentic capabilities are encouraged to explore the Qwen-Agent toolkit, which simplifies tool-calling operations.<\/p>\n\n\n\n<p>Junyang Lin, a member of the Qwen team, commented on X that building Qwen3 involved addressing critical but less glamorous technical challenges such as scaling reinforcement learning stably, balancing multi-domain data, and expanding multilingual performance without quality sacrifice. <\/p>\n\n\n\n<p>Lin also indicated that the team is transitioning focus toward training agents capable of long-horizon reasoning for real-world tasks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-it-means-for-enterprise-decision-makers\">What it means for enterprise decision-makers<\/h2>\n\n\n\n<p>Engineering teams can point existing OpenAI-compatible endpoints to the new model in hours instead of weeks. The MoE checkpoints (235 B parameters with 22 B active, and 30 B with 3 B active) deliver GPT-4-class reasoning at roughly the GPU memory cost of a 20\u201330 B dense model. <\/p>\n\n\n\n<p>Official LoRA and QLoRA hooks allow private fine-tuning without sending proprietary data to a third-party vendor.<\/p>\n\n\n\n<p>Dense variants from 0.6 B to 32 B make it easy to prototype on laptops and scale to multi-GPU clusters without rewriting prompts.<\/p>\n\n\n\n<p>Running the weights on-premises means all prompts and outputs can be logged and inspected. MoE sparsity reduces the number of active parameters per call, cutting the inference attack surface. <\/p>\n\n\n\n<p>The Apache-2.0 license removes usage-based legal hurdles, though organizations should still review export-control and governance implications of using a model trained by a China-based vendor.<\/p>\n\n\n\n<p>Yet at the same time, it also offers a viable alternative to other Chinese players including DeepSeek, Tencent, and ByteDance \u2014 as well as the myriad and growing number of North American models such as the aforementioned OpenAI, Google, Microsoft, Anthropic, Amazon, Meta and others. The permissive Apache 2.0 license \u2014 which allows for unlimited commercial usage \u2014 is also a big advantage over other open source players like Meta, whose licenses are more restrictive. <\/p>\n\n\n\n<p>It indicates furthermore that the race between AI providers to offer ever-more powerful and accessible models continues to remain highly competitive, and savvy organizations looking to cut costs should attempt to remain flexible and open to evaluating said new models for their AI agents and workflows.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-looking-ahead\">Looking ahead<\/h2>\n\n\n\n<p>The Qwen team positions Qwen3 not just as an incremental improvement but as a significant step toward future goals in Artificial General Intelligence (AGI) and Artificial Superintelligence (ASI), AI significantly smarter than humans. <\/p>\n\n\n\n<p>Plans for Qwen\u2019s next phase include scaling data and model size further, extending context lengths, broadening modality support, and enhancing reinforcement learning with environmental feedback mechanisms.<\/p>\n\n\n\n<p>As the landscape of large-scale AI research continues to evolve, Qwen3\u2019s open-weight release under an accessible license marks another important milestone, lowering barriers for researchers, developers, and organizations aiming to innovate with state-of-the-art LLMs.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/alibaba-launches-open-source-qwen3-model-that-surpasses-openai-o1-and-deepseek-r1\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese e-commerce and web giant Alibaba\u2019s Qwen team has officially launched a new series of open source AI large language multimodal models known as Qwen3 that appear to be among the state-of-the-art for open models, and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1496,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-1495","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/04\/029ebd88-b24c-4396-80ff-2a47b3477cc6.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=1495"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1495\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/1496"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=1495"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=1495"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=1495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 05:20:04 UTC -->