{"id":2738,"date":"2025-07-24T13:55:37","date_gmt":"2025-07-24T13:55:37","guid":{"rendered":"https:\/\/violethoward.com\/new\/qwen3-coder-480b-a35b-instruct-launches-and-it-might-be-the-best-coding-model-yet\/"},"modified":"2025-07-24T13:55:37","modified_gmt":"2025-07-24T13:55:37","slug":"qwen3-coder-480b-a35b-instruct-launches-and-it-might-be-the-best-coding-model-yet","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/qwen3-coder-480b-a35b-instruct-launches-and-it-might-be-the-best-coding-model-yet\/","title":{"rendered":"Qwen3-Coder-480B-A35B-Instruct launches and it &#8216;might be the best coding model yet&#8217;"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders.<\/em> <em>Subscribe Now<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>Chinese e-commerce giant Alibaba\u2019s \u201cQwen Team\u201d has done it again.<\/p>\n\n\n\n<p>Mere days after releasing for free and with open source licensing what is now the top performing non-reasoning large language model (LLM) in the world \u2014 full stop, even compared to proprietary AI models from well-funded U.S. labs such as Google and OpenAI \u2014 in the form of the lengthily named Qwen3-235B-A22B-2507, this group of AI researchers has come out with yet another blockbuster model. <\/p>\n\n\n\n<p>That is <strong>Qwen3-Coder-480B-A35B-Instruct<\/strong>, a new open-source LLM focused on assisting with software development. It is designed to handle complex, multi-step coding workflows and can create full-fledged, functional applications in <em>seconds<\/em> or minutes.<\/p>\n\n\n\n<p>The model is positioned to compete with proprietary offerings like Claude Sonnet-4 in agentic coding tasks and sets new benchmark scores among open models.<\/p>\n\n\n\n<p>It is available on Hugging Face, GitHub, Qwen Chat, via Alibaba\u2019s Qwen API, and a growing list of third-party vibe coding and AI tool platforms. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-open-sourcing-licensing-means-low-cost-and-high-optionality-for-enterprises\">Open sourcing licensing means low cost and high optionality for enterprises<\/h2>\n\n\n\n<p>But unlike Claude and other proprietary models, Qwen3-Coder, which we\u2019ll call it for short, is available now under an open source Apache 2.0 license, meaning it\u2019s free for any enterprise to take without charge, download, modify, deploy and use in their commercial applications for employees or end customers without paying Alibaba or anyone else a dime. <\/p>\n\n\n\n<p>It\u2019s also so highly performant on third-party benchmarks and anecdotal usage among AI power users for \u201cvibe coding\u201d \u2014 coding using natural language and without formal development processes and steps \u2014 that at least one, LLM researcher Sebastian Raschka, wrote on X that: <em>\u201cThis might be the best coding model yet. General-purpose is cool, but if you want the best at coding, specialization wins. No free lunch.\u201d<\/em><\/p>\n\n\n\n<p>Developers and enterprises interested in downloading it can find the code on the AI code sharing repository Hugging Face.<\/p>\n\n\n\n<p>Enterprises who don\u2019t wish to, or don\u2019t have the capacity to host the model on their own or through various third-party cloud inference providers, can also use it directly through the Alibaba Cloud Qwen API, where the per-million token costs start at $1\/$5 per million tokens (mTok) for input\/output of up to 32,000 tokens, then $1.8\/$9 for up to 128,000, $3\/$15 for up to 256,000 and $6\/$60 for the full million.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img fetchpriority=\"high\" decoding=\"async\" height=\"394\" width=\"800\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14%E2%80%AFPM.png?w=800\" alt=\"\" class=\"wp-image-3014612\" style=\"width:840px;height:auto\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png 1162w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=300,148 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=768,378 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=800,394 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=100,50 100w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=400,197 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=750,369 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=578,285 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14\u202fPM.png?resize=930,458 930w\" sizes=\"(max-width: 800px) 100vw, 800px\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-model-architecture-and-capabilities\">Model architecture and capabilities<\/h2>\n\n\n\n<p>According to the documentation released by Qwen Team online, Qwen3-Coder is a Mixture-of-Experts (MoE) model with 480 billion total parameters, 35 billion active per query, and 8 active experts out of 160. <\/p>\n\n\n\n<p>It supports 256K token context lengths natively, with extrapolation up to 1 million tokens using YaRN (Yet another RoPE extrapolatioN \u2014 a technique used to extend a language model\u2019s context length beyond its original training limit by modifying the Rotary Positional Embeddings (RoPE) used during attention computation. This capacity enables the model to understand and manipulate entire repositories or lengthy documents in a single pass.<\/p>\n\n\n\n<p>Designed as a causal language model, it features 62 layers, 96 attention heads for queries, and 8 for key-value pairs. It is optimized for token-efficient, instruction-following tasks and omits support for <think> blocks by default, streamlining its outputs.<\/think><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-high-performance\">High performance<\/h2>\n\n\n\n<p>Qwen3-Coder has achieved leading performance among open models on several agentic evaluation suites:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>SWE-bench Verified: 67.0% (standard), 69.6% (500-turn)<\/li>\n\n\n\n<li>GPT-4.1: 54.6%<\/li>\n\n\n\n<li>Gemini 2.5 Pro Preview: 49.0%<\/li>\n\n\n\n<li>Claude Sonnet-4: 70.4%<\/li>\n<\/ul>\n\n\n\n<p>The model also scores competitively across tasks such as agentic browser use, multi-language programming, and tool use. Visual benchmarks show progressive improvement across training iterations in categories like code generation, SQL programming, code editing, and instruction following.<\/p>\n\n\n\n\n\n\n\n<p>Alongside the model, Qwen has open-sourced Qwen Code, a CLI tool forked from Gemini Code. This interface supports function calling and structured prompting, making it easier to integrate Qwen3-Coder into coding workflows. Qwen Code supports Node.js environments and can be installed via npm or from source.<\/p>\n\n\n\n<p>Qwen3-Coder also integrates with developer platforms such as:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Claude Code (via DashScope proxy or router customization)<\/li>\n\n\n\n<li>Cline (as an OpenAI-compatible backend)<\/li>\n\n\n\n<li>Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers<\/li>\n<\/ul>\n\n\n\n<p>Developers can run Qwen3-Coder locally or connect via OpenAI-compatible APIs using endpoints hosted on Alibaba Cloud.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-post-training-techniques-code-rl-and-long-horizon-planning\">Post-training techniques: code RL and long-horizon planning<\/h2>\n\n\n\n<p>In addition to pretraining on 7.5 trillion tokens (70% code), Qwen3-Coder benefits from advanced post-training techniques:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Code RL (Reinforcement Learning): Emphasizes high-quality, execution-driven learning on diverse, verifiable code tasks<\/li>\n\n\n\n<li>Long-Horizon Agent RL: Trains the model to plan, use tools, and adapt over multi-turn interactions<\/li>\n<\/ul>\n\n\n\n<p>This phase simulates real-world software engineering challenges. To enable it, Qwen built a 20,000-environment system on Alibaba Cloud, offering the scale necessary for evaluating and training models on complex workflows like those found in SWE-bench.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-enterprise-implications-ai-for-engineering-and-devops-workflows\">Enterprise implications: AI for engineering and DevOps workflows<\/h2>\n\n\n\n<p>For enterprises, Qwen3-Coder offers an open, highly capable alternative to closed-source proprietary models. With strong results in coding execution and long-context reasoning, it is especially relevant for:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Codebase-level understanding:<\/strong> Ideal for AI systems that must comprehend large repositories, technical documentation, or architectural patterns<\/li>\n\n\n\n<li><strong>Automated pull request workflows:<\/strong> Its ability to plan and adapt across turns makes it suitable for auto-generating or reviewing pull requests<\/li>\n\n\n\n<li><strong>Tool integration and orchestration:<\/strong> Through its native tool-calling APIs and function interface, the model can be embedded in internal tooling and CI\/CD systems. This makes it especially viable for agentic workflows and products, i.e., those where the user triggers one or multiple tasks that it wants the AI model to go off and do autonomously, on its own, checking in only when finished or when questions arise.<\/li>\n\n\n\n<li><strong>Data residency and cost control: <\/strong>As an open model, enterprises can deploy Qwen3-Coder on their own infrastructure\u2014whether cloud-native or on-prem\u2014avoiding vendor lock-in and managing compute usage more directly<\/li>\n<\/ul>\n\n\n\n<p>Support for long contexts and modular deployment options across various dev environments makes Qwen3-Coder a candidate for production-grade AI pipelines in both large tech companies and smaller engineering teams.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-developer-access-and-best-practices\">Developer access and best practices<\/h2>\n\n\n\n<p>To use Qwen3-Coder optimally, Qwen recommends:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Sampling settings: temperature=0.7, top_p=0.8, top_k=20, repetition_penalty=1.05<\/li>\n\n\n\n<li>Output length: Up to 65,536 tokens<\/li>\n\n\n\n<li>Transformers version: 4.51.0 or later (older versions may throw errors due to qwen3_moe incompatibility)<\/li>\n<\/ul>\n\n\n\n<p>APIs and SDK examples are provided using OpenAI-compatible Python clients. <\/p>\n\n\n\n<p>Developers can define custom tools and let Qwen3-Coder dynamically invoke them during conversation or code generation tasks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-warm-early-reception-from-ai-power-users\">Warm early reception from AI power users<\/h2>\n\n\n\n<p>Initial responses to Qwen3-Coder-480B-A35B-Instruct have been notably positive among AI researchers, engineers, and developers who have tested the model in real-world coding workflows.<\/p>\n\n\n\n<p>In addition to Raschka\u2019s lofty praise above, Wolfram Ravenwolf, an AI engineer and evaluator at EllamindAI, shared his experience integrating the model with Claude Code on X, stating, <em>\u201cThis is surely the best one currently.\u201d<\/em> <\/p>\n\n\n\n<p>After testing several integration proxies, Ravenwolf said he ultimately built his own using LiteLLM to ensure optimal performance, demonstrating the model\u2019s appeal to hands-on practitioners focused on toolchain customization.<\/p>\n\n\n\n<p>Educator and AI tinkerer Kevin Nelson also weighed in on X after using the model for simulation tasks. <\/p>\n\n\n\n<p><em>\u201cQwen 3 Coder is on another level,\u201d<\/em> he posted, noting that the model not only executed on provided scaffolds but even embedded a message within the output of the simulation \u2014 an unexpected but welcome sign of the model\u2019s awareness of task context.<\/p>\n\n\n\n<p>Even Twitter co-founder and Square (now called \u201cBlock\u201d) founder Jack Dorsey posted an X message in praise of the model, writing: \u201c<em>Goose + qwen3-coder = wow,<\/em>\u201d in reference to his Block\u2019s open source AI agent framework Goose, which VentureBeat covered back in January 2025.<\/p>\n\n\n\n<p>These responses suggest Qwen3-Coder is resonating with a technically savvy user base seeking performance, adaptability, and deeper integration with existing development stacks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-looking-ahead-more-sizes-more-use-cases\">Looking ahead: more sizes, more use cases<\/h2>\n\n\n\n<p>While this release focuses on the most powerful variant, Qwen3-Coder-480B-A35B-Instruct, the Qwen team indicates that additional model sizes are in development. <\/p>\n\n\n\n<p>These will aim to offer similar capabilities with lower deployment costs, broadening accessibility.<\/p>\n\n\n\n<p>Future work also includes exploring self-improvement, as the team investigates whether agentic models can iteratively refine their own performance through real-world use.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/programming-development\/qwen3-coder-480b-a35b-instruct-launches-and-it-might-be-the-best-coding-model-yet\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Chinese e-commerce giant Alibaba\u2019s \u201cQwen Team\u201d has done it again. Mere days after releasing for free and with open source licensing what is now the top performing non-reasoning large [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2739,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-2738","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/07\/Screenshot-2025-07-23-at-2.38.14E280AFPM.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/2738","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=2738"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/2738\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/2739"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=2738"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=2738"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=2738"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 16:01:32 UTC -->