{"id":4503,"date":"2025-11-21T14:38:20","date_gmt":"2025-11-21T14:38:20","guid":{"rendered":"https:\/\/violethoward.com\/new\/ai2s-olmo-3-family-challenges-qwen-and-llama-with-efficient-open-reasoning-and-customization\/"},"modified":"2025-11-21T14:38:20","modified_gmt":"2025-11-21T14:38:20","slug":"ai2s-olmo-3-family-challenges-qwen-and-llama-with-efficient-open-reasoning-and-customization","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/ai2s-olmo-3-family-challenges-qwen-and-llama-with-efficient-open-reasoning-and-customization\/","title":{"rendered":"Ai2\u2019s Olmo 3 family challenges Qwen and Llama with efficient, open reasoning and customization"},"content":{"rendered":"<p> <br \/>\n<br \/><img decoding=\"async\" src=\"https:\/\/images.ctfassets.net\/jdtwqhzvc2n1\/2kgtg0ELzGGfEZIeYgcD2H\/dfe8040f9c6f5ea3d10feaefff392c59\/crimedy7_illustration_of_a_robot_with_a_clear_body_and_you_ca_9b829902-bd72-4f00-a3dd-103ede8df2ae_0.png?w=300&amp;q=30\" \/><\/p>\n<p>The <u>Allen Institute for AI (Ai2)<\/u> hopes to take advantage of an increased demand for customized models and enterprises seeking more transparency from AI models with its latest release.<\/p>\n<p>Ai2 made the latest addition to its Olmo family of large language models available to organizations, continuing to focus on openness and customization.\u00a0<\/p>\n<p>Olmo 3 has a longer context window, more reasoning traces and is better at coding than its previous iteration. This latest version, like the other Olmo releases, is open-sourced under the Apache 2.0 license. Enterprises will have complete transparency into and control over the training data and checkpointing.\u00a0<\/p>\n<p>Ai2 will release three versions of Olmo 3:<\/p>\n<ul>\n<li>\n<p>Olmo 3- Think in both 7B and 32B are considered the flagship reasoning models for advanced research<\/p>\n<\/li>\n<li>\n<p>Olmo 3- Base also in both parameters, which is ideal for programming, comprehension, math and long-context reasoning. Ai2 said this version is \u201cideal for continued pre-training or fine-tuning<\/p>\n<\/li>\n<li>\n<p>Olmo 3-Instruct in 7B that is optimized for instruction following, multi-turn dialogue and tool use<\/p>\n<\/li>\n<\/ul>\n<p>The company said Olmo 3- Think is the \u201cfirst-ever fully open 32B thinking model that generates explicit reasoning-chain-style content.\u201d Olmo-3 Think also has a long context window of 65,000 tokens, perfect for longer-running agentic projects or reasoning over longer documents.\u00a0<\/p>\n<p>Noah Smith, Ai2\u2019s senior director of NLP research, told VentureBeat in an interview that many of its customers, from regulated enterprises to research institutions, want to use models that give them assurance about what went into the training.\u00a0<\/p>\n<p>\u201cThe releases from our friends in the tech world are very cool and super exciting, but there are a lot of people for whom data privacy control over what goes into the model, how the models train and other constraints on how the model can be used as front of mind,\u201d said Smith.\u00a0<\/p>\n<p>Developers can access the models on Hugging Face and the Ai2 Playground.\u00a0<\/p>\n<\/p>\n<h2>Transparency and customization<\/h2>\n<p>Smith said models like Olmo 3, which the company believes any organization using its models has to have control over and mold in the way that best works for them.<\/p>\n<p>\u201cWe don&#x27;t believe in one-size-fits-all solutions,\u201d Smith said. It&#x27;s a known thing in the world of machine learning that if you try and build a model that solves all the problems, it ends up not being really the best model for any one problem. There aren&#x27;t formal proofs of that, but it&#x27;s a thing that old timers like me have kind of observed.\u201d<\/p>\n<p>He added that models with the ability to specialize \u201care maybe not as flash as getting high scores on math exams\u201d but offer more flexibility for enterprises. <\/p>\n<p>Olmo 3 allows enterprises to essentially retrain the model by adding to the data mix it learns from. The idea is that businesses can bring in their proprietary sources to guide the model in answering specific company queries. To help enterprises during this process, Ai2 added checkpoints from every major training phase. <\/p>\n<p>Demand for model customization has grown as enterprises that cannot build their own LLMs want to create company-specific or industry-focused models. Startups like <u>Arcee<\/u> have <u>begun offering<\/u> enterprise-focused, customizable small models. <\/p>\n<p>Models like Olmo 3, Smith said, also give enterprises more confidence in the technology. Since Olmo 3 provides the training data, Smith said enterprises can trust that the model did not ingest anything it shouldn\u2019t have.<\/p>\n<p>Ai2 has always claimed to be committed to greater transparency, even launching a tool called <u>OlmoTrace in April<\/u> that can track a model\u2019s output directly back to the original training data. The company releases open-sourced models and posts its code to repositories like GitHub for anyone to use. <\/p>\n<p>Competitors like Google and OpenAI have <u>faced criticism from developers<\/u> over moves that hid raw reasoning tokens and chose to summarize reasoning, claiming that they now resort to \u201cdebugging blind\u201d without transparency. <\/p>\n<p>Ai2 pretrained Olmo 3 on the six-trillion-token open source dataset, Dolma 3. The dataset encompasses web data, scientific literature and code. Smith said they optimized Olmo 3 for code, compared to the focus on math for Olmo 2.\u00a0<\/p>\n<h2>How it stacks up<\/h2>\n<p>Ai2 claims that the Olmo 3 family of models represents a significant leap for truly open-source models, at least for open-source LLMs developed outside China. The base Olmo 3 model trained \u201cwith roughly 2.5x greater compute efficiency as measured by GPU-hours per token,\u201d meaning it consumed less energy during pre-training and costs less.<\/p>\n<p>The company said the Olmo 3 models outperformed other open models, such as Marin from Stanford, LLM360\u2019s K2, and Apertus, though Ai2 did not provide figures for the benchmark testing. <\/p>\n<p>\u201cOf note, Olmo 3-Think (32B) is the strongest fully open reasoning model, narrowing the gap to the best open-weight models of similar scale, such as the Qwen 3-32B-Thinking series of models across our suite of reasoning benchmarks, all while being trained on 6x fewer tokens,\u201d Ai2 said in a press release. <\/p>\n<p>The company added that Olmo 3-Instruct performed better than Qwen 2.5, Gemma 3 and Llama 3.1.<\/p>\n<\/p>\n<p>\u00a0<\/p>\n<\/p>\n<p><br \/>\n<br \/><a href=\"https:\/\/venturebeat.com\/ai\/ai2s-olmo-3-family-challenges-qwen-and-llama-with-efficient-open-reasoning\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Allen Institute for AI (Ai2) hopes to take advantage of an increased demand for customized models and enterprises seeking more transparency from AI models with its latest release. Ai2 made the latest addition to its Olmo family of large language models available to organizations, continuing to focus on openness and customization.\u00a0 Olmo 3 has [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4504,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-4503","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/11\/crimedy7_illustration_of_a_robot_with_a_clear_body_and_you_ca_9b829902-bd72-4f00-a3dd-103ede8df2ae_0.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/4503","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=4503"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/4503\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/4504"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=4503"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=4503"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=4503"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69d79d7d46fa5cbf45858bd1. Config Timestamp: 2026-04-09 12:37:16 UTC, Cached Timestamp: 2026-04-29 21:40:49 UTC -->