{"id":1330,"date":"2025-04-21T08:33:41","date_gmt":"2025-04-21T08:33:41","guid":{"rendered":"https:\/\/violethoward.com\/new\/openais-new-gpt-4-1-models-can-process-a-million-tokens-and-solve-coding-problems-better-than-ever\/"},"modified":"2025-04-21T08:33:41","modified_gmt":"2025-04-21T08:33:41","slug":"openais-new-gpt-4-1-models-can-process-a-million-tokens-and-solve-coding-problems-better-than-ever","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/openais-new-gpt-4-1-models-can-process-a-million-tokens-and-solve-coding-problems-better-than-ever\/","title":{"rendered":"OpenAI&#8217;s new GPT-4.1 models can process a million tokens and solve coding problems better than ever"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<p>OpenAI launched a new family of AI models this morning that significantly improve coding abilities while cutting costs, responding directly to growing competition in the enterprise AI market.<\/p>\n\n\n\n<p>The San Francisco-based AI company introduced three models \u2014 GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano \u2014 all available immediately through its API. The new lineup performs better at software engineering tasks, follows instructions more precisely, and can process up to one million tokens of context, equivalent to about 750,000 words.<\/p>\n\n\n\n<p>\u201cGPT-4.1 offers exceptional performance at a lower cost,\u201d said Kevin Weil, chief product officer at OpenAI, during Monday\u2019s announcement. \u201cThese models are better than GPT-4o on just about every dimension.\u201d<\/p>\n\n\n\n<p>Perhaps most significant for enterprise customers is the pricing: GPT-4.1 will cost 26% less than its predecessor, while the lightweight nano version becomes OpenAI\u2019s most affordable offering at just 12 cents per million tokens.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><p>\n<iframe loading=\"lazy\" title=\"GPT 4.1 in the API\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/kA-P9ood-cE?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/p><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-gpt-4-1-s-improvements-target-enterprise-developers-biggest-pain-points\">How GPT-4.1\u2019s improvements target enterprise developers\u2019 biggest pain points<\/h2>\n\n\n\n<p>In a candid interview with VentureBeat, Michelle Pokrass, post training research lead at OpenAI, emphasized that practical business applications drove the development process.<\/p>\n\n\n\n<p>\u201cGPT-4.1 was trained with one goal: being useful for developers,\u201d Pokrass told VentureBeat. \u201cWe\u2019ve found GPT-4.1 is much better at following the kinds of instructions that enterprises use in practice, which makes it much easier to deploy production-ready applications.\u201d<\/p>\n\n\n\n<p>This focus on real-world utility is reflected in benchmark results. On SWE-bench Verified, which measures software engineering capabilities, GPT-4.1 scored 54.6% \u2014 a substantial 21.4 percentage point improvement over GPT-4o.<\/p>\n\n\n\n<p>For businesses developing AI agents that work independently on complex tasks, the improvements in instruction following are particularly valuable. On Scale\u2019s MultiChallenge benchmark, GPT-4.1 scored 38.3%, outperforming GPT-4o by 10.5 percentage points.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-openai-s-three-tiered-model-strategy-challenges-competitors-like-google-and-anthropic\">Why OpenAI\u2019s three-tiered model strategy challenges competitors like Google and Anthropic<\/h2>\n\n\n\n<p>The introduction of three distinct models at different price points addresses the diversifying AI marketplace. The flagship GPT-4.1 targets complex enterprise applications, while mini and nano versions address use cases where speed and cost efficiency are priorities.<\/p>\n\n\n\n<p>\u201cNot all tasks need the most intelligence or top capabilities,\u201d Pokrass told VentureBeat. \u201cNano is going to be a workhorse model for use cases like autocomplete, classification, data extraction, or anything else where speed is the top concern.\u201d<\/p>\n\n\n\n<p>Simultaneously, OpenAI announced plans to deprecate GPT-4.5 Preview \u2014 its largest and most expensive model released just two months ago \u2014 from its API by July 14. The company positioned GPT-4.1 as a more cost-effective replacement that delivers \u201cimproved or similar performance on many key capabilities at much lower cost and latency.\u201d<\/p>\n\n\n\n<p>This move allows OpenAI to reclaim computing resources while providing developers a more efficient alternative to its costliest offering, which had been priced at $75 per million input tokens and $150 per million output tokens.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-real-world-results-how-thomson-reuters-carlyle-and-windsurf-are-leveraging-gpt-4-1\">Real-world results: How Thomson Reuters, Carlyle and Windsurf are leveraging GPT-4.1<\/h2>\n\n\n\n<p>Several enterprise customers who tested the models prior to launch reported substantial improvements in their specific domains.<\/p>\n\n\n\n<p>Thomson Reuters saw a 17% improvement in multi-document review accuracy when using GPT-4.1 with its legal AI assistant, CoCounsel. This enhancement is particularly valuable for complex legal workflows involving lengthy documents with nuanced relationships between clauses.<\/p>\n\n\n\n<p>Financial firm Carlyle reported 50% better performance on extracting granular financial data from dense documents \u2014 a critical capability for investment analysis and decision-making.<\/p>\n\n\n\n<p>Varun Mohan, CEO of coding tool provider Windsurf (formerly Codeium), shared detailed performance metrics during the announcement.<\/p>\n\n\n\n<p>\u201cWe found that GPT-4.1 reduces the number of times that it needs to read unnecessary files by 40% compared to other leading models, and also modifies unnecessary files 70% less,\u201d Mohan said. \u201cThe model is also surprisingly less verbose\u2026 GPT-4.1 is 50% less verbose than other leading models.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-million-token-context-what-businesses-can-do-with-8x-more-processing-capacity\">Million-token context: What businesses can do with 8x more processing capacity<\/h2>\n\n\n\n<p>All three models feature a context window of one million tokens \u2014 eight times larger than GPT-4o\u2019s 128,000 token limit. This expanded capacity allows the models to process multiple lengthy documents or entire codebases at once.<\/p>\n\n\n\n<p>In a demonstration, OpenAI showed GPT-4.1 analyzing a 450,000-token NASA server log file from 1995, identifying an anomalous entry hiding deep within the data. This capability is particularly valuable for tasks involving large datasets, such as code repositories or corporate document collections.<\/p>\n\n\n\n<p>However, OpenAI acknowledges performance degradation with extremely large inputs. On its internal OpenAI-MRCR test, accuracy dropped from around 84% with 8,000 tokens to 50% with one million tokens.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-the-enterprise-ai-landscape-is-shifting-as-google-anthropic-and-openai-compete-for-developers\">How the enterprise AI landscape is shifting as Google, Anthropic and OpenAI compete for developers<\/h2>\n\n\n\n<p>The release comes as competition in the enterprise AI space heats up. Google recently launched Gemini 2.5 Pro with a comparable one-million-token context window, while Anthropic\u2019s Claude 3.7 Sonnet has gained traction with businesses seeking alternatives to OpenAI\u2019s offerings.<\/p>\n\n\n\n<p>Chinese AI startup DeepSeek also recently upgraded its models, putting additional pressure on OpenAI to maintain its leadership position.<\/p>\n\n\n\n<p>\u201cIt\u2019s been really cool to see how improvements in long context understanding have translated into better performance on specific verticals like legal analysis and extracting financial data,\u201d Pokrass said. \u201cWe\u2019ve found it\u2019s critical to test our models beyond the academic benchmarks and make sure they perform well with enterprises and developers.\u201d<\/p>\n\n\n\n\n\n\n\n<p>By releasing these models specifically through its API rather than ChatGPT, OpenAI signals its commitment to developers and enterprise customers. The company plans to gradually incorporate features from GPT-4.1 into ChatGPT over time, but the primary focus remains on providing robust tools for businesses building specialized applications.<\/p>\n\n\n\n<p>To encourage further research in long-context processing, OpenAI is releasing two evaluation datasets: OpenAI-MRCR for testing multi-round coreference abilities and Graphwalks for evaluating complex reasoning across lengthy documents.<\/p>\n\n\n\n<p>For enterprise decision-makers, the GPT-4.1 family offers a more practical, cost-effective approach to AI implementation. As organizations continue integrating AI into their operations, these improvements in reliability, specificity, and efficiency could accelerate adoption across industries still weighing implementation costs against potential benefits.<\/p>\n\n\n\n<p>While competitors chase larger, costlier models, OpenAI\u2019s strategic pivot with GPT-4.1 suggests the future of AI may not belong to the biggest models, but to the most efficient ones. The real breakthrough may not be in the benchmarks, but in bringing enterprise-grade AI within reach of more businesses than ever before.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/security\/openais-new-gpt-4-1-models-can-process-a-million-tokens-and-solve-coding-problems-better-than-ever\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>OpenAI launched a new family of AI models this morning that significantly improve coding abilities while cutting costs, responding directly to growing competition in the enterprise AI market. The San Francisco-based AI company introduced three models \u2014 GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano \u2014 all available immediately through its API. The new lineup performs better [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1331,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-1330","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/04\/nuneybits_Vector_art_of_a_retro_computer_screen_on_the_screen_i_cbb588f4-05a8-43cc-9e55-c08b617871ae.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1330","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=1330"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1330\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/1331"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=1330"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=1330"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=1330"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 04:40:36 UTC -->