{"id":949,"date":"2025-04-03T14:46:32","date_gmt":"2025-04-03T14:46:32","guid":{"rendered":"https:\/\/violethoward.com\/new\/what-you-need-to-know-about-amazon-nova-act-the-new-ai-agent-sdk-challenging-openai-microsoft-salesforce\/"},"modified":"2025-04-03T14:46:32","modified_gmt":"2025-04-03T14:46:32","slug":"what-you-need-to-know-about-amazon-nova-act-the-new-ai-agent-sdk-challenging-openai-microsoft-salesforce","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/what-you-need-to-know-about-amazon-nova-act-the-new-ai-agent-sdk-challenging-openai-microsoft-salesforce\/","title":{"rendered":"What you need to know about Amazon Nova Act: the new AI agent SDK challenging OpenAI, Microsoft, Salesforce"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>The sleeping giant has awoken!<\/p>\n\n\n\n<p>For a while, it seemed like Amazon was playing catchup in the race to offer its users \u2014 particularly the millions of developers building atop Amazon Web Services (AWS)\u2019s cloud infrastructure \u2014 compelling first-party AI models and tools. <\/p>\n\n\n\n<p>But in late 2024, it debuted its own internal foundation model family, Amazon Nova, with text, image and even video generation capabilities, and last month saw a new Amazon Alexa voice assistant powered in part by Anthropic\u2019s Claude family of models. <\/p>\n\n\n\n<p>Then, on Monday, the e-commerce and cloud giant\u2019s artificial general intelligence division Amazon AGI has announced the release of Amazon Nova Act, an experimental developer kit for building AI agents that can navigate the web and complete tasks autonomously, powered by a custom, proprietary version of Amazon\u2019s Nova large language model (LLM). Oh, and the standard developer kit (SDK) is open source under a permissive Apache 2.0 license, though the SDK is designed to work only with Amazon\u2019s in-house custom Nova model, not any third-party ones.<\/p>\n\n\n\n<p>The goal is to enable third-party developers to build AI agents capable of reliably performing tasks within web browsers. <\/p>\n\n\n\n<p>But how does Amazon\u2019s Nova Act stack up to other agent building platforms out there on the market, such as Microsoft\u2019s AutoGen, Salesforce\u2019s Agentforce, and of course, OpenAI\u2019s recently released open source Agents SDK? <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-a-different-more-thoughtful-approach-to-ai-agents\">A different, more thoughtful approach to AI agents<\/h2>\n\n\n\n<p>Since the public rise of large language models (LLMs), most \u201cagent\u201d systems have been limited to responding in natural language or providing information by querying knowledge bases. <\/p>\n\n\n\n<p>Nova Act is part of the larger industry shift toward action-based agents\u2014systems that can complete actual tasks across digital environments on behalf of the user. OpenAI\u2019s new Responses API, which gives users access to its autonomous browser navigator, is one leading example of this, which developers can integrate into AI agents through the OpenAI Agents SDK.<\/p>\n\n\n\n<p>Amazon AGI emphasizes that current agent systems, while promising, struggle with reliability and often require human supervision, especially when handling multi-step or complex workflows. <\/p>\n\n\n\n<p>Nova Act is specifically designed to address these limitations by providing a set of atomic, prescriptive commands that can be chained together into reliable workflows.<\/p>\n\n\n\n<p>Deniz Birlikci, a Member of Technical Staff at Amazon, described the broader vision in a video introducing Nova Act: soon, there will be more AI agents than people browsing the web, carrying out tasks on behalf of users.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><p>\n<iframe loading=\"lazy\" title=\"Introducing Amazon Nova Act\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/JLLapxWmalU?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/p><\/figure>\n\n\n\n<p>David Luan, VP of Amazon\u2019s Autonomy Team and Head of AGI SF Lab, framed the mission more directly in a recent video call interview with VentureBeat: \u201cWe\u2019ve created this new experimental AI model that is trained to perform actions in a web browser. Fundamentally, we think that agents are the building block of computing,\u201d he said. <\/p>\n\n\n\n<p>Luan, formerly a co-founder and CEO of Adept AI, joined Amazon in 2024 as part of an aqcui-hire. Luan said he has long been a proponent of AI agents. \u201cWith Adept, we were the first company to really start working on AI agents. At this point, everybody knows how important agents are. It was pretty cool to be a bit ahead of our time,\u201d he added.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-nova-act-offers-devs\">What Nova Act offers devs<\/h2>\n\n\n\n<p>The Nova Act SDK provides developers with a framework for constructing web-based automation agents using natural language prompts broken down into clear, manageable steps. <\/p>\n\n\n\n<p>Unlike typical LLM-powered agents that attempt entire workflows from a single prompt\u2014often resulting in unreliable behavior\u2014Nova Act is designed to incrementally execute smaller, verifiable tasks.<\/p>\n\n\n\n<p>Some of the key features of Nova Act include:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Fine-Grained Task Decomposition:<\/strong> Developers can break down complex digital workflows into smaller act() calls, each guiding the agent to perform specific UI interactions.<\/li>\n\n\n\n<li><strong>Direct Browser Manipulation via Playwright:<\/strong> Nova Act integrates with <strong>Playwright<\/strong>, an open-source browser automation framework developed by <strong>Microsoft<\/strong>. Playwright allows developers to control web browsers programmatically\u2014clicking elements, filling forms, or navigating pages\u2014without relying solely on AI predictions. This integration is particularly useful for handling sensitive tasks such as entering passwords or credit card details. For example, instead of sending sensitive information to the model, developers can instruct Nova Act to focus on a password field and then use Playwright APIs to securely enter the password without the model ever \u201cseeing\u201d it. This approach helps strengthen security and privacy when automating web interactions.<\/li>\n\n\n\n<li><strong>Python Integration:<\/strong> The SDK allows developers to interleave Python code with Nova Act commands, including standard Python tools such as breakpoints, assertions, or thread pooling for parallel execution.<\/li>\n\n\n\n<li><strong>Structured Information Extraction:<\/strong> The SDK supports structured data extraction through Pydantic schemas, allowing agents to convert screen content into structured formats.<\/li>\n\n\n\n<li><strong>Parallelization and Scheduling:<\/strong> Developers can run multiple Nova Act instances concurrently and schedule automated workflows without the need for continuous human oversight.<\/li>\n<\/ul>\n\n\n\n<p>Luan emphasized that Nova Act is a tool for developers rather than a general-purpose chatbot. \u201cNova Act is built for developers. It\u2019s not a chatbot you talk to for fun. It\u2019s designed to let developers start building useful products,\u201d he said.<\/p>\n\n\n\n<p>For example, one of the sample workflows demonstrated in Amazon\u2019s documentation shows how Nova Act can automate apartment searches by scraping rental listings and calculating biking distance to train stations, then sorting the results in a structured table.<\/p>\n\n\n\n<p>Another showcased example uses Nova Act to order a specific salad from Sweetgreen every Tuesday, entirely hands-free and on a schedule, illustrating how developers can automate repeatable digital tasks in a way that feels reliable and customizable.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-benchmark-performance-and-a-focus-on-reliability\">Benchmark performance and a focus on reliability<\/h2>\n\n\n\n<p>A central message in Amazon\u2019s announcement is that reliability, not just intelligence, is the key barrier to widespread agent adoption. <\/p>\n\n\n\n<p>Current state-of-the-art models are actually quite brittle at powering AI agents, with agents typically achieving 30% to 60% success rates on browser-based multi-step tasks, according to Amazon.<\/p>\n\n\n\n<p>Nova Act, however, emphasizes a building-block approach, scoring over 90% on internal evaluations of tasks that challenge other models\u2014such as interacting with dropdowns, date pickers, or pop-ups.<\/p>\n\n\n\n<p>Luan underscored why that reliability focus matters. \u201cWhat we\u2019ve really focused on is how do you actually make agents reliable? If you ask it to update a record in Salesforce and it deletes your database one out of ten times, you\u2019re probably never going to use it again,\u201d he said.<\/p>\n\n\n\n<p>Amazon AGI benchmarked Nova Act against competing models including Anthropic\u2019s Claude 3.7 Sonnet and OpenAI\u2019s CUA model. On the ScreenSpot Web Text benchmark, which tests instruction-following on textual screen elements, Nova Act achieved a score of 0.939, outperforming Claude 3.7 Sonnet (0.900) and OpenAI CUA (0.883). <\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"775\" height=\"597\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09%E2%80%AFPM.png\" alt=\"\" class=\"wp-image-3003017\" style=\"width:840px;height:auto\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png 775w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png?resize=300,231 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png?resize=768,592 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png?resize=400,308 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png?resize=750,578 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/04\/Screenshot-2025-04-02-at-3.10.09\u202fPM.png?resize=578,445 578w\" sizes=\"(max-width: 775px) 100vw, 775px\"\/><figcaption class=\"wp-element-caption\">Amazon Nova Act benchmarks. Credit: Amazon<\/figcaption><\/figure>\n\n\n\n<p>On the ScreenSpot Web Icon benchmark, which focuses on visual UI elements, Nova Act scored 0.879, again ahead of the other models. <\/p>\n\n\n\n<p>However, on the GroundUI Web benchmark, which tests general UI interaction, Nova Act scored 0.805, slightly behind its competitors.<\/p>\n\n\n\n<p>These scores were measured internally by Amazon using consistent prompts and evaluation criteria.<\/p>\n\n\n\n<p>Amazon also highlighted early results in Nova Act\u2019s ability to generalize beyond standard environments. <\/p>\n\n\n\n<p>For instance, team member Rick Liu demonstrated how the agent, without explicit training, successfully interacted with a pigeon-themed web game\u2014assigning stats, battling opponents, and progressing in the game.<\/p>\n\n\n\n<p>According to Luan, that ability to generalize is central to the long-term vision. \u201cOur goal with Nova Act is to be a universal browser-use solution. We want an agent that can do anything you want to do on a computer for you,\u201d he said.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-flexible-for-use-in-different-clouds-but-locked-to-amazon-s-nova-model\">Flexible for use in different clouds, but locked to Amazon\u2019s Nova model<\/h2>\n\n\n\n<p>While Nova Act is accessible to developers globally through nova.amazon.com, Luan clarified that the system is tightly coupled to Amazon\u2019s in-house Nova foundation models. <\/p>\n\n\n\n<p>Developers cannot plug in external LLMs such as OpenAI\u2019s GPT-4o or Anthropic\u2019s Claude 3.7 Sonnet, unlike with OpenAI\u2019s Agents SDK, and to a lesser extent, Microsoft\u2019s AutoGen and Salesforce\u2019s Agentforce platforms (which allow switching to a few different provider companies and model families).<\/p>\n\n\n\n<p>\u201cNova Act is a custom trained version of the Nova model,\u201d he said. \u201cIt\u2019s not just a scaffolding over a generic LLM. It\u2019s natively trained to act on the internet on your behalf.\u201d<\/p>\n\n\n\n<p>However, Nova Act is not restricted to AWS environments. Developers can download the SDK and run it locally, in the cloud, or wherever they choose. \u201cYou don\u2019t need to be on AWS to use it,\u201d Luan stated.<\/p>\n\n\n\n<p>Thus, for businesses looking for maximum underlying model flexibility for their agents, Nova Act is probably not the best choice. However, for those looking for a purpose-built model specifically designed to navigate the web and perform actions across a wide variety of websites with very different user interfaces (UIs), it\u2019s probably worth a look \u2014 especially if you\u2019re already in the Amazon or AWS developer ecosystem. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-security-licensing-and-pricing\">Security, licensing and pricing<\/h2>\n\n\n\n<p>The Nova Act SDK is released under the Apache License, Version 2.0 (January 2004), an open source license. However, this applies only to the SDK software. <\/p>\n\n\n\n<p>The Nova Act model itself, along with its weights and training data, is proprietary and remains closed-source. The approach is intentional, according to Luan, who explained that the model is tightly integrated and co-trained with the SDK to achieve reliability.<\/p>\n\n\n\n<p>At launch, Nova Act is offered as a free research preview. There is no announced pricing for production use yet. <\/p>\n\n\n\n<p>Luan described this phase as an opportunity for developers to experiment and build with the technology. \u201cOur belief is that the majority of the most useful agent products have not yet been built. We want to enable anybody to build a really useful agent, whether for themselves or as a product,\u201d he said.<\/p>\n\n\n\n<p>Longer term, Amazon plans to introduce production-grade terms, including usage-based billing and scaling guarantees, but those are not yet available.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-s-next-for-nova-act\">What\u2019s next for Nova Act?<\/h2>\n\n\n\n<p>The release of Nova Act reflects Amazon\u2019s broader ambition to make action-oriented AI agents a foundational component of computing. <\/p>\n\n\n\n<p>Luan summed up the opportunity ahead: \u201cMy personal dream is that agents become the building block of computing, and the coolest new startups and products get built on top of what our team is developing.\u201d<\/p>\n\n\n\n<p>The Nova Act SDK is available now for experimentation and prototyping on Amazon\u2019s website and on Github.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/what-you-need-to-know-about-amazon-nova-act-the-new-ai-agent-sdk-challenging-openai-microsoft-salesforce\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The sleeping giant has awoken! For a while, it seemed like Amazon was playing catchup in the race to offer its users \u2014 particularly the millions of developers building atop Amazon Web Services (AWS)\u2019s cloud infrastructure [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":950,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-949","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/04\/cfr0z3n_a_diverse_group_of_scientists_in_white_lab_coats_stan_ac9142cc-1eb4-4e10-a891-29182615ad56_0.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/949","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=949"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/949\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/950"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=949"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=949"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=949"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 01:26:10 UTC -->