{"id":2154,"date":"2025-06-28T17:57:56","date_gmt":"2025-06-28T17:57:56","guid":{"rendered":"https:\/\/violethoward.com\/new\/ai-agents-are-hitting-a-liability-wall-mixus-has-a-plan-to-overcome-it-using-human-overseers-on-high-risk-workflows\/"},"modified":"2025-06-28T17:57:56","modified_gmt":"2025-06-28T17:57:56","slug":"ai-agents-are-hitting-a-liability-wall-mixus-has-a-plan-to-overcome-it-using-human-overseers-on-high-risk-workflows","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/ai-agents-are-hitting-a-liability-wall-mixus-has-a-plan-to-overcome-it-using-human-overseers-on-high-risk-workflows\/","title":{"rendered":"AI agents are hitting a liability wall. Mixus has a plan to overcome it using human overseers on high-risk workflows"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy.\u00a0Learn more<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>While enterprises face the challenges of deploying AI agents in critical applications, a new, more pragmatic model is emerging that puts humans back in control as a strategic safeguard against AI failure.\u00a0<\/p>\n\n\n\n<p>One such example is Mixus, a platform that uses a \u201ccolleague-in-the-loop\u201d approach to make AI agents reliable for mission-critical work. <\/p>\n\n\n\n<p>This approach is a response to the growing evidence that fully autonomous agents are a high-stakes gamble.\u00a0<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-high-cost-of-unchecked-ai\">The high cost of unchecked AI<\/h2>\n\n\n\n<p>The problem of AI hallucinations has become a tangible risk as companies explore AI applications. In a recent incident, the AI-powered code editor Cursor saw its own support bot invent a fake policy restricting subscriptions, sparking a wave of public customer cancellations.\u00a0<\/p>\n\n\n\n<p>Similarly, the fintech company Klarna famously reversed course on replacing customer service agents with AI after admitting the move resulted in lower quality. In a more alarming case, New York City\u2019s AI-powered business chatbot advised entrepreneurs to engage in illegal practices, highlighting the catastrophic compliance risks of unmonitored agents.<\/p>\n\n\n\n<p>These incidents are symptoms of a larger capability gap. According to a May 2025 Salesforce research paper, today\u2019s leading agents succeed only 58% of the time on single-step tasks and just 35% of the time on multi-step ones, highlighting \u201ca significant gap between current LLM capabilities and the multifaceted demands of real-world enterprise scenarios.\u201d\u00a0<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-colleague-in-the-loop-model\">The colleague-in-the-loop model<\/h2>\n\n\n\n<p>To bridge this gap, a new approach focuses on structured human oversight. \u201cAn AI agent should act at your direction and on your behalf,\u201d Mixus co-founder Elliot Katz told VentureBeat. \u201cBut without built-in organizational oversight, fully autonomous agents often create more problems than they solve.\u201d\u00a0<\/p>\n\n\n\n<p>This philosophy underpins Mixus\u2019s colleague-in-the-loop model, which embeds human verification directly into automated workflows. For example, a large retailer might receive weekly reports from thousands of stores that contain critical operational data (e.g., sales volumes, labor hours, productivity ratios, compensation requests from headquarters). Human analysts must spend hours manually reviewing the data and making decisions based on heuristics. With Mixus, the AI agent automates the heavy lifting, analyzing complex patterns and flagging anomalies like unusually high salary requests or productivity outliers.\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" height=\"469\" width=\"800\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?w=800\" alt=\"\" class=\"wp-image-3013517\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png 1612w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=300,176 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=768,451 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=800,469 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=1536,901 1536w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=400,235 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=750,440 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=578,339 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/image_ec02d1.png?resize=930,546 930w\" sizes=\"(max-width: 800px) 100vw, 800px\"\/><\/figure>\n\n\n\n<p>For high-stakes decisions like payment authorizations or policy violations \u2014 workflows defined by a human user as \u201chigh-risk\u201d \u2014 the agent pauses and requires human approval before proceeding. The division of labor between AI and humans has been integrated into the agent creation process.<\/p>\n\n\n\n<p>\u201cThis approach means humans only get involved when their expertise actually adds value \u2014 typically the critical 5-10% of decisions that could have significant impact \u2014 while the remaining 90-95% of routine tasks flow through automatically,\u201d Katz said. \u201cYou get the speed of full automation for standard operations, but human oversight kicks in precisely when context, judgment, and accountability matter most.\u201d<\/p>\n\n\n\n<p>In a demo that the Mixus team showed to VentureBeat, creating an agent is an intuitive process that can be done with plain-text instructions. To build a fact-checking agent for reporters, for example, co-founder Shai Magzimof simply described the multi-step process in natural language and instructed the platform to embed human verification steps with specific thresholds, such as when a claim is high-risk and can result in reputational damage or legal consequences.\u00a0<\/p>\n\n\n\n<p>One of the platform\u2019s core strengths is its integrations with tools like Google Drive, email, and Slack, allowing enterprise users to bring their own data sources into workflows and interact with agents directly from their communication platform of choice, without having to switch contexts or learn a new interface (for example, the fact-checking agent was instructed to send approval requests to the editor\u2019s email).<\/p>\n\n\n\n<p>The platform\u2019s integration capabilities extend further to meet specific enterprise needs. Mixus supports the Model Context Protocol (MCP), which enables businesses to connect agents to their bespoke tools and APIs, avoiding the need to reinvent the wheel for existing internal systems. Combined with integrations for other enterprise software like Jira and Salesforce, this allows agents to perform complex, cross-platform tasks, such as checking on open engineering tickets and reporting the status back to a manager on Slack.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-human-oversight-as-a-strategic-multiplier\">Human oversight as a strategic multiplier<\/h2>\n\n\n\n<p>The enterprise AI space is currently undergoing a reality check as companies move from experimentation to production. The consensus among many industry leaders is that humans in the loop are a practical necessity for agents to perform reliably.\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">AI Agents will likely follow a self driving trajectory, where you need a human in the loop for a long tail of tasks for a while. The big difference is we\u2019ll get a growing number of autonomous agents along the way, where full self driving is an all or nothing proposition. https:\/\/t.co\/5dR7cGS7jn<\/p>\u2014 Aaron Levie (@levie) <a href=\"https:\/\/twitter.com\/levie\/status\/1935898443794583950?ref_src=twsrc%5Etfw\">June 20, 2025<\/a><\/blockquote>\n<\/div><\/figure>\n\n\n\n<p>Mixus\u2019s collaborative model changes the economics of scaling AI. Mixed predicts that by 2030, agent deployment may grow 1000x and each human overseer will become 50x more efficient as AI agents become more reliable. But the total need for human oversight will still grow.\u00a0<\/p>\n\n\n\n<p>\u201cEach human overseer manages exponentially more AI work over time, but you still need more total oversight as AI deployment explodes across your organization,\u201d Katz said.\u00a0<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" height=\"372\" width=\"800\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13%E2%80%AFAM.png?w=800\" alt=\"\" class=\"wp-image-3013518\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png 1346w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=300,140 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=768,357 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=800,372 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=400,186 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=750,349 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=578,269 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/06\/Screenshot-2025-06-27-at-7.12.13\u202fAM.png?resize=930,433 930w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\"\/><\/figure>\n\n\n\n<p>For enterprise leaders, this means human skills will evolve rather than disappear. Instead of being replaced by AI, experts will be promoted to roles where they orchestrate fleets of AI agents and handle the high-stakes decisions flagged for their review. <\/p>\n\n\n\n<p>In this framework, building a strong human oversight function becomes a competitive advantage, allowing companies to deploy AI more aggressively and safely than their rivals.<\/p>\n\n\n\n<p>\u201cCompanies that master this multiplication will dominate their industries, while those chasing full automation will struggle with reliability, compliance, and trust,\u201d Katz said.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div><template id="Iuvt5gDgtYAQwmSBewUc"></template><\/script>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/ai-agents-are-hitting-a-liability-wall-mixus-has-a-plan-to-overcome-it-using-human-overseers-on-high-risk-workflows\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy.\u00a0Learn more While enterprises face the challenges of deploying AI agents in critical applications, a new, more pragmatic model is emerging that puts humans back in control as a strategic safeguard against AI failure.\u00a0 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2155,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-2154","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/06\/ChatGPT-Image-Jun-27-2025-10_10_45-PM.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/2154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=2154"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/2154\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/2155"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=2154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=2154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=2154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 11:15:18 UTC -->