{"id":3395,"date":"2025-08-27T21:18:26","date_gmt":"2025-08-27T21:18:26","guid":{"rendered":"https:\/\/violethoward.com\/new\/enterprise-leaders-say-recipe-for-ai-agents-is-matching-them-to-existing-processes-not-the-other-way-around\/"},"modified":"2025-08-27T21:18:26","modified_gmt":"2025-08-27T21:18:26","slug":"enterprise-leaders-say-recipe-for-ai-agents-is-matching-them-to-existing-processes-not-the-other-way-around","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/enterprise-leaders-say-recipe-for-ai-agents-is-matching-them-to-existing-processes-not-the-other-way-around\/","title":{"rendered":"Enterprise leaders say recipe for AI agents is matching them to existing processes \u2014 not the other way around"},"content":{"rendered":" \r\n
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders.<\/em> Subscribe Now<\/em><\/p>\n\n\n\n There\u2019s no question that AI agents \u2014 those that can work autonomously and asynchronously behind the scenes in enterprise workflows \u2014 are the topic du jour in enterprise right now.\u00a0<\/p>\n\n\n\n But there\u2019s increasing concern that it\u2019s all just that \u2014 talk, mostly hype, without much substance behind it.\u00a0<\/p>\n\n\n\n Gartner, for one, observes that enterprises are at the \u201cpeak of inflated expectations,\u201d a period just before disillusionment sets in because vendors haven\u2019t backed up their talk with tangible, real-world use cases.\u00a0<\/p>\n\n\n\n Still, that\u2019s not to say that enterprises aren\u2019t experimenting with AI agents and seeing early return on investment (ROI); global enterprises Block and GlaxoSmithKline (GSK), for their parts, are exploring proof of concepts in financial services and drug discovery.\u00a0<\/p>\n\n\n\n AI Scaling Hits Its Limits<\/strong><\/p>\n\n\n\n Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:<\/p>\n\n\n\n Secure your spot to stay ahead<\/strong>: https:\/\/bit.ly\/4mwGngO<\/p>\n\n\n\n \u201cMulti-agent is absolutely what\u2019s next, but we\u2019re figuring out what that looks like in a way that meets the human, makes it convenient,\u201d Brad Axen, Block\u2019s tech lead for AI and data platforms, told VentureBeat CEO and editor-in-chief Matt Marshall at a recent SAP-sponsored AI Impact event this month.\u00a0<\/p>\n\n\n\n Block, the 10,000-employee parent company of Square, Cash App and Afterpay, considers itself in full discovery mode, having rolled out an interoperable AI agent framework, codenamed goose, in January.\u00a0<\/p>\n\n\n\n Goose was initially introduced for software engineering tasks, and is now used by 4,000 engineers, with adoption doubling monthly, Axen explained. The platform writes about 90% of code and has saved engineers an estimated 10 hours of work per week by automating code generation, debugging and information filtering.\u00a0<\/p>\n\n\n\n In addition to writing code, Goose acts as a \u201cdigital teammate\u201d of sorts, compressing Slack and email streams, integrating across company tools and spawning new agents when tasks demand more throughput and expanded scope.\u00a0<\/p>\n\n\n\n Axen emphasized that Block is focused on creating one interface that feels like working with a single colleague, not a swarm of bots. \u201cWe want you to feel like you\u2019re working with one person, but they\u2019re acting on your behalf in many places in many different ways,\u201d he explained.\u00a0<\/p>\n\n\n\n Goose operates in real time in the development environment, searching, navigating and writing code based on large language model (LLM) output, while also autonomously reading and writing files, running code and tests, refining outputs and installing dependencies.<\/p>\n\n\n\n Essentially, anyone can build and operate a system on their preferred LLM, and Goose can be conceptualized as the application layer. It has a built-in desktop application and command line interface, but devs can also build custom UIs. The platform is built on Anthropic\u2019s Model Context Protocol (MCP), an increasingly popular open-source standardized set of APIs and endpoints that connects agents to data repositories, tools and development environments.<\/p>\n\n\n\n Goose has been released under the open-source Apache License 2.0 (ASL2), meaning anyone can freely use, modify and distribute it, even for commercial purposes. Users can access Databricks databases and make SQL calls or queries without needing technical knowledge.\u00a0<\/p>\n\n\n\n \u201cWe really want to come up with a process that lets people get value out of the system without having to be an expert,\u201d Axen explained.\u00a0<\/p>\n\n\n\n For instance, in coding, users can say what they want in natural language and the framework will interpret that into thousands of lines of code that devs can then read and sift through. Block is seeing value in compression tasks, too, such as Goose reading through Slack, email and other channels and summarizing information for users. Further, in sales or marketing, agents can gather relevant information on a potential client and port it into a database.\u00a0<\/p>\n\n\n\n Process has been the biggest bottleneck, Axen noted. You can\u2019t just give people a tool and tell them to make it work for them; agents need to reflect the processes that employees are already engaged with. Human users aren\u2019t worried about the technical backbone, \u2014 rather, the work they\u2019re trying to accomplish.\u00a0<\/p>\n\n\n\n Builders, therefore, need to look at what employees are trying to do and design the tools to be \u201cas literally that as possible,\u201d said Axen. Then they can use that to chain together and tackle bigger and bigger problems.<\/p>\n\n\n\n \u201cI think we\u2019re hugely underusing what they can do,\u201d Axen said of agents. \u201cIt\u2019s the people and the process because we can\u2019t keep up with the technology. There\u2019s a huge gap between the technology and the opportunity.\u201d<\/p>\n\n\n\n And, when the industry bridges that, will there still be room for human domain expertise? Of course, Axen says. For instance, particularly in financial services, code must be reliable, compliant and secure to protect the company and users; therefore, it must be reviewed by human eyes.\u00a0<\/p>\n\n\n\n \u201cWe still see a really critical role for human experts in every part of operating our company,\u201d he said. \u201cIt doesn\u2019t necessarily change what expertise means as an individual. It just gives you a new tool to express it.\u201d<\/p>\n\n\n\n The human UI is one of the most difficult elements of AI agents, Axen noted; the goal is to make interfaces simple to use while AI is in the background proactively taking action.\u00a0<\/p>\n\n\n\n It would be helpful, Axen noted, if more industry players incorporate MCP-like standards. For instance, \u201cI would love for Google to just go and have a public MCP for Gmail,\u201d he said. \u201cThat would make my life a lot easier.\u201d<\/p>\n\n\n\n When asked about Block\u2019s commitment to open source, he noted, \u201cwe\u2019ve always had an open-source backbone,\u201d adding that over the last year the company has been \u201crenewing\u201d its investment to open technologies.\u00a0<\/p>\n\n\n\n \u201cIn a space that\u2019s moving this fast, we\u2019re hoping we can set up open-source governance so that you can have this be the tool that keeps up with you even as new models and new products come out.\u201d<\/p>\n\n\n\n GSK is a leading pharmaceutical developer, with specific focus on vaccines, infectious diseases and oncology research. Now, the company is starting to apply multi-agent architectures to accelerate drug discovery.\u00a0<\/p>\n\n\n\n Kim Branson, GSK\u2019s SVP and global head of AI and ML, said agents are beginning to transform the company\u2019s product and are \u201cabsolutely core to our business.\u201d<\/p>\n\n\n\n GSK\u2019s scientists are combining domain-specific LLMs with ontologies (subject matter concepts and categories that indicate properties and relations between them), toolchains and rigorous testing frameworks, Branson explained.\u00a0<\/p>\n\n\n\n This helps them query gigantic scientific datasets, plan out experiments (even if there is no ground truth) and assemble evidence across genomics (the study of DNA), proteomics (the study of protein) and clinical data. Agents can surface hypotheses, validate data joins and compress research cycles.\u00a0<\/p>\n\n\n\n Branson noted that scientific discovery has come a long way; sequencing times have come down, and proteomics research is much faster. At the same time, though, discovery becomes ever more difficult as more and more data is amassed, particularly through devices and wearables. As Branson put it: \u201cWe have more continuous pulse data on people than we\u2019ve ever had before as a species.\u201d\u00a0<\/p>\n\n\n\n It can be almost impossible for humans to analyze all that data, so GSK\u2019s goal is to use AI to speed up iteration times, he noted. <\/p>\n\n\n\n But, at the same time, AI can be tricky in big pharma because there often isn\u2019t a ground truth without performing big clinical experiments; it\u2019s more about hypotheses and scientists exploring evidence to come up with possible solutions.\u00a0<\/p>\n\n\n\n \u201cWhen you start to add agents, you find that most people actually haven\u2019t even got a standard way of doing it amongst themselves,\u201d Branson noted. \u201cThat variance isn\u2019t bad, but sometimes it leads to another question.\u201d<\/p>\n\n\n\n He quipped: \u201cWe don\u2019t always have an absolute truth to work with \u2014 otherwise my job would be a lot easier.\u201d\u00a0<\/p>\n\n\n\n It\u2019s all about coming up with the right targets or knowing how to design what could be a biomarker or evidence for different hypotheses, he explained. For instance: Is this the best avenue to consider for people with ovarian cancer in this particular condition?<\/em><\/p>\n\n\n\n To get the AI to understand that reasoning requires the use of ontologies and posing questions such as, \u2018If this is true, what does X mean?\u2019. Domain-specific agents can then pull together relevant evidence from large internal datasets.\u00a0<\/p>\n\n\n\n GSK built epigenomic language models powered by Cerebras from scratch that it uses for inference and training, Branson explained. \u201cWe build very specific models for our applications where no one else has one,\u201d he said.<\/p>\n\n\n\n Inference speed is important, he noted, whether for back-and-forth with a model or autonomous deep research, and GSK uses different sets of tools based on the end goal. But large context windows aren\u2019t always the answer, and filtering is critical. \u201cYou can\u2019t just play context stuffing,\u201d said Branson. \u201cYou can\u2019t just throw all the data in this thing and trust the LM to figure it out.\u201d<\/p>\n\n\n\n GSK puts a lot of testing into its agentic systems, prioritizing determinism and reliability, often running multiple agents in parallel to cross-check results.<\/p>\n\n\n\n Branson recalled that, when his team first started building, they had an SQL agent that they ran \u201c10,000 times,\u201d and it inexplicably suddenly \u201cfaked up\u201d details.\u00a0<\/p>\n\n\n\n \u201cWe never saw it happen again but it happened once and we didn\u2019t even understand why it happened with this particular LLM,\u201d he said.\u00a0<\/p>\n\n\n\n As a result, his team will often run multiple copies and models in parallel while enforcing tool calling and constraints; for instance, two LLMs will perform exactly the same sequence and GSK scientists will cross-check them.\u00a0<\/p>\n\n\n\n His team focuses on active learning loops and is assembling its own internal benchmarks because popular, publicly-available ones are often \u201cfairly academic and not reflective of what we do.\u201d\u00a0<\/p>\n\n\n\n For instance, they will generate several biological questions, score what they think the gold standard will be, then apply an LLM against that and see how it ranks.\u00a0<\/p>\n\n\n\n \u201cWe especially hunt for problematic things where it didn\u2019t work or it did a dumb thing, because that\u2019s when we learn some new stuff,\u201d said Branson. \u201cWe try to have the humans use their expert judgment where it matters.\u201d\u00a0<\/p>\n
\n<\/div>
\n\n\n\n\n
\n<\/div>Working with a single colleague, not a swarm of bots<\/h2>\n\n\n\n
AI agents underutilized, but human domain expertise still necessary<\/h2>\n\n\n\n
Block built on an open-source backbone<\/h2>\n\n\n\n
GSK\u2019s experiences with multi agents in drug discovery<\/h2>\n\n\n\n
Ongoing testing critical\u00a0<\/h2>\n\n\n\n