{"id":4608,"date":"2025-11-28T21:00:08","date_gmt":"2025-11-28T21:00:08","guid":{"rendered":"https:\/\/violethoward.com\/new\/anthropic-says-it-solved-the-long-running-ai-agent-problem-with-a-new-multi-session-claude-sdk\/"},"modified":"2025-11-28T21:00:08","modified_gmt":"2025-11-28T21:00:08","slug":"anthropic-says-it-solved-the-long-running-ai-agent-problem-with-a-new-multi-session-claude-sdk","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/anthropic-says-it-solved-the-long-running-ai-agent-problem-with-a-new-multi-session-claude-sdk\/","title":{"rendered":"Anthropic says it solved the long-running AI agent problem with a new multi-session Claude SDK"},"content":{"rendered":"

\n
<\/p>\n

Agent memory remains a problem that enterprises want to fix, as agents forget some instructions or conversations the longer they run.\u00a0<\/p>\n

Anthropic<\/u> believes it has solved this issue for its Claude Agent SDK<\/u>, developing a two-fold solution that allows an agent to work across different context windows.<\/p>\n

\u201cThe core challenge of long-running agents is that they must work in discrete sessions, and each new session begins with no memory of what came before,\u201d Anthropic wrote in a blog post<\/u>. \u201cBecause context windows are limited, and because most complex projects cannot be completed within a single window, agents need a way to bridge the gap between coding sessions.\u201d<\/p>\n

Anthropic engineers proposed a two-fold approach for its Agent SDK: An initializer agent to set up the environment, and a coding agent to make incremental progress in each session and leave artifacts for the next.\u00a0\u00a0<\/p>\n

The agent memory problem<\/h2>\n
Since agents are built on foundation models, they remain constrained by the limited, although continually growing, context windows. For long-running agents, this could create a larger problem, leading the agent to forget instructions and behave abnormally while performing a task. Enhancing agent memory<\/u> becomes essential for consistent, business-safe performance.\u00a0<\/p>\n
Several methods emerged over the past year, all attempting to bridge the gap between context windows and agent memory. LangChain<\/u>\u2019s LangMem SDK, Memobase<\/u> and OpenAI<\/u>\u2019s Swarm are examples of companies offering memory solutions. Research on agentic memory has also exploded recently, with proposed frameworks like Memp<\/u> and the Nested Learning Paradigm<\/u> from Google<\/u> offering new alternatives to enhance memory.\u00a0<\/p>\n
Many of the current memory frameworks are open source and can ideally adapt to different large language models (LLMs) powering agents. Anthropic\u2019s approach improves its Claude Agent SDK.\u00a0<\/p>\n
How it works<\/h2>\n
Anthropic identified that even though the Claude Agent SDK had context management capabilities and \u201cshould be possible for an agent to continue to do useful work for an arbitrarily long time,\u201d it was not sufficient. The company said in its blog post that a model like Opus 4.5<\/u> running the Claude Agent SDK can \u201cfall short of building a production-quality web app if it\u2019s only given a high-level prompt, such as 'build a clone of claude.ai.'\u201d\u00a0<\/p>\n
The failures manifested in two patterns, Anthropic said. First, the agent tried to do too much, causing the model to run out of context in the middle. The agent then has to guess what happened and cannot pass clear instructions to the next agent. The second failure occurs later on, after some features have already been built. The agent sees progress has been made and just declares the job done.\u00a0<\/p>\n
Anthropic researchers broke down the solution: Setting up an initial environment to lay the foundation for features and prompting each agent to make incremental progress towards a goal, while still leaving a clean slate at the end.\u00a0<\/p>\n
This is where the two-part solution of Anthropic's agent comes in. The initializer agent sets up the environment, logging what agents have done and which files have been added. The coding agent will then ask models to make incremental progress and leave structured updates.\u00a0<\/p>\n
\u201cInspiration for these practices came from knowing what effective software engineers do every day,\u201d Anthropic said.\u00a0<\/p>\n
The researchers said they added testing tools to the coding agent, improving its ability to identify and fix bugs that weren\u2019t obvious from the code alone.\u00a0<\/p>\n
Future research<\/h2>\n
Anthropic noted that its approach is \u201cone possible set of solutions in a long-running agent harness.\u201d However, this is just the beginning stage of what could become a wider research area for many in the AI space.\u00a0<\/p>\n
The company said its experiments to boost long-term memory for agents haven\u2019t shown whether a single general-purpose coding agent works best across contexts or a multi-agent structure.\u00a0<\/p>\n
Its demo also focused on full-stack web app development, so other experiments should focus on generalizing the results across different tasks.<\/p>\n
\u201cIt\u2019s likely that some or all of these lessons can be applied to the types of long-running agentic tasks required in, for example, scientific research or financial modeling,\u201d Anthropic said.\u00a0<\/p>\n

\n
Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"
Agent memory remains a problem that enterprises want to fix, as agents forget some instructions or conversations the longer they run.\u00a0 Anthropic believes it has solved this issue for its Claude Agent SDK, developing a two-fold solution that allows an agent to work across different context windows. \u201cThe core challenge of long-running agents is that […]<\/p>\n","protected":false},"author":1,"featured_media":4609,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-4608","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/11\/crimedy7_illustration_of_robots_running_a_marathon_-ar_169_-_98e4a2e9-af27-4fe5-8f24-c70cc6d9dd30_3.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/4608","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=4608"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/4608\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/4609"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=4608"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=4608"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=4608"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}