A new framework developed by researchers at Google Cloud and DeepMind aims to address one of the key challenges of developing computer use agents (CUAs): Gathering high-quality training examples at scale. The framework, dubbed Watch & Learn (W&L), addresses the problem of training data generation in a way that doesn’t require human annotation and can…
Researchers at Nvidia have developed a novel approach to train large language models (LLMs) in 4-bit quantized format while maintaining their stability and accuracy at the level of high-precision models. Their technique, NVFP4, makes it possible to train models that not only outperform other leading 4-bit formats but match the performance of the larger 8-bit…
Researchers at Meta FAIR and the University of Edinburgh have developed a new technique that can predict the correctness of a large language model's (LLM) reasoning and even intervene to fix its mistakes. Called Circuit-based Reasoning Verification (CRV), the method looks inside an LLM to monitor its internal “reasoning circuits” and detect signs of computational…
The rise of AI marks a critical shift away from decades defined by information-chasing and a push for more and more compute power. Canva co-founder and CPO Cameron Adams refers to this dawning time as the “imagination era.” Meaning: Individuals and enterprises must be able to turn creativity into action with AI. Canva hopes to…
Presented by Elastic As organizations scramble to enact agentic AI solutions, accessing proprietary data from all the nooks and crannies will be key By now, most organizations have heard of agentic AI, which are systems that “think” by autonomously gathering tools, data and other sources of information to return an answer. But here’s the rub:…
Enterprises, eager to ensure any AI models they use adhere to safety and safe-use policies, fine-tune LLMs so they do not respond to unwanted queries. However, much of the safeguarding and red teaming happens before deployment, “baking in” policies before users fully test the models’ capabilities in production. OpenAI believes it can offer a more…
When researchers at Anthropic injected the concept of "betrayal" into their Claude AI model's neural networks and asked if it noticed anything unusual, the system paused before responding: "I'm experiencing something that feels like an intrusive thought about 'betrayal'." The exchange, detailed in new research published Wednesday, marks what scientists say is the first rigorous…
The vibe coding tool Cursor, from startup Anysphere, has introduced Composer, its first in-house, proprietary coding large language model (LLM) as part of its Cursor 2.0 platform update. Composer is designed to execute coding tasks quickly and accurately in production-scale environments, representing a new step in AI-assisted programming. It's already being used by Cursor’s own…
The moment Mack McConnell knew everything about search had changed came last summer at the Paris Olympics. His parents, independently and without prompting, had both turned to ChatGPT to plan their day's activities in the French capital. The AI recommended specific tour companies, restaurants, and attractions — businesses that had won a new kind of…
Building AI for financial software requires a different playbook than consumer AI, and Intuit's latest QuickBooks release provides an example. The company has announced Intuit Intelligence, a system that orchestrates specialized AI agents across its QuickBooks platform to handle tasks including sales tax compliance and payroll processing. These new agents augment existing accounting and project…