{"id":3443,"date":"2025-08-30T09:04:00","date_gmt":"2025-08-30T09:04:00","guid":{"rendered":"https:\/\/violethoward.com\/new\/how-sakana-ais-new-evolutionary-algorithm-builds-powerful-ai-models-without-expensive-retraining\/"},"modified":"2025-08-30T09:04:00","modified_gmt":"2025-08-30T09:04:00","slug":"how-sakana-ais-new-evolutionary-algorithm-builds-powerful-ai-models-without-expensive-retraining","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/how-sakana-ais-new-evolutionary-algorithm-builds-powerful-ai-models-without-expensive-retraining\/","title":{"rendered":"How Sakana AI’s new evolutionary algorithm builds powerful AI models without expensive retraining"},"content":{"rendered":" \r\n
\n\t\t\t\t
\n

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders.<\/em> Subscribe Now<\/em><\/p>\n\n\n\n


\n<\/div>

A new evolutionary technique from Japan-based AI lab Sakana AI enables developers to augment the capabilities of AI models without costly training and fine-tuning processes. The technique, called Model Merging of Natural Niches (M2N2), overcomes the limitations of other model merging methods and can even evolve new models entirely from scratch.<\/p>\n\n\n\n

M2N2 can be applied to different types of machine learning models, including large language models (LLMs) and text-to-image generators. For enterprises looking to build custom AI solutions, the approach offers a powerful and efficient way to create specialized models by combining the strengths of existing open-source variants.<\/p>\n\n\n\n

What is model merging?<\/h2>\n\n\n\n

Model merging is a technique for integrating the knowledge of multiple specialized AI models into a single, more capable model. Instead of fine-tuning, which refines a single pre-trained model using new data, merging combines the parameters of several models simultaneously. This process can consolidate a wealth of knowledge into one asset without requiring expensive, gradient-based training or access to the original training data.<\/p>\n\n\n\n

For enterprise teams, this offers several practical advantages over traditional fine-tuning. In comments to VentureBeat, the paper\u2019s authors said model merging is a gradient-free process that only requires forward passes, making it computationally cheaper than fine-tuning, which involves costly gradient updates. Merging also sidesteps the need for carefully balanced training data and mitigates the risk of \u201ccatastrophic forgetting,\u201d where a model loses its original capabilities after learning a new task. The technique is especially powerful when the training data for specialist models isn\u2019t available, as merging only requires the model weights themselves.<\/p>\n\n\n\n

\n
\n\n\n\n

AI Scaling Hits Its Limits<\/strong><\/p>\n\n\n\n

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:<\/p>\n\n\n\n