{"id":3141,"date":"2025-08-14T10:38:03","date_gmt":"2025-08-14T10:38:03","guid":{"rendered":"https:\/\/violethoward.com\/new\/liquid-ais-lfm2-vl-gives-smartphones-small-ai-vision-models\/"},"modified":"2025-08-14T10:38:03","modified_gmt":"2025-08-14T10:38:03","slug":"liquid-ais-lfm2-vl-gives-smartphones-small-ai-vision-models","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/liquid-ais-lfm2-vl-gives-smartphones-small-ai-vision-models\/","title":{"rendered":"Liquid AI&#8217;s LFM2-VL gives smartphones small AI vision models"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders.<\/em> <em>Subscribe Now<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>Liquid AI has released <strong>LFM2-VL<\/strong><strong>, a new generation of vision-language foundation models <\/strong>designed for efficient deployment across a wide range of hardware \u2014 <strong>from smartphones and laptops to wearables and embedded systems. <\/strong><\/p>\n\n\n\n<p>The models promise low-latency performance, strong accuracy, and flexibility for real-world applications.<\/p>\n\n\n\n<p>LFM2-VL builds on the company\u2019s existing LFM2 architecture introduced just over a month ago as the \u201cfastest on-device foundation models on the market\u201d thanks to its approach of generating \u201cweights\u201d or model settings on the fly for each input (known as Linear Input-Varying (LIV) system), extending it into multimodal processing that supports both text and image inputs at variable resolutions.<\/p>\n\n\n\n<p>According to Liquid AI, the <strong>models deliver up to twice the GPU inference speed of comparable vision-language models<\/strong>, while maintaining competitive performance on common benchmarks.<\/p>\n\n\n\n<div id=\"boilerplate_2803147\" class=\"post-boilerplate boilerplate-speedbump\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><strong\/><strong>AI Scaling Hits Its Limits<\/strong><\/p>\n\n\n\n<p>Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Turning energy into a strategic advantage<\/li>\n\n\n\n<li>Architecting efficient inference for real throughput gains<\/li>\n\n\n\n<li>Unlocking competitive ROI with sustainable AI systems<\/li>\n<\/ul>\n\n\n\n<p><strong>Secure your spot to stay ahead<\/strong>: https:\/\/bit.ly\/4mwGngO<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<\/div><p><strong>\u201cEfficiency is our product,\u201d wrote Liquid AI co-founder and CEO Ramin Hasani <\/strong>in a post on X announcing the new model family:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">meet LFM2-VL: an efficient Liquid vision-language model for the device class. open weights, 440M &amp; 1.6B, up to 2\u00d7 faster on GPU with competitive accuracy, Native 512\u00d7512, smart patching for big images. <\/p><p>efficiency is our product <a href=\"https:\/\/twitter.com\/LiquidAI_?ref_src=twsrc%5Etfw\">@LiquidAI_<\/a> <\/p><p>download them on <a href=\"https:\/\/twitter.com\/huggingface?ref_src=twsrc%5Etfw\">@huggingface<\/a>:\u2026 <a href=\"https:\/\/t.co\/3Lze6Hc6Ys\">pic.twitter.com\/3Lze6Hc6Ys<\/a><\/p>\u2014 Ramin Hasani (@ramin_m_h) <a href=\"https:\/\/twitter.com\/ramin_m_h\/status\/1955332731942174960?ref_src=twsrc%5Etfw\">August 12, 2025<\/a><\/blockquote>\n<\/div><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-two-variants-for-different-needs\">Two variants for different needs<\/h2>\n\n\n\n<p>The release includes two model sizes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>LFM2-VL-450M<\/strong> \u2014 a hyper-efficient model with less than half a billion parameters (internal settings) aimed at highly resource-constrained environments.<\/li>\n\n\n\n<li><strong>LFM2-VL-1.6B<\/strong> \u2014 a more capable model that remains lightweight enough for single-GPU and device-based deployment.<\/li>\n<\/ul>\n\n\n\n<p>Both variants process images at native resolutions up to 512\u00d7512 pixels, avoiding distortion or unnecessary upscaling. <\/p>\n\n\n\n<p>For larger images, the system applies non-overlapping patching and adds a thumbnail for global context, enabling the model to capture both fine detail and the broader scene.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-background-on-liquid-ai\">Background on Liquid AI<\/h2>\n\n\n\n<p>Liquid AI was founded by former researchers from MIT\u2019s Computer Science and Artificial Intelligence Laboratory (CSAIL) with the goal of building AI architectures that move beyond the widely used transformer model. <\/p>\n\n\n\n<p>The company\u2019s flagship innovation, the Liquid Foundation Models (LFMs), are based on principles from dynamical systems, signal processing, and numerical linear algebra, producing general-purpose AI models capable of handling text, video, audio, time series, and other sequential data. <\/p>\n\n\n\n<p><strong>Unlike traditional architectures, Liquid\u2019s approach aims to deliver competitive or superior performance using significantly fewer computational resources<\/strong>, allowing for real-time adaptability during inference while maintaining low memory requirements. This makes LFMs well suited for both large-scale enterprise use cases and resource-limited edge deployments.<\/p>\n\n\n\n<p>In July 2025, the company expanded its platform strategy with the launch of the Liquid Edge AI Platform (LEAP), <strong>a cross-platform SDK designed to make it easier for developers to run small language models directly on mobile and embedded devices.<\/strong> <\/p>\n\n\n\n<p>LEAP offers OS-agnostic support for iOS and Android, integration with both Liquid\u2019s own models and other open-source SLMs, and a built-in library with models as small as 300MB\u2014small enough for modern phones with minimal RAM. <\/p>\n\n\n\n<p>Its companion app, Apollo, enables developers to test models entirely offline, aligning with Liquid AI\u2019s emphasis on privacy-preserving, low-latency AI. Together, LEAP and Apollo reflect the company\u2019s commitment to decentralizing AI execution, reducing reliance on cloud infrastructure, and empowering developers to build optimized, task-specific models for real-world environments.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-speed-quality-trade-offs-and-technical-design\">Speed\/quality trade-offs and technical design<\/h2>\n\n\n\n<p>LFM2-VL uses a modular architecture <strong>combining a language model backbone, a SigLIP2 NaFlex vision encoder, and a multimodal projector. <\/strong><\/p>\n\n\n\n<p>The projector includes a two-layer MLP connector with pixel unshuffle, reducing the number of image tokens and improving throughput.<\/p>\n\n\n\n<p>Users can adjust parameters such as the maximum number of image tokens or patches, allowing them to balance speed and quality depending on the deployment scenario. The training process involved approximately 100 billion multimodal tokens, sourced from open datasets and in-house synthetic data.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-performance-and-benchmarks\">Performance and benchmarks<\/h2>\n\n\n\n<p>The models achieve competitive benchmark results across a range of vision-language evaluations. LFM2-VL-1.6B scores well in RealWorldQA (65.23), InfoVQA (58.68), and OCRBench (742), and maintains solid results in multimodal reasoning tasks. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img fetchpriority=\"high\" decoding=\"async\" height=\"586\" width=\"800\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30%E2%80%AFPM.png?w=800\" alt=\"\" class=\"wp-image-3015524\" style=\"width:840px;height:auto\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png 1266w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=300,220 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=768,563 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=800,586 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=400,293 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=750,550 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=578,424 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/Screenshot-2025-08-12-at-5.57.30\u202fPM.png?resize=930,682 930w\" sizes=\"(max-width: 800px) 100vw, 800px\"\/><\/figure>\n\n\n\n<p>In inference testing, LFM2-VL achieved the fastest GPU processing times in its class when tested on a standard workload of a 1024\u00d71024 image and short prompt.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" height=\"453\" width=\"800\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?w=800\" alt=\"\" class=\"wp-image-3015527\" style=\"width:840px;height:auto\" srcset=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png 2072w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=300,170 300w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=768,434 768w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=800,453 800w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=1536,869 1536w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=2048,1158 2048w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=400,226 400w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=750,424 750w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=578,327 578w, https:\/\/venturebeat.com\/wp-content\/uploads\/2025\/08\/689b3eef2ae10f1ac5e8c338_LFM2-VL-Vision-Language-Models_-Processing-Time-Comparison-4-1-1.png?resize=930,526 930w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\"\/><\/figure>\n\n\n\n<p>\n<h2 class=\"wp-block-heading\" id=\"h-licensing-and-availability\">Licensing and availability<\/h2>\n<\/p>\n\n\n\n<p>LFM2-VL models are available now on Hugging Face, along with example fine-tuning code in Colab. They are compatible with Hugging Face transformers and TRL. <\/p>\n\n\n\n<p>The models are released under a custom \u201cLFM1.0 license\u201d. Liquid AI has described this license as based on Apache 2.0 principles, but the full text has not yet been published.<\/p>\n\n\n\n<p> The company has indicated that commercial use will be permitted under certain conditions, with different terms for companies above and below $10 million in annual revenue.<\/p>\n\n\n\n<p>With LFM2-VL, Liquid AI aims to make high-performance multimodal AI more accessible for on-device and resource-limited deployments, without sacrificing capability.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div><template id="I4TiBTLiFELwIJps4N52"></template><\/script>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/liquid-ai-wants-to-give-smartphones-small-fast-ai-that-can-see-with-new-lfm2-vl-model\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Liquid AI has released LFM2-VL, a new generation of vision-language foundation models designed for efficient deployment across a wide range of hardware \u2014 from smartphones and laptops to wearables [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3142,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-3141","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/08\/Collage-of-Expression-and-Vision.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/3141","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=3141"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/3141\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/3142"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=3141"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=3141"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=3141"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 20:00:20 UTC -->