{"id":3303,"date":"2025-08-23T19:49:48","date_gmt":"2025-08-23T19:49:48","guid":{"rendered":"https:\/\/violethoward.com\/new\/chan-zuckerberg-initiatives-rbio-uses-virtual-cells-to-train-ai-bypassing-lab-work\/"},"modified":"2025-08-23T19:49:48","modified_gmt":"2025-08-23T19:49:48","slug":"chan-zuckerberg-initiatives-rbio-uses-virtual-cells-to-train-ai-bypassing-lab-work","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/chan-zuckerberg-initiatives-rbio-uses-virtual-cells-to-train-ai-bypassing-lab-work\/","title":{"rendered":"Chan Zuckerberg Initiative&#8217;s rBio uses virtual cells to train AI, bypassing lab work"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders.<\/em> <em>Subscribe Now<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>The Chan Zuckerberg Initiative announced Thursday the launch of rBio, the first artificial intelligence model trained to reason about cellular biology using virtual simulations rather than requiring expensive laboratory experiments \u2014 a breakthrough that could dramatically accelerate biomedical research and drug discovery.<\/p>\n\n\n\n<p>The reasoning model, detailed in a research paper published on bioRxiv, demonstrates a novel approach called \u201csoft verification\u201d that uses predictions from virtual cell models as training signals instead of relying solely on experimental data. This paradigm shift could help researchers test biological hypotheses computationally before committing time and resources to costly laboratory work.<\/p>\n\n\n\n<p>\u201cThe idea is that you have these super powerful models of cells, and you can use them to simulate outcomes rather than testing them experimentally in the lab,\u201d said Ana-Maria Istrate, senior research scientist at CZI and lead author of the research, in an interview. \u201cThe paradigm so far has been that 90% of the work in biology is tested experimentally in a lab, while 10% is computational. With virtual cell models, we want to flip that paradigm.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-ai-finally-learned-to-speak-the-language-of-living-cells\">How AI finally learned to speak the language of living cells<\/h2>\n\n\n\n<p>The announcement represents a significant milestone for CZI\u2019s ambitious goal to \u201ccure, prevent, and manage all disease by the end of this century.\u201d Under the leadership of pediatrician Priscilla Chan and Meta CEO Mark Zuckerberg, the $6 billion philanthropic initiative has increasingly focused its resources on the intersection of artificial intelligence and biology.<\/p>\n\n\n\n<div id=\"boilerplate_2803147\" class=\"post-boilerplate boilerplate-speedbump\">\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><strong\/><strong>AI Scaling Hits Its Limits<\/strong><\/p>\n\n\n\n<p>Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Turning energy into a strategic advantage<\/li>\n\n\n\n<li>Architecting efficient inference for real throughput gains<\/li>\n\n\n\n<li>Unlocking competitive ROI with sustainable AI systems<\/li>\n<\/ul>\n\n\n\n<p><strong>Secure your spot to stay ahead<\/strong>: https:\/\/bit.ly\/4mwGngO<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<\/div><p>rBio addresses a fundamental challenge in applying AI to biological research. While large language models like ChatGPT excel at processing text, biological foundation models typically work with complex molecular data that cannot be easily queried in natural language. Scientists have struggled to bridge this gap between powerful biological models and user-friendly interfaces.<\/p>\n\n\n\n<p>\u201cFoundation models of biology \u2014 models like GREmLN and TranscriptFormer \u2014 are built on biological data modalities, which means you cannot interact with them in natural language,\u201d Istrate explained. \u201cYou have to find complicated ways to prompt them.\u201d<\/p>\n\n\n\n<p>The new model solves this problem by distilling knowledge from CZI\u2019s TranscriptFormer \u2014 a virtual cell model trained on 112 million cells from 12 species spanning 1.5 billion years of evolution \u2014 into a conversational AI system that researchers can query in plain English.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-soft-verification-revolution-teaching-ai-to-think-in-probabilities-not-absolutes\">The \u2018soft verification\u2019 revolution: Teaching AI to think in probabilities, not absolutes<\/h2>\n\n\n\n<p>The core innovation lies in rBio\u2019s training methodology. Traditional reasoning models learn from questions with unambiguous answers, like mathematical equations. But biological questions involve uncertainty and probabilistic outcomes that don\u2019t fit neatly into binary categories.<\/p>\n\n\n\n<p>CZI\u2019s research team, led by Senior Director of AI Theofanis Karaletsos and Istrate, overcame this challenge by using reinforcement learning with proportional rewards. Instead of simple yes-or-no verification, the model receives rewards proportional to the likelihood that its biological predictions align with reality, as determined by virtual cell simulations.<\/p>\n\n\n\n<p>\u201cWe applied new methods to how LLMs are trained,\u201d the research paper explains. \u201cUsing an off-the-shelf language model as a scaffold, the team trained rBio with reinforcement learning, a common technique in which the model is rewarded for correct answers. But instead of asking a series of yes\/no questions, the researchers tuned the rewards in proportion to the likelihood that the model\u2019s answers were correct.\u201d<\/p>\n\n\n\n<p>This approach allows scientists to ask complex questions like \u201cWould suppressing the actions of gene A result in an increase in activity of gene B?\u201d and receive scientifically grounded responses about cellular changes, including shifts from healthy to diseased states.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-beating-the-benchmarks-how-rbio-outperformed-models-trained-on-real-lab-data\">Beating the benchmarks: How rBio outperformed models trained on real lab data<\/h2>\n\n\n\n<p>In testing against the PerturbQA benchmark \u2014 a standard dataset for evaluating gene perturbation prediction \u2014 rBio demonstrated competitive performance with models trained on experimental data. The system outperformed baseline large language models and matched performance of specialized biological models in key metrics.<\/p>\n\n\n\n<p>Particularly noteworthy, rBio showed strong \u201ctransfer learning\u201d capabilities, successfully applying knowledge about gene co-expression patterns learned from TranscriptFormer to make accurate predictions about gene perturbation effects\u2014a completely different biological task.<\/p>\n\n\n\n<p>\u201cWe show that on the PerturbQA dataset, models trained using soft verifiers learn to generalize on out-of-distribution cell lines, potentially bypassing the need to train on cell-line specific experimental data,\u201d the researchers wrote.<\/p>\n\n\n\n<p>When enhanced with chain-of-thought prompting techniques that encourage step-by-step reasoning, rBio achieved state-of-the-art performance, surpassing the previous leading model SUMMER.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-from-social-justice-to-science-inside-czi-s-controversial-pivot-to-pure-research\">From social justice to science: Inside CZI\u2019s controversial pivot to pure research<\/h2>\n\n\n\n<p>The rBio announcement comes as CZI has undergone significant organizational changes, refocusing its efforts from a broad philanthropic mission that included social justice and education reform to a more targeted emphasis on scientific research. The shift has drawn criticism from some former employees and grantees who saw the organization abandon progressive causes.<\/p>\n\n\n\n<p>However, for Istrate, who has worked at CZI for six years, the focus on biological AI represents a natural evolution of long-standing priorities. \u201cMy experience and work has not changed much. I have been part of the science initiative for as long as I have been at CZI,\u201d she said.<\/p>\n\n\n\n<p>The concentration on virtual cell models builds on nearly a decade of foundational work. CZI has invested heavily in building cell atlases \u2014 comprehensive databases showing which genes are active in different cell types across species \u2014 and developing the computational infrastructure needed to train large biological models.<\/p>\n\n\n\n<p>\u201cI\u2019m really excited about the work that\u2019s been happening at CZI for years now, because we\u2019ve been building up to this moment,\u201d Istrate noted, referring to the organization\u2019s earlier investments in data platforms and single-cell transcriptomics.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-building-bias-free-biology-how-czi-curated-diverse-data-to-train-fairer-ai-models\">Building bias-free biology: How CZI curated diverse data to train fairer AI models<\/h2>\n\n\n\n<p>One critical advantage of CZI\u2019s approach stems from its years of careful data curation. The organization operates CZ CELLxGENE, one of the largest repositories of single-cell biological data, where information undergoes rigorous quality control processes.<\/p>\n\n\n\n<p>\u201cWe\u2019ve generated some of the flagship initial data atlases for transcriptomics, and those were generated with diversity in mind to minimize bias in terms of cell types, ancestry, tissues, and donors,\u201d Istrate explained.<\/p>\n\n\n\n<p>This attention to data quality becomes crucial when training AI models that could influence medical decisions. Unlike some commercial AI efforts that rely on publicly available but potentially biased datasets, CZI\u2019s models benefit from carefully curated biological data designed to represent diverse populations and cell types.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-open-source-vs-big-tech-why-czi-is-giving-away-billion-dollar-ai-technology-for-free\">Open source vs. big tech: Why CZI is giving away billion-dollar AI technology for free<\/h2>\n\n\n\n<p>CZI\u2019s commitment to open-source development distinguishes it from commercial competitors like Google DeepMind and pharmaceutical companies developing proprietary AI tools. All CZI models, including rBio, are freely available through the organization\u2019s Virtual Cell Platform, complete with tutorials that can run on free Google Colab notebooks.<\/p>\n\n\n\n<p>\u201cI do think the open source piece is very important, because that\u2019s a core value that we\u2019ve had since we\u2019ve started CZI,\u201d Istrate said. \u201cOne of the main goals for our work is to accelerate science. So everything we do is we want to make it open source for that purpose only.\u201d<\/p>\n\n\n\n<p>This strategy aims to democratize access to sophisticated biological AI tools, potentially benefiting smaller research institutions and startups that lack the resources to develop such models independently. The approach reflects CZI\u2019s philanthropic mission while creating network effects that could accelerate scientific progress.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-end-of-trial-and-error-how-ai-could-slash-drug-discovery-from-decades-to-years\">The end of trial and error: How AI could slash drug discovery from decades to years<\/h2>\n\n\n\n<p>The potential applications extend far beyond academic research. By enabling scientists to quickly test hypotheses about gene interactions and cellular responses, rBio could significantly accelerate the early stages of drug discovery \u2014 a process that typically takes decades and costs billions of dollars.<\/p>\n\n\n\n<p>The model\u2019s ability to predict how gene perturbations affect cellular behavior could prove particularly valuable for understanding neurodegenerative diseases like Alzheimer\u2019s, where researchers need to identify how specific genetic changes contribute to disease progression.<\/p>\n\n\n\n<p>\u201cAnswers to these questions can shape our understanding of the gene interactions contributing to neurodegenerative diseases like Alzheimer\u2019s,\u201d the research paper notes. \u201cSuch knowledge could lead to earlier intervention, perhaps halting these diseases altogether someday.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-universal-cell-model-dream-integrating-every-type-of-biological-data-into-one-ai-brain\">The universal cell model dream: Integrating every type of biological data into one AI brain<\/h2>\n\n\n\n<p>rBio represents the first step in CZI\u2019s broader vision to create \u201cuniversal virtual cell models\u201d that integrate knowledge from multiple biological domains. Currently, researchers must work with separate models for different types of biological data\u2014transcriptomics, proteomics, imaging\u2014without easy ways to combine insights.<\/p>\n\n\n\n<p>\u201cOne of our grand challenges is building these virtual cell models and understanding cells, as I mentioned over the next couple of years, is how to integrate knowledge from all of these super powerful models of biology,\u201d Istrate said. \u201cThe main challenge is, how do you integrate all of this knowledge into one space?\u201d<\/p>\n\n\n\n<p>The researchers demonstrated this integration capability by training rBio models that combine multiple verification sources \u2014 TranscriptFormer for gene expression data, specialized neural networks for perturbation prediction, and knowledge databases like Gene Ontology. These combined models significantly outperformed single-source approaches.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-roadblocks-ahead-what-could-stop-ai-from-revolutionizing-biology\">The roadblocks ahead: What could stop AI from revolutionizing biology<\/h2>\n\n\n\n<p>Despite its promising performance, rBio faces several technical challenges. The model\u2019s current expertise focuses primarily on gene perturbation prediction, though the researchers indicate that any biological domain covered by TranscriptFormer could theoretically be incorporated.<\/p>\n\n\n\n<p>The team continues working on improving the user experience and implementing appropriate guardrails to prevent the model from providing answers outside its area of expertise\u2014a common challenge in deploying large language models for specialized domains.<\/p>\n\n\n\n<p>\u201cWhile rBio is ready for research, the model\u2019s engineering team is continuing to improve the user experience, because the flexible problem-solving that makes reasoning models conversational also poses a number of challenges,\u201d the research paper explains.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-trillion-dollar-question-how-open-source-biology-ai-could-reshape-the-pharmaceutical-industry\">The trillion-dollar question: How open source biology AI could reshape the pharmaceutical industry<\/h2>\n\n\n\n<p>The development of rBio occurs against the backdrop of intensifying competition in AI-driven drug discovery. Major pharmaceutical companies and technology firms are investing billions in biological AI capabilities, recognizing the potential to transform how medicines are discovered and developed.<\/p>\n\n\n\n<p>CZI\u2019s open-source approach could accelerate this transformation by making sophisticated tools available to the broader research community. Academic researchers, biotech startups, and even established pharmaceutical companies can now access capabilities that would otherwise require substantial internal AI development efforts.<\/p>\n\n\n\n<p>The timing proves significant as the Trump administration has proposed substantial cuts to the National Institutes of Health budget, potentially threatening public funding for biomedical research. CZI\u2019s continued investment in biological AI infrastructure could help maintain research momentum during periods of reduced government support.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-a-new-chapter-in-the-race-against-disease\">A new chapter in the race against disease<\/h2>\n\n\n\n<p>rBio\u2019s launch marks more than just another AI breakthrough\u2014it represents a fundamental shift in how biological research could be conducted. By demonstrating that virtual simulations can train models as effectively as expensive laboratory experiments, CZI has opened a path for researchers worldwide to accelerate their work without the traditional constraints of time, money, and physical resources.<\/p>\n\n\n\n<p>As CZI prepares to make rBio freely available through its Virtual Cell Platform, the organization continues expanding its biological AI capabilities with models like GREmLN for cancer detection and ongoing work on imaging technologies. The success of the soft verification approach could influence how other organizations train AI for scientific applications, potentially reducing dependence on experimental data while maintaining scientific rigor.<\/p>\n\n\n\n<p>For an organization that began with the audacious goal of curing all diseases by the century\u2019s end, rBio offers something that has long eluded medical researchers: a way to ask biology\u2019s hardest questions and get scientifically grounded answers in the time it takes to type a sentence. In a field where progress has traditionally been measured in decades, that kind of speed could make all the difference between diseases that define generations\u2014and diseases that become distant memories.<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/chan-zuckerberg-initiatives-rbio-uses-virtual-cells-to-train-ai-bypassing-lab-work\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now The Chan Zuckerberg Initiative announced Thursday the launch of rBio, the first artificial intelligence model trained to reason about cellular biology using virtual simulations rather than requiring expensive laboratory [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":3304,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-3303","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/08\/nuneybits_Vector_art_of_AI_analyzing_cells_e3d03bda-a9ab-44b8-8460-cf2f257967e5.webp.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/3303","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=3303"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/3303\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/3304"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=3303"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=3303"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=3303"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 21:40:50 UTC -->