{"id":734,"date":"2025-03-21T05:30:50","date_gmt":"2025-03-21T05:30:50","guid":{"rendered":"https:\/\/violethoward.com\/new\/small-models-as-paralegals-lexisnexis-distills-models-to-build-ai-assistant\/"},"modified":"2025-03-21T05:30:50","modified_gmt":"2025-03-21T05:30:50","slug":"small-models-as-paralegals-lexisnexis-distills-models-to-build-ai-assistant","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/small-models-as-paralegals-lexisnexis-distills-models-to-build-ai-assistant\/","title":{"rendered":"Small models as paralegals: LexisNexis distills models to build AI assistant"},"content":{"rendered":" \r\n
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More<\/em><\/p>\n\n\n\n When legal research company LexisNexis created its AI assistant Prot\u00e9g\u00e9, it wanted to figure out the best way to leverage its expertise without deploying a large model.\u00a0<\/p>\n\n\n\n Prot\u00e9g\u00e9 aims to help lawyers, associates and paralegals write and proof legal documents and ensure that anything they cite in complaints and briefs is accurate. However, LexisNexis didn\u2019t want a general legal AI assistant; they wanted to build one that learns a firm\u2019s workflow and is more customizable.\u00a0<\/p>\n\n\n\n LexisNexis saw the opportunity to bring the power of large language models (LLMs) from Anthropic and Mistral and find the best models that answer user questions the best, Jeff Riehl, CTO of LexisNexis Legal and Professional, told VentureBeat.<\/p>\n\n\n\n \u201cWe use the best model for the specific use case as part of our multi-model approach. We use the model that provides the best result with the fastest response time,\u201d Riehl said. \u201cFor some use cases, that will be a small language model like Mistral or we perform distillation to improve performance and reduce cost.\u201d<\/p>\n\n\n\n While LLMs still provide value in building AI applications, some organizations turn to using small language models (SLMs) or distilling LLMs to become small versions of the same model.\u00a0<\/p>\n\n\n\n Distillation, where an LLM \u201cteaches\u201d a smaller model, has become a popular method for many organizations.\u00a0<\/p>\n\n\n\n Small models often work best for apps like chatbots or simple code completion, which is what LexisNexis wanted to use for Prot\u00e9g\u00e9.\u00a0<\/p>\n\n\n\n This is not the first time LexisNexis built AI applications, even before launching its legal research hub LexisNexis + AI in July 2024.<\/p>\n\n\n\n \u201cWe have used a lot of AI in the past, which was more around natural language processing, some deep learning and machine learning,\u201d Riehl said. \u201cThat really changed in November 2022 when ChatGPT was launched, because prior to that, a lot of the AI capabilities were kind of behind the scenes. But once ChatGPT came out, the generative capabilities, the conversational capabilities of it was very, very intriguing to us.\u201d<\/p>\n\n\n\n Riehl said LexisNexis uses different models from most of the major model providers when building its AI platforms. LexisNexis + AI used Claude models from Anthropic, OpenAI\u2019s GPT models and a model from Mistral.\u00a0<\/p>\n\n\n\n This multimodal approach helped break down each task users wanted to perform on the platform. To do this, LexisNexis had to architect its platform to switch between models.\u00a0<\/p>\n\n\n\n \u201cWe would break down whatever task was being performed into individual components, and then we would identify the best large language model to support that component. One example of that is we will use Mistral to assess the query that the user entered in,\u201d Riehl said.\u00a0<\/p>\n\n\n\n For Prot\u00e9g\u00e9, the company wanted faster response times and models more fine-tuned for legal use cases. So it turned to what Riehl calls \u201cfine-tuned\u201d versions of models, essentially smaller weight versions of LLMs or distilled models.\u00a0<\/p>\n\n\n\n \u201cYou don\u2019t need GPT-4o to do the assessment of a query, so we use it for more sophisticated work, and we switch models out,\u201d he said.\u00a0<\/p>\n\n\n\n When a user asks Prot\u00e9g\u00e9 a question about a specific case, the first model it pings is a fine-tuned Mistral \u201cfor assessing the query, then determining what the purpose and intent of that query is\u201d before switching to the model best suited to complete the task. Riehl said the next model could be an LLM that generates new queries for the search engine or another model that summarizes results.\u00a0<\/p>\n\n\n\n Right now, LexisNexis mostly relies on a fine-tuned Mistral model though Riehl said it used a fine-tuned version of Claude \u201cwhen it first came out; we are not using it in the product today but in other ways.\u201d LexisNexis is also interested in using other OpenAI models especially since the company came out with new reinforcement fine-tuning capabilities last year. LexisNexis is in the process of evaluating OpenAI\u2019s reasoning models including o3 for its platforms.\u00a0<\/p>\n\n\n\n Riehl added that it may also look at using Gemini models from Google.\u00a0<\/p>\n\n\n\n LexisNexis backs all of its AI platforms with its own knowledge graph to perform retrieval augmented generation (RAG) capabilities, especially as Prot\u00e9g\u00e9 could help launch agentic processes later.\u00a0<\/p>\n\n\n\n Even before the advent of generative AI, LexisNexis tested the possibility of putting chatbots to work in the legal industry. In 2017, the company tested an AI assistant that would compete with IBM\u2019s Watson-powered Ross and\u00a0Prot\u00e9g\u00e9 sits in the company\u2019s LexisNexis + AI platform, which brings together the AI services of LexisNexis.\u00a0<\/p>\n\n\n\n Prot\u00e9g\u00e9 helps law firms with tasks that paralegals or associates tend to do. It helps write legal briefs and complaints that are grounded in firms\u2019 documents and data, suggest legal workflow next steps, suggest new prompts to refine searches, draft questions for depositions and discovery, link quotes in filings for accuracy, generate timelines and, of course, summarize complex legal documents.\u00a0<\/p>\n\n\n\n \u201cWe see Prot\u00e9g\u00e9 as the initial step in personalization and agentic capabilities,\u201d Riehl said. \u201cThink about the different types of lawyers: M&A, litigators, real estate. It\u2019s going to continue to get more and more personalized based on the specific task you do. Our vision is that every legal professional will have a personal assistant to help them do their job based on what they do, not what other lawyers do.\u201d<\/p>\n\n\n\n Prot\u00e9g\u00e9 now competes against other legal research and technology platforms. Thomson Reuters customized OpenAI\u2019s o1-mini-model for its CoCounsel legal assistant. Harvey, which raised $300 million from investors including LexisNexis, also has a legal AI assistant.\u00a0<\/p>\n
\n<\/div>Small, fine-tuned models and model routing\u00a0<\/h2>\n\n\n\n
The AI legal suite<\/h2>\n\n\n\n