{"id":484,"date":"2025-03-07T20:15:05","date_gmt":"2025-03-07T20:15:05","guid":{"rendered":"https:\/\/violethoward.com\/new\/how-yelp-reviewed-competing-llms-for-correctness-relevance-and-tone-to-develop-its-user-friendly-ai-assistant\/"},"modified":"2025-03-07T20:15:05","modified_gmt":"2025-03-07T20:15:05","slug":"how-yelp-reviewed-competing-llms-for-correctness-relevance-and-tone-to-develop-its-user-friendly-ai-assistant","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/how-yelp-reviewed-competing-llms-for-correctness-relevance-and-tone-to-develop-its-user-friendly-ai-assistant\/","title":{"rendered":"How Yelp reviewed competing LLMs for correctness, relevance and tone to develop its user-friendly AI assistant"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>The review app Yelp has provided helpful information to diners and other consumers for decades. It had experimented with machine learning since its early years. During the recent explosion in AI technology, it was still encountering stumbling blocks as it worked to employ modern large language models to power some features.\u00a0<\/p>\n\n\n\n<p>Yelp realized that customers, especially those who only occasionally used the app, had trouble connecting with its AI features, such as its AI Assistant.\u00a0<\/p>\n\n\n\n<p>\u201cOne of the obvious lessons that we saw is that it\u2019s very easy to build something that looks cool, but very hard to build something that looks cool and is very useful,\u201d Craig Saldanha, chief product officer at Yelp, told VentureBeat in an interview.<\/p>\n\n\n\n<p>It certainly wasn\u2019t all easy. After it launched Yelp Assistant, its AI-powered service search assistant, in April 2024 to a broader swathe of customers, Yelp saw usage figures for its AI tools actually beginning to decline.\u00a0<\/p>\n\n\n\n<p>\u201cThe one that took us by surprise was when we launched this as a beta to consumers \u2014 a few users and folks who are very familiar with the app \u2014 [and they] loved it. We got such a strong signal that this would be successful, and then we rolled it out to everyone, [and] the performance just fell off,\u201d Saldanha said. \u201cIt took us a long time to figure out why.\u201d<\/p>\n\n\n\n<p>It turned out that Yelp\u2019s more casual users, those who occasionally visited the site or app to find a new tailor or plumber, did not expect to be be immediately talking with an AI representative.\u00a0<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-from-simple-to-more-involved-ai-features\">From simple to more involved AI features<\/h2>\n\n\n\n<p>Most people know Yelp as a website and app to look up restaurant reviews and menu photos. I use Yelp to find pictures of food in new eateries and to see if others share my feelings about a particularly bland dish. It\u2019s also a place that tells me if a coffee shop I plan to use as a workspace for the day has WiFi, plugs and seating, a rarity in Manhattan.<\/p>\n\n\n\n<p>Saldanha recalled that Yelp had been investing in AI \u201cfor the better part of a decade.\u201d<\/p>\n\n\n\n<p>\u201cWay back when, I\u2019d say in the 2013-2014 timeline, we were in a very different generation of AI, so our focus was on building our own models to do things like query understanding. Part of the job of making a meaningful connection is helping people refine their own search intent,\u201d he said.<\/p>\n\n\n\n<p>But as AI continued to evolve, so did Yelp\u2019s needs. <span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">It invested in AI to recognize food in pictures submitted by users to identify popular dishes, and then it launched new ways to connect to tradespeople and services and\u00a0help guide users\u2019 searches on the platform.<\/span>\u00a0<\/p>\n\n\n\n<p>AI Assistant helps Yelp users find the right \u201cPro\u201d to work with. People can tap the chatbox and either use the prompts or type out the task they need done. The assistant then asks follow-up questions to narrow down potential service providers before drafting a message to Pros who might want to bid for the job.<\/p>\n\n\n\n<p>Saldanha said Pros are encouraged to respond to users themselves, though he acknowledges that larger brands often have call centers that handle messages generated by Yelp\u2019s AI Assistant.\u00a0<\/p>\n\n\n\n<p>In addition to AI Assistant, Yelp launched Review Insights and Highlights. LLMs analyze user and reviewer sentiment, which Yelp collects into sentiment scores. Yelp uses a detailed GPT-4o prompt to generate a dataset for a list of topics. Then, it\u2019s fine-tuned with a GPT-4o-mini model.\u00a0<\/p>\n\n\n\n<p>The review highlights feature, which presents information from reviews, also uses an LLM prompt to generate a dataset. However, it is based on GPT-4, with fine-tuning from GPT-3.5 Turbo. Yelp said it will update the feature with GPT-4o and o1.\u00a0<\/p>\n\n\n\n<p>Yelp joined many other companies using LLMs to improve the usefulness of reviews by adding better search functions based on customer comments. For example, Amazon launched Rufus, an AI-powered assistant that helps people find recommended items.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-big-models-and-performance-needs\">Big models and performance needs<\/h2>\n\n\n\n<p>For many of its new AI features, including AI Assistant, Yelp turned to OpenAI\u2019s GPT-4o and other models, but Saldanha noted that no matter the model, Yelp\u2019s data is the secret sauce for its assistants. Yelp did not want to lock itself into one model and kept an open mind about which LLMs would provide the best service for its customers.\u00a0<\/p>\n\n\n\n<p>\u201cWe use models from OpenAI, Anthropic and other models on AWS Bedrock,\u201d Saldanha said.\u00a0<\/p>\n\n\n\n<p>Saldanha explained that Yelp created a rubric to test the performance of models in correctness, relevance, consciousness, customer safety and compliance. He said that \u201cit \u2018s really the top end models\u201d that performed best. The company runs a small pilot with each model before taking into account iteration cost and response latency.\u00a0<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-teaching-users\">Teaching users<\/h2>\n\n\n\n<p>Yelp also embarked on a concerted effort to educate both casual and power users to get comfortable with the new AI features. Saldanha said one of the first things they realized, especially with the AI Assistant, is that the tone had to feel human. It couldn\u2019t respond too fast or too slowly; it couldn\u2019t be overly encouraging or too brusque.<\/p>\n\n\n\n<p>\u201cWe put a bunch of effort into helping people feel comfortable, especially with that first response. It took us almost four months to get this second piece right. And as soon as we did, it was very obvious and you could see that hockey stick in engagement,\u201d Saldanha said.\u00a0<\/p>\n\n\n\n<p>Part of that process involved training the AI Assistant to use certain words and to sound positive. After all that fine-tuning, Saldanha said they\u2019re finally seeing higher usage numbers for Yelp\u2019s AI features.\u00a0<\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/how-yelp-reviewed-competing-llms-for-correctness-relevance-and-tone-to-develop-its-user-friendly-ai-assistant\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The review app Yelp has provided helpful information to diners and other consumers for decades. It had experimented with machine learning since its early years. During the recent explosion in AI technology, it was still encountering [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":485,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-484","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/03\/crimedy7_illustration_of_a_robot_looking_at_a_phone_app_for_res_4008bf89-38b4-4f5e-95c1-05072a193fdc.jpeg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=484"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/484\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/485"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69b0ea1f46fa5c3231e56837. Config Timestamp: 2026-03-11 04:05:51 UTC, Cached Timestamp: 2026-04-08 06:47:56 UTC -->