{"id":1033,"date":"2025-04-06T10:10:43","date_gmt":"2025-04-06T10:10:43","guid":{"rendered":"https:\/\/violethoward.com\/new\/deepseek-jolts-ai-industry-why-ais-next-leap-may-not-come-from-more-data-but-more-compute-at-inference\/"},"modified":"2025-04-06T10:10:43","modified_gmt":"2025-04-06T10:10:43","slug":"deepseek-jolts-ai-industry-why-ais-next-leap-may-not-come-from-more-data-but-more-compute-at-inference","status":"publish","type":"post","link":"https:\/\/violethoward.com\/new\/deepseek-jolts-ai-industry-why-ais-next-leap-may-not-come-from-more-data-but-more-compute-at-inference\/","title":{"rendered":"DeepSeek jolts AI industry: Why AI&#8217;s next leap may not come from more data, but more compute at inference"},"content":{"rendered":" \r\n<br><div>\n\t\t\t\t<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More<\/em><\/p>\n\n\n\n<hr class=\"wp-block-separator has-css-opacity is-style-wide\"\/>\n<\/div><p>The AI landscape continues to evolve at a rapid pace, with recent developments challenging established paradigms. Early in 2025, Chinese AI lab DeepSeek unveiled a new model that sent shockwaves through the AI industry and resulted in a 17% drop in Nvidia\u2019s stock, along with other stocks related to AI data center demand. This market reaction was widely reported to stem from DeepSeek\u2019s apparent ability to deliver high-performance models at a fraction of the cost of rivals in the U.S., sparking discussion about the implications for AI data centers.\u00a0<\/p>\n\n\n\n<p>To contextualize DeepSeek\u2019s disruption, we think it\u2019s useful to consider a broader shift in the AI landscape being driven by the scarcity of additional training data. Because the major AI labs have now already trained their models on much of the available public data on the internet, data scarcity is slowing further improvements in pre-training. As a result, model providers are looking to \u201ctest-time compute\u201d (TTC) where reasoning models (such as Open AI\u2019s \u201co\u201d series of models) \u201cthink\u201d before responding to a question at inference time, as an alternative method to improve overall model performance. The current thinking is that TTC may exhibit scaling-law improvements similar to those that once propelled pre-training, potentially enabling the next wave of transformative AI advancements.<\/p>\n\n\n\n<p>These developments indicate two significant shifts: First, labs operating on smaller (reported) budgets are now capable of releasing state-of-the-art models. The second shift is the focus on TTC as the next potential driver of AI progress. Below we unpack both of these trends and the potential implications for the competitive landscape and broader AI market.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-implications-for-the-ai-industry\">Implications for the AI industry<\/h2>\n\n\n\n<p>We believe that the shift towards TTC and the increased competition among reasoning models may have a number of implications for the wider AI landscape across hardware, cloud platforms, foundation models and enterprise software.\u00a0<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-1-hardware-gpus-dedicated-chips-and-compute-infrastructure\">1. Hardware (GPUs, dedicated chips and compute infrastructure)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>From massive training clusters to on-demand \u201ctest-time\u201d spikes:<\/strong> In our view, the shift towards TTC may have implications for the type of hardware resources that AI companies require and how they are managed. Rather than investing in increasingly larger GPU clusters dedicated to training workloads, AI companies may instead increase their investment in inference capabilities to support growing TTC needs. While AI companies will likely still require large numbers of GPUs to handle inference workloads, the differences between training workloads and inference workloads may impact how those chips are configured and used. Specifically, since inference workloads tend to be more dynamic (and \u201cspikey\u201d), capacity planning may become more complex than it is for batch-oriented training workloads.\u00a0<\/li>\n\n\n\n<li><strong>Rise of inference-optimized hardware:<\/strong> We believe that the shift in focus towards TTC is likely to increase opportunities for alternative AI hardware that specializes in low-latency inference-time compute. For example, we may see more demand for GPU alternatives such as application specific integrated circuits (ASICs) for inference. As access to TTC becomes more important than training capacity, the dominance of general-purpose GPUs, which are used for both training and inference, may decline. This shift could benefit specialized inference chip providers.\u00a0<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-2-cloud-platforms-hyperscalers-aws-azure-gcp-and-cloud-compute\">2. Cloud platforms: Hyperscalers (AWS, Azure, GCP) and cloud compute<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Quality of service (QoS) becomes a key differentiator:<\/strong> One issue preventing AI adoption in the enterprise, in addition to concerns around model accuracy, is the unreliability of inference APIs. Problems associated with unreliable API inference include fluctuating response times, rate limiting and difficulty handling concurrent requests and adapting to API endpoint changes. Increased TTC may further exacerbate these problems. In these circumstances, a cloud provider able to provide models with QoS assurances that address these challenges would, in our view, have a significant advantage.<\/li>\n\n\n\n<li><strong>Increased cloud spend despite efficiency gains:<\/strong> Rather than reducing demand for AI hardware, it is possible that more efficient approaches to large language model (LLM) training and inference may follow the Jevons Paradox, a historical observation where improved efficiency drives higher overall consumption. In this case, efficient inference models may encourage more AI developers to leverage reasoning models, which, in turn, increases demand for compute. We believe that recent model advances may lead to increased demand for cloud AI compute for both model inference and smaller, specialized model training.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-3-foundation-model-providers-openai-anthropic-cohere-deepseek-mistral\">3. Foundation model providers (OpenAI, Anthropic, Cohere, DeepSeek, Mistral)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Impact on pre-trained models:<\/strong> If new players like DeepSeek can compete with frontier AI labs at a fraction of the reported costs, proprietary pre-trained models may become less defensible as a moat. We can also expect further innovations in TTC for transformer models and, as DeepSeek has demonstrated, those innovations can come from sources outside of the more established AI labs.\u00a0\u00a0\u00a0<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-4-enterprise-ai-adoption-and-saas-application-layer\">4. Enterprise AI adoption and SaaS (application layer)<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Security and privacy concerns: <\/strong>Given DeepSeek\u2019s origins in China, there is likely to be ongoing scrutiny of the firm\u2019s products from a security and privacy perspective. In particular, the firm\u2019s China-based API and chatbot offerings are unlikely to be widely used by enterprise AI customers in the U.S., Canada or other Western countries. Many companies are reportedly moving to block the use of DeepSeek\u2019s website and applications. We expect that DeepSeek\u2019s models will face scrutiny even when they are hosted by third parties in the U.S. and other Western data centers which may limit enterprise adoption of the models. Researchers are already pointing to examples of security concerns around jail breaking, bias and harmful content generation. Given consumer attention, we may see experimentation and evaluation of DeepSeek\u2019s models in the enterprise, but it is unlikely that enterprise buyers will move away from incumbents due to these concerns.<\/li>\n\n\n\n<li><strong>Vertical specialization gains traction:<\/strong> In the past, vertical applications that use foundation models mainly focused on creating workflows designed for specific business needs. Techniques such as retrieval-augmented generation (RAG), model routing, function calling and guardrails have played an important role in adapting generalized models for these specialized use cases. While these strategies have led to notable successes, there has been persistent concern that significant improvements to the underlying models could render these applications obsolete. As Sam Altman cautioned, a major breakthrough in model capabilities could \u201csteamroll\u201d application-layer innovations that are built as wrappers around foundation models.<\/li>\n<\/ul>\n\n\n\n<p>However, if advancements in train-time compute are indeed plateauing, the threat of rapid displacement diminishes. In a world where gains in model performance come from TTC optimizations, new opportunities may open up for application-layer players. Innovations in domain-specific post-training algorithms \u2014 such as structured prompt optimization, latency-aware reasoning strategies and efficient sampling techniques \u2014 may provide significant performance improvements within targeted verticals. <\/p>\n\n\n\n<p>Any performance improvement would be especially relevant in the context of reasoning-focused models like OpenAI\u2019s GPT-4o and DeepSeek-R1, which often exhibit multi-second response times. In real-time applications, reducing latency and improving the quality of inference within a given domain could provide a competitive advantage. As a result, application-layer companies with domain expertise may play a pivotal role in optimizing inference efficiency and fine-tuning outputs.<\/p>\n\n\n\n<p>DeepSeek demonstrates a declining emphasis on ever-increasing amounts of pre-training as the sole driver of model quality. Instead, the development underscores the growing importance of TTC. While the direct adoption of DeepSeek models in enterprise software applications remains uncertain due to ongoing scrutiny, their impact on driving improvements in other existing models is becoming clearer. <\/p>\n\n\n\n<p>We believe that DeepSeek\u2019s advancements have prompted established AI labs to incorporate similar techniques into their engineering and research processes, supplementing their existing hardware advantages. The resulting reduction in model costs, as predicted, appears to be contributing to increased model usage, aligning with the principles of Jevons Paradox.<\/p>\n\n\n\n<p><em>Pashootan Vaezipoor is technical lead at Georgian.<\/em><\/p>\n<div id=\"boilerplate_2660155\" class=\"post-boilerplate boilerplate-after\"><div class=\"Boilerplate__newsletter-container vb\">\n<div class=\"Boilerplate__newsletter-main\">\n<p><strong>Daily insights on business use cases with VB Daily<\/strong><\/p>\n<p class=\"copy\">If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.<\/p>\n<p class=\"Form__newsletter-legal\">Read our Privacy Policy<\/p>\n<p class=\"Form__success\" id=\"boilerplateNewsletterConfirmation\">\n\t\t\t\t\tThanks for subscribing. Check out more VB newsletters here.\n\t\t\t\t<\/p>\n<p class=\"Form__error\">An error occured.<\/p>\n<\/p><\/div>\n<div class=\"image-container\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/venturebeat.com\/wp-content\/themes\/vb-news\/brand\/img\/vb-daily-phone.png\" alt=\"\"\/>\n\t\t\t\t<\/div>\n<\/p><\/div>\n<\/div>\t\t\t<\/div>\r\n<br>\r\n<br><a href=\"https:\/\/venturebeat.com\/ai\/deepseek-jolts-ai-industry-why-ais-next-leap-may-not-come-from-more-data-but-more-compute-at-inference\/\">Source link <\/a>","protected":false},"excerpt":{"rendered":"<p>Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The AI landscape continues to evolve at a rapid pace, with recent developments challenging established paradigms. Early in 2025, Chinese AI lab DeepSeek unveiled a new model that sent shockwaves through the AI industry and resulted [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1034,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[33],"tags":[],"class_list":["post-1033","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-automation"],"aioseo_notices":[],"jetpack_featured_media_url":"https:\/\/violethoward.com\/new\/wp-content\/uploads\/2025\/04\/Datatransformer.webp.jpeg","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/comments?post=1033"}],"version-history":[{"count":0,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/posts\/1033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media\/1034"}],"wp:attachment":[{"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/media?parent=1033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/categories?post=1033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/violethoward.com\/new\/wp-json\/wp\/v2\/tags?post=1033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69e302c146fa5c92dc28ac12. Config Timestamp: 2026-04-18 04:04:16 UTC, Cached Timestamp: 2026-04-29 01:22:45 UTC -->