UBS Analysis: Compute Demand for Generative AI Inference
UBS projects a massive surge in compute demand for generative AI (GenAI) inference, driven by increasingly complex AI applications across sectors. According to a May 7, 2025, report, inference—the process of running trained AI models to generate outputs—will overtake training as the primary driver of AI compute needs. Below are key insights from UBS’s analysis:
- Exponential Growth Projections:
- For consumer applications like chatbots and copilots, inference compute demand is expected to rise from 350 exaFLOP/s in 2024 to 3,500 exaFLOP/s by 2030, a 10-fold increase.
- Enterprise applications, such as fraud detection and contract summarization, are forecast to grow even faster, from 15 exaFLOP/s to 440 exaFLOP/s over the same period.
- Agentic AI, including autonomous customer support and workflow automation, could see demand soar to 14 zettaFLOP/s (14,000 exaFLOP/s) by 2030, a leap from today’s estimated hundreds of exaFLOP/s.
- Physical AI, encompassing robotics and autonomous vehicles, may eventually require compute in the yottaFLOP/s range (1,000 zettaFLOP/s) as it evolves to mimic human cognition.
- Current Capacity and Utilization:
- Today’s installed GPU compute capacity is approximately 4,000 exaFLOP/s, rising to 5,000 exaFLOP/s with Google’s Tensor Processing Units (TPUs). However, much of this capacity remains underutilized, indicating room for optimization but also highlighting supply constraints.
- Drivers of Demand:
- Complex Reasoning Methods: Techniques like Chain of Thought (CoT) reasoning significantly increase computational intensity. Nvidia CEO Jensen Huang noted that agentic AI and reasoning require “easily 100x more” computation than previously anticipated.
- Sectoral Adoption: UBS Evidence Lab found nearly 500 companies across 27 sectors referencing GenAI on 2024 earnings calls, with high activity in Software & Services, Media, Commercial Services, Semiconductors, and Consumer Services. Sectors with high employee costs and automatable roles are particularly poised for GenAI-driven efficiency gains.
- Monetization Potential: The application layer of AI, embedding intelligence into use cases, is projected to generate $395 billion in revenue by 2027, underscoring the economic incentive for scaling inference capabilities.
- Challenges and Implications:
- Energy Consumption: The surge in inference demand will drive significant electricity usage, with data centers projected to consume 6.7%–12% of U.S. electricity by 2028, up from 4.4% in 2023. Globally, data centers consumed 460 terawatts in 2022, equivalent to a major nation’s electricity use.
- Hardware Constraints: The reliance on GPUs and specialized AI chips (e.g., Nvidia’s Blackwell, Meta’s MTIA) underscores supply chain concentration risks, with TSMC dominating chip fabrication. By 2030, leading AI supercomputers could require 2 million chips, costing $200 billion and 9 GW of power.
- Environmental Impact: Inference’s ongoing energy demands, unlike one-time training spikes, pose sustainability challenges, with water consumption for data center cooling also straining resources.
- Investment Opportunities:
- UBS highlights opportunities in AI semiconductors, cloud platforms, and data center infrastructure. Software and internet stocks are expected to lead the next tech cycle, with a projected $170 billion in AI application revenues by 2027 (139% CAGR), compared to $130 billion for semiconductors and hardware (38% CAGR).
- Hyperscalers like Google, Microsoft, and Amazon, which control cloud infrastructure, are well-positioned to monetize AI inference, though concerns about market concentration persist.
UBS emphasizes that despite tariff uncertainties and fears of AI infrastructure overspending, GenAI’s compute demand remains resilient, with supply still lagging behind need. The bank’s framework suggests that industries with low margins or high capital intensity, such as financial services and healthcare, stand to gain the most from GenAI efficiencies, while agentic and physical AI could redefine computational scales by 2030.
This analysis aligns with broader global developments, such as U.S.-China trade talks in Geneva and Putin’s proposal for Russia-Ukraine negotiations, reflecting a world grappling with technological and geopolitical shifts. For deeper insights, see UBS’s full report on their website.