Copy Section

{{articledata.title}}

{{moment(articledata.cdate)}} @{{articledata.company.replace(" ","")}} comment

Investing.com -- Despite investor concerns about tariffs and AI infrastructure overspending, UBS sees generative AI (genAI) inference compute demand poised to expand dramatically across sectors.

According to the bank, AI remains resilient to macroeconomic uncertainty, with major U.S. tech companies reaffirming capital expenditure (capex) plans and highlighting that compute demand continues to exceed supply.

UBS argues that inference—the process of running AI models to generate answers—will become the primary driver of future AI compute needs, overtaking training.

“The amount of computation we need as a result of agentic AI and reasoning is easily 100x more than we thought we needed this time last year,” said Nvidia (NASDAQ:NVDA) CEO Jensen Huang, quoted in a UBS note.

The bank echoes this sentiment, pointing to the emergence of more complex methods like Chain of Thought (CoT) reasoning as a key source of growing computational intensity.

In its projections, UBS lays out four categories of genAI use cases: chatbots, enterprise AI, agentic AI, and physical AI.

Chatbots like ChatGPT are expected to see compute demand rise from 10 exaFLOP/s in 2024 to 200 exaFLOP/s by 2030.

For enterprise applications, such as fraud detection and contract summarization, inference needs are forecast to grow even faster—from 15 to 440 exaFLOP/s over the same period.

The most dramatic growth is expected from agentic AI, which includes autonomous customer support and workflow automation. UBS estimates demand from this segment could climb to 14 zettaFLOP/s by 2030, which would mark an “enormous leap from today’s needs, which we estimate to be in the hundreds of exaFLOP/s,” the firm said in the note.

Physical AI, which includes robotics and autonomous vehicles, could eventually require compute in the yottaFLOP/s range as it evolves to replicate aspects of human cognition.

Today’s installed GPU compute capacity is estimated at around 4,000 exaFLOP/s (rising to 5,000 with Google (NASDAQ:GOOGL)’s Tensor Processing Units (TPUs)), but UBS notes much of it remains underutilized.

Limitations like GPU memory bottlenecks mean actual usage often falls short of nominal potential, making it unlikely that the current base can meet future demand, especially for agentic and physical AI.

“Inference is often constrained by GPU memory, meaning the actual FLOP/s a chip can deliver is well below its theoretical maximum—with memory limitations resulting in chips operating at as little as 25% of their nominal FLOP/s,” the note explains.

“Even with these limitations the available capacity might be enough for current chatbot needs, but far below what will be required for agentic and physical AI, which will demand computing power of a different order of magnitude,” it adds.

All in all, UBS concludes that the expanding role of inference in AI adoption, combined with rising hardware requirements, supports continued investment in AI infrastructure.

For investors, the bank sees “any pullbacks in stocks linked to our ‘AI’ and ‘Power and resources’ selections as attractive entry points.”

This content was originally published on http://Investing.com


More from @{{articledata.company.replace(" ", "") }}

Menu