Software

AI + ML

Cloudy with a chance of GPU bills: AI's energy appetite has CIOs sweating

Public cloud expenses have businesses scrambling for alternatives that won't melt the budget


Canalys Forums EMEA Organizations are being forced to rethink where they host workloads in response to ballooning AI demands combined with rising energy bills, and shoving them into the public cloud may not be the answer.

CIOs are facing a quandary over rising power consumption from the huge compute demands of training and deploying advanced AI models, while energy costs are simultaneously rising. Finding some way to square this circle is becoming a big concern for large corporates, according to Canalys.

Speaking at the recent Canalys Forum EMEA in Berlin, chief analyst Alastair Edwards said that every company is trying to figure out what model or IT architecture they need to deploy to take best advantage of the business transformation that "AI promises".

The public cloud vendors position themselves as the destination of choice for training AI workloads, and they certainly have the infrastructure resources, with capital expenditure on AI-capable servers up by about 30 percent this year, by some estimates.

But as an organization starts to look beyond training to putting those models to work – fine-tuning and inferencing with them – the question arises of how and where to deploy them in a scalable way, according to Edwards.

"The public cloud, as you start to deploy these use cases we're all focused on and start to scale that, if you're doing that in the public cloud, it becomes unsustainable from a cost perspective," he claimed.

But what is the alternative to cloud? Businesses have been migrating workloads to the cloud for a decade or so now because of the hassle of managing complex infrastructure, among other reasons, and many have downsized their own bit barns in response.

"Almost no organization these days wants to build their own on-prem datacenter," Edwards said. "They want to have the control, the sovereignty, the security, and compliance, but they want to locate it where they don't have to deal with an increased power requirement, increased need for liquid cooling, which you can't just repurpose an existing datacenter for."

This means some companies are now turning to colocation and specialized hosting providers rather than your typical public cloud operator, according to the Canalys view.

"We're seeing new business models emerging, companies which have invested in GPU capacity and are now developing GPU-as-a-service models to help customers access this. Whether that's a sustainable model or not is debatable, but essentially, every customer needs help to actually define what that looks like," Edwards explained.

Such GPU-as-a-service operators include Coreweave and Foundry, while even Rackspace (remember them?) has announced a GPU-as-a-Service product.

But whichever way you access it, infrastructure to drive AI looks like being a growth area for investment in the near future, according to the latest forecast from market watcher IDC. It estimates that corporates increased spending on compute and storage hardware for AI deployments by 37 percent in the first half of 2024, and forecasts that this will expand to top $100 billion by 2028.

Given that Microsoft alone has recently announced plans to raise $100 billion to invest in datacenters and AWS plans to spend $10.4 billion just on facilities in the UK, this figure seems like something of a conservative estimate.

IDC says that AI-enabled systems deployed in cloud and shared environments accounted for 65 percent of the entire server spend on AI during the first half of 2024 as hyperscalers, cloud service providers, and digital service providers built out their capabilities. In contrast, enterprises have largely lagged behind in adopting on-prem AI kit.

This matches what analyst firm Omdia has been telling us: that server demand for AI training is largely driven by a relatively small number of hyperscalers, and expenditure on servers is growing rapidly because AI calls for high-performance systems rammed with costly GPU accelerators.

Not surprisingly, the US leads the way in global AI infrastructure, according to IDC, accounting for almost half of the total spending in 1H24. It was followed by China with 23 percent, then the Asia-Pacific region on 16 percent, and EMEA trailing on just 10 percent.

Over the next five years, however, IDC expects Asia-Pacific to grow the fastest with a compound annual growth rate (CAGR) of 20 percent, followed by the US. Accelerated servers are forecast to make up 56 percent of the total market spending by 2028.

IDC expects AI adoption to continue growing at a "remarkable pace" as hyperscalers, CSPs, private companies, and governments around the world increasingly prioritize AI, but there is a cloud on the horizon (no pun intended).

"Growing concerns around energy consumption for AI infrastructure will become a factor in datacenters looking for alternatives to optimize their architectures and minimize energy use," said the company's Group VP of Worldwide Enterprise Infrastructure Trackers, Lidice Fernandez.

Recent reports have included warnings that AI-driven datacenter energy demands could swell by 160 percent just over the next two years, faster than utility providers are able to add extra generation capacity.

This could put a damper on datacenter expansion plans, but doesn't seem to be curbing the enthusiasm of investors just yet, despite a survey finding that 98 percent of them are worried about energy availability. Or perhaps the AI bubble will burst, in which case we can all stop worrying about its energy consumption and plowing billions into datacenters to support it. ®

Send us news
11 Comments

US bipartisan group publishes laundry list of AI policy requests

Chair Jay Obernolte urges Congress to act – whether it will is another matter

Take a closer look at Nvidia's buy of Run.ai, European Commission told

Campaign groups, non-profit orgs urge action to prevent GPU maker tightening grip on AI industry

Infosec experts divided on AI's potential to assist red teams

Yes, LLMs can do the heavy lifting. But good luck getting one to give evidence

$800 'AI' robot for kids bites the dust along with its maker

Moxie maker Embodied is going under, teaching important lessons about cloud services

AI's rising tide lifts all chips as AMD Instinct, cloudy silicon vie for a slice of Nvidia's pie

Analyst estimates show growing apetite for alternative infrastructure

Million GPU clusters, gigawatts of power – the scale of AI defies logic

It's not just one hyperbolic billionaire – the entire industry is chasing the AI dragon

Are you better value for money than AI?

Tech vendors start saying the quiet part out loud – do enterprises really need all that headcount?

Apple called on to ditch AI headline summaries after BBC debacle

'Facts can't be decided by a roll of the dice'

American cops are using AI to draft police reports, and the ACLU isn't happy

Do we really need to explain why this is a problem?

Google Gemini 2.0 Flash comes out with real-time conversation, image analysis

Chocolate Factory's latest multimodal model aims to power more trusted AI agents

Apple Intelligence summary botches a headline, causing jitters in BBC newsroom

Meanwhile, some iPhone users apathetic about introduction of AI features

Just how deep is Nvidia's CUDA moat really?

Not as impenetrable as you might think, but still more than Intel or AMD would like