The new global survey by ClearML, FuriosaAI and the AI Infrastructure Alliance (AIIA) unveils the state of AI infrastructure at scale, exposing GPU utilisation challenges.

ClearML recently announced new research findings from a global AI survey conducted with FuriosaAI and the AI Infrastructure Alliance (AIIA). The new report, called “The State of AI Infrastructure at Scale 2024: Unveiling Future Landscapes, Key Insights, and Business Benchmarks” includes responses from AI/ML and technology leaders at 1,000 companies of various sizes across North America, Europe, and Asia Pacific. 

The report focuses on: How executives are building their AI infrastructure, the critical benchmarks and key challenges they face, and how they rank priorities when evaluating AI infrastructure solutions against their business use cases. The report dives into respondents’ current scheduling, compute, and AI/ML needs for training and deploying models and AI framework plans for 2024-2025.

One of the primary drivers propelling hypergrowth in the AI infrastructure market is the realisation among organisations of how AI can drive their operational efficiency and workforce productivity as the leading business use case. Companies are recognising the need for Gen AI solutions to extract actionable insights from their internal knowledge bases and plan to deploy Gen AI to boost their competitive edge, enhance knowledge worker productivity, and impact their bottom line.

As companies navigate the AI infrastructure market, they are seeking clarity, peer insights and reviews, as well as industry benchmarks on AI/ML platforms and compute. To understand executives’ biggest pain points in moving AI/ML to production, this survey examined not only model training, but also model serving and inference.

Our research shows that while most organizations are planning to expand their AI infrastructure, they can’t afford to move too fast in deploying Generative AI at scale at the cost of not prioritizing the right use cases,” said Noam Harel, ClearML’s CMO and GM, North America. “We also explore the myriad challenges organizations face in their current AI workloads and how their ambitious plans for the future signal a need for highly performant, cost-effective ways to optimize GPU utilization (or find alternatives to GPUs), and harness seamless, end-to-end AI/ML platforms to drive effective, self-serve compute orchestration and scheduling with maximum utilization.”

There are lots of claims about how businesses are addressing their rapidly evolving AI infrastructure resource needs and incorporating Generative AI into their products. This report provides hard data to answer these questions,” said June Paik, FuriosaAI’s CEO. “In particular, it shows how businesses are actively looking for new, cost-effective options for inference compute. We’re excited to see that our second-gen product, launching later this year, directly addresses one of the top concerns cited in the report.”

Key findings of the report

The report reveals several key insights shaping the landscape of AI infrastructure and its utilisation across organisations. Notably, a vast majority, amounting to 96% of respondents, express intentions to expand their AI compute infrastructure, with considerations such as availability, cost, and infrastructure complexities weighing heavily. 

Cloud adoption remains a dominant trend, with 60% of respondents considering increased cloud usage, while 40% contemplate bolstering on-premise infrastructure, all with an emphasis on flexibility and speed, albeit concerns linger regarding wastage and idle costs in cloud environments. 

Additionally, the significance of Open Source technology is underscored, as 95% of executives stress its importance, with a strong focus on customising Open Source models, particularly favouring PyTorch as their framework of choice. 

However, dissatisfaction looms over current job scheduling and orchestration tools, with 74% of companies expressing discontent, citing constraints in compute resource allocation and team productivity. 

Furthermore, concerns regarding GPU utilisation optimisation and partitioning persist, with only a fraction of companies equipped to manage these capabilities effectively. Cost emerges as a pivotal factor, notably in inference compute, as organizations actively seek cost-effective alternatives to GPUs, indicating a growing demand for economically viable inference solutions. Despite these challenges, the report indicates a keen interest in deploying language and embedding models commercially, underscoring the imperative of mitigating compute challenges to facilitate smooth implementation.

About the Survey Research Authors

The AI Infrastructure Alliance is dedicated to bringing together the essential building blocks for the Artificial Intelligence applications of today and tomorrow. To learn more, 

FuriosaAI is a semiconductor company designing high-performance data center AI accelerators with vastly improved power efficiency.  

As the leading open source, end-to-end solution for unleashing AI in the enterprise, ClearML is used by more than 1,600 enterprise customers to develop highly repeatable processes for their entire AI model lifecycles, from product feature exploration to model deployment and monitoring in production.