Unlocking AI’s Full Potential: Why Next-Gen Storage Is the Real Game Changer

By Mikael Holmberg ,

Distinguished Engineer and Member of the Office of the CTO

Published: October 16, 2025

Across data centers worldwide, GPUs are waiting—idling for data that hasn’t yet arrived. The real bottleneck in enterprise AI isn’t compute power; it’s the storage systems struggling to feed data-hungry models fast enough to keep up. As organizations deploy increasingly larger AI models and datasets, the limitations of traditional architectures are being exposed in real-time. Without a storage foundation capable of sustained throughput and low latency, even the most advanced compute clusters can underperform. 

The past year has seen an unprecedented acceleration in AI adoption, triggering massive changes in enterprise infrastructure. While much of the focus has been on compute, including GPUs, accelerators, and parallel processing, the underlying data layer determines how efficiently those resources can be utilized. AI workloads are distinct from anything that has come before: they are iterative, distributed, and extremely bandwidth-intensive. 

Storage, once viewed as a passive repository, is now emerging as a critical enabler of AI success. As data volumes surge and model complexity grows, organizations are re-evaluating how data is stored, accessed, and moved across their networks. The rise of “AI networks” means storage can no longer be separated from compute or connectivity—it must evolve into an intelligent, high-performance fabric that supports the entire AI lifecycle. 

To understand why this shift is so profound, it’s worth examining the data more closely—and how its behavior has evolved. 

The changing behavior of data 

AI workloads rely on massive datasets—what we often refer to as “data lakes”—that supply machine learning and deep learning models. These datasets are characterized by three key attributes: volume, velocity, and variety, each presenting distinct challenges for traditional storage systems. 

Volume refers to the sheer scale of data used for training. AI models routinely consume petabytes of information drawn from structured databases, sensor streams, logs, and multimedia. Centralized arrays and siloed file systems quickly become bottlenecks under this kind of sustained load. 
Velocity measures the speed at which data must be transmitted. Model training requires repeated passes over enormous datasets, and inference workloads demand real-time responses. The storage layer must deliver high-throughput, low-latency access across distributed environments—something conventional NAS or SAN designs were never built to do. 
Variety encompasses the diverse range of AI data types, including text, audio, images, video, telemetry, and more. Each has unique access patterns and performance requirements. Static tiering policies can’t keep pace with this dynamic mix, creating inefficiencies in both cost and performance.

In short, AI doesn’t just generate more data—it changes the behavior of data. Storage infrastructure that once served predictable, batch-oriented workloads must now operate as an intelligent, adaptive layer, capable of feeding GPUs as efficiently as it preserves petabytes of training inputs. 

A diagram of a machine learning process

Architectural shifts in AI storage

As AI workloads evolve, the supporting storage stack is undergoing its own transformation. What used to be a static layer of disks and arrays is becoming a distributed, high-speed data fabric—closely integrated with compute, edge, and network layers. Several key shifts are driving this change. 

The first is the rise of data lakes at scale. AI models thrive on vast, centralized repositories that can store raw and unstructured data in any format. These environments depend on object-based storage systems that scale horizontally and support massive parallel reads and writes. The goal isn’t just capacity—it’s throughput, keeping GPUs continuously fed with data to avoid idle cycles. 

At the same time, specialized storage media are becoming mainstream. Solid-state drives and high-bandwidth memory (HBM) deliver the performance that AI training and inference demand. At the same time, emerging interconnects such as NVMe over Fabrics (NVMe-oF) provide the low-latency links needed to move data efficiently between compute nodes, storage devices, and edge systems. 

A third shift is dynamic resource management, enabled by AI itself. As AI networks become more adaptive, storage must reallocate capacity and performance resources in real time—adjusting to traffic patterns, changing workloads, or model retraining cycles. This kind of elasticity was rare in traditional enterprise storage but is essential when AI clusters span multiple sites and thousands of devices. 

Finally, the expansion of AI to the edge is redefining where storage lives. Many inference workloads now run closer to where data is generated: whether in factories, hospitals, or vehicles, requiring local, high-speed storage that supports low-latency decision-making. Processing data at the edge not only reduces backhaul costs and latency but also improves privacy and resiliency when connectivity is intermittent. 

Underpinning all of this is the shift toward high-speed Ethernet fabrics that link compute, storage, and edge nodes into a cohesive system. These networks utilize advanced transport protocols to minimize latency and congestion, enabling data to flow as freely as compute cycles themselves. In an AI-driven world, storage performance is no longer about disks or arrays—it’s about the efficiency of the entire data path from sensor to GPU. 

The AI data cycle

AI doesn’t just consume data; it cycles through it. Every workload moves through distinct stages—from data collection and preprocessing to model training, validation, deployment, and real-time inference. Each stage imposes different stresses on the storage layer, and no single architecture can optimize for all of them. 

During data ingestion, storage must handle massive parallel writes as raw data floods in from sensors, logs, and external sources. At the training stage, the emphasis shifts to high-throughput reads and sustained bandwidth as GPUs iterate over the same datasets thousands of times. Once models are trained and move into inference, latency becomes the critical metric—storage must deliver small, frequent reads in milliseconds to keep prediction pipelines responsive. 

Meanwhile, archival and retraining cycles introduce yet another challenge: vast stores of versioned datasets that need to be preserved, indexed, and made available for future model updates. This constant movement of data—from hot to cold tiers, from edge to core—demands an architecture that can adapt dynamically to changing access patterns. 

Meeting these requirements has pushed enterprises toward modern storage fabrics built on NVMe and RDMA technologies. Together, they provide the high-speed, low-latency backbone that AI workloads require, enabling compute and storage resources to function as a unified system rather than isolated silos. 

A black and white logo

High-speed fabrics: the data path for AI 

The next evolution in this transformation lies in how data moves. 

At the heart of modern AI infrastructure is the need for extreme data movement. Model training depends on continuously streaming massive datasets between compute nodes and storage arrays. Traditional TCP/IP networking introduces too much overhead for this scale—data must travel through the operating system kernel, consuming CPU cycles and adding latency at every hop. Technologies such as NVMe-oF and Remote Direct Memory Access (RDMA) were created to eliminate those inefficiencies. 

NVMe-oF extends the high-performance NVMe storage protocol across network fabrics, allowing remote drives to behave almost as if they were local. This delivers the same parallelism and low latency that GPUs require for continuous data feeding, but without being tied to a single host. In large AI clusters, NVMe-oF enables storage to scale out horizontally, pooling capacity across racks or even sites, while maintaining microsecond-level responsiveness. 

RDMA goes a step further. It enables one system to read or write directly to another system’s memory without involving the CPU or operating system kernel. By bypassing kernel mediation, RDMA drastically reduces latency and CPU overhead—two factors that directly limit GPU utilization at scale. In AI environments where GPU efficiency drives both performance and cost, that advantage is critical. 

Two major hardware implementations support RDMA today: InfiniBand and RoCE (RDMA over Converged Ethernet). InfiniBand has long been the gold standard for high-performance computing, but it requires specialized hardware and higher cost. RoCE, by contrast, brings RDMA capabilities to Ethernet fabrics, allowing organizations to extend existing data center investments. With proper congestion control and load-balancing mechanisms, RoCE can achieve performance levels approaching those of InfiniBand, making it an increasingly popular choice for AI workloads that require both speed and scalability. 

Together, NVMe-oF and RDMA form the data path that connects compute, storage, and edge resources into a single, high-performance ecosystem. They transform the network from a potential bottleneck into a performance multiplier—precisely what AI at scale demands. 

A diagram of a computer

Storage architectures: from hardware to intelligence 

High-speed fabrics, such as NVMe-oF and RDMA, solve half of the problem—they make data movement faster. The other half is about making storage intelligent, capable of adapting automatically to shifting AI workloads. That’s where Software-Defined Storage (SDS) plays a pivotal role. 

SDS decouples storage management from the underlying hardware, creating a flexible, software-driven control plane for capacity, performance, and data placement. In an AI environment where datasets grow and move unpredictably, this abstraction becomes essential. SDS allows enterprises to pool heterogeneous resources—NVMe drives, SSD tiers, object stores, and even cloud capacity—into a single logical fabric that can be orchestrated dynamically. 

This approach brings three key advantages for AI networks: 

Elastic scalability — scales horizontally as new training clusters or inference nodes come online. 
Automated provisioning — allocates and reclaims resources dynamically, ensuring GPUs aren’t left waiting for data. 
Policy-driven optimization — uses analytics to rebalance hot and cold data and predict capacity needs.

The result is a storage infrastructure that behaves less like a static repository and more like an active participant in the AI lifecycle—continuously tuning itself for optimal throughput, latency, and efficiency. In practice, SDS becomes the foundation of what some refer to as the Enterprise AI Factory —a converged environment where compute, storage, and networking operate as a single, adaptive system. 

The future of the AI storage landscape 

AI is already changing how we think about storage. What used to be a static layer of disks and arrays is now a dynamic part of the overall data pipeline. As AI workloads grow in size and complexity, storage systems must do more than just store data; they must also manage and optimize it in real-time. 

AI will play an increasing role in how storage itself operates. Intelligent data management, predictive health monitoring, and automated tiering will improve performance and resource utilization while reducing human overhead. Storage will become self-optimizing, adjusting to changing workloads and data movement across the network. 

This evolution also changes the economics. The efficiency of the storage layer now determines how well expensive compute resources are used. Reducing latency and bottlenecks directly translates into faster training cycles and higher return on GPU investment. In this sense, storage becomes a multiplier for AI performance rather than a background component. 

Over time, the boundaries between compute, network, and storage will continue to blur. High-speed fabrics, software-defined control planes, and AI-driven orchestration will integrate these layers into a single, adaptive system. The result will be storage architectures that are not only faster and more scalable, but also smarter—built to keep pace with the demands of AI at scale. 

In the AI era, data is only as powerful as the storage that delivers it. 

Posted In

Unlocking AI’s Full Potential: Why Next-Gen Storage Is the Real Game Changer

The changing behavior of data

Architectural shifts in AI storage

The AI data cycle

High-speed fabrics: the data path for AI

Storage architectures: from hardware to intelligence

The future of the AI storage landscape

Related Resources

IT Leaders Share Why Extreme Platform ONE Is a Game Changer

How Companies Successfully Invest in AI: Augmentation to Differentiation

Your Human Skills Are Your Superpower in the Age of AI

Industry Solutions

Products & Solutions

Log In

Resources

About

Unlocking AI’s Full Potential: Why Next-Gen Storage Is the Real Game Changer

The changing behavior of data

Architectural shifts in AI storage

The AI data cycle

High-speed fabrics: the data path for AI

Storage architectures: from hardware to intelligence

The future of the AI storage landscape

Related Resources

IT Leaders Share Why Extreme Platform ONE Is a Game Changer

How Companies Successfully Invest in AI: Augmentation to Differentiation

Your Human Skills Are Your Superpower in the Age of AI

Extreme’s Third Act Shines with Extreme Platform ONE™ Unified Management

From Data Mesh to AI Mesh: Taking the Next Step

Agents Without Semantics are Flying Blind

How Agentic AI Will Transform Business Models for Service Providers, Resellers, and Distributors

Industry Solutions

Products & Solutions

Log In

Resources

About

The changing behavior of data 

High-speed fabrics: the data path for AI 

Storage architectures: from hardware to intelligence 

The future of the AI storage landscape