WekaIO (WEKA), the AI-native data platform company, announced that the WEKA Data Platform has been certified as a high-performance data store for NVIDIA Partner Network Cloud Partners.
With this certification, NVIDIA Cloud Partners can now leverage the WEKA Data Platform’s performance, scalability, operational efficiency, and ease of use through the jointly validated WEKA Reference Architecture for NVIDIA Cloud Partners using NVIDIA HGX H100 systems.
The NVIDIA Cloud Partner reference architecture provides a comprehensive, full-stack hardware and software solution for cloud providers to offer AI services and workflows for different use cases. WEKA’s storage certification ensures that WEKApod appliances and hardware from WEKA-qualified server partners meet NVIDIA Cloud Partner high-performance storage (HPS) specifications for AI cloud environments.
The certification highlights the WEKA Data Platform’s ability to provide powerful performance at scale and accelerate AI workloads. It delivers up to 48GBps of read throughput and over 46GBps of write throughput on a single HGX H100 system and supports up to 32,000 NVIDIA GPUs in a single NVIDIA Spectrum-X Ethernet networked cluster.
NVIDIA Cloud Partners can now confidently pair the WEKA Data Platform with large-scale AI infrastructure deployments powered by NVIDIA GPUs to help their customers rapidly deploy and scale AI projects.
“AI innovators are increasingly turning to hyperscale and specialty cloud providers to fuel model training and inference and build their advanced computing projects,” said Nilesh Patel, chief product officer at WEKA. “WEKA’s certified reference architecture enables NVIDIA Cloud Partners and their customers to now deploy a fully validated, AI-native data management solution that can help to improve time-to-outcome metrics while significantly reducing power and data center infrastructure costs.”
Global demand for next-generation GPU access has surged as organizations move to rapidly adopt generative AI and gain a competitive edge across a wide spectrum of use cases. This has spurred the rise of a new breed of specialty AI cloud service providers that offer wide GPU access by providing accelerated computing and AI infrastructure solutions to organizations of every size and in every industry.
As enterprise AI projects converge training, inference, and retrieval-augmented generation (RAG) workflows on larger GPU environments, these cloud providers often face significant data management challenges, such as data integration and portability, minimizing latency, and controlling costs through efficient GPU utilization.
WEKA’s AI-native data platform optimizes and accelerates data pipelines, helping ensure GPUs are continuously saturated with data to achieve maximum utilization, streamline AI model training and inference, and accelerate performance-intensive workloads. It provides a simplified, zero-tuning storage experience that optimizes performance across all I/O profiles, helping cloud providers simplify AI workflows to reduce data management complexity and staff overhead, according to the company.
Many NVIDIA Cloud Partners are also building their service offerings with sustainability in mind, employing energy-efficient technologies and sustainable AI practices to reduce their environmental impact. The WEKA Data Platform dramatically improves GPU efficiency and the efficacy of AI model training and inference, which can help cloud service providers avoid 260 tons of CO2e per petabyte of data stored. This can further reduce their data centers’ energy and carbon footprints and the environmental impact of customers’ AI and HPC initiatives.
Key Benefits of WEKA’s Reference Architecture for NVIDIA Cloud Partners include:
- Exceptional performance: Validated high throughput and low latency help to reduce AI model training and inference wall clock time from days to hours, providing up to 48GBps of read throughput and over 46GBps of write throughput for a single HGX H100 system.
- Maximum GPU utilization: WEKA delivers consistent performance and linear scalability across all HGX H100 systems, optimizing data pipelines to improve GPU utilization by up to 20x, resulting in fewer GPUs needed for high-traffic workloads while maximizing performance.
- Service provider-level multi-tenancy: Secure access controls and virtual composable clusters offer resource separation and independent encryption to preserve customer privacy and performance.
- Eliminate checkpoint stalls: Scalable, low-latency checkpointing is crucial for large-scale model training, mitigating risks and providing operational predictability.
- Massive scale: Supports up to 32,000 NVIDIA H100 GPUs and an exabyte of capacity within a single namespace across an NVIDIA Spectrum-X Ethernet backbone to scale to meet the needs of any deployment size.
- Simplified operations: Zero-tuning architecture provides linear scaling of metadata and data services and streamlines the design, deployment, and management of diverse, multi-workload cloud environments.
- Reduced complexity and enhanced efficiency: WEKA delivers class-leading performance in one-tenth the data center footprint and cabling compared to competing solutions, reducing infrastructure complexity, storage and energy costs, and the associated environmental impact to promote more sustainable use of AI.
For more information about this news, visit www.weka.io.