WekaIO Data Platform Achieves Nvidia Cloud Network Partner Certification

WekaIO announced that its Data Platform has been certified as a high-performance data store for NVIDIA Partner Network Cloud Partners.

With this certification, NVIDIA Cloud Partners can now leverage the Weka Data Platform’s performance, scalability, operational efficiency, and ease of use through the jointly validated Weka Reference Architecture for NVIDIA Cloud Partners using NVIDIA HGX H100 systems.

Click to enlarge

The NVIDIA Cloud Partner reference architecture provides a full-stack hardware and software solution for cloud providers to offer AI services and workflows for different use cases. The firm‘s storage certification ensures that Wekapod appliances and hardware from Weka-qualified server partners meet NVIDIA Cloud Partner high-performance storage (HPS) specifications for AI cloud environments.

The certification highlights the company’s Data Platform’s ability to provide performance at scale and accelerate AI workloads. It delivers up to 48GB/s of read throughput and over 46GB/s of write throughput on a single HGX H100 system and supports up to 32,000 NVIDIA GPUs in a single NVIDIA Spectrum-X Ethernet networked cluster. NVIDIA Cloud Partners can confidently pair the Weka Data Platform with large-scale AI infrastructure deployments powered by NVIDIA GPUs to help their customers rapidly deploy and scale AI projects.

“AI innovators are increasingly turning to hyperscale and specialty cloud providers to fuel model training and inference and build their advanced computing projects,” said Nilesh Patel, CPO, Weka. “Weka’s certified reference architecture enables NVIDIA Cloud Partners and their customers to now deploy a fully validated, AI-native data management solution that can help to improve time-to-outcome metrics while significantly reducing power and data center infrastructure costs.“

AI revolution is driving surging demand for specialty cloud solutions
Global demand for next-gen GPU access has surged as organizations move to rapidly adopt GenAI and gain a competitive edge across a wide spectrum of use cases. This has spurred the rise of a new breed of specialty AI cloud service providers that offer wide GPU access by providing accelerated computing and AI infrastructure solutions to organizations of every size and in every industry. As enterprise AI projects converge training, inference, and retrieval-augmented generation (RAG) workflows on larger GPU environments, these cloud providers often face significant data management challenges, such as data integration and portability, minimizing latency, and controlling costs through efficient GPU utilization.

Weka’s AI-native data platform optimizes and accelerates data pipelines, helping ensure GPUs are continuously saturated with data to achieve maximum utilization, streamline AI model training and inference, and accelerate performance-intensive workloads. It provides a simplified, zero-tuning storage experience that optimizes performance across all I/O profiles, helping cloud providers simplify AI workflows to reduce data management complexity and staff overhead.

Many NVIDIA Cloud Partners are also building their service offerings with sustainability in mind, employing energy-efficient technologies and sustainable AI practices to reduce their environmental impact. The firm’s Data Platform improves GPU efficiency and the efficacy of AI model training and inference, which can help cloud service providers avoid 260 tons of CO²e/PB of data stored. This can further reduce their data centers’ energy and carbon footprints and the environmental impact of customers’ AI and HPC initiatives.

“The Weka Data Platform is crucial in optimizing the performance of Yotta’s Shakti Cloud, India’s fastest AI supercomputing infrastructure. Shakti Cloud allows us to provide scalable GPU services to enterprises of all sizes, democratizing access to high-performance computing resources and enabling businesses to fully harness AI through our extensive NVIDIA H100 GPU fleet. With this enhancement, our customers can efficiently run real-time generative AI on trillion-parameter language models,” said Sunil Gupta, co-founder, managing director and CEO, Yotta Data Services Ltd., NVIDIA Cloud Partner. “At Yotta, we are deeply committed to balancing data center growth with sustainability and energy efficiency. We are dedicated to deploying energy-efficient AI technologies to minimize the environmental impact of our data centers while continuing to scale our infrastructure to meet the growing demand. Weka is instrumental in helping us achieve this objective.”

Key benefits of Weka’s reference architecture for NVIDIA Cloud Partners include:

Performance: Validated high throughput and low latency help to reduce AI model training and inference wall clock time from days to hours, providing up to 48GB//s of read throughput and over 46GB/s of write throughput for a single HGX H100 system.
Maximum GPU utilization: Weka delivers consistent performance and linear scalability across all HGX H100 systems, optimizing data pipelines to improve GPU utilization by up to 20x, resulting in fewer GPUs needed for high-traffic workloads while maximizing performance.
Service provider-level multi-tenancy: Secure access controls and virtual composable clusters offer resource separation and independent encryption to preserve customer privacy and performance.
Eliminate checkpoint stalls: Scalable, low-latency checkpointing is crucial for large-scale model training, mitigating risks and providing operational predictability.
Massive scale: Supports up to 32,000 NVIDIA H100 GPUs and an exabyte of capacity within a single namespace across an NVIDIA Spectrum-X Ethernet backbone to scale to meet the needs of any deployment size.
Simplified operations: Zero-tuning architecture provides linear scaling of metadata and data services and streamlines the design, deployment, and management of diverse, multi-workload cloud environments.
Reduced complexity and enhanced efficiency: The company delivers class-leading performance in one-tenth the data center footprint and cabling compared to competing solutions, reducing infrastructure complexity, storage and energy costs, and the associated environmental impact to promote more sustainable use of AI.