Blog post from Uli Plechschmidt, WW product marketing HPC storage at Cray, Hewlett Packard company
Learn about the expansion of the parallel storage portfolio that fuses two iconic enterprise brands – IBM Spectrum Scale (FKA GPFS) and HPE ProLiant DL rack servers – in the HPE factory for HPC and AI clusters running modeling and simulation, AI, and high performance data analytics.
An HPE storage system with embedded IBM Spectrum Scale. Really? Yes. Really!
With the new HPE Parallel File System Storage product, HPE is now introducing unique parallel storage for customers that use clusters of HPE Apollo Systems or HPE ProLiant DL rack servers with HPE InfiniBand HDR/Ethernet 200GbE adapters for modeling and simulation, AI, ML, deep learning, and high-performance analytics workloads.
Parallel File System Storage is the first and only storage product in the market that offers a combination of:
- The enterprise parallel file system1 (IBM Spectrum Scale) in Erasure Code Edition (ECE)
- Running on the leading, cost-effective x86 enterprise servers (ProLiant DL rack servers)
- Shipping fully integrated from the HPE factory with HPE Operational Support Services for the full HPE product (both hardware and software)
- Without the need to license storage capacity separately by terabyte or by storage drive (SSD and/or HDD)
Here’s why this really is a four-way winning scenario for all stakeholders – customers, channel partners, IBM, and HPE.
What’s in it for customers who are using clusters of GPU- and/or CPU-powered compute nodes like HPE Apollo systems or HPE ProLiant DL rack servers today?
Many of those customers are struggling with the architectural (performance and scalability) or economical ($/TB) limitations of their Scale-Out NAS storage (for example Dell EMC PowerScale or NetApp AFF). They now have access to a new more performant, more scalable, and more cost-effective parallel storage option – ith “one hand to shake” from procurement to support for their HPC and AI infrastructure stack from compute to storage.
If they already are using Spectrum Scale today for the shared parallel storage that feeds their clusters of HPE Apollo or ProLiant compute nodes with data, they get a more cost-effective option to do so – with “one hand to shake” from procurement to support for their full HPC and AI infrastructure stack from compute to storage.
What’s in it for HPE channel partners who are delivering the HPC and AI compute nodes today but the fast storage for the HPC and AI environment is provided by somebody else?
Increased revenue and expanded end customer value creation by providing the full HPC and AI infrastructure stack – while earning more benefits from the HPE Partner Ready program.
What’s in it for IBM?
This partnership creates a completely new route to market for Spectrum Scale, an enterprise parallel file system, through HPE, the number one HPC compute vendor.2
What’s in it for HPE?
The firm is expanding its parallel storage portfolio for HPC and AI beyond our Lustre-based Cray ClusterStor E1000 high-end storage system. Unlike others who only have Lustre-based storage systems, we now always can attach the ideal parallel storage solution to our leading HPC and AI compute systems for all use cases across all industries. Find more details in this table.
The E1000 has been generally available since August 2020 and has achieved the 2EB milestone shortly after that. HPE Parallel File System Storage will be generally available in May 2021.
While both HPE storage systems are embedding the leading parallel file system of their respective target market and have different implementations, they share the same fundamental design philosophy we passionately believe in: Limitless performance delivered in the most cost-effective way with single point of contact operational support.
Here’s a look at each of the 3 components of this HPC and AI storage “manifesto” and the benefits they bring.
Both storage systems are parallel storage systems embedding parallel file systems that enable tens or even thousands of high performance CPU/GPU compute nodes to read and write their data in parallel at the same time.
Unlike NFS-based NAS or scale-out NAS, there are (almost) no limitations regarding the scale of storage performance or storage capacity in the same file system. NFS-based storage is great for classic enterprise file serving (e.g. home folders of employees on a shared file server). But when it comes to feeding modern CPU/GPU compute nodes with data at sufficient speeds to ensure a high utilization of this expensive resource – then NFS no longer stands for Network File System but instead, it’s Not For Speed.
Are you in one of the organizations that has started your HPC and AI journey with NFS-based file storage attached to your Apollo 2000, Apollo 6500 or ProLiant DL clusters? And now – after experiencing massive data growth, you are struggling to cope with the architectural (performance/scalability) or economic ($/TB) limitations of scale-out NAS systems like Dell EMC PowerEdge or NetApp AFF, then look no further. Our new HPE Parallel File System Storage is the solution for you.
Delivered in the most cost-effective way
Our parallel storage systems are unlike others who:
- Embed Spectrum Scale running on cost-effective x86 rack servers in their products – HPE Parallel File System Storage does not require additional licenses for the file system that are based on the number and type of drives in the system.
- Embed the Lustre file system in their products – Cray ClusterStor E1000 is based on “zero bottleneck” end-to-end PCIe 4.0 storage controllers that get more file system performance out of each drive and that support 25 Watt NVMe SSD writes.
- Only offer all-flash systems that do not support cost-effective HDDs in the file system – Both HPE Parallel File System Storage and Cray ClusterStor E1000 support fast NVMe flash pools and cost-effective HDD pools in the same file system.
Seymour Cray once said: “Anyone can build a fast CPU. The trick is to build a fast system.” Adapted to our HPC and AI storage topic: “Anyone can build fast storage. The trick is to build a fast, but also cost-effective system.”
To achieve the latter, HPE leverages “medianomics” in both our parallel storage systems. What is medianomics? It is a portmanteau word that combines the words “media” and “economics” into a single word. It means: Leveraging the strengths of the different storage media: for NVMe SSD performance and for HDD cost-effective capacity. It also means to do this while minimizing the weaknesses: For NVMe SSD, it’s high cost per terabyte. For HDD, it’s the inability to efficiently serve small, random IO.
IDC forecasts3 that even in the year 2024 the price per terabyte of SSD will be 7x higher than the price for terabyte of HDDs. For this very economic reason, we believe that hybrid file systems will become the norm for the foreseeable future.
Both our parallel storage systems enable customers to have 2 different media pools in the same file system:
- NVMe Flash pool(s) to drive the required performance (throughput in gigabyte per second/IO/s) and
- HDD pool(s) to provide most of the cost-effective storage capacity
It is important to state that we do this within the same file system and do not tier different file systems to leverage medianomics. In our space the saying goes: “More file system tiers, more customer tears.”
With single point of contact customer support
We believe that you should focus on business and your primary mission – and not on managing finger-pointing vendors during the problem identification and problem resolution process.
This is why with our 2 HPC/AI storage systems you are only speaking to Operational Support Services from HPE for both hardware and software support issues – and never to IBM (in case of HPE Parallel File System Storage) or the Lustre community (in case of Cray ClusterStor E1000). That is true no matter which operational support service level you chose: HPE Pointnext Tech Care or HPE Datacenter Care.
If you are using non-HPE file storage for your HPE Apollo and HPE ProLiant DL clusters today but would like to unify the operational support for your end-to-end cluster including storage, look no further. Our portfolio of parallel HPC and AI storage systems fit the HPC and AI needs of organizations of all sizes, in all industries or mission areas.
Can’t wait to get to the new desired state of the future in which you have unified accountability of one operational support provider for your whole stack? Then accelerate your transition to an end-to-end HPC and AI infrastructure stack from HPE with HPE Accelerated Migration from HPE Financial Services. Create investment capacity by turning your ageing non-HPE storage into cash!
Another choice: consume the whole stack “as-a-service” as fully managed private HPC cloud in your premises or in a colocation facility. HPE GreenLake for HPC enables you to do that.
Gartner has noted that “there will be no way to put the storage beast on a diet.”4
HPC storage is forecasted to grow with a 57% higher CAGR than HPC compute.5
Current course and speed, HPC storage is projected to consume an ever-increasing portion of the overall budget – at the expense of CPU and GPU compute nodes. Unlike storage-only companies we think that this is unacceptable and have expanded our portfolio of HPC and AI storage systems with high performance, but cost-effective options for both leading parallel file systems.
With our parallel HPC and AI storage systems, you can put the storage beast on an effective diet – independent of the size of your organization or the mission or business your organization is pursuing. In addition, you get the benefits of one hand to shake from procurement to support for your full HPC and AI technology stack (software, compute, network, storage) from HPE – with the option to consume the whole stack as the “cloud that comes to you” with HPE GreenLake.
If you want to learn more
Read the first and only white paper in the history of the IT industry that was published by a storage business unit and that has the call to action to spend less on storage and more on CPU/GPU compute nodes. And check out the infographic for a history of the leading parallel file systems.
Get more CPU/GPU mileage from your existing budget by putting the HPC and AI storage beast on an effective diet. Accelerate your innovation and insight within the same budget.
More about HPC storage for a new HPC era.
1 Hyperion Research, Special Study: Shifts Are Occurring in the File System Landscape, June 2020
2 Hyperion Research, SC20 Virtual Market Update, November 2020
3 IDC, WW 2019-2023 Enterprise SSD and HDD Market Overview, June 2019
4 Gartner Research, Market Trends: Evolving HDD and SSD Storage Landscapes, October 2013
5 Hyperion Research, SC20 Virtual Market Update, November 2020
We didn't find any press release but just the post above from Uli Plechschmidt. So is it serious from HPE as it doesn't deserve a press release? The news he shared is amazing and as he wrote himself, it's a a big surprise for the market. We tried to check the calendar but April 1 is already passed and we're not yet in December.
This announcement illustrates once again the lack of clear storage strategy from HPE or the reverse, the strategy to promote and sell HPE hardware with any combination of market software possible even from direct competitors. As soon as a piece of software has significant market share, sell it with HPE hardware seems to be the driver. Wow. We read announcements everyday but this one is a rare piece of extravaganza.
Let put things in perspective. Almost one year ago, in May 2020, we published an article titled HPE Storage, Big Bazar or Real Strategy? and it made things clear and obvious for everyone. You can read that HPE, and HP in the past, has acquired Ibrix and PolyServe, distributes Qumulo, Ctera in addition of their 3Par File Persona or StorServe File Controller, all these products exist for different needs addressed by HPE. And if you remember the company has promoted FibreNet many many years ago coming from the Transoft Networks acquisition in 2002. The company has clearly a long history in file storage but almost always with partners and acquisitions with very little R&D effort behind. This is the just a fact as they speak for themselves. The recent MapR acquisition generated a new marketing effort and exercise at HPE around the Ezmeral brand and as it is written on the HPE web site that Ezmeral Data Fabric is MapR Data Platform. We're also a bit surprise as HPE justified the acquisition of BlueData to drive AI and analytics.
The image below summarizes the HPE storage portfolio without Ezmeral.
A few points are interesting in this post that needs a clarification.
Yes, it confirms parallel file system delivers high I/O performance essentially throughput, no doubt about that, it is known and used for decades by research centers, universities or high demanding scientific/technical environments. At scale, the design of metadata servers and data servers role and the parallel I/O aspect at the client level make this approach a good one when performance is a must.
The surprise comes from the fact that HPE distributes already WekaIO, one of the most modern parallel file system, but also Lustre, thanks of the Cray acquisition and tactically we saw in the past DDN with ExaScaler and GRIDScaler and Panasas. HPE has also received in the wedding basket SGI CXFS promoted now as HPE Clustered Extents File System. This product is a SAN file sharing system. Honestly, they need to check one more box with ThinkParQ and BeeGFS and the picture will be fully covered. The post mentions a white paper and an infographic and I anticipate some smiles on readers faces. WekaIO, a key partner for HPE, is completely ignored. On the WekaIO site, HPE is listed as channel partner since their joint announcement dedicated to AI March 21, 2018.
What is sure is it exists a few similarities and differences between HPC and AI storage. HPC is pretty well understood with large files, sometimes small, and essentially sequential and write intensive profile. I/O sizes can vary of course. On the AI side, files sizes are a mixed of small and large, mixed in term of I/O operations and of course is dominated by read operations for training and learning phases. The fact that WekaIO, very active in the AI segment, is not chosen by HPE is key question, both for HPE and for WekaIO. The later has already demonstrated a key market traction in that domain.
The other surprise in that post is the comparison with Dell PowerScale and NetApp AFF claiming they're not aligned with HPC and AI workloads. At least 2 scalable NAS vendors, Vast Data and Qumulo are listed in the I/O 500 list confirming fast NAS exist and they're even chosen for some HPC workloads. For AI, NVidia has a GPUDirect Storage developer program and 3 vendors are listed on this page: DDN, Vast Data and WekaIO. DDN with ExaScaler based on Lustre is the only player supporting NVidia SuperPod and WekaIO, both are key players in parallel file system. Vast Data promotes high speed NFS especially over RDMA. Pavilion Data has also published some interesting numbers. NVidia GTC is coming next week and several companies plan to announce new stuff around GPUDirect Storage, DGX... We'll see.
We're pretty sure current parallel file system HPE partners will react to this bizarre announcement that should create more doubt on the market and prospects heads about HPE. Of course firm's resellers and customers can see this as a new one concentrated point of sale for more solutions acting like a distributor finally... But many of us are nostalgic of HP R&D...