From Elbencho, Storage Benchmark Tool for AI, Available on Github
For evaluating performance of storage systems, optionally including GPUs in storage access
This is a Press Release edited by StorageNewsletter.com on February 12, 2021 at 2:07 pmElbencho, a open-source storage benchmark tool, is available to help organizations that demand performance evaluate performance of modern storage systems, optionally even including GPUs in the storage access.
Traditionally, storage system vendors published numbers primarily based on simple large block streaming bandwidth. This was because the mechanics of spinning disks performed poorly for almost all other workloads. The rapid price drops in recent years have given rise to AFAs. With that, data scientists started developing new algorithms that relied on fast flash storage devices for far more challenging access patterns. Deep learning with its demand for reading lots of small files or random reads with high concurrency is one of the major drivers in this area. But with NVMe or Optane SSDs, the storage software (i.e. the file system or block service) becomes a much more critical part of the data path, making it hard to predict the storage system performance for these new access patterns.
Realizing the above, Sven Breuner, creator, BeeGFS parallel file system, published elbencho, a benchmark tool. It covers a range of test cases, among them tests for lots of small files with varying size, access latency, IO/s for random access to shared files and others. For all test cases, it shows live statistics to see how a system behaves under load. The company is also the first independent tool to officially support Nvidia’s GPUDirect Storage (GDS) API to check how much storage performance is available to the GPUs. GDS support was added during Nvidia’s current GDS public beta phase after official permission from Nvidia.
“Since its initial release in 2020, elbencho has received very positive feedback and is already in use by a number of major vendors to gain new insights for their customers. However, feedback or contributions to make it even more useful are always welcome.” says Sven Breuner. “A major contribution came from Zettar Inc, a leader in moving data at scale and speed, based on their work with the US Department of Energy and others.“
Zettar, Inc. contributed a set of storage sweep tools, which automatically test the attainable storage performance for a range of file sizes and creates a chart for the results at the end.
“Elbencho and the storage sweep tools finally give storage users world-wide the ability to quickly understand their storage systems, rather than depending on published numbers that are meaningless for their actual workloads“, comments Chin Fang, CEO, Zettar. “In contrast to benchmark suites like IO500 or SPEC SFS, elbencho does not try to predefine a certain workload and instead enables users to test what actually matters to them – be it on a single host or coordinated across multiple storage clients”.
Elbencho is available on Github for everyone to use and contribute