R&D: Leveraging Keys in Key-Value SSD for Production Workloads
Demonstrates naive prefix-based index partitioning mechanism inside KV-SSD that can reduce on-flash index accesses for multiple production workloads and discusses shortcomings of this approach.
This is a Press Release edited by StorageNewsletter.com on December 14, 2023 at 2:00 pmACM Digital Library has published, in HPDC ’23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, an article written by Manoj P. Saha, Florida International University, Miami, FL, USA, Omkar Desai, Bryan S. Kim, Syracuse University, Syracuse, NY, USA, and Janki Bhimani, Florida International University, Miami, FL, USA.
Abstract: “Key-Value SSDs reduce host-side resource utilization for unstructured data management by streamlining the I/O stack. However, designing a robust Key-Value SSD with resource constrained flash controllers has always been a challenge. The key-to-page (K2P) mapping inside KV-SSD, which consolidates multiple layers of indirection in the traditional block I/O storage, has its own shortcomings. The sparsely populated NVMe KV namespace leads to very large index, which cannot be optimized similar to hybrid- or block-FTL in block-SSDs. In addition, the background index management tasks (e.g. compaction on LSM-tree index) also lead to performance degradation. Moreover, existing KV index design is not equipped to tackle fast changing workload patterns. These shortcomings have stalled the adoption of KV-SSDs in production environments. In this work, we take the position that these shortcomings can be addressed by leveraging the information embedded inside keys about application keyspaces and groups as prefixes. The prefixes can be used to partition the monolithic large index into smaller ones. We demonstrate a naive prefix-based index partitioning mechanism inside KV-SSD that can reduce on-flash index accesses for multiple production workloads and discuss the shortcomings of this approach. Lastly, we discuss our proposed design of a society of indices that initialize, interact and evolve based on workload characteristics over time.“