What are you looking for ?
Advertise with us
RAIDON

R&D: ZoomDB, Building Cost-effective Key–value Store Engine on ZNS SSD and SMR HDD

Authors propose ZoomDB, an LSM-tree KV store engine designed around KV separation and tailored for hybrid zoned storage devices.

Journal of Systems Architecture has published an article written by Shiqiang Nie, School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, Shaanxi, China, Chi Zhang, Air Defense and Antimissile School, Air Force Engineering University, Xi’an, 710051, Shaanxi, China, Menghan Li, Fangxing Yu, School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, Shaanxi, China, Yaming Li, Air Defense and Antimissile School, Air Force Engineering University, Xi’an, 710051, Shaanxi, China, Weiguo Wu, School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, Shaanxi, China.

Abstract: Log-Structured Merge tree (LSM-tree) based key–Value (KV) stores have become critical components in managing data for write-intensive cloud applications. With the explosive growth of unstructured data, emerging host-managed zoned storage solutions, such as high-performance Zoned NameSpace Solid State Drive (ZNS SSD) and large-capacity Shingled Magnetic Recording Hard Disk Drive (SMR HDD), present an ideal opportunity for efficient data storage. However, The state-of-the-art scheme partitions the LSM-tree on hybrid storage, placing lower levels on high-performance devices and higher levels on large-capacity devices, but it fails to address challenges in data layout and garbage collection on the hybrid storage system equipped with ZNS SSD and SMR HDD.

In this paper, we propose ZoomDB, an LSM-tree KV store engine designed around KV separation and tailored for hybrid zoned storage devices. First, we integrate KV separation with zone management in LSM-tree-based hybrid storage. Specifically, keys and low-level values are placed in high-performance zones on ZNS SSDs, while high-level values are stored in large-capacity zones on SMR HDDs, optimizing both performance and storage efficiency. To further enhance data management, we introduce a hotness identification mechanism that classifies values based on access frequency, storing hot and cold values in separate zones. Finally, we propose diversity GC tailored to zones with varying access frequencies, effectively reducing data migration overhead. We implement and evaluate ZoomDB on real ZNS SSD and SMR HDD. The evaluation results demonstrate that ZoomDB reduces the number of GC-triggered writes by 77.5% on average compared to WiscKey. It achieves throughput gains of 1.79, 3.13 , 4.01 , 4.25 , and 4.32 over WiscKey+, WiscKey, GearDB, ZoneKV, and LevelDB, respectively.

Articles_bottom
ExaGrid
AIC
Teledyne
ATTO
OPEN-E