R&D: Building GC-Free Key-Value Store on HM-SMR Drives with ZoneFS
Extensive experiments confirm that GearDB achieves both high performance and space efficiency, i.e., on average 1.7× and 1.5× better than LevelDB in random RW, respectively, with up to 86.9% space efficiency.
This is a Press Release edited by StorageNewsletter.com on November 23, 2022 at 2:00 pmACM Transactions on Storage has published an article written by Yiwen Zhang, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, Hubei, China, Ting Yao, Cloud Storage Service Product Dept, Huawei Technologies Co., Ltd., Shenzhen, Guangdong, China, Jiguang Wan, and Changsheng Xie, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, Hubei, China.
Abstract: “Host-managed shingled magnetic recording drives (HM-SMR) are advantageous in capacity to harness the explosive growth of data. For key-value stores based on log-structured merge trees (LSM-trees), the HM-SMR drive is an ideal solution owning to its capacity, predictable performance, and economical cost. However, building an LSM-tree based key-value (KV) store on HM-SMR drives presents severe challenges in maintaining the performance and space utilization efficiency due to the redundant cleaning processes for applications and storage devices (i.e., compaction and garbage collection). To eliminate the overhead of on-disk garbage collection (GC) and improve compaction efficiency, this paper presents GearDB, a GC-free KV store tailored for HM-SMR drives. GearDB improves the write performance and space efficiency through three new techniques: a new on-disk data layout, compaction windows, and a novel gear compaction algorithm. We further augment the read performance of GearDB with a new SSTable layout and read ahead mechanism. We implement GearDB with LevelDB, and use zonefs to access a real HM-SMR drive. Our extensive experiments confirm that GearDB achieves both high performance and space efficiency, i.e., on average 1.7 × and 1.5 × better than LevelDB in random write and read, respectively, with up to 86.9% space efficiency.“