R&D: ANVIL, In-Storage Accelerator for Name–Value Data Stores

ACM Digital Library has published, in ISCA ’25: Proceedings of the 52nd Annual International Symposium on Computer Architecture, an article written by Ryan Wong, Univ. of Illinois Urbana-Champaign, Urbana, IL, USA and Sandia National Laboratories, Albuquerque, NM, USA, Nikita Kim, Carnegie Mellon Univ., Pittsburgh, PA, USA, Aniket Das, Kevin Higgs, Univ. of Illinois Urbana-Champaign, Urbana, IL, USA, Engin Ipek, Micron, San Diego, CA, USA, Sapan Agarwal, Sandia National Laboratories, Livermore, CA, USA, Saugata Ghose, Univ. of Illinois Urbana-Champaign, Urbana, IL, USA, and Ben Feinberg, Sandia National Laboratories, Albuquerque, NM, USA.

Abstract: “Name–value pairs (NVPs) are a widely-used abstraction to organize data in millions of applications. At a high level, an NVP associates a name (e.g., array index, key, hash) with each value in a collection of data. Specific NVP data store formats can vary widely, ranging from simple arrays/dictionaries and lookup tables to key–value stores and data mining workloads. Despite their importance, existing optimizations for NVPs are limited to only a single data store format, as the broad definition of NVPs allows for significant heterogeneity in encoding and implementation.“

“We propose ANVIL, the first end-to-end system that allows programmers to broadly accelerate most formats of NVPs. With a conventional solid-state drive (SSD), large-scale NVP lookups can saturate both external and internal SSD bandwidth, as every NVP in the data store needs to be sent back to the host CPU to check for a matching name. ANVIL makes use of in-storage processing to avoid reading out any data for names that do not match, by performing name match checks directly inside the SSD’s NAND flash chips. We demonstrate that ANVIL can substantially reduce disk I/O, reduce metadata overheads, and provide speedups of 4.0 ×, 25 ×, and 14.6% over a conventional SSD, for three different NVP workloads (database transactions, analytics, and graph processing).“