What are you looking for ?
Infinidat
Articles_top

R&D: STAIR High Reliable STT-MRAM Aware Multi-Level I/O Cache Architecture by Adaptive ECC Allocation

Decreases data loss probability by 5x on average with negligible performance overhead (0.12% hit ratio reduction in worst case) and 1.56% memory overhead for cache controller.

IEEE Xplore has published, in 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) proceedings, an article written by Mostafa Hadizadeh, Elham Cheshmikhani, and Hossein Asadi, Sharif University of Technology,Department of Computer Engineering, Tehran, Iran.

Abstract:Hybrid Multi-Level Cache Architectures (HCAs) are promising solutions for the growing need of high-performance and cost-efficient data storage systems. HCAs employ a high endurable memory as the first-level cache and a Solid-State Drive (SSD) as the second-level cache. Spin-Transfer Torque Magnetic RAM (STT-MRAM) is one of the most promising candidates for the first-level cache of HCAs because of its high endurance and DRAM-comparable performance along with non-volatility. However, STT-MRAM faces with three major reliability challenges named Read Disturbance, Write Failure, and Retention Failure. To provide a reliable HCA, the reliability challenges of STT-MRAM should be carefully addressed. To this end, this paper first makes a careful distinction between clean and dirty pages to classify and prioritize their different vulnerabilities. Then, we investigate the distribution of more vulnerable pages in the first-level cache of HCAs over 17 storage workloads. Our observations show that the protection overhead can be significantly reduced by adjusting the protection level of data pages based on their vulnerability. To this aim, we propose a STT-MRAM Aware Multi-Level I/O Cache Architecture (STAIR) to improve HCA reliability by dynamically generating extra strong Error- Correction Codes (ECCs) for the dirty data pages. STAIR adaptively allocates under-utilized parts of the first-level cache to store these extra ECCs. Our evaluations show that STAIR decreases the data loss probability by five orders of magnitude, on average, with negligible performance overhead (0.12% hit ratio reduction in the worst case) and 1.56% memory overhead for the cache controller.

Articles_bottom
AIC
ATTO
OPEN-E