R&D: Proactive Drive Failure Prediction for Cloud Storage System through Semi-Supervised Learning
Propose 2 semi-supervised drive failure prediction models, DFP-VL and DFP-FL, from perspective of reconstructing SMART data and learning SMART data's probability density, respectively.
This is a Press Release edited by StorageNewsletter.com on August 23, 2023 at 2:01 pmIEEE Transactions on Dependable and Secure Computing has published an article written by Hao Zhou, Zhiheng Niu, Gang Wang, Xiaoguang Liu, College of Computer Science, Nankai University, Tianjin, China, Dongshi Liu, Bingnan Kang, Zheng Hu, and Yong Zhang, Huawei Technologies Co., Ltd, Shenzhen, China.
Abstract: “Proactive drive failure prediction can help operators handle the failing drives in advance, enhancing the storage system dependability. SSD and HDD failure prediction techniques are currently evolving towards a semi-supervised approach. In this paper, we are dedicated to enhancing the methodology for semi-supervised drive failure prediction from these aspects: design more powerful, robust yet generic models, mine drive failure modes and make the prediction model interpretable. Specifically, we propose two semi-supervised drive failure prediction models, DFP-VL and DFP-FL, from the perspective of reconstructing SMART data and learning SMART data’s probability density, respectively. They capture the pattern of healthy drives, and failing drives can be detected when they deviate from the normal pattern. We mine two failure modes, slow deterioration and dramatic deterioration. Based on that, we design two failure detectors, ”ThresholdDetector” and ”HybridDetector”, to determine whether a drive deviates from the normal pattern. We evaluate the proposed methods on both SSD and HDD SMART data. DFP-VL is generic yet effective and can interpret the predicted results. DFP-FL has better performance than DFP-VL, but it cannot interpret the results. ThresholdDetector has low complexity and can detect most failing drives. HybridDetector has high complexity and can further improve the detection performance.“