R&D: Four Articles on SSDs Technologies and Applications
Published by Space Weather, arXiv, Results in Engineering, and SSRN
This is a Press Release edited by StorageNewsletter.com on September 16, 2025 at 2:00 pmR&D: Comparative Study of the Single-Event Functional Interrupt (SEFI) Rate in SSD Through Ground and On-Orbit Testing
Research highlights the critical influence of the dynamic space environment on single-event effects and emphasizes the importance of accounting for secondary neutron radiation in spacecraft design.
Space Weather has published an article written by Yingqi Ma, Tian Yu, Sate Key Laboratory of Solar Activity and Space Weather, National Space Science Center, Chinese Academy of Sciences, Beijing, China, and University of Chinese Academy of Sciences, Beijing, China, Xiaoheng Xu, Sate Key Laboratory of Solar Activity and Space Weather, National Space Science Center, Chinese Academy of Sciences, Beijing, China, Peng Guo, University of Chinese Academy of Sciences, Beijing, China, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing, China, Chunqin Wang, Guohong Shen, Sate Key Laboratory of Solar Activity and Space Weather, National Space Science Center, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China, Zhikang Fu,Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing, China, Longlong Zhang, Jieyi Wang, Sate Key Laboratory of Solar Activity and Space Weather, National Space Science Center, Chinese Academy of Sciences, Beijing, China, and University of Chinese Academy of Sciences, Beijing, China, Bo Yang, Yunziwei Deng, Qiang Li, and Long Zhang, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing, China.
Abstract: “Space radiation poses a significant threat to the reliability of spacecraft equipment, while traditional static space environment models exhibit substantial inaccuracies in calculating on-orbit error rates. This study investigates single-event functional interrupts (SEFIs) in commercial off-the-shelf (COTS) solid-state drives (SSDs) operating in low Earth orbit utilizing real space environment data and AP9 model data. Using PHITS software, the internal radiation environment of the spacecraft was simulated based on external detector measurements and the AP9 model. Findings show that SEFI rates derived from measured spectra and the AP9 model at the 95% percentile closely match on-orbit observations, whereas results based on spectra deviating from the real environment may differ by more than an order of magnitude. This research highlights the critical influence of the dynamic space environment on single-event effects and emphasizes the importance of accounting for secondary neutron radiation in spacecraft design.“
R&D: The Unwritten Contract of Cloud-based Elastic Solid-State Drives
Authors present an unwritten contract of cloud-based ESSDs, encapsulating 4 observations and 5 implications for cloud storage users.
arXiv has published an article written by Yingjia Wang, and Ming-Chang Yang, The Chinese University of Hong Kong.
Abstract: “Elastic block storage (EBS) with the storage-compute disaggregated architecture stands as a pivotal piece in today’s cloud. EBS furnishes users with storage capabilities through the elastic solid-state drive (ESSD). Nevertheless, despite the widespread integration into cloud services, the absence of a thorough ESSD performance characterization raises critical doubt: when more and more services are shifted onto the cloud, can ESSD satisfactorily substitute the storage responsibilities of the local SSD and offer comparable performance?“
“In this paper, we for the first time target this question by characterizing two ESSDs from Amazon AWS and Alibaba Cloud. We present an unwritten contract of cloud-based ESSDs, encapsulating four observations and five implications for cloud storage users. Specifically, the observations are counter-intuitive and contrary to the conventional perceptions of what one would expect from the local SSD. The implications we hope could guide users in revisiting the designs of their deployed cloud software, i.e., harnessing the distinct characteristics of ESSDs for better system performance.“
R&D: Numerical Analysis of Thermal Performance in Semi-enclosed Electronics, Investigating Active and Passive Cooling Techniques for SSD-driven Single-board Devices
Study investigates and compares 2 SSD cooling strategies: conduction combined with natural convection, and forced convection using a heat sink.
Results in Engineering has published an article written by Zheng Zhang, School of Mechanical Engineering, Universiti Sains Malaysia, Engineering Campus, Nibong Tebal, 14300 Penang, Malaysia, and School of Applied Engineering, Zhejiang Business College, Hangzhou 310000, China, Aizat Abas, School of Mechanical Engineering, Universiti Sains Malaysia, Engineering Campus, Nibong Tebal, 14300 Penang, Malaysia, Jiao Da, School of Applied Engineering, Zhejiang Business College, Hangzhou 310000, China, and Fakhrozi Che Ani, Western Digital Corp, Plot 301A Persiaran Cassia Selatan 1, 14100 Simpang Ampat, Pulau Penang, Malaysia.
Abstract: “Effective thermal management is critical for the stable operation of high-power single-board computers (SBCs), particularly in semi-enclosed environments with limited airflow.This study investigates and compares two Solid-State Drive (SSD) cooling strategies: conduction combined with natural convection, and forced convection using a heat sink. Results demonstrate that the conduction-based method reduces chip junction and package temperatures by approximately 4°C compared to the forced convection setup, while the chassis shell temperature experiences only a marginal increase (∼0.2°C), indicating efficient localized heat transfer. Airflow analysis reveals that forced convection achieves higher maximum velocity (0.55 m/s), but the flow is concentrated near the fan outlet, reducing its overall effectiveness. In contrast, the conduction model sustains a more stable pressure distribution, with a significantly higher average internal air pressure (8.73 Pa vs. 0.018 Pa), enhancing heat dissipation. Although the 7-fin heat sink design achieves a high fin efficiency of 0.9825, it is less effective overall due to limited airflow distribution. The conduction model also exhibits a substantially higher heat transfer coefficient, reinforcing its thermal superiority. These findings highlight the limitations of forced convection in constrained environments and demonstrate the effectiveness of conduction-driven cooling for SSDs in compact, low-airflow systems, offering valuable insights for future thermal design strategies in embedded computing applications.“
R&D: Design and Implementation of a Fast and Predictable SSD Liveness Watchdog for Storage Systems
By analyzing local and distributed storage systems, authors identify 3 primary causes of delayed SSD failure detection: 1) loose-deterministic failure check, 2) fixed command timeout, and 3) delayed failure notification.
SSRN has published an article written by Jin Yong Ha, Samsung Electronics, Hwasung, Republic of Korea, and Yongseok Son, Chung-Ang University, Seoul, Republic of Korea.
Abstract: “Solid-state drives (SSDs) offer higher performance, greater reliability, and energy efficiency over hard disk drives (HDDs), and have been widely adopted in various storage systems such as data centers, cloud infrastructure, and enterprise storage servers. Accordingly, the reliability of SSDs has become critical, and timely failure detection is essential to maintain data availability and system resilience. However, existing storage systems do not detect and notify of SSD failures promptly, resulting in prolonged service downtime and increased risk of data loss. By analyzing local and distributed storage systems, we identify three primary causes of delayed SSD failure detection: 1) loose-deterministic failure check, 2) fixed command timeout, and 3) delayed failure notification. First, the Linux kernel employs a loose-deterministic failure check based on timeouts, which is a passive failure detection mechanism. As a result, SSD failure checks can be delayed, potentially resulting in untimely or inaccurate detection. Second, the kernel uses a fixed timeout that fails to accommodate differences across SSD models and command types. Third, intermediate layers between the device driver and applications, such as RAID, file systems, and network layers, introduce further delays in failure notification due to their own failure-handling policies. To address these issues, in this article, we propose RL-Watchdog (RLW), which accelerates SSD failure detection in both local and distributed storage systems through four key components: light-weighted watchdog (LWW), reinforcement learning-based timeout predictor (RLTP), fast failure notification (FFN), and extended FFN (eFFN). First, LWW periodically checks SSD liveness using a light-weight special command to ensure deterministic failure detection. Second, RLTP dynamically predicts command latency based on SSD states, adapting to different SSD models and command types. Third, FFN bypasses intermediate layers and directly notifies the application layer of SSD failure, ensuring rapid failure notification. Finally, for the distributed storage system, eFFN promptly delivers SSD failure notifications from storage nodes across the network, reducing failure detection time and minimizing data loss at the application level on client nodes. We implement RLW in the Linux kernel and evaluate it in both single-node and multi-node environments. The results imply that RLW reduces data loss by up to 96.7% in single-node environments and 44.6% in multi-node environments.“