R&D : Five Articles on DNA Data Storage Technologies
Predict degree of secondary structures of encoding sequences in DNA storage by deep learning model, effective IDS error correction algorithms for DNA storage channels with multiple output sequences, ultrafast and accurate DNA storage and reading integrated system via microfluidic magnetic beads polymerase chain reaction, advances and challenges in random access techniques for in vitro DNA data storage, deniable encryption method for modulation-based DNA storage
This is a Press Release edited by StorageNewsletter.com on July 17, 2025 at 2:00 pmR&D: Predict Degree of Secondary Structures of Encoding Sequences in DNA Storage by Deep Learning Model
Proposed screening method for top high-risk sequences can be proactive step to prevent the occurrence of severe secondary structures, providing a solution for reliable information retrieval.
Scientific Reports has published an article written by Wanmin Lin, Ling Chu, Xiangyu Yao, Zhihua Chen, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China, Peng Xu, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China, School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, Guizhou, China, and Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, Guangdong, China, and Wenbin Liu, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, Guangdong, China, and Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, Guangdong, China.
Abstract: “DNA storage has been widely considered as a promising alternative for exponentially growing data. However, the inherent complex secondary structures severely compromise the processes of synthesis, PCR amplification, and sequencing, interfering with reliable information recovery. In large-scale storage applications, how to effectively circumvent the negative effects is a critical problem. As secondary structures are formed by contiguous bases with reversal complementary relations and accompanied by the released free energy, we construct a BiLSTM-Transformer model with k-mer embedding to predict the free energy of sequences and further screen out these sequences with high values. K-mer embedding can capture the characteristics of contiguous base pairings through overlapping short subsequences, further facilitating free-energy prediction. Compared with other deep learning models, our simulation results demonstrate that BiLSTM-Transformer model with k-mer embedding has a better prediction performance. Application on a real dataset demonstrates that the proposed model can screen out those top high-risk sequences which are prone to more read errors and fewer retrieved copy numbers in real DNA storage. The proposed screening method for top high-risk sequences can be a proactive step to prevent the occurrence of severe secondary structures, providing a solution for reliable information retrieval.“
R&D: Effective IDS Error Correction Algorithms for DNA Storage Channels with Multiple Output Sequences
Compared with existing studies, simulation results show that the authors proposed decoding algorithm reduces the BER by 21.72% ~ 99.75%.
IEEE Transactions on NanoBioscience has published an article written by Caiyun Deng, Guojun Han, Pengchao Han, and Yi Fang, School of Information Engineering, Guangdong University of Technology, Guangzhou, China.
Abstract: “DNA data storage is a cutting-edge storage technique due to its high density, replicability, and long-term capability. It involves encoding, insertion, deletion, and substitution (IDS) channels for data synthesis and sequencing, and decoding processes. The IDS channels that feature multiple output sequences are prone to IDS errors, complicating the decoding process and degrading the performance of DNA data storage. To address this issue, we investigate effective IDS error correction algorithms considering two encoding schemes in DNA data storage. Specifically in the encoding process, we use marker codes (MC) and embedded marker codes (EMC) as inner codes, respectively, both connected to low-density parity-check (LDPC) codes as outer codes. First, we propose the segmented progressive matching (SPM) algorithm to infer the consensus sequence from multiple output sequences, thereby facilitating the decoding processes. Moreover, when using MC as the inner code, we propose a synchronous decoding algorithm based on the Hidden Markov Model (SDH) to infer the a posteriori probability (APP) of base symbols, which supports the external decoding algorithm. Furthermore, when the inner code is EMC, we propose the iterative external decoding (IED) algorithm. IED integrates synchronous decoding with embedded normalized min-sum decoding (ENMS) to achieve an enhanced APP for external decoding, enabling lower bit-error rate (BER) transmission. Meanwhile, we reduce the complexity of the external decoder by minimizing checksum node computations. Comparing the two schemes reveals that the SDH algorithm with MC as the inner code offers a lightweight solution for DNA data storage. In contrast, the IED with EMC demonstrates superior decoding performance with a linear complexity scale by the number of iterations. Compared with existing studies, simulation results show that our proposed decoding algorithm reduces the BER by 21.72% ~ 99.75%.“
R&D: Ultrafast and Accurate DNA Storage and Reading Integrated System Via Microfluidic Magnetic Beads Polymerase Chain Reaction
Study integrated system was constructed using homemade microfluidic PCR and DNA magnetic beads for fast and accurate DNA storage and reading with reproducibility.
ACS Nano has published an article written by Ying Zhou, Kun Bi, Qi Xu, Quanjun Liu, Xiangwei Zhao, Qinyu Ge, and Zuhong Lu, State Key Laboratory of Digital Medical Engineering, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China.
Abstract: “DNA storage is expected to tackle the dilemma faced by electronic information technology for the effective storage and management of massive amounts of data in the era of big data. Efficient and reliable data retrieval is crucial for DNA storage. However, it is still challenging to actualize DNA storage with fast and accurate readout capabilities, which play a key role in the practicality and reliability of DNA storage. In this study, an integrated system was constructed using homemade microfluidic PCR and DNA magnetic beads for fast and accurate DNA storage and reading with reproducibility. The homemade microfluidic PCR and DNA magnetic beads constructed for the random access of DNA storage have the advantages of short time and low bias named MMBP. The homemade DNA magnetic beads are low cost, stable, and reproducible. The integrated DNA storage and reading system integrated by MMBP can read information not only more accurately and quickly but also at a lower sequencing depth than traditional PCR. Overall, the MMBP-based DNA information storage system (MMBP-DIS) has the advantages of reducing the cost, decreasing the random access time to 10 min, and improving the reading accuracy and sensitivity. In the future, it can be integrated with DNA electrochemical synthesis to develop a fast and accurate portable microfluidic device for DNA synthesis-preservation-reading integration.“
R&D: Advances and Challenges in Random Access Techniques for In Vitro DNA Data Storage
Authors summarize recent advances in DNA storage technology that enable random access functionality, as well as challenges that need to be overcome and current solutions.
ACS Applied Materials & Interfaces has published an article written by Ying Zhou, Kun Bi, Qinyu Ge, and Zuhong Lu, State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University, Nanjing 210096, China.
Abstract: “With digital transformation and the general application of new technologies, data storage is facing new challenges with the demand for high-density loading of massive information. In response, DNA storage technology has emerged as a promising research direction. Efficient and reliable data retrieval is critical for DNA storage, and the development of random access technology plays a key role in its practicality and reliability. However, achieving fast and accurate random access functions has proven difficult for existing DNA storage efforts, which limits its practical applications in industry. In this review, we summarize the recent advances in DNA storage technology that enable random access functionality, as well as the challenges that need to be overcome and the current solutions. This review aims to help researchers in the field of DNA storage better understand the importance of the random access step and its impact on the overall development of DNA storage. Furthermore, the remaining challenges and future research trends in random access technology of DNA storage are discussed, with the goal of providing a solid foundation for achieving random access in DNA storage under large-scale data conditions.“
R&D: Deniable Encryption Method for Modulation-Based DNA Storage
In paper, authors propose deniable encryption method that uniquely leverages DNA noise channels.
Interdisciplinary Sciences: Computational Life Sciences has published an article written by Ling Chu, Yanqing Su, Xiangzhen Zan, Wanmin Lin, Xiangyu Yao, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, 510006, China, Peng Xu, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, 510006, China, School of Computer Science of Information Technology, Qiannan Normal University for Nationalities, Duyun, 558000, China, and Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, 510000, China, and Wenbin Liu, Institute of Computing Science and Technology, Guangzhou University, Guangzhou, 510006, China, and Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangzhou, 510000, China.
Abstract: “Recent advancements in synthesis and sequencing techniques have made deoxyribonucleic acid (DNA) a promising alternative for next-generation digital storage. As it approaches practical application, ensuring the security of DNA-stored information has become a critical problem. Deniable encryption allows the decryption of different information from the same ciphertext, ensuring that the “plausible” fake information can be provided when users are coerced to reveal the real information. In this paper, we propose a deniable encryption method that uniquely leverages DNA noise channels. Specifically, true and fake messages are encrypted by two similar modulation carriers and subsequently obfuscated by inherent errors. Experiment results demonstrate that our method not only can conceal true information among fake ones indistinguishably, but also allow both the coercive adversary and the legitimate receiver to decrypt the intended information accurately. Further security analysis validates the resistance of our method against various typical attacks. Compared with conventional DNA cryptography methods based on complex biological operations, our method offers superior practicality and reliability, positioning it as an ideal solution for data encryption in future large-scale DNA storage applications.“