R&D: High Net Information Density DNA Storage by MOPE Encoding Algorithm
Proposed algorithm with high net information density and satisfying constraints can not only effectively reduce cost of DNA synthesis and sequencing but also reduce occurrence of errors during DNA storage.
This is a Press Release edited by StorageNewsletter.com on July 12, 2023 at 2:00 pmIEEE/ACM Transactions on Computational Biology and Bioinformatics has published an article written by Yanfen Zheng, Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China, Ben Cao, School of Computer Science and Technology, Dalian University of Technology, Dalian, China, Jieqiong Wu, Bin Wang, and Qiang Zhang, Key Laboratory of Advanced Design and Intelligent Computing, Ministry of Education, School of Software Engineering, Dalian University, Dalian, China.
Abstract: “DNA has recently been recognized as an attractive storage medium due to its high reliability, capacity, and durability. However, encoding algorithms that simply map binary data to DNA sequences have the disadvantages of low net information density and high synthesis cost. Therefore, this paper proposes an efficient, feasible, and highly robust encoding algorithm called MOPE (Modified Barnacles Mating Optimizer and Payload Encoding). The Modified Barnacles Mating Optimizer (MBMO) algorithm is used to construct the non-payload coding set, and the Payload Encoding (PE) algorithm is used to encode the payload. The results show that the lower bound of the non-payload coding set constructed by the MBMO algorithm is 3%–18% higher than the optimal result of previous work, and theoretical analysis shows that the designed PE algorithm has a net information density of 1.90 bits/nt, which is close to the ideal information capacity of 2 bits per nucleotide. The proposed MOPE encoding algorithm with high net information density and satisfying constraints can not only effectively reduce the cost of DNA synthesis and sequencing but also reduce the occurrence of errors during DNA storage.“