R&D: Study of Error Correction Capability of Multiple Sequence Alignment Algorithm (MAFFT) in DNA Storage
Simulation study on error correction capability of typical MSA algorithm, MAFFT
This is a Press Release edited by StorageNewsletter.com on July 11, 2023 at 2:01 pmBMC Bioinformatics has published an article written by Ranze Xie, Xiangzhen Zan, Ling Chu, Yanqing Su, Peng Xu, and Wenbin Liu, Institution of Computational Science and Technology, Guangzhou University, Guangzhou 510006, China.
Abstract: “Synchronization (insertions–deletions) errors are still a major challenge for reliable information retrieval in DNA storage. Unlike traditional error correction codes (ECC) that add redundancy in the stored information, multiple sequence alignment (MSA) solves this problem by searching the conserved subsequences. In this paper, we conduct a comprehensive simulation study on the error correction capability of a typical MSA algorithm, MAFFT. Our results reveal that its capability exhibits a phase transition when there are around 20% errors. Below this critical value, increasing sequencing depth can eventually allow it to approach complete recovery. Otherwise, its performance plateaus at some poor levels. Given a reasonable sequencing depth (≤ 70), MSA could achieve complete recovery in the low error regime, and effectively correct 90% of the errors in the medium error regime. In addition, MSA is robust to imperfect clustering. It could also be combined with other means such as ECC, repeated markers, or any other code constraints. Furthermore, by selecting an appropriate sequencing depth, this strategy could achieve an optimal trade-off between cost and reading speed. MSA could be a competitive alternative for future DNA storage.“