R&D: Dual-Plasmid Editing System Improves DNA Digital Storage Potential
Researchers established dual-plasmid system in vivo using rationally designed coding algorithm and information editing tool.
This is a Press Release edited by StorageNewsletter.com on August 15, 2022 at 2:01 pmFrom Chen Na, Chinese Academy of Sciences
DNA-based information is a new interdisciplinary field linking information technology and biotechnology.
The field hopes to meet the enormous need for long-term data storage by using DNA as an information storage medium. Despite DNA’s promise of strong stability, high storage density and low maintenance cost, however, researchers face problems accurately rewriting digital information encoded in DNA sequences.
Generally, DNA storage technology has 2 modes, i.e., the ‘in vitro hard disk mode’ and the ‘in vivo CD mode.’ The primary advantage of the in vivo mode is its low-cost, reliable replication of chromosomal DNA by cell replication. Due to this characteristic, it can be used for rapid and low-cost data copy dissemination. Since encoded DNA sequences for some information contain a large number of repeats and the appearance of homopolymers, however, such information can only be ‘written’ and ‘read,’ but cannot be accurately ‘rewritten.’
To solve the rewriting problem, Prof. LIU Kai, department of chemistry, Tsinghua University, Prof. LI Jingjing, Changchun Institute of Applied Chemistry (CIAC), Chinese Academy of Sciences, and Prof. Chen Dong, Zhejiang University led a research team that recently developed a dual-plasmid editing system for accurately processing digital information in a microbial vector. Their findings were published in Science Advances.
The researchers established a dual-plasmid system in vivo using a rationally designed coding algorithm and an information editing tool. This dual-plasmid system is suitable for storing, reading and rewriting various types of information, including text, codebooks and images. It fully explores the coding capability of DNA sequences without requiring any addressing indices or backup sequences. It is also compatible with various kinds of coding algorithms, thus enabling high coding efficiency. For example, the coding efficiency of the current system reaches 4.0 bits per nucleotide.
To achieve high efficiency as well as reliability in rewriting complex information stored in exogenous DNA sequences in vivo, a variety of CRISPR-associated proteins (Cas) and recombinase were used. The tools were guided by their corresponding CRISPR RNA (crRNA) to cleave a target locus in a DNA sequence so that the specific information could be addressed and rewritten. Because of the high specificity between complementary pairs of nucleic acid molecules, the information-encoded DNA sequences were accurately reconstructed by recombinase to encode new information. Due to optimizing the crRNA sequence, the information rewriting tool became highly adaptable to complex information, thus resulting in rewriting reliability of up to 94%, which is comparable to existing gene-editing systems.
The dual-plasmid system can serve as a universal platform for DNA-based information rewriting in vivo, thus offering a new strategy for information processing and target-specific rewriting of large and complicated data on a molecular level.
“We believe this strategy can also be applied in a living host with a larger genome, such as yeast, which would further pave the way for practical applications regarding big data storage,” said Prof. Liu.
Article: In vivo processing of digital information molecularly with targeted specificity and robust reliability
Science Advances has published an article written by Yangyi Liu, Yubin Ren, Department of Chemistry, Tsinghua University, Beijing, China, Jingjing Li, State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, China, Fan Wang, State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, China, Fei Wang, Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, and Institute of Molecular Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China, Chao Ma, Department of Chemistry, Tsinghua University, Beijing, China, Dong Chen, College of Energy Engineering and State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou, China, Xingyu Jiang, Department of Biomedical Engineering, Southern University of Science and Technology, No. 1088 Xueyuan Road, Nanshan District, Shenzhen, Guangdong, China, Chunhai Fan, Hongjie Zhang, Frontiers Science Center for Transformative Molecules, School of Chemistry and Chemical Engineering, and Institute of Molecular Medicine, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China, and Kai Liu, Department of Chemistry, Tsinghua University, Beijing, China, and State Key Laboratory of Rare Earth Resource Utilization, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, China.
Schematic illustration of DNA-based information storage and rewriting within living cells.
The digital information stored in binary codes was encoded into DNA sequences using a high-density encoding algorithm. The sequences were then cloned into the plasmid (info, info plasmid) of living cells for long-term storage, data amplification, and information rewriting. Information rewriting was achieved using the dual-plasmid system based on CRISPR-Cas12a-λRed, whose encoding template was cloned in the plasmid (help, help plasmid). The original information was target-specifically rewritten by replacing the target DNA fragment within the info plasmid with a donor DNA fragment. Following precise revision, the rewritten information was decoded by decoding the DNA sequence within the new info plasmid (info*).
(Copyright Science Advances )
Abstract: “DNA has attracted increasing interest as an appealing medium for information storage. However, target-specific rewriting of the digital data stored in intracellular DNA remains a grand challenge because the highly repetitive nature and uneven guanine-cytosine content render the encoded DNA sequences poorly compatible with endogenous ones. In this study, a dual-plasmid system based on gene editing tools was introduced into Escherichia coli to process information accurately. Digital data containing large repeat units in binary codes, such as text, codebook, or image, were involved in the realization of target-specific rewriting in vivo, yielding up to 94% rewriting reliability. An optical reporter was introduced as an advanced tool for presenting data processing at the molecular level. Rewritten information was stored stably and amplified over hundreds of generations. Our work demonstrates a digital-to-biological information processing approach for highly efficient data storage, amplification, and rewriting, thus robustly promoting the application of DNA-based information technology.“