Update 7/7/2016: Researchers have improved on their original methods substantially in the last year, announcing they have been able to store far more than the original four images in a DNA sequence. The new batch includes a high-definition OK Go music video for “This Too Shall Pass,” as well as 100 books and a seed database. The new 200 megabyte batch is a long ways from the previous DNA storage record of 22 megabytes, according to Mashable.
The University of Washington and Microsoft researchers have collaborated on an incredible project that will change data storage as we know it. The team has developed a way of storing data on DNA, which promises to dramatically reduce the storage size needed to save our files, images, and more. The team claims the technology would take a data center as large as a Walmart Supercenter and shrink it down to the size of the sugar cube you drop into your morning coffee. The research behind this stunning technology was presented this month at the ACM International Conference on Architectural Support for Programming Languages and Operating Systems.
During their experiments, the team, which is made up of engineers from Microsoft and scientists from UW, was able to store digital data from four image files on a strand of synthetic DNA.”Life has produced this fantastic molecule called DNA that efficiently stores all kinds of information about your genes and how a living system works — it’s very, very compact and very durable,” said UW associate professor Luis Ceze. “We’re essentially repurposing it to store digital data — pictures, videos, documents — in a manageable way for hundreds or thousands of years.”
The team encoded the digital data into the nucleotide sequence, using a very precise method to read the ones and zeros of the digital data and convert them to the DNA building blocks of adenine, guanine, cytosine, and thymine. They were able to retrieve the data without any loss of information. And once the scientists turned the ones and zeroes into As, Gs, Cs, and Ts, they were able to store a massive amount of data in these tiny DNA molecules, which could be preserved for long-term storage.
To retrieve the data, the team used short signature sequences, much like a zip code, to identify the strands of data they needed. Using the polymerase chain reaction and DNA sequencing techniques, the researchers were able to pull out the encoded DNA, reverse engineer it and reconstruct the data files.
The team hopes to further improve this method, eventually looking at ways of scaling it so they can store data center-sized amounts of information in small spaces. The biggest hindrance to this expansion is the DNA synthesis itself, which is costly to synthesize and sequence. If there is enough interest in this DNA-based storage technology, however, these hurdles could be overcome in time.