

DNA Cassette Tape Can Store Every Song Ever Recorded (newscientist.com) 34
Researchers in China have developed a "DNA cassette," a retro-styled plastic tape embedded with synthetic DNA strands that can store up to 36 petabytes of digital data -- enough to hold every song ever recorded. New Scientist reports: Xingyu Jiang at the Southern University of Science and Technology in Guangdong, China, and his colleagues created the cassette by printing synthetic DNA molecules on to a plastic tape. "We can design its sequence so that the order of the DNA bases (A, T, C, G) represents digital information, just like 0s and 1s in a computer," he says. This means it can store any type of digital file, whether text, image, audio or video.
One problem with previous DNA storage techniques is the difficulty in accessing data, so the team then overlaid a series of barcodes on the tape to assist with retrieval. "This process is like finding a book in the library," says Jiang. "We first need to find the shelf corresponding to the book, then find the book on the corresponding shelf."
The tape is also coated in what the researchers describe as "crystal armor" made of zeolitic imidazolate, which prevents the DNA bonds from breaking down. That means the cassette could store data for centuries without deteriorating. While a traditional cassette tape could boast around 12 songs on each side, 100 meters of the new DNA cassette tape can hold more than 3 billion pieces of music, at 10 megabytes a song. The total data storage capacity is 36 petabytes of data -- equivalent to 36,000 terabyte hard drives. The research has been published in the journal Science Advances.
One problem with previous DNA storage techniques is the difficulty in accessing data, so the team then overlaid a series of barcodes on the tape to assist with retrieval. "This process is like finding a book in the library," says Jiang. "We first need to find the shelf corresponding to the book, then find the book on the corresponding shelf."
The tape is also coated in what the researchers describe as "crystal armor" made of zeolitic imidazolate, which prevents the DNA bonds from breaking down. That means the cassette could store data for centuries without deteriorating. While a traditional cassette tape could boast around 12 songs on each side, 100 meters of the new DNA cassette tape can hold more than 3 billion pieces of music, at 10 megabytes a song. The total data storage capacity is 36 petabytes of data -- equivalent to 36,000 terabyte hard drives. The research has been published in the journal Science Advances.
Nice improvement (Score:5, Informative)
One problem with previous DNA storage techniques is the difficulty in accessing data, so the team then overlaid a series of barcodes on the tape to assist with retrieval. "This process is like finding a book in the library," says Jiang. "We first need to find the shelf corresponding to the book, then find the book on the corresponding shelf."
What a terrible summary. Indexing was never the problem, reading back the DNA in a nondestructive manner and keeping it above a super cold state have been. Here's a quote from the actual paper:
Last, we developed a compact DNA cassette tape drive for DNA tape (Fig. 1B), which can perform file addressing, decapsulation, encapsulation, recovery, removal, and redeposition operations on DNA files quickly and automatically.
The actual paper is quite dense but almost accessible even for non-molecular biologists, and it sounds like they have solved quite a few problems with DNA-based storage including both reading AND writing AND doing this at room temperature.
Re: (Score:3)
I hope they can commercialize it. An alternative to LTO would be nice. Huawei has one that is a tape/SSD hybrid, but not available widely yet.
Re: (Score:2)
But how long does it take to search to the particular bits you were looking for? How many times can you search through the tape before it breaks? There are reasons random access cassettes were never popular.
Re: (Score:2)
Re:Nice improvement (Score:5, Informative)
Nobody's talking about it as a random access media. LTO, which is what AmiMojo referred to, is a common standard tape back-up system. You'd use this kind of media to back up data.
At those kinds of capacities, if priced cheaply enough, it'd be possible to create a sealed, permanently installed, box that periodically snapshots your PC, allowing you to go back in history to any point and retrieve files from that date. 36 petabytes could snapshot 1Tb of uncompressed hard disk space once a day for 100 years. Yes, eventually larger capacity random access storage (eg SSDs/HDDs) might become common in home PCs, but even a 20 fold increase would mean it'd last more than the lifetime of a regular PC, and SSDs/HDDs installed into new PCs aren't really growing in size that quickly.
(Cue people who'll miss the point and say "Well this'll be useless for me as I have a 100Tb NAS!" - you're not the typical user I'm talking about, and a 100Tb NAS isn't the storage in your PC anyway...)
Re: (Score:2)
Going to a particular date on a tape is a seek operation. A better reply would be that there's more than on kind of cassette. (1/2" tape has been in cassettes before, just not the kind you usually think of. And that was durable enough to allow a reasonable number of seeks. But I'd sure hate to have to patch a tape with that density.)
Re: (Score:2)
Well the point was more "We're not talking random access media here where seek times are really really important."
With a tape, even right now with some of the faster seek time devices, you wouldn't use it as random access media. That's not what it's good for. The fastest tapes with non-trivial storage capacities (ie not talking about stringy floppy or Sinclair microdrive type systems) still have a seek time poorer than the slowest floppy drives.
That narrows the scope of what concerns we should have. If it t
Re: (Score:2)
When you're talking petabytes and "reading DNA", I don't think 60 minutes is the right order of magnitude.
Re: Nice improvement (Score:2)
An SSD has a max lifetime write volume ... search for TBW ... you could store all writes, not just daily copies, in 1000Tb.
Re: Nice improvement (Score:2)
... then that 1000tb is your primary storage not your backup. You can do point in time backups to primary storage, but what do you do when that's unavailable... it's all gone. That's mouse nuts for an enterprise tape library.
Re: (Score:2)
They do, but I'm pretty sure you can write more than a thousand times to an SSD, so 1000T isn't going to be enough storage for a lifetime write log for a 1T SSD alas if it's based upon the media lifetime. What I've read is that modern SSDs tend to be rated for 100,000 writes per sector. That's a little more than 36Pb, but it's not unreasonable.
OTOH the point you raise suggests a combination approach might work pretty well, just write changed sectors but perhaps delay the write to deal with the inevitable "U
Re: Nice improvement (Score:2)
I think the limit really is something like 1000 writes of the total capacity on a consumer drive.
Apparently this is also expressed as the number of full writes you can do per day within the warranty period; it's less than one on a consumer drive.
Re: (Score:2)
In this case, this is less of an issue. With tape, you generally don't search through media, other than to fast forward to a file set to yank it off. With tape, you need capacity, and archival grade capacity, where stuff will be readable years to decades later. Speed is important as well, and even though LTO is king, we need something that can keep up, even it means going back to helical scanning.
With cloud storage being less and less effective, we need something. Optical is promising, but because peopl
Re: Nice improvement (Score:4, Interesting)
All tape backups systems are indexed, tape and seek position, by client, file path, and time.
Yea there's some cost to doing that, there's a little database involved usually, but it's ultimately no different than filesystem metadata, it's just tracked off the media because they're removable. Like any filesystem there's still metadata backups, maybe written to the end of the tape or at a fixed position to aid rebuilding the index instead of scanning entire tapes in case the backup server went tits up with the things that need restoring.
When you do a restore you're just browsing that index, and when you hit go, the system knows exactly which tapes to tell the robot to load and what all the seek positions are. Then the tapes can be run at full speed before reading.
It's not as optimal as backup IO, like constant optimal request depth, optimal block sizes, full wire speeds etc. But restore IO, aside from tape load times and seeking is still really fast linear reads with optimal block size and request depth.
Re: (Score:2)
Hmm, well your comment made me go and read the paper. It is interesting because they focused on an important part of the DNA storage process, which is deposition and recovery. And yes, there are a few neat innovations, like the tape system, the partitions, and the zeolite encapsulation.
However, the limitation remains DNA reading and writing, which is A) much longer than the deposition process, making gains in the deposition process almost insignificant, and B) requires careful temperature control such that
Will mutate into new species (Score:2)
...Cassettus Mixtapius and feed us to robots in rhythm
Plot twist : Its escapes. (Score:1)
So can my 80s cassette recorder (Score:1)
equiva,ent to 36,000 terabyte hard drives (Score:4, Funny)
Does a DNA cassette tape player even exist ??? (Score:2)
Every Song Ever Recorded (Score:3)
DNA Cassette Tape Can Store Every Song Ever Recorded
...and the copyright infringement penalties will be more than every dollar ever printed.
Re:Every Song Ever Recorded (Score:4, Insightful)
DNA Cassette Tape Can Store Every Song Ever Recorded
...and the copyright infringement penalties will be more than every dollar ever printed.
But it's China, so they don't care.
Re:Every Song Ever Recorded (Score:4, Funny)
Yep. This is why you insert this DNA on several emus , release it on australia, and inform it to the MPAA then watch as they try their best to eradicate these creatures that are effectively pirating every song everytime a cell divides.
Not so Impressive (Score:2)
Don't tell Monsanto! (Score:2)
The litigious DNA waters are already murky enough.
bit rot (Score:3)
And it would probably (Score:2)
take a long long time to read that data off and write.
I don't know why, but... (Score:2)
Enough tape to go around the earth would hold: (Score:2)
Earth circumference ~ 40,075,000 m.
Scale it up:
Per meter: 36 PB / 100 m = 0.36 PB/m (= 360 TB/m).
All the way around Earth:
0.36 PB/m x 40,075,000 m = 14,427,000 PB.
That's:
14,427 EB (decimal) = 14.4 ZB (decimal, 1 ZB = 1,000 EB)
13.8 ZB (binary-ish comparison using 1024s)
Songs at 10 MB each (as in the article):
36 PB/100 m -> 3.6 billion songs/100 m -> 36 million songs per meter
40,075,000 m -> 1.44 quadrillion songs.
Born to be wild (Score:3)
I love the wide open information superhighway with my bandwidth cranked up and the Windows down.