When film critic and disability advocate Jeff Shannon passed away unexpectedly last December, I was confronted by the loss of a long-time friend, but also a data problem. Years ago, Jeff had mentioned that copies of his personal and professional writing spanning almost a decade existed only on ever-aging floppy disks. Jeff knew I kept some older systems running for just those sorts of occasions, and we were able to make fresh copies of almost all those old files before the floppies turned to dust.
A few weeks before he died, Jeff asked me about those same files. “I can’t open them now – they’re just gibberish. Is there any way to make them readable again?”
We never got a chance to work out the details.
Today, most of us trust our lives’ most important work, events, and memories to digital storage. Jeff’s situation highlights two common problems: storage media doesn’t last forever, and modern software might not be able to read old files. Social media and trusting our data to the cloud only makes things more complicated. Our digital lives aren’t just on our devices, they’re spread across the entire Internet. How can we preserve our digital legacies, and make sure they’ll be accessible in the future? Here’s an in-depth look at how to ensure your digital life lasts forever.
Media matters
Before we get start worrying about how to save, you need to decide what to save. Email, apps, documents, and the photos and videos we shoot and edit all add up, so make sure to keep tabs on the things you want to hold onto for the foreseeable future.
Hard drives have typical lifespans of two to eight years, depending on their environment and how they’re treated.
In the past, saving data for posterity has meant making a copy and stashing it somewhere – a drawer, a shoebox, maybe even a safety deposit box. But just as a safety deposit box is safer than the shoe variety, not all digital storage media is the same, and there’s no clear winner for saving our data in the long term. Here are the options:
Optical: Writable CDs, DVDs, and Blu-ray discs are widely available – but they’re not very big: CD-Rs only hold 700MB, DVD-R is usually 4.7GB, and consumer-level Blue-ray discs are typically 25GB. I’d need 28 DVDs just to make one copy of my photos and documents – that’s a lot of disc-flipping. And besides, DVD drives are going the way of the floppy. Most notebooks don’t even have DVD drive options, and they’re vanishing from desktops too. Will you be able to read a DVD in ten years? Probably not easily.
Still, optical media is cheap if you need to store a lot of data – so cheap, Facebook has developed a robotic system to use Blu-ray media as “cold storage” for data it may rarely (if ever) need to access. And Sony and Panasonic just announced a new “archival” Blu-ray format that can handle up to 1 TB per disc. Sony claims archive-quality Blu-ray discs should last over 50 years– but these products are aimed at corporations and professionals, not consumers. Plus, any disk media can crap out.
“We see DVD and Blu-ray media failure all the time,” said Bin Iwata, a senior technician at a video editing studio in Vancouver, British Columbia. “Sometimes a writer goes bad, but sometimes discs burned to bulk blank media fail within a few months. When blank media says it’s got a five-year average lifespan, that means half that blank media fails in less than five years. You’re rolling dice.”
Quality of writable optical media seems to vary widely. Some aficionados swear blank media from Japan’s Taiyo Yuden is highly reliable – but you can’t pick it up at Costco or Best Buy.
Hard drives: Traditional drives offer cheap capacity (1 TB external drives are under $70 right now) and they’re faster to read and write than optical media. But hard drives are mechanical devices spinning at high speeds, and rely on complex components and circuitry. What’s more, their interfaces become obsolete quickly. A USB 3.0 or eSATA drive is easily accessible today – but can you still access a FireWire 400 drive? How about IDE? Or SCSI? Have you even heard of SCSI? Exactly.
Hard drives have typical lifespans of two to eight years, depending on their environment and how they’re treated. But, eventually, they all fail.
Flash: Solid State Drives (SSDs) are currently more expensive than optical media or traditional hard drives, but they have no moving parts – so they should be ideal for long-term storage, right? Well, nobody truly knows yet, partly because the technology is still changing. Flash manufacturers usually claim their media will retain data for ten years; however, memory cells in flash media eventually lose their charge when left idle and unpowered. Reports and estimates vary widely on how long that might take – I’ve lost data on a carefully-stored thumb drive in just over a year. Flash drives also face the same interface problems as traditional hard drives: even if the data is intact, will you be able to connect to the device to retrieve it?
No matter which you choose, there’s always the possibility that your data will become corrupted. “If you have words cut off of a physical letter or a few lines that have faded, you may be able to put them back together or, with a photo, easily understand the content of the image based on surrounding undamaged areas,” says Evan Fay Earle, a collections specialist and archival technical services coordinator at Cornell University. “If you have a disc or digital file that is suffering from bit rot or gets slightly altered during a transfer it can be much harder to fix.”
Technologies like (expensive) ECC memory and checksums can help detect when something has gone wrong, but they’re little to no help (even to experts) for correcting data once it’s been corrupted.
What about the cloud?
Cloud storage offered by services like Dropbox, Google Drive, Microsoft OneDrive, iCloud, and countless others may seem to solve the media problem. Cloud services assume the responsibility of keeping your data safe and online – and they’re also geographically distant, so a fire or flood that might destroy your personal archives wouldn’t impact a cloud-based archive.
If your data matters, start protecting it now. It’s not going to archive itself.
The good news is, that’s all true. However, cloud storage solutions can be expensive in the long run (for instance, 50 GB of iCloud storage costs $100 per year) and, obviously, they require broadband Internet: if your Internet goes down (or perhaps you’re traveling), not only can you not access your data, you can’t save anything either. You also have to decide how much you trust cloud service operators to secure your data. After all, we’re getting to the point where major security breaches are becoming everyday news.
Despite these downsides, using the cloud for at least one of your storage methods is a wise move.
Data formats
It’s important to save data on media that will be readable in the future, but also to use file formats that’ll be accessible in the future. This is exactly the problem my friend Jeff Shannon was facing: he did much of his early writing using a proprietary word processor, and the format is virtually unsupported today.
To preserve documents long-term, it’s best to break them out of proprietary formats (like Photoshop and Microsoft Office’s .doc and .xls) to Open Document Format (ODF), Open XML, or perhaps PDF. Photographers should consider saving images in original raw format (if available), uncompressed TIFF format, or alternatives like PNG or JPEG. For audio, save the highest uncompressed WAV or AIFF if possible. Video is tougher, but lossless MPEG-2 seems like a good choice. If you want to geek out, the Library of Congress maintains an extensive reference on the pros and cons of archival file formats.
Social media
Don’t forget about your social media history – your Facebook timeline or Twitter feed might be very important to you in the years ahead. Both Facebook and Twitter let users request an archive via email, which arrives as a compressed, HTML-formatted snapshot of your timeline. They’re browseable on their own, and both services have done a good job of creating snapshots that should be accessible for many years.
On Facebook, go to Account > Account Settings > Download Your Information. On Twitter, go to Account > Request your archive, and Google enables users to back up a wide variety of data. Many (but not all) social media services have similar features.
So what’s the best plan?
If you think your might want to save your digital life for posterity, here are the main things to consider:
- Make regular backups. Back up your devices on a regular schedule. Ideally, you should make more than one, keeping one at an offsite location.
- Make archives. Store photos, video, and audio to whatever media seems most sensible, whether optical, flash, or traditional hard drives. Test those copies at least every two years, migrate those archives to fresh (or perhaps different) media every three to five years.
- Make copies. Make more than one copy of your archive – if one has a problem, it’s unlikely the other will have the same problem. Consider using different storage media for different copies.
- Store your archives in a cool, dry place. It doesn’t have to be a climate controlled room, but big changes in temperature and humidity reduce media lifespans.
- Request regular backups of your social media activity. Store them with your archived files.
- Convert documents and media out of proprietary formats. Open formats more likely to be supported in the distant future.
- Consider encrypting your archive. This has a potentially huge downside: if you lose your password – or or decryption software isn’t available in the future — you lose everything. But, if done right, you aren’t vulnerable if your archive is lost or stolen.
Some people don’t care about their data – it’s all ephemeral and transitory. But if your data matters, start protecting it now. It’s not going to archive itself.
Images courtesy of Shutterstock