Humanity uploads around two billion photos every single day. A staggering 300 hours of video is uploaded to YouTube every minute. Most of us create digital files at work and for our personal projects. Our digital worlds are expanding all the time.
But does that digital data have a shelf-life? How do we ensure that our most precious digital files are preserved and accessible when we need them? What happens to them when we can no longer access them ourselves through illness or death?
In the physical world there isn’t a distinction between storage and preservation. You may have a box of old photographs or papers in your attic that belonged to your great grandfather, and, provided they’ve been stored in good conditions, they’ll still be readable today. The same can’t be said of digital data.
“Preservation is really about long term access,” explains Dr. Micah Altman, Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, “It’s about communicating with the future at some point.”
The threat of loss
“One threat is that the media fails. The hard drive fails, the DVD fails, or the disc can’t be read,” says Dr. Altman, “Another threat is that you can see the bits, but you can no longer tell what they mean because there’s no software available that will render that document. You might have a Microsoft Word file you wrote 15 years ago and it looks fine, but when you open it up you might not be able to understand what it says because that format is not supported anymore.”
The media that’s sold to consumers is not really built for long term storage.
We assume that digital files will last forever, because they don’t degrade, but the media we store them on can and does.
“Different media have tremendously varying shelf lives,” explains Dr. Altman, “It’s possible to get an archival optical media and write it in a professional way, so that you would expect it to last 100 years, if it’s stored in the right place. But recording on other things like a random hard drive, a CD, or a flash drive and sticking it on the shelf, and you could come back in three years to find significant degradation of data.”
You can drastically impact the likely shelf life by storing things properly. If you put a CD in the back of your cupboard where it’s relatively cool and dry, it will last a lot longer than if you leave it in the sun or somewhere humid. In the wrong environmental conditions it could be ruined within a year, or even a month.
Not all storage is created equal
“One of the challenges is that the media that’s usually sold to consumers, like hard drives in computers, are not really built for long term storage,” Dr. Altman says, “That’s not really how they’re designed. They’re designed for an operating lifetime of maybe three, four, or five years.”
Anyone who has experienced a hard drive failure knows that only too well. There’s also a lot of variation in hard drive quality, and discs are manufactured in batches. Some brands and batches last longer than others.
It’s possible to buy archival quality optical discs that are designed to last 100 years. Sony and Panasonic have been working on an Archival Disc standard. There are also options like M-Disc (Millennial Disc) which claims to last for up to 1,000 years, but, as you might expect, they’re significantly more expensive than ordinary discs. You’ll also require the right hardware to read them in the future, and there’s no telling how difficult it might be to get your hands on a DVD player a century from now.
Every cloud has a silver lining
“Making a backup on a separate disc, or flash drive, or a backup tape — all of these are just fine, but you’re going to need to do it every month and update it,” suggests Dr. Altman, “Even better would be to use a cloud storage backup system and to automatically take stuff and upload it to external servers. Typically, these are not on offline storage systems, they’re online storage, so they’re replicated in multiple spinning discs, and a good cloud storage service will check their copies regularly.”
If you go the cloud route, then you’re shifting the burden of figuring out how to preserve your data onto your cloud provider. Pick a good one, and they’ll maintain multiple copies of your files on technology that’s frequently updated. You can also set up your backup schedule and forget about it. Your files will be safe in the event of a break-in or a fire in your home that might wipe out locally stored data.
You’re going to have pay out a subscription fee, but if you want peace of mind about your data, then you may feel it’s worthwhile. A couple of services that Dr. Altman recommends are Backblaze and CrashPlan, because they implement best practices and they’re transparent about what they do.
Failings of cloud storage
Unfortunately, cloud storage isn’t a foolproof method of preservation, and you are placing a great deal of trust in the provider you select.
“They could go bankrupt,” says Dr. Altman, “Or they could accidentally send your expired credit card message to the wrong email and eventually cancel your account, or somebody could claim that you’ve got copyrighted material and they could get an injunction.”
You can mitigate those risks by doubling up on services and using two cloud storage solutions, because the chance of simultaneous problems at two independent companies is much lower.
There’s also the privacy question to consider. Many cloud storage providers have privacy policies that are difficult to understand.
“Privacy is another reason that you might want to manage things locally,” Dr. Altman explains, “If you were really worried you might choose to encrypt things before you store them anywhere. That’s probably one of the best ways of protecting that data from access and use by others, but from a preservation point of view — if you lose the encryption key then you’ve lost the data.”
Things that you want to make sure you pass down, should be in a standardized format.
Another problem with cloud accounts is access. If you have an accident, a serious illness, or shuffle off this mortal coil, how is your family going to gain access to your cloud accounts? How long before your account is wiped? Legally, it can prove tricky for family members to handle this at what is already a very difficult time.
“Having some plan in place for what happens to your accounts when you’re not in a position to access them is very important in this digital world we live in now,“ Dr. Altman suggests.
You might want to share your password manager with someone and make provision for your digital files in your will.
Formats matter too
“Even if you put your stuff in Google and Amazon, and you give a copy of your password keeper to your children or your lawyer, some of those files are going to be difficult to read and understand later,” explains Dr. Altman, “You might think, for things that you really want to make sure you pass down, about making some standardized copies. Things that are viewable directly in a web browser without plug-ins, like a JPEG file. They may not be the best quality of an image, but they’ll probably be easy to render based on open standards.”
Two more formats that he recommends are TIF and PDF-A. You can find a really detailed list of recommended format specifications for preserving data at the Library of Congress website.
“A rule of thumb would be something you display in a web browser without connecting to the Internet or installing an extra plug-in, is likely to be readable for a long time,” says Dr. Altman.
The final word
There’s no definitive answer on how to ensure your digital files last forever, but you can hedge your bets and come up with a multi-pronged strategy to give your data the greatest chance of survival.
“I would recommend spending some time researching cloud storage services. Choose two that are independent. Have a password manager or an account manager. Have a plan for who is going to handle your digital assets should something happen to you, whether it’s your son or daughter, or your lawyer, but have an explicit succession plan,” says Dr. Altman, “The few things that are very important, that you know you want to keep, put them in a digital format like a TIF or GIF or PDF-A, so they’re more durable than the average file. Make multiple copies and share them with people you trust.”