Skip to Main Content

Digital Projects Toolkit: Digital Preservation

Digital Preservation

What is Digital Preservation?

Digital preservation is:

  • The "process of (actively) maintaining a digital object for as long as required, in a form which is authentic, and accessible to users."
    • Brown, Adrian. Practical Digital Preservation: A How-to Guide for Organizations of Any Size. Chicago: Neal-Schuman, 2013.
  • A "formal endeavor to ensure that digital information of continuing value remains accessible and usable."

Key Concepts

  • Digitization does NOT equal preservation (do not discard the original items after scanning)
    • Digitization is only the first step in a complex process
  • Digital files are just as fragile (if not more so) than physical items
    • Digital items have a shorter lifespan than their physical counterparts.
    • They can easily become corrupted by malware, viruses, or misuse. They can also easily become lost (especially if stored on things like flash drives and memory cards) or deleted.
  • Not only is the format for digital items constantly changing but with that changing technology the hardware and software for it can become obsolete (like floppy disks).
Digital Preservation Process
  • Digitize
    • Digitize items that have not yet been converted into a digital format (skip this step for born-digital items, such as emails or Word documents).
    • Digitization does not equal preservation. It's only the first step, so do not discard the originals.
    • Do it right the first time: high quality, proper format, etc. (follow the Digitization Standards of the ND State Library).
  • Identify & Import
    • Determine all the locations where you have digital items (cameras, computers, flash drives, phones, CDs, external hard drives, email, cloud storage, social media, etc.).
    • Gather everything into one location (importing them if necessary).
  • Select
    • Determine what you want to save (select files that have long-term value).
    • Remove duplicates, near duplicates, poor quality items, etc.
    • If there are multiple versions of the same item, choose the one with the highest quality.
  • Organize
    • Give descriptive but brief file names, avoid spaces and most symbols (underscores or hyphens are okay), and include the date when applicable.
    • Add/ tag/ embed as much information to your files as you can (identify the who, what, when, where).
    • Create a consistent organizational structure that works for you (by year, by subject, etc.).
    • Create an inventory.
  • Storage & Backup
    • Make copies and store them in different places (don't put all of your digital "eggs" in one basket).
    • Follow the 3-2-1 rule (3 copies, stored on 2 different media, and 1 copy located off-site).
  • Manage & Preserve
    • Check your files annually to make sure they are still accessible. There are software that can do this for you via fixity checks/ checksum (HashMyFiles is a free and works well, but there are also other software options available).
    • Plan to migrate your digital archive every few years to new storage media and create new media
      copies to avoid data loss.

More information about the digital preservation process is available at the PDF below and on the "Resources" tab on this page.

Resources from the ND State Library

Resources from the Library of Congress

Resources from Other Sources
Digital Storage

Storage is an important aspect of digital preservation. The saying "don’t put all of your eggs in one basket" also applies to digital collections. It is important to make copies of your digital files and store them in different places.

Golden Rules/ Best Practices

There are a few rules when it comes to your master files, which are the high-quality items in stable formats (like TIFF and PDF/A).

  • Keep your master files separate from your access files.
    • So for photos, keep the TIFFs (master files) separate from the JPEGs (access copies).
  • Restrict who has access to your master files.
    • Remember, accidents can happen.
    • Try to limit access to the master files unless absolutely necessary. Think of the master files being in "cold storage."
  • If you need to do any editing or processing, always work with copies.
    • Again, accidents can happen. You don’t want to accidentally overwrite a master file.
    • Edit a copy of the master file, or edit the access copy.
  • Plan to regularly back up and monitor your files.
    • Checksum software can be used to monitor the integrity of the files.

3-2-1 Rule

Follow the 3-2-1 rule:

  • Have 3 separate copies of your digital files
  • 2 copies should be stored on different storage media
    • Don’t have both copies live locally on the same computer. If that computer crashes, you may lose both copies.
    • For example, one copy could live locally on a computer and another copy could live in cloud storage.
  • 1 of the copies should be located off-site
    • The off-site criteria is there to help protect your digital collection in case something should happen to the main location (the "on-site" one). It is there to help safeguard against water damage, tornado, fire, etc.

Why is it so important to have backups and an off-site backup? Here is an example. Back in the 1990s, when Toy Story 2 was in production, much of the film was accidentally deleted and almost lost forever. What saved the movie? An off-site backup. Learn more here: "How Toy Story 2 Almost Got Deleted: Stories From Pixar Animation: ENTV" (YouTube)

The Cloud

The cloud can be a beneficial addition to your storage plan.

  • Cloud storage can include things like Google Drive, OneDrive, Dropbox, etc. Many other options are also available.
  • Cloud storage is typically accessible from anywhere.
  • Some options are free (up to a certain GB). The large storage options are relatively inexpensive, but the more storage you need, the more money it will cost.
  • Cloud storage is generally a good backup, as it is typically a secure, off-site option.

However, cloud storage should NOT be your only backup method.

The cloud is useful but it is NOT the solution to everything.

  • Using cloud storage, your files will be stored on a server that is managed by a third-party company.
  • There are no guarantees that this company will last forever.
  • Make sure to read the company’s user agreements, so there are no hidden privacy issues.
  • The cloud is essentially someone else's computer, so it is still at risk to the same threats a normal computer would face.

Additional Resources
Website Preservation

What is the average lifespan of a webpage?

  • In 1997, a report published in Scientific American claimed 44 days.
  • In 2001, a study published in IEEE Computer claimed 75 days.
  • In 2003, a Washington Post article claimed 100 days.

The estimates vary, but they are all short. [Ironically, 2 of the 3 reports/ studies listed above had broken links and were no longer available. They had to be found using a web archive.]

Now, if webpages last 100 days or less before they are changed, deleted, or moved, what about all of the hyperlinks on webpages? In 2013, a study by Harvard University found that about 50% of all the hyperlinks included in Supreme Court decisions were broken (meaning the links no longer worked, or no longer directed you to the location they were supposed to).

This phenomenon is known as link rot ("...hyperlinks tending over time to cease to point to their originally targeted file, web page, or server due to that resource being relocated to a new address or becoming permanently unavailable." - Wikipedia).

One of the best ways to prevent link rot or webpage loss is by using a web archive. Arguably, two of the most common tools to preserve webpages and websites are products of the Internet Archive:

  • Wayback Machine
    • A "service that allows people to visit archived versions of Web sites. Visitors to the Wayback Machine can type in a URL, select a date range, and then begin surfing on an archived version of the Web" (Wayback Machine General Information).
    • The Wayback Machine can also be used for free to manually capture a webpage as it appears today so it can be used in the future.
  • Archive-It
    • A subscription web archiving service that "enables you to capture, manage and search collections of digital content without any technical expertise or hosting facilities."

Other web archives include:

  • Heritrix
  • WebCite
  • Webrecorder

Further Reading

Email & Social Media Preservation

For information on preserving your emails and your content on social media, visit the Personal Digital Archiving page.

IMLS logo

Many of these resources and programs are funded under the provisions of the Library Services and Technology Act from the Institute of Museum and Library Services.