CIARD Virtual Fair - Pathway

Preservation of digital documents and data

Visit the CIARD Fair by pathway to learn how to achieve this pathway
Intended audience: 
information professionals
Managers
technical developers

Version 0.1 October 2009 - pdf.gifDownload PDF

Today’s information landscape is moving rapidly from a predominantly printed environment to an electronic one. This poses new challenges for effective information delivery today, but we need also to consider the measures we need to take to preserve the world’s knowledge heritage for future generations.    

What do you need to know?

Human knowledge has been recorded predominantly, until very recently, as print on paper. Paper can be stored for centuries if it is kept under the right conditions of temperature and humidity. However it should be noted that paper will decay rapidly if not kept under these suitable environmental conditions. Paper produced since the 1850’s may also need to be deacidified because acid speeds decay, although acid-free paper has become more common recently. Several processes for mass deacidification have now been developed, and if they prove viable this paper can be stored for centuries. Print on paper can of course be read without any extra reading devices! Nevertheless, digitization can be seen as a preferable option for preservation. 

Little is known about the longevity of storage media for digital information. It is common experience that magnetic storage devices, such as diskettes, have a relatively short life. Optical media (CD-ROM’s, DVD’s) last longer, but the first reports about their deterioration have already appeared. What we know for certain is that we need specific hardware and software to read the information on digital media. Technology in this area is changing rapidly, and will certainly continue to do so.  The world has already lost a lot of digital material on old websites. Nevertheless, and contradictory as it may seem, digitization is often rightfully implemented as a method of preservation of printed material. It is often the case that measures to preserve printed material are too late or too costly. In this case digitization is a good solution. For the digital documents that result the same considerations for preservation apply as for other digital data.  

What do you need to do?

Because of the rapid development and change of digital technologies it is important to address the copying of digital files to new formats before the current format (whether hardware or software) becomes outdated. Even if the carrier has been preserved and the information is still secure there is no guarantee that you will be able to decode the data properly and understand it if the software and media have been superseded. 

To preserve your material for the long term you should: 

1)  Preferably use data formats that are independent of specific hardware and software. SGML and especially its web version XML have been developed for that purpose. If those formats are not practical use the most frequently used data formats for storage as it is more likely that migration solutions will be developed for those formats. 

2)  Monitor the quality of the stored materials over time

-   measure media quality

-   store in appropriate conditions

-   store digital objects in multiple copies and locations. 

3)  Refresh storage media.

-   If your digital media risk losing their integrity, you should refresh them, that is copy them onto new fresh media of the same type or of a more modern type;

-   Refreshment must also be undertaken if the media containing the digital document becomes obsolete: in some cases the hardware used to read a certain format ceases to be supported and the media disappears from the marketplace and is no longer in frequent use (e.g. diskettes!). In this case, the refreshment should be to a newer well-supported media format, preferably a type that can be easily monitored and measured from the preservation standpoint. 

4)  Reformat the digital objects themselves.

-   Digital objects will require migration to newer supported formats when the software to read them becomes obsolete. Migration often requires additional manual corrections to be done in the document or data;

-   In some cases reformatting is impossible, such as highly interactive information products that use routines for specific hardware and operating system environments. For such products (or if one is simply too late to reformat the information objects) the only hope is that one day new technologies will be able to emulate the old environment.  Although there has been ongoing research in this area for some years it is unlikely to be available on a large scale – so do not rely on it happening! 

The IMARK Module “Digitization and Digital Librariestreats digital preservation in the wider context of the preservation of scientific and cultural heritage including the handling of delicate printed materials. See References below. 

References

Detailed advice and information on digital preservation can be found in the following: 

·      Information Management Resource Kit (IMARK). Look at Module 4, Digitization and Digital Libraries, Lesson 4.7. In addition, lesson 4.6 covers in detail issues related to delicate and heritage documents. (http://www.imarkgroup.org)  This module is currently (July 2009) being updated in combination with the module “Management of Electronic Documents”. The new module “Digital libraries and repositories”  is expected to be released at the end of 2009.

·      Moving Theory into Practice - Digital Imaging Tutorial (http://www.library.cornell.edu/preservation/tutorial/) Copyright Cornell University Library/Research Department, 2000-2003. This tutorial gives information on the use of digital imaging to convert and make accessible cultural heritage materials. It also introduces some concepts advocated by Cornell University Library, in particular the value of benchmarking requirements before undertaking a digital initiative. You will find here up-to-date technical information, formulas, and reality checks, designed to test your level of understanding.

·      Guidelines for the Preservation of Digital Heritage. Prepared by the National Library of Australia. Paris, UNESCO, 2003. 170 pp. (http://unesdoc.unesco.org/images/0013/001300/130071e.pdf)

·      Preservation management of digital materials: The Handbook - Digital Preservation Coalition, London (United Kingdom), 2003. (http://www.dpconline.org/graphics/handbook/)