Abstract
The Multivalent Document Model offers a practical, proven, no-compromises architecture for preserving digital documents of potentially any data format. We have implemented from scratch such complex and currently important formats as PDF and HTML, as well as older formats including scanned paper, UNIX manual pages, TeX DVI, and Apple II AppleWorks word processing. The architecture, stable since its definition in 1997, extends easily to additional document formats, defines a cross-format document tree data structure that fully captures semantics and layout, supports full expression of a format’s often idiosyncratic concepts and behavior, enables sharing of functionality across formats thus reducing implementation effort, can introduce new functionality such as hyperlinks and annotation to older formats that cannot express them, and provides a single interface (API) across all formats. Multivalent contrasts sharply with emulation and conversion, and advances Lorie’s Universal Virtual Computer with high-level architecture and extensive implementation.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adobe Systems, Inc. PDF as a Standard for Archiving, Adobe white paper, http://www.adobe.com/products/acrobat/pdfs/pdfarchiving.pdf
Birrell, A., McJones, P.: Virtual Paper Web site (1995–1997), http://www.research.compaq.com/SRC/virtualpaper/
Brown, A.: Preserving the Digital Heritage: Building a Digital Archive for UK Government Records. In: Brown, A. (ed.) Proceedings of Online Information (2003), http://www.nationalarchives.gov.uk/preservation/digitalarchive/
IBM. Digital Asset Preservation Tool Web site, http://www.alphaworks.ibm.com/tech/uvc?Open&ca=daw-flHts-120204
International Standards Organization. In: ISO/CD 19005-1, Document management—Electronic document file format for long-term preservation—Part 1: Use of PDF (PDF/A), (November 31, 2003), http://www.aiim.org/documents/standards/ISO_19005-1_E.doc
Janssen, W.C., Popat, K.: UpLib: a universal personal digital library system, In. In: Proceedings of the ACM symposium on Document Engineering, pp. 234–242 (2003)
Lindholm, T., Yellin, F.: The Java Virtual Machine Specification, 2nd edn. Addison-Wesley, Longman Publishing Co., Inc. (1999)
Lorie, R.: Long term preservation of digital information. In: Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 346–352 (2001)
Marshall, C.C., Golovchinsky, G.: Saving Private Hypertext: Requirements and pragmatic dimensions for preservation. In: Proceedings of ACM Hypertext, pp. 130–138 (2004)
Mellor, P., Wheatley, P., Sergeant, D.: Migration on Request: A Practical Technique for Digital Preservation. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, p. 516. Springer, Heidelberg (2002)
Meehan, J., Taft, E., Chernicoff, S., Rose, C., Karr, R.: PDF Reference, 5th edn. (2004)
Messerschmitt, D.G.: Opportunities for Research Libraries in the NSF Cyberinfrastructure Program. In: ARL Bimonthly Report, p. 229 (2003), http://www.arl.org/newsltr/229/cyber.html
Multivalent Web site, http://multivalent.sourceforge.net
The National Archives. PRONOM Web site, http://www.nationalarchives.gov.uk/pronom/
Phelps, T.A.: Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Documents and Systems. Ph.D. Dissertation, University of California, Berkeley (1998)
Phelps, T.A., Wilensky, R.: The Multivalent Browser: a platform for new ideas. In: Proceedings of Document Engineering, pp. 58–67 (2001)
Rodig, P., Borghoff, U.M., Scheffczyk, J., Schmitz, L.: Preservation of digital publications: An OAIS extension and implementation. In: Proceedings of the ACM Symposium on Document Engineering, pp. 131–139 (2003)
San Diego Supercomputer Center. Persistent Archive Testbed (PAT), http://www.sdsc.edu/PAT/
Wheatley, P.: Survey and Assessment of Sources of Information on File Formats and Software Documentation (2003), http://www.jisc.ac.uk/uploaded_documents/FileFormatsreport.pdf
Wotzits Format? Web site, http://www.wotsit.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phelps, T.A., Watry, P.B. (2005). A No-Compromises Architecture for Digital Document Preservation. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2005. Lecture Notes in Computer Science, vol 3652. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551362_24
Download citation
DOI: https://doi.org/10.1007/11551362_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28767-4
Online ISBN: 978-3-540-31931-3
eBook Packages: Computer ScienceComputer Science (R0)