skip to main content
10.1145/3197026.3197037acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Evaluation of Conformance Checkers for Long-Term Preservation of Multimedia Documents

Published: 23 May 2018 Publication History

Abstract

We develop an evaluation framework for the validation of conformance checkers for the long-term preservation. The framework assesses the correctness, usability, and usefulness of the tools for three media types: PDF/A (text), TIFF (image), and Matroska (audio/video). Finally, we report the results of the validation of these conformance checkers using the proposed framework. In general, the presented framework is a high-level tool that can be quite easily employed in other preservation-related tasks.

References

[1]
O. Alonso. Implementing crowdsourcing-based relevance experimentation: an industrial perspective. Information Retrieval, 16(2):101--120, April 2013.
[2]
C. Becker and K. Duretec. Free Benchmark Corpora for Preservation Experiments: Using Model-Driven Engineering to Generate Data Sets. In Proc. 13th ACM/IEEECS Joint Conference on Digital Libraries (JCDL 2013), pages 349--358. ACM Press, New York, USA, 2013.
[3]
C. Becker, K. Duretec, and A. Rauber. The Challenge of Test Data Quality in Data Processing. ACM Journal of Data and Information Quality (JDIQ), 8(2), 2016.
[4]
C. Becker and A. Rauber. Decision Criteria in Digital Preservation: What to Measure and How. Journal of the American Society for Information Science and Technology (JASIST), 62(6):1009--1028, 2011.
[5]
D. Calvanese, D. De Nart, and C. Tasso, editors. Digital Libraries on the Move -- Proc. 11th Italian Research Conference on Digital Libraries (IRCDL 2015). Communications in Computer and Information Science (CCIS) 612, Springer, Heidelberg, Germany, 2016.
[6]
L. Cappellato, N. Ferro, A. Fresa, M. Geber, B. Justrel, B. Lemmen, C. Prandoni, and G. Silvello. The PREFORMA Project: Federating Memory Institutions for Better Compliance of Preservation Formats. In Calvanese et al. {5}, pages 86--91.
[7]
J.-P. Chanod, M. Dobreva, A. Rauber, S. Ross, and V. Casarosa. Issues in Digital Preservation: Towards a New Research Agenda. In J.-P. Chanod, M. Dobreva, A. Rauber, and S. Ross, editors, Report from Dagstuhl Seminar 10291: Automation in Digital Preservation, Dagstuhl Reports, pages 1--14. Schloss Dagstuhl--LeibnizZentrum für Informatik, Germany, 2010.
[8]
C. W. Cleverdon. The Cranfield Tests on Index Languages Devices. In K. Spärck Jones and P. Willett, editors, Readings in Information Retrieval, pages 47--60. Morgan Kaufmann Publisher, Inc., San Francisco, CA, USA, 1997.
[9]
K. Duretec, A. Kulmukhametov, A. Rauber, and C. Becker. Benchmarks for Digital Preservation Tools. In Proc. 11th International Conference on Preservation of Digital Objects (iPRES 2015), 2015.
[10]
K. Duretec, A. Rauber, and C. Becker. A Text Extraction Software Benchmark Based on a Synthesized Dataset. In 2017 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017, pages 109--118. IEEE Computer Society, 2017.
[11]
N. Ferro. Quality and Interoperability: The Quest for the Optimal Balance. In I. Iglezakis, T.-E. Synodinou, and S. Kapidakis, editors, E-Publishing and Digital Libraries: Legal and Organizational Issues, pages 48--68. IGI Global, USA, 2010.
[12]
N. Ferro. Proposal for an Evaluation Framework for Compliance Checkers for Long-term Digital Preservation. In Digital Libraries and Multimedia Archives -- Proc. 12th Italian Research Conference on Digital Libraries (IRCDL 2016), pages 125--136. Communications in Computer and Information Science (CCIS) 701, Springer, Heidelberg, Germany, 2016.
[13]
N. Ferro. Reproducibility Challenges in Information Retrieval Evaluation. ACM Journal of Data and Information Quality (JDIQ), 8(2):8:1--8:4, January 2017.
[14]
N. Ferro, E. Buelinckx, B. Doubrov, K. Jadeglans, B. Lemmens, J. Martinez, V. Muñoz, C. Prandoni, D. Rice, S. Rohde-Enslin, X. Tarrés, E. Verbruggen, B. Yousefi, and C. Wilson. Deliverable D8.1R2 -- Competitive Evaluation Strategy. PREFORMA PCP Project, EU 7FP, Contract N. 619568, October 2016.
[15]
N. Ferro and G. Silvello. Towards a Semantic Web Enabled Representation of DL Foundational Models: The Quality Domain Example. In Calvanese et al. {5}, pages 24--35.
[16]
N. Ferro, G. Silvello, E. Buelinckx, B. Doubrov, M. Geber, K. Jadeglans, J. Martinez, V. Muñoz, D. Rice, S. Rohde-Enslin, X. Tarrés, E. Verbruggen, B. Yousefi, and C. Wilson. Deliverable D8.6 -- Testing Report. PREFORMA PCP Project, EU 7FP, Contract N. 619568, October 2017.
[17]
N. Fuhr, G. Tsakonas, T. Aalberg, M. Agosti, P. Hansen, S. Kapidakis, C.-P. Klas, L. Kovács, M. Landoni, A. Micsik, C. Papatheodorou, C. Peters, and I. Sølvberg. Evaluation of Digital Libraries. International Journal on Digital Libraries, 8(1):21-- 38, 2007.
[18]
IEC 60958. Digital audio interface - Part 1: General. Standard IEC 60958--1 Ed. 3.1 b:2014, 2014.
[19]
P. Innocenti, S. Ross, E. Maceviciute, T. Wilson, J. Ludwig, and W. Pempe. Assessing Digital Preservation Frameworks: The Approach of the SHAMAN Project. In N. Spyratos, E. Kapetanios, and A. Traina, editors, Proc. ACM International Conference on Management of Emergent Digital EcoSystems (MEDES 2009), pages 412--416. ACM Press, New York, USA, 2009.
[20]
ISO 12234--2. Electronic still-picture imaging -- Removable memory -- Part 2: TIFF/EP image data format. Recommendation ISO 12234--2:2001, 2001.
[21]
ISO 12639. Graphic technology -- Prepress digital data exchange -- Tag image file format for image technology (TIFF/IT). Recommendation ISO 12639:2004, 2004.
[22]
ISO 14721. Space data and information transfer systems -- Open archival information system (OAIS) -- Reference model. Recom. ISO 14721:2012, 2012.
[23]
ISO 19005--1. Document management -- Electronic document file format for long-term preservation -- Part 1: Use of PDF 1.4 (PDF/A-1). Recommendation ISO 19005--1:2005, 2005.
[24]
ISO 19005--2. Document management -- Electronic document file format for long-term preservation -- Part 2: Use of ISO 32000--1 (PDF/A-2). Recommendation ISO 19005--2:2011, 2011.
[25]
ISO 19005--3. Document management -- Electronic document file format for long-term preservation -- Part 3: Use of ISO 32000--1 with support for embedded files (PDF/A-3). Recommendation ISO 19005--3:2012, 2012.
[26]
ISO/IEC 15444. Information technology -- JPEG 2000 image coding system: Core coding system. Recommendation ISO/IEC 15444--1:2004, 2004.
[27]
S. T. Kowalczyk. Before the Repository: Defining the Preservation Threats to Research Data in the Lab. In P. Logasa Bogen II, S. Allard, H. Mercer, and M. Beck, editors, Proc. 15th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2015), pages 215--222. ACM Press, New York, USA, 2015.
[28]
S. Ross. Digital Preservation, Archival Science and Methodological Foundations for Digital Libraries. New Review of Information Networking, 17(1):43--68, 2012.
[29]
F. Sebastiani. Machine Learning in Automated Text Categorization. ACM Computing Surveys (CSUR), 34(1):1--47, March 2002.
[30]
G. Silvello. Theory and practice of data citation. JASIST, 69(1):6--20, 2018.
[31]
I. Soboroff, C. Nicholas, and P. Cahan. Ranking Retrieval Systems without Relevance Judgments. In D. H. Kraft, W. B. Croft, D. J. Harper, and J. Zobel, editors, Proc. 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pages 66--73. ACM Press, New York, USA, 2001.
[32]
M. Sokolova and G. Lapalme. A Systematic Analysis of Performance Measures for Classification Tasks. Information Processing &Management, 45(4):427--437, July 2009.
[33]
The Consultative Committee for Space Data Systems (CCSDS). Reference Model for an Open Archival Information System (OAIS). Magenta Book, Issue 2. Recommended Practice CCSDS 650.0-M-2, http://public.ccsds.org/publications/archive/ 650x0m2.pdf, June 2012.
[34]
E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing &Management, 36(5):697--716, September 2000.

Cited By

View all
  • (2023)On PDF/A Conformance and Font Usage in PDF Documents Provided by Public Sector OrganizationsInternational Journal of Standardization Research10.4018/IJSR.32960520:1(1-19)Online publication date: 6-Sep-2023
  • (2021)Achieving Conformance to Document StandardsInternational Journal of Standardization Research10.4018/IJSR.28852319:1(1-32)Online publication date: 1-Jan-2021
  • (2021)Improving data quality in large-scale repositories through conflict resolutionInternational Journal on Digital Libraries10.1007/s00799-021-00311-0Online publication date: 21-Oct-2021
  • Show More Cited By

Index Terms

  1. Evaluation of Conformance Checkers for Long-Term Preservation of Multimedia Documents

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries
    May 2018
    453 pages
    ISBN:9781450351782
    DOI:10.1145/3197026
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 May 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. conformance checking
    2. evaluation
    3. long-term preservation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    JCDL '18
    Sponsor:

    Acceptance Rates

    JCDL '18 Paper Acceptance Rate 26 of 71 submissions, 37%;
    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)On PDF/A Conformance and Font Usage in PDF Documents Provided by Public Sector OrganizationsInternational Journal of Standardization Research10.4018/IJSR.32960520:1(1-19)Online publication date: 6-Sep-2023
    • (2021)Achieving Conformance to Document StandardsInternational Journal of Standardization Research10.4018/IJSR.28852319:1(1-32)Online publication date: 1-Jan-2021
    • (2021)Improving data quality in large-scale repositories through conflict resolutionInternational Journal on Digital Libraries10.1007/s00799-021-00311-0Online publication date: 21-Oct-2021
    • (2019)Digital curation at workProceedings of the 18th Joint Conference on Digital Libraries10.1109/JCDL.2019.00016(39-48)Online publication date: 2-Jun-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media