skip to main content
10.1145/3322905.3322929acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdatechConference Proceedingsconference-collections
research-article

Validating 126 million MARC records

Published: 08 May 2019 Publication History

Abstract

The paper describes the method and results of validation of 14 library catalogues. The format of the catalog record is Machine Readable Catalog (MARC21) which is the most popular metadata standards for describing books. The research investigates the structural features of the record and as a result finds and classifies different commonly found issues. The most frequent issue types are usage of undocumented schema elements, then improper values in places where a value should be taken from a dictionary, or should match to other strict requirements.

References

[1]
Collections as Data project team. 2017. The Santa Barbara Statement on Collections as Data. v2. https://collectionsasdata.github.io/statement/
[2]
American Library Association, Canadian Library Association, Chartered Institute of Library, and Information Professionals. [n. d.]. Anglo-American Cataloging Rule. 2nd edition. http://www.aacr2.org/
[3]
Henriette D. Avram and Library of Congress. 1975. MARC; its History and implications. Library of Congress. http://catalog.hathitrust.org/Record/002993527
[4]
David Bade. [n. d.]. The Perfect Bibliographic Record: Platonic Ideal, Rhetorical Strategy or Nonsense? 46, 1 ([n. d.]), 109--133. https://doi.org/10.1080/01639370802183081
[5]
Nicole M. Brown, Ruby Mendenhall, Michael L. Black, Mark Van Moer, Assata Zerai, and Karen Flynn. 2016. Mechanized Margin to Digitized Center: Black Feminism's Contributions to Combatting Erasure within the Digital Humanities. International Journal of Humanities and Arts Computing 10, 1 (2016), 110--125. https://doi.org/10.3366/ijhac.2016.0163 arXiv:https://doi.org/10.3366/ijhac.2016.0163
[6]
Matthew L Jockers. 2013. Macroanalysis: Digital methods and literary history. University of Illinois Press.
[7]
Leo Lahti, Jani Marjanen, Hege Roivainen, and Mikko Tolonen. 2019. Bibliographic Data Science and the History of the Book (c. 1500-1800). Cataloging & Classification Quarterly 0, 0 (2019), 1--19. https://doi.org/10.1080/01639374.2018.1543747 arXiv:https://doi.org/10.1080/01639374.2018.1543747
[8]
Jean-Philippe Moreux. 2016. Data Mining Historical Newspaper Metadata. In Proceedings of the IFLA International News Media Conference.
[9]
Federico Nanni. 2017. The Web as a Historical Corpus: Collecting, Analysing and Selecting Sources on the Recent Past of Academic Institutions. Dottorato di ricerca in Science, cognition and technology. Università di Bologna. https://doi.org/10.6092/unibo/amsdottorato/7848
[10]
Library of Congress. 2000. MARC 21 Format for Holdings Data. https://www.loc.gov/marc/holdings/
[11]
Library of Congress. 2018. MARC 21 Format for Bibliographic Data. https://www.loc.gov/marc/bibliographic/
[12]
International Federation of Library Associations and Institutions (IFLA). 2011. International Standard Bibliographic Description. http://www.ifla.org/files/assets/cataloguing/isbd/isbd-cons_20110321.pdf
[13]
Benjamin Smith. 2017. A brief visual history of MARC cataloging at the Library of Congress. http://sappingattention.blogspot.de/2017/05/a-brief-visual-history-of-marc.html
[14]
Roy Tennant. 2002. MARC must die. Library Journal 41, 4 (2002), 185--194. http://lj.libraryjournal.com/2002/10/ljarchives/marc-must-die/

Cited By

View all
  • (2023)A checklist to publish collections as data in GLAM institutionsGlobal Knowledge, Memory and Communication10.1108/GKMC-06-2023-0195Online publication date: 9-Nov-2023
  • (2022)A Shape Expression approach for assessing the quality of Linked Open Data in librariesSemantic Web10.3233/SW-21044114:2(159-179)Online publication date: 15-Dec-2022
  • (2021)Analysis of Clustering Algorithms to Clean and Normalize Early Modern European Book TitlesProceedings of the 2021 4th International Conference on Software Engineering and Information Management10.1145/3451471.3451489(106-112)Online publication date: 16-Jan-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DATeCH2019: Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage
May 2019
163 pages
ISBN:9781450371940
DOI:10.1145/3322905
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data
  2. Data Science
  3. Java
  4. MARC21
  5. metadata quality measurement

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DATeCH2019

Acceptance Rates

Overall Acceptance Rate 60 of 86 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A checklist to publish collections as data in GLAM institutionsGlobal Knowledge, Memory and Communication10.1108/GKMC-06-2023-0195Online publication date: 9-Nov-2023
  • (2022)A Shape Expression approach for assessing the quality of Linked Open Data in librariesSemantic Web10.3233/SW-21044114:2(159-179)Online publication date: 15-Dec-2022
  • (2021)Analysis of Clustering Algorithms to Clean and Normalize Early Modern European Book TitlesProceedings of the 2021 4th International Conference on Software Engineering and Information Management10.1145/3451471.3451489(106-112)Online publication date: 16-Jan-2021
  • (2021) Patterns of Subject Metadata Change in MARC 21 Bibliographic Records for Video Recordings Proceedings of the Association for Information Science and Technology10.1002/pra2.49458:1(543-547)Online publication date: 13-Oct-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media