Skip to main content

Progress of Self-Archiving Within the DML Corpus, with a View Toward Community Dynamics

  • Conference paper
  • First Online:
Intelligent Computer Mathematics (CICM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9791))

Included in the following conference series:

  • 570 Accesses

Abstract

Self-archiving has developed as a key component to realize Open Access within the DML framework, with the arXiv being by far the most widely used platform. Important features like full-text formula search are facilitated by the openly available sources. However, despite the obvious growth of the arXiv corpus, it is not clear what share of the published mathematical literature is already openly accessible in this way, and whether it might eventually converge to full coverage. We present the methodology of the matching procedure of the zbMATH corpus (comprising most of the published math literature since 1868) to the arXiv, and derive from the granular zbMATH data a detailed analysis of the progress of self-archiving within the different mathematical communities, taking into account subject specifics, publication delays, peer review policies, and author networks, among other things. On this basis we give some projections of future developments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The generic approach of [9] has the immense methodological drawback of testing just accessibility, therefore mixing self-archiving, academic, gold and predatory open access, and relying on Google Scholar and Microsoft Academic Search related estimates, with their inherent high imprecision due to possibly inflated data and largely unsolved questions of scope and quality.

  2. 2.

    There are of course prominent exceptions like the famous [18], currently the arXiv:math document with the earliest publication year.

  3. 3.

    The most extreme case so far seems to be [17] with a delay of no less than 21 years.

  4. 4.

    The publication type plays a role as well: so far, only 331 math books are on the arXiv, a considerable part of which are derived from PhD theses.

References

  1. arXiv e-Print archive. http://arxiv.org/

  2. arXiv Mathematics article statistics (2015). http://arxiv.org/year/math/15

  3. arXiv bulk data. https://arxiv.org/help/bulk_data

  4. The Global Digital Mathematics Library Working Group. https://blog.wias-berlin.de/imu-icm-panel-wdml/2014/08/28/the-global-digital-mathematical-library-working-group-gdml-wg/

  5. dblp database. http://dblp.uni-trier.de/

  6. Elasticsearch open source software. https://www.elastic.co/products/elasticsearch

  7. Electronic Library of Mathematics. http://www.emis.de/elibm

  8. European Digital Mathematics Library. https://eudml.org/

  9. Giles, C.L., Khabsa, M.: The number of scholarly documents on the public web. PLoS ONE 9(5), e93949 (2014)

    Article  Google Scholar 

  10. Grobid Machine Learning Library. https://github.com/kermitt2/grobid

  11. Hyper Articles en Ligne. https://hal.archives-ouvertes.fr/

  12. Harnad, S.: The self-archiving initiative. Nature 410(6832), 1024–1025 (2001)

    Article  Google Scholar 

  13. Ingoldsby, T.: Physics journals and the arXiv: what is myth and what is reality? American Institute of Physics, Technical report (2009)

    Google Scholar 

  14. Kohlhase, M., Mihaljević-Brandt, H., Sperber, W., Teschke, O.: Mathematical formula search. Eur. Math. Soc. Newsl. 89, 56–58 (2013)

    MATH  Google Scholar 

  15. Mathematics Subject Classification (2010). http://msc2010.org/

  16. Müller, F., Teschke, O.: Will all mathematics be on the arXiv (soon)? Eur. Math. Soc. Newslett. 99, 55–57 (2016)

    MATH  Google Scholar 

  17. Poirier, A.: Hubbard forests. Ergodic Theory Dyn. Syst. 33(1), 303–317 (2013). arXiv:math/9208204

    Article  MathSciNet  MATH  Google Scholar 

  18. Grothendieck, A., Raynaud, M.: Séminaire de géométrie algébrique du Bois Marie 1960/1961 (SGA 1). Lect. Notes Math. 224, xxii+447 pp. (1971). arXiv:math/0206203

  19. Saharon Shelah author profile. https://zbmath.org/authors/shelah.saharon

  20. TEI XML format. http://www.tei-c.org/index.xml

  21. zbMATcH interface. https://zbmath.org/citationmatching/

  22. zbMATH database (1755–). https://zbmath.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olaf Teschke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Müller, F., Teschke, O. (2016). Progress of Self-Archiving Within the DML Corpus, with a View Toward Community Dynamics. In: Kohlhase, M., Johansson, M., Miller, B., de Moura, L., Tompa, F. (eds) Intelligent Computer Mathematics. CICM 2016. Lecture Notes in Computer Science(), vol 9791. Springer, Cham. https://doi.org/10.1007/978-3-319-42547-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42547-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42546-7

  • Online ISBN: 978-3-319-42547-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics