skip to main content
10.1145/1378889.1379005acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
poster

Harvesting needed to maintain scientific literature online

Published: 16 June 2008 Publication History

Abstract

Millions of scientific articles are accessible freely on the web. While some of them are stored in institutional repositories many are made available on personal pages which are exposed to the net's transience. We found that nearly 11% of URLs of PDF documents containing references to life science publications were not accessible within 5 months after being harvested using a search engine's (SE) API. For most of them (8.4%) no SE cache backup could be found. Although we have yet to estimate the exact rate at which the scientific literature disappears and the duration of its disappearance the results so far are a clear indicator that web harvesting is needed to preserve the online scientific literature.

References

[1]
McCown F., Nelson M. 2007. Search Engines and Their Public Interfaces: Which APIs are the Most Synchronized? WWW2007 Conference.
[2]
Google blocks academic http://tinyurl.com/2j7alg last retrieved on February 3, 2008

Cited By

View all
  • (2011)e-Research applications for tracking online socio-political capital in the Asia-Pacific regionAsian Journal of Communication10.1080/01292986.2011.59489721:5(450-466)Online publication date: Oct-2011
  • (2008)Integrating Biomedical Publications with Existing MetadataProceedings of the 2008 21st IEEE International Symposium on Computer-Based Medical Systems10.1109/CBMS.2008.127(653-655)Online publication date: 17-Jun-2008

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
June 2008
490 pages
ISBN:9781595939982
DOI:10.1145/1378889
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. availability
  2. preservation
  3. scientific literature

Qualifiers

  • Poster

Conference

JCDL08
JCDL08: Joint Conference on Digital Libraries
June 16 - 20, 2008
PA, Pittsburgh PA, USA

Acceptance Rates

JCDL '08 Paper Acceptance Rate 33 of 117 submissions, 28%;
Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2011)e-Research applications for tracking online socio-political capital in the Asia-Pacific regionAsian Journal of Communication10.1080/01292986.2011.59489721:5(450-466)Online publication date: Oct-2011
  • (2008)Integrating Biomedical Publications with Existing MetadataProceedings of the 2008 21st IEEE International Symposium on Computer-Based Medical Systems10.1109/CBMS.2008.127(653-655)Online publication date: 17-Jun-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media