Skip to main content
Log in

The lifespan of “informetrics” on the Web: An eight year study (1998–2006)

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The World Wide Web is growing at an enormous speed, and has become an indispensable source for information and research. New pages are constantly added, but there are additional processes as well: pages are moved or removed and/or their content changes. We report here the results of an eight year long project started in 1998, when multiple search engines were used to identify a set of pages containing the term informetrics. Data collection was repeated once a year for the last eight years (with the exception of 2000 and 2001) using both search engines and revisiting previously identified pages. The results show that the number of pages grew from 866 in 1998 to 28,914 in 2006 — a 33-fold growth. Besides the obvious growth of the topic on the Web, we observed both decay (pages disappearing from the Web) and modification. Even though most of the pages from 1998 either disappeared or ceased to contain the term informetrics, 165 pages (19.1%) still exist in 2006 and contain the search term. We followed the “fate” of these 165 pages: characterized the publishers, the contents and the changes that occurred the whole period. In recent years e-print servers and publishers’ sites became sources of large number of pages related to informetrics. Longitudinal studies following the evolution of a topic on the Web are very important, since they provide insights about content and the underlying Web processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bar-Ilan, J. (2000), The Web as information source on informetrics? — A content analysis. Journal of the American Society for Information Science, 51(5): 432–443.

    Article  Google Scholar 

  • Bar-Ilan, J., Peritz, B. C. (1999), The life-span of a specific topic on the Web — The case of “informetrics”: A quantitative analysis. Scientometrics, 46: 371–382.

    Article  Google Scholar 

  • Bar-Ilan, J., Peritz, B. C. (2004), Evolution, continuity and disappearance of documents on a specific topic on the Web — A longitudinal study of “informetrics”. Journal of the American Society for Information Science and Technology, 56: 980–990.

    Article  Google Scholar 

  • Baeza-Yates, R., Poblete, B. (2003), Evolution of the Chilean Web structure composition. In: Proceedings of the First Latin American Web Congress (LA-WEB 2003), Retrieved November 12, 2006 from: http://www.la-web.org/2003/stamped/02_baeza-yates-poblete.pdf

  • Bergman, M. K. (2001), The deep Web: Surfacing hidden value. Journal of Electronic Publishing, 7(1), Retrieved September 12, 2007, from http://www.press.umich.edu/jep/07-01/bergman.html

  • Bharat, K., Broder, A. (1998), A technique for measuring the relative size and overlap of public Web search engines. Computer Networks and ISDN Systems, 30: 379–388.

    Article  Google Scholar 

  • Brewington, B. E., Cybenko, G. (2000), How dynamic is the Web? Computer Networks, 33: 257–276.

    Article  Google Scholar 

  • Casserly, M. F., Bird, J. E. (2003), Web citation availability: analysis and implications for scholarship, College and Research Libraries, 64(7): 300–317.

    Google Scholar 

  • Cho, J., Garcia-Molina, H. (2000), The Evolution of the Web and implications for an incremental crawler. In: Proceedings of 26th International Conference on Very Large Databases (VLDB), September 2000, (pp. 200–210).

  • Fetterly, D., Manasse, M., Najork, M., Wiener, J. L. (2004), A large scale study of the evolution of Web pages. Software — Practice and Experience, 34: 213–237.

    Article  Google Scholar 

  • Goh, D. H., Ng, P. K. (NO DATE), Link decay in leading information science journals. To appear in JASIST. Retrieved November 17, 2006 from: http://www3.interscience.wiley.com/cgi-bin/fulltext/113452914/HTMLSTART

  • Gomes, D., Silva, M. J. (2006), Modeling information persistence on the Web. In: Proceedings of the 6th International Conference on Web Engineering (ICWE06), (pp.193–200).

  • Ke, Y., Deng, L., Ng, W., Lee, D. L. (2006), Web dynamics and their ramifications for the development of Web search engines. Computer Networks, 50: 1430–1447.

    Article  MATH  Google Scholar 

  • Kim, S. J., Lee, S. H. (2005), An empirical study on the change of Web pages. In: Proceedings of APWeb 2005, LNCS 3399, (pp. 632–642).

  • Koehler, W. (2004), A longitudinal study of Web pages continued: A report after six years. Information Research, 9(2) paper 174. Retrieved November 12, 2006 from: http://InformationR.net/ir/9-2/paper174.html

  • Krippendorff, K. (2003), Content Analysis: An Introduction to Its Methodology. 2nd edition. Sage Publications.

  • Lawrence, S., Giles, C. L. (1998), Searching the World Wide Web. Science, 280(5360): 98–100.

    Article  Google Scholar 

  • Lawrence, S., Giles, C. L. (1999), Accessibility of information on the Web. Nature, 400: 107–109.

    Article  Google Scholar 

  • Lawrence, S., Pennock, D. M., Krovetz, R., Coetzee, F. M., Glover, E., Nielsen, F. A., Giles, L. E. (2001), Persistence of Web references in scientific research. Computer, 34(2): 26–31.

    Article  Google Scholar 

  • Markwell, J., Brooks, D. W. (2003), “Link rot” limits the usefulness of Web-based educational material in biochemistry and molecular biology. Biochemistry and Molecular Biology Education, 31(1): 69–72.

    Article  Google Scholar 

  • McCown, F., Chan, S., Nelson, M. L., Bollen, J. (2005), The availability and persistence of Web references in D-Lib Magazine. 5th International Web Archiving Workshop (IWAW05), Vienna, Austria. Retrieved November 12, 2006 from: http://arxiv.org/ftp/cs/papers/0511/0511077.pdf

  • Mizzaro, S. (1998), How many relevances in information retrieval? Interacting with Computers, 10(1998): 305–322. Retrieved November 12, 2006 from: http://www.dimi.uniud.it/mizzaro/research/papers/IwC.pdf

    Google Scholar 

  • Nelson, M. L., Allen, B. D. (2002), Object persistence and availability in digital libraries. D-Lib Magazine, 8(1). November 12, 2006 from: http://www.dlib.org/dlib/january02/nelson/01nelson.html

  • Neudorf, K. A. (2001), The Content Analysis Guidebook. Sage Publications.

  • Ntoulas, A., Cho, J., Olston, C. (2004), What’s new on the Web? The evolution of the Web from a search engine perspective. In: Proceedings of the World-Wide Web Conference (www), May 2004, (pp. 1–12).

  • Ortega, J. L., Aguillo, I., Prieto, J. (2006), A longitudinal study of content and elements in scientific Web environment. Journal of Information Science, 32: 344–351.

    Article  Google Scholar 

  • Rousseau, R. (1999), Daily time series of common single word searches in AltaVista and Northern Light. Cybermetrics, 2/3(1), paper 2. Retrieved November 12, 2006 from: http://www.cindoc.csic.es/cybermetrics/articles/v2i1p2.html

  • Saracevic, T. (1998), Relevance reconsidered. In: Proceedings of the Second Conference on Conceptions of Library and Information Science (CoLIS 2), Copenhagen, Denmark (pp. 201–218).

  • Sellitto, C. (2005), The impact of impermanent Web-located citations: A study of 123 scholarly conference publications. Journal of the American Society for Information Science and Technology, 56(7): 695–703.

    Article  Google Scholar 

  • Spinellis, D. (2003), The decay and failures of URL references. Communications of the ACM, 46(1): 71–77.

    Article  MathSciNet  Google Scholar 

  • Toyoda, M., Kitsuregawa, M. (2006) What’s really new on the Web? Identifying new pages from a series of unstable web snapshots. In: Proceedings of www2006 (2006), (pp. 233–241).

  • Tyler, D. C., Mcneil, B. (2003), Librarians and link rot: A comparative analysis with some methodological considerations. Portal: Libraries and the Academy, 3(4): 615–632.

    Article  Google Scholar 

  • Wren, J. D. (2004), 404 not found: The stability and persistence of URLs published in Medline. Bioinformatics, 20(5): 668–672.

    Article  Google Scholar 

  • Wren, J. D., Johnson, K. R., Crockett, D. M., Heilig, L. F., Schilling, L. M., Dellavalle, R. P. (2006), Uniform Resource Locator decay in dermatology journals. Archives of Dermatology, 142: 1147–1152.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Judit Bar-Ilan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bar-Ilan, J., Peritz, B.C. The lifespan of “informetrics” on the Web: An eight year study (1998–2006). Scientometrics 79, 7–25 (2009). https://doi.org/10.1007/s11192-009-0401-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-009-0401-7

Keywords

Navigation