Skip to main content
Log in

Informetric studies using databases: Opportunities and challenges

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Since their arrival in the 1960s, electronic databases have been an invaluable tool for informetricians. Databases and their delivery mechanism have provided both the source of raw data, as well as the analytical tools for many informetric studies. In particular, the citation databases produced by the Institute for Scientific Information have been the key source of data for a whole range of citation-based research. However, there are also many problems and challenges associated with the use of online databases. Most of the problems arise because databases are designed primarily for information retrieval purposes, and informetric studies represent only a secondary use of the systems. The sorts of problems encountered by informetricians include: errors or inconsistency in the data itself; problems with the coverage, overlap and changeability of the databases; as well as problems and limitations in the tools provided by the database hosts such as DIALOG. For some informetric studies, the only viable solution to these problems is to download the data and perform offline correction and data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BOURKE, P., BUTLER, L. (1996a), Publication types, citation rates and evaluation. Scientometrics, 37: 473-494.

    Article  Google Scholar 

  • BOURKE, P., BUTLER, L. (1996b), Standards issues in a national bibliometric database: the Australian case. Scientometrics, 35: 199-207.

    Article  Google Scholar 

  • BOURNE, C. P. (1977), Frequency and impact of spelling errors in bibliographic databases. Information Processing & Management, 13: 1-12.

    Article  Google Scholar 

  • BRAUN, T., BROCKEN, M., GLäNZEL, W., RINIA, E., SCHUBERT, A. (1995), Hyphenation of databases in building scientometric indicators: Physics briefs, SCI based indicators of 13 European countries, 1980-1989. Scientometrics, 33: 131-148.

    Article  Google Scholar 

  • BRAUN, T., BUJDOSO, E., SCHUBERT, A. (1987), Literature of analytical chemistry: A scientometric evaluation. Boca Raton: CRC Press, Inc.

    Google Scholar 

  • BROOKS, T. A. (1998), The Bibliometrics Toolbox. [Available at ftp://ftp.u.washington.edu/public/tabrooks/toolbox]

  • BURTON, H. D. (1988), Use of a virtual information system for bibliometric analysis. Information Processing & Management, 24: 39-44.

    Article  Google Scholar 

  • BYLER, A. M., RAVENHALL, M. (1988), Using Dialindex for the identification of online databases relevant to urban and regional-planning. Online Review, 12: 119-133.

    Google Scholar 

  • Carpenter, M. P., NARIN, F. (1981), The adequacy of the Science Citation Index (SCI) as an indicator of international scientific activity. Journal of the American Society for Information Science, 32: 430-439.

    Google Scholar 

  • CHRISTENSEN, F. H., INGWERSEN, P. (1996), Online citation analysis: a methodological approach. Scientometrics, 37: 39-62.

    Article  Google Scholar 

  • COILE, R. C. (1977), Error Detection in Computerized Information Retrieval Data Bases. Arlington, VA: Center for Naval Analyses.

    Google Scholar 

  • CRONIN, B., ATKINS, H. B. (Eds) (2000), The Web of Knowledge: A Festschrift in Honor of Eugene Garfield. Medford, NJ, Information Today.

    Google Scholar 

  • DE STRICKER, U., SERIO, S., CASEY, V. (1997), Information resources in Canada. Database, 20: 18-32.

    Google Scholar 

  • DEOGAN, M. S. (1987), On-line bibliometrics. Lucknow Librarian, 19: 43-48.

    Google Scholar 

  • DHYANI, D., NG, W. K., BHOWMICK, S. S. (2002), A survey of Web metrics. ACM Computing Surveys, 34 (4): 469-503.

    Article  Google Scholar 

  • DIALOG (2002a), DIALINDEX. http://library.dialog.com/pocketguide/pktgde.pdf. pp. 46.48. 8th July, 2002.

  • DIALOG (2002b), DIALOG Home Page. http://www.dialog.com. 8th July, 2002.

  • DIALOG (2002c), Duplicate Detection. http://library.dialog.com/pocketguide/pktgde.pdf. p. 30. 8th July, 2002.

  • DIALOG (2002d), OneSearch. http://library.dialog.com/pocketguide/pktgde.pdf. pp. 27.30. 8th July, 2002.

  • DIALOG (2002e), RANK Command. http://library.dialog.com/pocketguide/pktgde.pdf. pp. 17.21. 8th July, 2002.

  • DIALOG Bluesheets (2002), Databases in Alphabetical Order. http://library.dialog.com/bluesheets/html/blf.html. 17th July, 2002.

  • EGGHE, L. (1988), Concentration places, concentration evolutions, and online information retrieval techniques for calculating them. Information Processing & Management, 24: 109-121.

    Article  Google Scholar 

  • EPSTEIN, B. A., ANGIER, J. J. (1980), Multi-database searching in the behavioral sciences. Part 1: basic techniques and core databases. Database, 3: 9-15.

    Google Scholar 

  • ERNEST, D. J., LANGE, H. R., HERRING, D. (1988), An online comparison of three library science databases. RQ, 28: 185-194.

    Google Scholar 

  • EVANS, J. E. (1980), Database selection in an academic library: are those big multi-file searches really necessary? Online, 4: 35-43.

    Google Scholar 

  • FEDOROWICZ, J. (1982), A Zipfian model of an automatic bibliographic system: an application to MEDLINE. Journal of the American Society for Information Science, 33: 223-232.

    Article  Google Scholar 

  • Gale Directory of Online, Portable, and Internet Databases. (2002), http://library.dialog.com/bluesheets/html/bl0230.html. 28th August, 2002.

  • GARFIELD, E. (1955), Citation indexes for science: A new dimension in documentation through association of ideas. Science, 122(3159), 108-111.

    Google Scholar 

  • GARFIELD, E. (1990), The Russians are coming! Part 1. The red-hot 100 Soviet scientists, 1973-1988. In: Essays of an information scientist: Journalology, KeyWords Plus, and other Essays. Vol. 13, 202-215. Also available from: Current Contents. (24), 5-18, June 11, 1990.

    Google Scholar 

  • HAAS, S., CLARK, M. (1992), Research journals and databases covering the field of agrochemicals and water pollution. Science and Technology Libraries, 13: 57-64.

    Article  Google Scholar 

  • HAWKINS, D. T. (1977), Unconventional uses of on-line information retrieval systems: on-line bibliometric studies. Journal of the American Society for Information Science, 28: 13-18

    Google Scholar 

  • HAWKINS, D. T. (1978), Multiple database searching: techniques and pitfalls. Online, 2: 9-15.

    Google Scholar 

  • HAWKINS, D. T. (1981), Machine-readable output from online searches. Journal of the American Society for Information Science, 32: 253-256.

    Google Scholar 

  • HIBBS, J. E., BOBNER, R. R., NEWMAN, I., DYE, C. M., BENZ, C. R. (1984), How to use online databases to perform trend analysis in research. Online, 8: 59-64.

    Google Scholar 

  • HOOD, W. W. (1998), An Informetric Study of the Distribution of Bibliographic Records in Online Databases: A Case Study Using the Literature of Fuzzy Set Theory (1965-1993), PhD dissertation. Sydney, The University of New South Wales. http://www.library.unsw.edu.au/∼thesis/adt-NUN/public/adt-NUN1999.0033.

    Google Scholar 

  • HOOD, W. W., WILSON, C. S. (1992), An Analysis of the Indexing Used in the LISA Database. (Ed.), Kensington, Australia: The School of Information, Library and Archive Studies, University of New South Wales.

    Google Scholar 

  • HOOD, W. W., WILSON, C. S. (1994), Indexing terms in the LISA database on CD-ROM. Information Processing & Management, 30: 327-342.

    Article  Google Scholar 

  • HOOD, W. W., WILSON, C. S. (1999), The distribution of bibliographic records in databases using different counting methods for duplicate records. Scientometrics, 46: 473-486.

    Article  Google Scholar 

  • HOOD, W. W., WILSON, C. S. (2001), The scatter of documents over databases in different subject domains: How many databases are needed. Journal of the American Society for Information Science and Technology, 54: 1242-1254.

    Article  Google Scholar 

  • HOOD, W. W., WILSON, C. S. (2002), Analysis of the fuzzy set literature using phrases. Scientometrics, 54: 103-118.

    Article  Google Scholar 

  • HOOD, W. W., WILSON, C. S. (2003), Overlap in bibliographic databases. Journal of the American Society for Information Science and Technology, (in press).

  • HUDNUT, S. K. (1993), Finding answers by the numbers: statistical analysis of online search results. In: M. E. WILLIAMS (Ed.), Proceedings of the 14th National Online Meeting, (pp. 209-219), Medford, NJ, Learned Information.

    Google Scholar 

  • INGWERSEN, P. (1998), Personal Communication.

  • INGWERSEN, P., CHRISTENSEN, F. H. (1997), Data set isolation for bibliometric online analyses of research publications: fundamental methodological issues. Journal of the American Society for Information Science, 48: 205-217.

    Article  Google Scholar 

  • ISI (2002), Web of Science, http://www.isinet.com/isi/products/citation/wos/. 28th August, 2002.

  • JACSÓ, P. (1997), Content evaluation of databases. In: WILLIAMS, M. E. (Ed.) Annual Review of Information Science and Technology, Vol. 32. (pp. 231-267), Medford, NJ, Information Today.

    Google Scholar 

  • JACSÓ, P. (1999), Database section tools. Online & CD-ROM Review. 23: 227-229.

    Google Scholar 

  • LANCASTER, F. (1991), Bibliometric Methods in Assessing Productivity and Impact of Research. (Ed.), Bangalore, Sarada Ranganathan Endowment for Library Science.

    Google Scholar 

  • LANCASTER, F. W., LEE, J.-L. (1985), Bibliometric techniques applied to issues management: A case study. Journal of the American Society for Information Science, 36: 389-397.

    Google Scholar 

  • LANCASTER, F. W., MEHROTRA, R., OTSU, K. (1984), Some publication patterns in Indian and Japanese science: a bibliometric comparison. International Forum on Information and Documentation, 9: 11-16.

    Google Scholar 

  • LUUKKONEN, T. (1989), Publish in a visible journal or perish? Assessing citation performance of Nordic cancer research. Scientometrics, 15: 349-367.

    Article  Google Scholar 

  • MARX, W., SCHIER, H., WANITSCHEK, M. (2001), Citation analysis using online databases: Feasibilities and shortcomings. Scientometrics, 52: 59-82.

    Article  Google Scholar 

  • MCGRATH, W. E. (1996), The unit of analysis (objects of study) in bibliometrics and Scientometrics. Scientometrics, 35: 257-264.

    Article  Google Scholar 

  • MIDORIKAWA, N., MIYAMOTO, S., NAKAYAMA, K. (1990) A view of studies on bibliometrics and related subjects in Japan. In: BORGMAN, C. L. (Ed.), Scholarly Communication and Bibliometrics. (pp. 73-83), Newbury Park, SAGE Publications.

    Google Scholar 

  • MILLER, C. (1990), Detecting duplicates: a searcher.s dream come true. Online, 14: 27-34.

    Google Scholar 

  • MIYAMOTO, S., MIDORIKAWA, N., NAKAYAMA, K. (1989), A view of studies on bibliometrics and related subjects in Japan. Communication Research, 16: 629-641.

    Google Scholar 

  • MOED, H. F. (1988), The use of on-line databases in bibliometric analysis. In: L. EGGHE, R. ROUSSEAU (Eds), Informetrics 87/88. Select Proceedings of the First International Conference on Bibliometrics and Theoretical Aspects of Information Retrieval. (pp. 133-146), Netherlands, Elsevier.

    Google Scholar 

  • MOED, H. F. (1989), Bibliometric measurement of research performance and Price.s theory of differences among sciences. Scientometrics, 15: 473-483.

    Article  Google Scholar 

  • NORTON, N. P. (1981), Dirty data-a call for quality control. Online, 5: 40-41.

    Google Scholar 

  • OJALA, M. (1992), Quality online and online quality. (The Dollar $ign), Online, 16: 73-75.

    Article  Google Scholar 

  • OSAREH, F., WILSON, C. S. (2002), Collaboration in Iranian scientific publications. Libri, 52: 25-35.

    Google Scholar 

  • PAO, M. L. (1989), Importance of quality data for bibliometric research. In: C. NIXON, L. PADGETT (Eds), National Online Meeting. Proceedings. (pp. 321-327), Medford, NJ, Learned Information.

    Google Scholar 

  • PERSSON, O. (1986), Online bibliometrics. A research tool for everyman. Scientometrics, 10: 69-75.

    Article  Google Scholar 

  • PERSSON, O. (1988), Measuring scientific output by online techniques. In: VAN RAAN, A. F. J. (Ed.), Handbook of Quantitative Studies of Science and Technology. (pp. 229-252), Amsterdam, Elsevier Science.

    Google Scholar 

  • PITERNICK, A. B. (1982), Standardization of journal titles in databases (letter to the editor), Journal of the American Society for Information Science, 33: 105.

    Google Scholar 

  • PROVOST, F., NIEUWENHUYSEN, P. (1992), Measuring overlap of databases in water supply and sanitation using sampling and the binomial probability distribution. Scientometrics, 25: 201-208.

    Article  Google Scholar 

  • REID, E. O. F. (1992), Using online databases to analyze the development of a specialty: case study of terrorism. In: WILLIAMS, M. E. (Ed.), 13th National Online Meeting. (pp. 279-291), Medford, NJ, Learned Information.

    Google Scholar 

  • ROY, D., HUGHES, J. P., JONES, A. S., Fenton, J. E. (2002), Citation analysis of otorhinolaryngology journals. Journal of Laryngology and Otology. 116(5): 363-366.

    Article  Google Scholar 

  • SANDISON, A. (1989), Thinking about citation analysis. Journal of Documentation, 45: 59-64.

    Google Scholar 

  • SAARTI, J. (2001), Consistency of subject indexing of novels by public library professionals and patrons. Journal of Documentation. 58(1): 49-65.

    Google Scholar 

  • SEGLEN, P. O. (1989), Evaluering av forskningskvalitet ved hjaelp af siteringsanalyse og andre bibliometriske metoder. In Norwegian. [Evaluation of research quality by means of citation analysis and other bibliometric methods]. Nordisk Medicin, 104, 331-335; 341-342.

    Google Scholar 

  • SMITH, L. C. (1981), Citation analysis. Library Trends, 30: 83-106.

    Google Scholar 

  • SNOW, B. (1993), RANK: A new tool for analyzing search results on DIALOG. Database, 16: 111-119.

    Google Scholar 

  • STEFANIAK, B. (1987), Use of bibliographic data bases for scientometric studies. Scientometrics, 12: 149-161.

    Article  Google Scholar 

  • STERN, B. T. (1977), Evaluation and design of bibliographic data bases. In: M. E. WILLIAMS (Ed.), Annual Review of Information Science and Technology. (pp. 3-30), New York, Knowledge Industry Publications for American Society for Information Science.

    Google Scholar 

  • TENOPIR, C. (1982), Distributions of citations in databases in a multidisciplinary field. Online Review, 6: 399-419.

    Google Scholar 

  • THORNE, F. C. (1977), The citation index: another case of spurious validity. Journal of Clinical Psychology, 33: 1157-1161.

    Google Scholar 

  • TORRICELLA-MORALES, R. G., VAN HOODYDONK, G., ARAUJO-RUIZ, J. A. (2000), Citation analysis of Cuban research. Part 1. A case study: the Cuban Journal of Agricultural Science. Scientometrics, 47: 413-426.

    Article  Google Scholar 

  • VAN CAMP, A. J. (1991), StarSearch for the health sciences. (Caduceus), Database, 14: 99-101.

    Google Scholar 

  • WALKER, G. (1990), Searching the humanities-subject overlap and search vocabulary. Database, 13: 37-46.

    Google Scholar 

  • WANGER, J. (1977), Multiple database use. Online, 1: 35-41.

    Google Scholar 

  • WHITE, H. D. (1996), Literature retrieval for interdisciplinary synthesis. Library Trends, 45: 239-264.

    Google Scholar 

  • WHITE, H. D. (2001), Computing a curriculum: Descriptor-based domain analysis for educations. Information Processing & Management, 37: 91-117.

    Article  MATH  Google Scholar 

  • WHITE, H. D., GRIFFITH, B. C. (1987), Quality of indexing in online data bases. Information Processing & Management, 23: 211-224.

    Article  Google Scholar 

  • WHITE, H. D., MCCAIN, K. W. (1989), Bibliometrics. In: WILLIAMS, M. E. (Ed.), Annual Review of Information Science and Technology, Vol. 24. (pp. 119-186), Amsterdam, The Netherlands, Elsevier Science Publishers B.V. for the American Society for Information Science.

    Google Scholar 

  • WILLIAMS, M. E. (2002), The state of databases today: 2002. In: E. NAGEL (Ed.), Gale Directory of Databases. (pp. xvii-xxx) Detroit, Gale Group, Inc.

    Google Scholar 

  • WILLIAMS, M. E., LANNOM, L. (1981), Lack of standardization of the journal title data element in databases. Journal of the American Society for Information Science, 32: 229-233.

    Google Scholar 

  • WILSON, C. S. (1999), Informetrics. In: WILLIAMS, M. E. (Ed.), Annual Review of Information Science and Technology, Vol. 34. (pp. 107-247), Medford, NJ, Information Today.

    Google Scholar 

  • WILSON, C. S., MARKUSOVA, V. A. (in prep.), The effect of politico-economic changes in Russia from 1980 to 2000 on its scientific output as reflected in the Science Citation Index.

  • WILSON, C. S., OSAREH, F. (2003), Science and research in Iran: A scientometric Study. Interdisciplinary Science Reviews, 28(1): 26-37.

    Article  Google Scholar 

  • WOLFRAM, D., CHU, C. M., LU, X. (1990), Growth of knowledge: bibliometric analysis using online database data. In: L. EGGHE, R. ROUSSEAU (Eds), Informetrics 89/90: Selection of Papers Submitted for the 2nd International Conference on Bibliometrics, Scientometrics and Informetrics, London, Ontario, Canada. (pp. 355-372), Amsterdam, The Netherlands, Elsevier.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William W. Hood.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hood, W.W., Wilson, C.S. Informetric studies using databases: Opportunities and challenges. Scientometrics 58, 587–608 (2003). https://doi.org/10.1023/B:SCIE.0000006882.47115.c6

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:SCIE.0000006882.47115.c6

Keywords

Navigation