Science models for search: a study on combining scholarly information retrieval and scientometrics

Mutschke, Peter; Mayr, Philipp

doi:10.1007/s11192-014-1485-2

Science models for search: a study on combining scholarly information retrieval and scientometrics

Published: 27 November 2014

Volume 102, pages 2323–2345, (2015)
Cite this article

Scientometrics Aims and scope Submit manuscript

Peter Mutschke¹ &
Philipp Mayr¹

943 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

Models of science address statistical properties and mechanisms of science. From the perspective of scholarly information retrieval (IR) science models may provide some potential to improve retrieval quality when operationalized as specific search strategies or used for rankings. From the science modeling perspective, on the other hand, scholarly IR can play the role of a validation model of science models. The paper studies the applicability and usefulness of two particular science models for re-ranking search results (Bradfordizing and author centrality). The paper provides a preliminary evaluation study that demonstrates the benefits of using science model driven ranking techniques, but also how different the quality of search results can be if different conceptualizations of science are used for ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Metrics and Rankings: Myths and Fallacies

The Development of a Webometric Criterion for Ranking Researchers

Article 01 July 2018

Scientometrics and Publications: A Comparative Study of Ranking of Multi-source Databases

Notes

http://www.gesis.org/en/events/events-archive/conferences/issiworkshop2013/.
http://www.gesis.org/en/events/events-archive/conferences/ecirworkshop2014/.
Instead of addressing just local features such as citation counts of papers or author productivity.
http://www.gesis.org/en/research/external-funding-projects/archive/irm/.
ISSNs are stable identifiers for journals.
Actually, co-authorships are computed during indexing time in advance and are retrieved by the system via particular facets added to the user’s query.
To reduce computation effort pure single-authors are not added to the graph.
http://sowiport.gesis.org/. Furthermore, sowiport provides simple re-ranking models such as citation counts of papers and author productivity as well.
There are three extreme cases of very low precision values which need some explanation: A reasonable explanation for the low precision of SOLR in the case of topic 166 is most likely the low selectivity of the search term ‘Deutschland’ (Germany) in SOLIS which might negatively affect the precision of TF-IDF. A possible explanation for the low precision of BRAD in this case of topic 83 is the lower coverage of media related journals in SOLIS. Likewise, the low precision of AUTH in the case of topic 88 can be explained by the rather fragmentary representation of historical science topics in SOLIS which lead to less representative networks. However, much more detailed research needs to be done here.
This issue is addressed by the COST action KNOWeSCAPE (http://knowescape.org/).

References

Albert, R., & Barabási, A.-L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74, 47–97.
Article MATH MathSciNet Google Scholar
Armstrong, T. G., Moffat, A., Webber, W., & Zobel, J. (2009). Improvements that don’t add up: Ad-hoc retrieval results since 1998, . In: Conference on information and knowledge management (pp. 601–610). doi:10.1145/1645953.1646031.
Barabasi, A. L., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002). Evolution of the social network of scientific collaborations. Physica A, 311, 590–614.
Article MATH MathSciNet Google Scholar
Bates, M. J. (1990). Where should the person stop and the information search interface start? Information Processing and Management, 26(5), 575–591.
Article MathSciNet Google Scholar
Bates, M. J. (2002). Speculations on browsing, directed searching, and linking in relation to the Bradford distribution. Paper presented at the fourth international conference on conceptions of library and information science (CoLIS 4).
Bavelas, A. (1948). A mathematical model for group structure. Applied Anthropology, 7, 16–30.
Google Scholar
Beaver, D. (2004). Does collaborative research have greater epistemic authority? Scientometrics, 60(3), 309–408.
Article Google Scholar
Bogers, T., & van den Bosch, A. (2006). Authoritative re-ranking in fusing authorship-based subcollection search results. In: Proceedings of the sixth Belgian–Dutch information retrieval workshop (pp. 49–55). DIR-2006 (2006).
Bookstein, A. (1990). lnformetric distributions. Part I: Unified overview. JASIS, 41(5), 368–375.
Article MathSciNet Google Scholar
Borgatti, S. P., & Everett, M. (2006). A graph-theoretic perspective on centrality. Social Networks, 28, 466–484.
Article Google Scholar
Börner, K., Dall’Asta, L., Ke, W., & Vespignani, A. (2005). Studying the emergine global brain: Analysing and visualizing the impact of co-authorship teams. Complexity. Special issue on understanding. Complex Systems, 10(4), 57–67.
Google Scholar
Börner, K., & Scharnhorst, A. (2009). Visual conceptualizations and models of science. Journal of Informetrics, 3, 161–172.
Article Google Scholar
Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137(3550), 85–86.
Google Scholar
Bradford, S. C. (1948). Documentation. London: Lockwood.
Google Scholar
Brookes, B. C. (1977). Theory of the Bradford law. Journal of Documentation, 33(3), 180–209.
Article Google Scholar
Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22, 191–235.
Article Google Scholar
Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). Towards an explanatory and computational theory of scientific discovery. Journal of Informetrics, 3, 191–209.
Article Google Scholar
Cole, M., Liu, J., Belkin, N. J., Bierig, R., Gwizdka, J., Liu, C., Zhang, J., & Zhang, X. (2009). Usefulness as the criterion for evaluation of interactive information retrieval. In Proceedings of the third human computer information retrieval workshop, Washington, DC.
De Haan, J. (1997). Authorship patterns in Dutch sociology. Scientometrics, 39(2), 197–208.
Article Google Scholar
de Price, D. S. (1976). A general theory of bibliometric and other cumulative advantage processes. Journal of the American Society for Information Science, 27, 292–306.
Article Google Scholar
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378–382.
Article Google Scholar
Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 40, 35–41.
Article Google Scholar
Freeman, L. C. (1978/79). Centrality in social networks: Conceptual clarification. Social Networks 1,(pp. 215–239).
Freeman, L. C. (1980). The gatekeeper, pair-dependency and structural centrality. Quality and Quantity, 14, 585–592.
Article Google Scholar
Friedkin, N. E. (1991). Theoretical foundations for centrality measures. American Journal of Sociology, 96, 1478–1504.
Article Google Scholar
Fuhr, N., Schaefer, A., Klas, C.-P., & Mutschke, P. (2002). Daffodil: An integrated desktop for supporting high-level search activities in federated digital libraries. In Agosti, M., Thanos, C. (Eds.), Research and advanced technology for digital libraries. 6th European Conference, EDCL 2002. Proceedings. Lecture Notes in Computer Science (Vol. 2458, pp. 597–612). Berlin, Heidelberg, New York: Springer.
Garfield, E. (1996). The significant scientific literature appears in a small core of journals. The Scientist, 10(17), 13. Retrieved from http://www.garfield.library.upenn.edu/commentaries/tsv10(17)p13y090296.html.
Goffman, W., & Warren, K. S. (1969). Dispersion of papers among journals based on a mathematical analysis of two diverse medical literatures. Nature, 221(5178), 1205–1207.
Article Google Scholar
Greve, W., & Wentura, D. (1997). Wissenschaftliche Beobachtung: Eine Einführung. Weinheim: PVU/Beltz.
Google Scholar
Grivel, L., Mutschke, P., & Polanco, X. (1995). Thematic mapping on bibliographic databases by cluster analysis: A description of the SDOC environment with SOLIS. Knowledge Organisation, 22, 70–77.
Google Scholar
He, Z.-L. (2009). International collaboration does not have greater epistemic authority. JASIST, 60(10), 2151–2164.
Article Google Scholar
Hjørland, B., & Nicolaisen, J. (2005). Bradford’s law of scattering: Ambiguities in the concept of “subject”. Paper presented at the 5th international conference on conceptions of library and information science.
Ingwersen, P., & Järvelin, K. (2005). The turn-integration of information seeking and retrieval in context. In W. B. Croft (Ed.), The Kluwer international series on information retrieval. Dordrecht: Springer.
Google Scholar
Jiang, Y. (2008). Locating active actors in the scientific collaboration communities based on interaction topology analysis. Scientometrics, 74(3), 471–482.
Article Google Scholar
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Article MATH MathSciNet Google Scholar
Lang, F. R., & Neyer, F. J. (2004). Kooperationsnetzwerke und karrieren an deutschen hochschulen. KfZSS, 56(3), 520–538.
Google Scholar
Leydesdorff, L., & Wagner, C. S. (2008). International collaboration in science and the formation of a core group. Journal of Informetrics, 2(4), 317–325.
Article Google Scholar
Lin, J. (2008). Pagerank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval. BMC Bioinformatics, 9. Retrieved from http://www.biomedcentral.com/1471-2105/9/270.
Liu, X., Bollen, J., Nelson, M. L., & van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(2005), 1462–1480.
Article Google Scholar
Lotka, A. (1926). The frequency distribution of scientific productivity. Journal of the Washington Academy of Sciences, 16(12), 317–323.
Google Scholar
Lu, H., & Feng, Y. (2009). A measure of authors’ centrality in co-authorship networks based on the distribution of collaborative relationships. Scientometrics, 81(2), 499–511.
Article MathSciNet Google Scholar
Mali, F., Kronegger, L., Doreian, P., & Ferligoj, A. (2012). Dynamic scientific co-authorship networks. In A. Scharnhorst, et al. (Eds.), Models of science dynamics. Encounters between complexity theory and information sciences (pp. 195–232). Berlin, Heidelberg: Springer.
Google Scholar
Mayr, P. (2009). Re-ranking auf basis von Bradfordizing für die verteilte Suche in digitalen bibliotheken. Berlin: Humboldt-Universität zu Berlin.
Google Scholar
Mayr, P. (2013). Relevance distributions across Bradford zones: Can Bradfordizing improve search? In J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger, & H. Moed (Eds.), 14th International society of scientometrics and informetrics conference (pp. 1493–1505). Vienna, Austria. Retrieved from http://arxiv.org/abs/1305.0357.
Mayr, P. (2014). Are topic-specific search term, journal name and author name recommendations relevant for researchers? In 4th European symposium on human–computer interaction and information retrieval. London. Retrieved from http://arxiv.org/abs/1408.4440.
Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., & Mutschke, P. (2014). Bibliometric-enhanced information retrieval. In M. de Rijke et al. (Ed.), Proceedings of the 36th European conference on information retrieval (ECIR 2014) (pp. 798–801). Amsterdam: Springer. http://arxiv.org/abs/1310.8226.
Mutschke, P. (1994). Processing scientific networks in bibliographic databases. In H. H. Bock, et al. (Eds.), Information systems and data analysis. Prospects-foundations-applications. Proceedings 17th annual conference of the GfKl 1993 (pp. 127–133). Heidelberg, Berlin: Springer.
Google Scholar
Mutschke, P. (2001). Enhancing information retrieval in federated bibliographic data sources using author network based stratagems. In Constantopoulos, P., Sölvberg, I. T. (Eds.), Research and advanced technology for digital libraries: 5th European conference, ECDL 2001, Proceedings. Lecture notes in computer science (Vol. 2163, pp. 287–299). Berlin, Heidelberg, New York: Springer.
Mutschke, P. (2004). Autorennetzwerke: Verfahren der Netzwerkanalyse als Mehrwertdienste für Informationssysteme. Bonn: Informationszentrum Sozialwissenschaften. (IZ-Arbeitsbericht Nr. 32). http://www.gesis.org/fileadmin/upload/forschung/publikationen/gesis_reihen/iz_arbeitsberichte/ab_32.pdf.
Mutschke, P. (2010). Zentralitäts- und Prestigemaße. In R. Häußling & C. Stegbauer (Eds.), Handbuch Netzwerkforschung. Wiesbaden: VS-Verlag für Sozialwissenschaften.
Google Scholar
Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89(1), 349–364. doi:10.1007/s11192-011-0430-x.
Article Google Scholar
Mutschke, P., & Quan-Haase, A. (2001). Collaboration and cognitive structures in social science research fields: Towards socio-cognitive analysis in information systems. Scientometrics, 52(3), 487–502.
Article Google Scholar
Newman, M. E. J. (2001). The structure of scientific collaboration networks. PNAS, 98, 404–409.
Article MATH Google Scholar
Newman, M. E. J. (2004). Coauthorship networks and patterns of scientific collaboration. PNAS, 101, 5200–5205.
Article Google Scholar
Pontigo, J., & Lancaster, F. W. (1986). Qualitative aspects of the Bradford distribution. Scientometrics, 9(1–2), 59–70.
Article Google Scholar
Salton, G., Fox, E. A., & Wu, H. (1983). Extended Boolean information retrieval. Communications of the ACM, 26(11), 1022–1036.
Article MATH MathSciNet Google Scholar
Saracevic, T. (1975). Relevance: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321–343.
Article Google Scholar
Schaer, P., Mayr, P., & Mutschke, P. (2010). Implications of inter-rater agreement on a student information retrieval evaluation. In Atzmüller, M., Benz, D., Hotho, A., Stumme, G. (Eds.), LWA2010: Lernen, Wissen and Adaptivität; Workshop Proceedings; Kassel, 4–6 Oct 2010. http://arxiv.org/ftp/arxiv/papers/1010/1010.1824.pdf.
Scharnhorst, A., Börner, K., & van den Besselaar, P. (Eds.). (2012). Models of science dynamics. Encounters between complexity theory and information science. Berlin, Heidelberg: Springer.
Google Scholar
Voorhees, E. M., & Harman, D. K. (Eds.). (2005). TREC: Experiment and evaluation in information retrieval. Cambridge: The MIT Press.
Google Scholar
Wassermann, S., & Faust, K. (1994). Social network analysis: Methods and applications. New York: Cambridge University Press.
Book Google Scholar
White, H. D. (1981). ‘Bradfordizing’ search output: how it would help online users. Online Review, 5(1), 47–54.
Article Google Scholar
Worthen, D. B. (1975). The application of Bradford’s law to monographs. Journal of Documentation, 31(1), 19–25.
Article Google Scholar
Yaltaghian, B., & Chignell, M. (2002). Re-ranking search results using network analysis. A case study with Google. In Proceedings of the 2002 conference of the Centre for Advanced Studies on Collaborative Research.
Yan, E., & Ding, Y. (2009). Applying centrality measures to impact analysis: A coauthorship network analysis. JASIST, 60(10), 2107–2118.
Article Google Scholar
Yin, L., Kretschmer, H., Hannemann, R. A., & Liu, Z. (2006). Connection and stratification in research collaboration: An analysis of the COLLNET network. Information Processing and Management, 42, 1599–1613.
Article Google Scholar
Zhou, D., Orshansky, S. A., Zha, H., & Giles, C. L. (2007). Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 2007 seventh IEEE international conference on data mining (pp 739–744).

Download references

Acknowledgments

We thank Philipp Schaer and Thomas Lüke who were the main developers and our co-investigators in the IRM projects. This work was funded by DFG, Grant No. INST 658/6-1 and SU 647/5-2.

Author information

Authors and Affiliations

GESIS–Leibniz-Institute for the Social Sciences, Unter Sachsenhausen 6-8, 50667, Cologne, Germany
Peter Mutschke & Philipp Mayr

Authors

Peter Mutschke
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Mayr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Mutschke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mutschke, P., Mayr, P. Science models for search: a study on combining scholarly information retrieval and scientometrics. Scientometrics 102, 2323–2345 (2015). https://doi.org/10.1007/s11192-014-1485-2

Download citation

Received: 31 October 2014
Published: 27 November 2014
Issue Date: March 2015
DOI: https://doi.org/10.1007/s11192-014-1485-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Science models for search: a study on combining scholarly information retrieval and scientometrics

Abstract

Access this article

Similar content being viewed by others

Metrics and Rankings: Myths and Fallacies

The Development of a Webometric Criterion for Ranking Researchers

Scientometrics and Publications: A Comparative Study of Ranking of Multi-source Databases

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Science models for search: a study on combining scholarly information retrieval and scientometrics

Abstract

Access this article

Similar content being viewed by others

Metrics and Rankings: Myths and Fallacies

The Development of a Webometric Criterion for Ranking Researchers

Scientometrics and Publications: A Comparative Study of Ranking of Multi-source Databases

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation