skip to main content
10.1145/2938503.2938523acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
short-paper

A Scientist's Impact over Time: The Predictive Power of Clustering with Peers

Published: 11 July 2016 Publication History

Abstract

The identification of latent patterns in big scholarly data that concern the performance of researchers is a significant task because it can potentially impact scientific careers since they are based in funding and promotion. This article investigates the temporal evolution of a scientist's impact. Instead of taking a detailed, microscopic view that examines the citation curves of every scientist's article, the article develops a scalable, macroscopic methodology that uses the articles' citation profiles to build a more abstract and high-level profile that characterizes a scientist. This profile is utilized to cluster scientists in a set of 'performance' clusters. To this end, established techniques such as Principal Component Analysis and Self-Organizing Map clustering are employed as well as a set of proposed heuristics. The effectiveness of the proposed methodology is examined by comparing the resulting rankings with the outcomes of the peer-review procedures that resulted in the E. F. Codd and the Turing awards. The good match between the outcomes of computerized and peer-review procedures provides solid evidence that the proposed techniques constitute a promising analysis method for big scholarly data.

References

[1]
E. Bruna. On identifying rising stars in ecology. BioScience, 64(3), 2015.
[2]
P. della Briotta Parolo, R. Pan, R. Ghosh, B. Huberman, K. Kaski, and S. Fortunato. Attention decay in science. Journal of Informetrics, 9(4):734--745, 2015.
[3]
P. Erdi, K. Makovi, Z. Somogyvari, K. Strandburg, J. Tobochnik, P. Volf, and L. Zalanyi. Prediction of emerging technologies based on analysis of the US patent citation network. Scientometrics, 95(1):225--242, 2013.
[4]
J. E. Hirsch. An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences, 102(46):16569--16572, 2005.
[5]
D. Katsaros, L. Akritidis, and P. Bozanis. The f index: Quantifying the impact of coterminal citations on scientists' ranking. Journal of the American Society for Information Science and Technology, 60(5):1051--1056, 2009.
[6]
Q. Ke, E. Ferrara, F. Radicchi, and A. Flammini. Defining and identifying Sleeping Beauties in science. Proceedings of the National Academy of Sciences, 112(24):7426--7431, 2015.
[7]
N. Kejzar, S. Korenjak-Cerne, and V. Batagelj. Clustering of distributions: A case of patent citations. Journal of Classification, 28(2):156--183, 2011.
[8]
T. Kohonen, M. R. Schroeder, and T. S. Huang, editors. Self-Organizing Maps. Springer, 2001.
[9]
P. Z. Revesz. A method for predicting citations to the scientific publications of individual researchers. In Proceedings of the IDEAS, 2014.
[10]
P. J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53--65, 1987.
[11]
M. Schreiber. Restricting the h-index to a publication and citation time window: A case study of a timed Hirsch index. Journal of Informetrics, 9(1):150--155, 2015.
[12]
A. Sidiropoulos, A. Gogoglou, D. Katsaros, and Y. Manolopoulos. Gazing at the skyline for star scientists. Journal of Informetrics, 2016. in press.
[13]
A. Sidiropoulos, D. Katsaros, and Y. Manolopoulos. Generalized Hirsch h-index for disclosing latent facts in citation networks. Scientometrics, 72(2):253--280, 2007.
[14]
A. Sidiropoulos, D. Katsaros, and Y. Manolopoulos. Identification of influential scientists vs. mass producers by the Perfectionism index. Scientometrics, 103(1):1--31, 2015.
[15]
C.-T. Zhang. The e-index, complementing the h-index for excess citations. PLoS One, 4(5):e5429, 2009.

Cited By

View all
  • (2020)A Data-Driven Unified Framework for Predicting Citation DynamicsIEEE Transactions on Big Data10.1109/TBDATA.2018.28845056:4(727-740)Online publication date: 1-Dec-2020
  • (2019)Prediction methods and applications in the science of science: A surveyComputer Science Review10.1016/j.cosrev.2019.10019734(100197)Online publication date: Nov-2019
  1. A Scientist's Impact over Time: The Predictive Power of Clustering with Peers

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IDEAS '16: Proceedings of the 20th International Database Engineering & Applications Symposium
    July 2016
    420 pages
    ISBN:9781450341189
    DOI:10.1145/2938503
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Keio University: Keio University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. career path
    2. clustering
    3. h-index
    4. perfectionism index
    5. principal component analysis
    6. self-organizing map

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    IDEAS '16

    Acceptance Rates

    Overall Acceptance Rate 74 of 210 submissions, 35%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)A Data-Driven Unified Framework for Predicting Citation DynamicsIEEE Transactions on Big Data10.1109/TBDATA.2018.28845056:4(727-740)Online publication date: 1-Dec-2020
    • (2019)Prediction methods and applications in the science of science: A surveyComputer Science Review10.1016/j.cosrev.2019.10019734(100197)Online publication date: Nov-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media