ABSTRACT
Classifying citation trajectories of scientific publications is crucial. However, they diffuse anomalously due to non-linear, non-stationary, and long-ranged correlations. Previous studies define hard thresholds, arbitrary parameters, and subjective rules to classify based on their rise and fall patterns. It leads to substantial variance and, thus, ambiguous classification. This paper proposes CiteDEK, a hybrid EMD-kNN-DTW classification model framework. It predicts the nature of 5,039 trajectories, each 30 years in length, using only raw time series. We get a classification accuracy of ≈ 76%, and Cohen’s kappa-statistic is 0.63, which is significant.
- Joyita Chakraborty, Dinesh K Pradhan, and Subrata Nandi. 2023. A multiple k-means cluster ensemble framework for clustering citation trajectories. arXiv preprint arXiv:2309.04949 (2023).Google Scholar
- Tanmoy Chakraborty, Suhansanu Kumar, Pawan Goyal, Niloy Ganguly, and Animesh Mukherjee. 2015. On the categorization of scientific citation profiles in computer science. Commun. ACM 58, 9 (2015), 82–90.Google ScholarDigital Library
- Giovanni Colavizza and Massimo Franceschet. 2016. Clustering citation histories in the Physical Review. Journal of Informetrics 10, 4 (2016), 1037–1051.Google ScholarCross Ref
- Zhenyu Gou, Fan Meng, Zaida Chinchilla-Rodríguez, and Yi Bu. 2022. Encoding the citation life-cycle: the operationalization of a literature-aging conceptual model. Scientometrics 127, 8 (2022), 5027–5052.Google ScholarDigital Library
- Dinesh K Pradhan, Joyita Chakraborty, and Subrata Nandi. 2019. Applications of machine learning in analysis of citation network. In Proceedings of the ACM India joint international conference on data science and management of data. 330–333.Google ScholarDigital Library
- Andrew J Quinn, Vitor Lopes-dos Santos, David Dupret, Anna Christina Nobre, and Mark W Woolrich. 2021. EMD: Empirical mode decomposition and Hilbert-Huang spectral analyses in Python. Journal of open source software 6, 59 (2021).Google ScholarCross Ref
- Fred Y Ye and Lutz Bornmann. 2018. “Smart girls” versus “sleeping beauties” in the sciences: The identification of instant and delayed recognition by using the citation angle. Journal of the Association for Information Science and Technology 69, 3 (2018), 359–367.Google ScholarDigital Library
- Maryam Zamani, Erez Aghion, Peter Pollner, Tamas Vicsek, and Holger Kantz. 2021. Anomalous diffusion in the citation time series of scientific publications. Journal of Physics: Complexity 2, 3 (2021), 035024.Google ScholarCross Ref
Index Terms
- CiteDEK: A hybrid EMD-KNN-DTW model for classification of paper citation trajectories
Recommendations
Classification of Gunshots with KNN Classifier
EATIS '18: Proceedings of the Euro American Conference on Telematics and Information SystemsIn this article a system of detection and classification of gunshots is proposed, which consists of using the KNN classifier in the presence and absence of Gaussian additive noise. The results guarantee that the classifier reaches up to 94 % of ...
Adaptive kNN using expected accuracy for classification of geo-spatial data
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied ComputingThe k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e. g., for datasets ...
An effective refinement strategy for KNN text classifier
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. A wide range of supervised ...
Comments