A data science-based framework to categorize academic journals

Halim, Zahid; Khan, Shafaq

doi:10.1007/s11192-019-03035-w

A data science-based framework to categorize academic journals

Published: 18 February 2019

Volume 119, pages 393–423, (2019)
Cite this article

Scientometrics Aims and scope Submit manuscript

Zahid Halim¹ &
Shafaq Khan²

1612 Accesses
20 Citations
Explore all metrics

Abstract

Academic journals play a significant role in the dissemination of new research insights and knowledge among scientists. The number of such journals has recently increased significantly. Scientists prefer to publish their scholarly work at reputed venues. Speed of publication is also an import factor considered by many while selecting a publication venue. To evaluate a journal’s quality, few of the key indicators include impact factor, Source Normalized Impact per Paper (SNIP), and Hirsch index (h-index). Journals’ ranking is an indication of their impact and quality with respect to other venues in a specific discipline. Various measures can be utilized for ranking, like, field specific statistics, intra discipline ranking, or a combination of both. Earlier, the journals’ ranking was done through a manual process by providing an institutional list created by academic leaders. Factors like politicization, biases, and personal interests were the key issues with such categorization. Later, the process evolved to a database system based on impact factor, SNIP (Source Normalized Impact per Paper), h-index, or any combination of these. All this demanded an external source of categorizing academic journals. This work presents a data science-based framework that evaluates journals based on their key bibliometric indicators and presents an automated approach to categorize them. For this, the current proposal is restricted to the journals published in the computer science domain. The journal’s features considered in the proposed framework include: publisher, impact factor, website, CiteScore, SJR (SCImago Journal & Country Rank), SNIP, h-index, country, age, cited half-life, immediacy factor/index, Eigenfactor score, article influence score, open access, percentile, citations, acceptance rate, peer review, and the number of articles published yearly. A dataset is collected for 660 journals consisting of these 19 features. The dataset is preprocessed to fill-in the missing values and perform scaling. Three feature selection techniques, namely, Mutual Information (MI), minimum Redundancy Maximum Relevance (mRMR), and Statistical Dependency (SD) are used to rank the aforementioned features. The dataset is then vertically divided into three sets, all features, top nine features, and bottom ten features. Later, two clustering techniques, namely, k-means and k-medoids are employed to find the optimum number of coherent groups in the dataset. Based on a rigorous evaluation, four groups of journals are identified. It is followed by training two classifiers, i.e., k-NN (Nearest Neighbor) and Artificial Neural Network (ANN) to predict the category of an unknown journal. Where, the ANN shows an average accuracy of 82.85%. A descriptive analysis of the clusters formed is also presented to gain insights about the four journal categories. The proposed framework provides an opportunity to independently categorize academic journals based on data science methods using multiple significant bibliometric indicators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Notes

References

Aleskerov, F., Pislyakov, V., & Vitkup, T. (2014). Ranking Journals in Economics, Management and Political Sciences by the Threshold Aggregation Procedure.
Bauerly, R. J., & Johnson, D. T. (2005). An evaluation of journals used in doctoral marketing programs. Journal of the Academy of Marketing Science, 33(3), 313–329.
Article Google Scholar
Bollen, K. A., & Paxton, P. (1998). Detection and determinants of bias in subjective measures. American Sociological Review, 63, 465–478.
Article Google Scholar
Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2013). A review of feature selection methods on synthetic data. Knowledge and Information Systems, 34(3), 483–519.
Article Google Scholar
Bouyssou, D., & Marchant, T. (2011). Bibliometric rankings of journals based on impact factors: An axiomatic approach. Journal of Informetrics, 5(1), 75–86.
Article Google Scholar
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28.
Article Google Scholar
Chang, C. L., McAleer, M., & Oxley, L. (2013). Journal impact factor, eigenfactor. Journal Influence and Article Influence (No. 13-002/III). Tinbergen Institute Discussion Paper.
Derkatch, C. (2012). Demarcating medicine’s boundaries: Constituting and categorizing in the journals of the American Medical Association. Technical Communication Quarterly, 21(3), 210–229.
Article Google Scholar
Egghe, L. (1988). Mathematical relations between impact factors and average number of citations. Information Processing and Management, 24(5), 567–576.
Article Google Scholar
Epstein, D. (2007). Impact factor manipulation. The Write Stuff, 16(3), 133–134.
Google Scholar
Franke, N., & Schreier, M. (2008). A meta-ranking of technology and innovation management/entrepreneurship journals. Die Betriebswirtschaft, 68, 185–216.
Google Scholar
Freyne, J., Coyle, L., Smyth, B., & Cunningham, P. (2010). Relative status of journal and conference publications in computer science. Communications of the ACM, 53(11), 124–132.
Article Google Scholar
Garfield, E., & Sher, I. H. (1963). New factors in the evaluation of scientific literature through citation indexing. Journal of the Association for Information Science and Technology, 14(3), 195–201.
Google Scholar
Glänzel, W., & Moed, H. (2002). Journal impact measures in bibliometric research. Scientometrics, 53(2), 171–193.
Article Google Scholar
González-Pereira, B., Guerrero-Bote, V. P., & Moya-Anegón, F. (2010). A new approach to the metric of journals’ scientific prestige: The SJR indicator. Journal of informetrics, 4(3), 379–391.
Article Google Scholar
Goodman, S. N. (2018). A quality-control test for predatory journals. Nature, 553(7687), 155.
Article Google Scholar
Halim, Z., Atif, M., Rashid, A., & Edwin, C. A. (2017). Profiling players using real-world datasets: Clustering the data and correlating the results with the big-five personality traits. IEEE Transactions on Affective Computing.
Halim, Z., Kalsoom, R., & Baig, A. R. (2016). Profiling drivers based on driver dependent vehicle driving features. Applied Intelligence, 44(03), 645–664.
Article Google Scholar
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569.
Article MATH Google Scholar
Hole, A. R. (2017). Ranking economics journals using data from a national research evaluation exercise. Oxford Bulletin of Economics and Statistics, 79(5), 621–636.
Article Google Scholar
Kao, C., Lin, H. W., Chung, S. L., Tsai, W. C., Chiou, J. S., Chen, Y. L., et al. (2008). Ranking Taiwanese management journals: A case study. Scientometrics, 76(1), 95–115.
Article Google Scholar
Lambert, S., & Alony, I. (2018). Embedding MOOCs in academic programmes as a part of curriculum transformation: A pilot case study. In Innovations in open and flexible education (pp. 73–81). Springer, Singapore.
Lowry, P., Moody, G., Gaskin, J., Galletta, D., Humphreys, S., Barlow, J., et al. (2013). Evaluating journal quality and the association for information systems (AIS) senior scholars’ journal basket via bibliometric measures: Do expert journal assessments add value? MIS Quarterly, 37(4), 993–1012.
Article Google Scholar
Lowry, P., Romans, D., & Curtis, A. (2004). Global journal prestige and supporting disciplines: A scientometric study of information systems journals. Journal of the Association for Information Systems, 5(2), 29–75.
Article Google Scholar
Meho, L. I., & Rogers, Y. (2008). Citation counting, citation ranking, and h-index of human-computer interaction researchers: a comparison of Scopus and Web of Science. Journal of the Association for Information Science and Technology, 59(11), 1711–1726.
Google Scholar
Moed, H. F. (2011). The source normalized impact per paper is a valid and sophisticated indicator of journal citation impact. Journal of the American Society for Information Science and Technology, 62(1), 211–213.
Article Google Scholar
Pisanski, K., Sorokowski, P., & Kulczycki, E. (2017). Predatory journals recruit fake editor. Nature, 543, 481–483.
Article Google Scholar
Serenko, A., & Bontis, N. (2009). Global ranking of knowledge management and intellectual capital academic journals. Journal of Knowledge Management, 13(1), 4–15.
Article Google Scholar
Spezi, V., Wakeling, S., Pinfield, S., Creaser, C., Fry, J., & Willett, P. (2017). Open-Access mega-journals: The future of scholarly communication or academic dumping ground? A review. Journal of Documentation, 73(2), 263–283.
Article Google Scholar
Tüselmann, H., Sinkovics, R. R., & Pishchulov, G. (2015). Towards a consolidation of worldwide journal rankings—a classification using random forests and aggregate rating via data envelopment analysis. Omega, 51, 11–23.
Article Google Scholar
Vaccario, G., Medo, M., Wider, N., & Mariani, M. S. (2017). Quantifying and suppressing ranking bias in a large citation network. Journal of Informetrics, 11(3), 766–782.
Article Google Scholar
Wallace, F. H., & Perri, T. J. (2018). Economists behaving badly: Publications in predatory journals. Scientometrics, 115(2), 749–766.
Article Google Scholar
Wiloso, E. I., Nazir, N., Hanafi, J., Siregar, K., Harsono, S. S., Setiawan, A. A. R. et al. (2018). Life cycle assessment research and application in Indonesia. The International Journal of Life Cycle Assessment, 1–11.
Zhou, D., Ma, J., & Turban, E. (2001). Journal quality assessment: An integrated subjective and objective approach. IEEE Transactions on Engineering Management, 48(4), 479–490.
Article Google Scholar

Download references

Author information

Authors and Affiliations

The Machine Intelligence Research Group (MInG), Faculty of Computer Science and Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi, 23460, Pakistan
Zahid Halim
School of Systems and Technology, Department of Computer Science, University of Management and Technology, Lahore, 54000, Pakistan
Shafaq Khan

Authors

Zahid Halim
View author publications
You can also search for this author in PubMed Google Scholar
Shafaq Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zahid Halim.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TXT 30 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Halim, Z., Khan, S. A data science-based framework to categorize academic journals. Scientometrics 119, 393–423 (2019). https://doi.org/10.1007/s11192-019-03035-w

Download citation

Received: 09 September 2018
Published: 18 February 2019
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s11192-019-03035-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A data science-based framework to categorize academic journals

Abstract

Access this article

Similar content being viewed by others

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis

Artificial intelligence to automate the systematic review of scientific literature

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (TXT 30 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A data science-based framework to categorize academic journals

Abstract

Access this article

Similar content being viewed by others

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

The journal coverage of Web of Science, Scopus and Dimensions: A comparative analysis

Artificial intelligence to automate the systematic review of scientific literature

Notes

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (TXT 30 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation