skip to main content
10.1145/3575882.3575940acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3inaConference Proceedingsconference-collections
research-article

A scientific expertise classification model based on experts’ self-claims using the semantic and the TF-IDF approach

Published:27 February 2023Publication History

ABSTRACT

It is difficult to understand a scientific domain’s structure and extract specific information from it. A lot of human work is needed to achieve this goal. Based on previous studies, most of the data sets used in identifying the scientific expertise of academia are obtained through the information in the metadata and the contents of the papers written by academia. Therefore, machine learning tools should be utilized to accurately represent how knowledge has been arranged and presented up to this point. In this research, we compare semantic analysis approaches (Latent Dirichlet Allocation/ LDA and knowledge graph / KG) and non-explainable variables (TF-IDF) in identifying categories of scientific expertise. Dataset used based on scientific expertise self-claims written organically by academia which has not been widely studied in previous studies. The TF-IDF approach can provide better classification model accuracy results because its character only looks at the level of word importance (word relevance). However, this approach does not give meaning to the independent variable. It is also supported by the dataset with single part of speech condition. Meanwhile, the semantic analysis approach can provide meaning and relation to form the topic or cluster graph, even with a lower accuracy value.

References

  1. Scopus: Access and use Support Center. 2020. What are Scopus subject area categories and ASJC codes?https://service.elsevier.com/app/answers/detail/a_id/12007/supporthub/scopus/Google ScholarGoogle Scholar
  2. Jonardo R. Asor and Marco Antonio T. Subion. 2018. RESEARCH++: An Academic Social Networking Research Community Portal for Profiling and Expertise Classification. In 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). 470–475. https://doi.org/10.1109/ISRITI.2018.8864483Google ScholarGoogle Scholar
  3. Krisztian Balog and Maarten Rijke. 2007. Determining Expert Profiles (With an Application to Expert Finding).Proceedings IJCAI-2007, 2657–2662.Google ScholarGoogle Scholar
  4. David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. The Art and Science of Analyzing Software Data 3 (2003), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google ScholarGoogle Scholar
  5. Veselka Boeva, Liliana Boneva, and Elena Tsiporkova. 2014. Semantic-Aware Expert Partitioning. In Artificial Intelligence: Methodology, Systems, and Applications, Gennady Agre, Pascal Hitzler, Adila A. Krisnadhi, and Sergei O. Kuznetsov (Eds.). Springer International Publishing, Cham, 13–24.Google ScholarGoogle Scholar
  6. Veselka Boeva, Maria Krusheva, and Elena Tsiporkova. 2012. Measuring expertise similarity in expert networks. In 2012 6th IEEE International Conference Intelligent Systems. 053–057. https://doi.org/10.1109/IS.2012.6335190Google ScholarGoogle ScholarCross RefCross Ref
  7. Joshua Charles Campbell, Abram Hindle, and Eleni Stroulia. 2015. Latent Dirichlet Allocation: Extracting topics from software engineering data. The Art and Science of Analyzing Software Data (2015), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google ScholarGoogle ScholarCross RefCross Ref
  8. Rodrigo Gonçalves and Carina Dorneles. 2019. Automated Expertise Retrieval: A Taxonomy-Based Survey and Open Issues. Comput. Surveys 52 (09 2019), 1–30. https://doi.org/10.1145/3331000Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Margherita Grandini, Enrico Bagli, and Giorgio Visani. 2020. Metrics for Multi-Class Classification: an Overview. https://doi.org/10.48550/ARXIV.2008.05756Google ScholarGoogle Scholar
  10. Lukman Lukman, Yan Rianto, Shidiq Al Hakim, Irene M Nadhiroh, and Deden Sumirat Hidayat. 2018. Citation performance of Indonesian scholarly journals indexed in Scopus from Scopus and Google Scholar. Sci Ed 5, 1 (2018), 53–58. https://doi.org/10.6087/kcse.119Google ScholarGoogle ScholarCross RefCross Ref
  11. Lindung Parningotan Manik, Zaenal Akbar, Aris Yaman, and Ariani Indrawati. 2022. Indonesian Scientists’ Behavior Relative to Research Data Governance in Preventing WMD-Applicable Technology Transfer. Publications 10, 4 (2022). https://doi.org/10.3390/publications10040050Google ScholarGoogle Scholar
  12. C. Murray, Weimao Ke, and K. Borner. 2006. Mapping Scientific Disciplines and Author Expertise Based on Personal Bibliography Files. In Tenth International Conference on Information Visualisation (IV’06). 258–263. https://doi.org/10.1109/IV.2006.73Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining(2015), 399–408. https://doi.org/10.1145/2684822.2685324Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G Salton. 1988. Term-weighting approaches in automatic text retrieval.Google ScholarGoogle Scholar
  15. SINTA. 2022. Subjects. https://sinta.kemdikbud.go.id/subjectsGoogle ScholarGoogle Scholar
  16. Mauro Dalle Lucca Tosi and Julio Cesar Dos Reis. 2021. SciKGraph: A knowledge graph approach to structure a scientific field. Journal of Informetrics 15, 1 (2021), 101109. https://doi.org/10.1016/j.joi.2020.101109Google ScholarGoogle ScholarCross RefCross Ref
  17. Ike Vayansky and Sathish A.P. Kumar. 2020. A review of topic modeling methods. Information Systems 94(2020). https://doi.org/10.1016/j.is.2020.101582Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications
    November 2022
    415 pages
    ISBN:9781450397902
    DOI:10.1145/3575882

    Copyright © 2022 ACM

    © 2022 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 27 February 2023

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)3

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format