ABSTRACT
It is difficult to understand a scientific domain’s structure and extract specific information from it. A lot of human work is needed to achieve this goal. Based on previous studies, most of the data sets used in identifying the scientific expertise of academia are obtained through the information in the metadata and the contents of the papers written by academia. Therefore, machine learning tools should be utilized to accurately represent how knowledge has been arranged and presented up to this point. In this research, we compare semantic analysis approaches (Latent Dirichlet Allocation/ LDA and knowledge graph / KG) and non-explainable variables (TF-IDF) in identifying categories of scientific expertise. Dataset used based on scientific expertise self-claims written organically by academia which has not been widely studied in previous studies. The TF-IDF approach can provide better classification model accuracy results because its character only looks at the level of word importance (word relevance). However, this approach does not give meaning to the independent variable. It is also supported by the dataset with single part of speech condition. Meanwhile, the semantic analysis approach can provide meaning and relation to form the topic or cluster graph, even with a lower accuracy value.
- Scopus: Access and use Support Center. 2020. What are Scopus subject area categories and ASJC codes?https://service.elsevier.com/app/answers/detail/a_id/12007/supporthub/scopus/Google Scholar
- Jonardo R. Asor and Marco Antonio T. Subion. 2018. RESEARCH++: An Academic Social Networking Research Community Portal for Profiling and Expertise Classification. In 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). 470–475. https://doi.org/10.1109/ISRITI.2018.8864483Google Scholar
- Krisztian Balog and Maarten Rijke. 2007. Determining Expert Profiles (With an Application to Expert Finding).Proceedings IJCAI-2007, 2657–2662.Google Scholar
- David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. The Art and Science of Analyzing Software Data 3 (2003), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google Scholar
- Veselka Boeva, Liliana Boneva, and Elena Tsiporkova. 2014. Semantic-Aware Expert Partitioning. In Artificial Intelligence: Methodology, Systems, and Applications, Gennady Agre, Pascal Hitzler, Adila A. Krisnadhi, and Sergei O. Kuznetsov (Eds.). Springer International Publishing, Cham, 13–24.Google Scholar
- Veselka Boeva, Maria Krusheva, and Elena Tsiporkova. 2012. Measuring expertise similarity in expert networks. In 2012 6th IEEE International Conference Intelligent Systems. 053–057. https://doi.org/10.1109/IS.2012.6335190Google ScholarCross Ref
- Joshua Charles Campbell, Abram Hindle, and Eleni Stroulia. 2015. Latent Dirichlet Allocation: Extracting topics from software engineering data. The Art and Science of Analyzing Software Data (2015), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google ScholarCross Ref
- Rodrigo Gonçalves and Carina Dorneles. 2019. Automated Expertise Retrieval: A Taxonomy-Based Survey and Open Issues. Comput. Surveys 52 (09 2019), 1–30. https://doi.org/10.1145/3331000Google ScholarDigital Library
- Margherita Grandini, Enrico Bagli, and Giorgio Visani. 2020. Metrics for Multi-Class Classification: an Overview. https://doi.org/10.48550/ARXIV.2008.05756Google Scholar
- Lukman Lukman, Yan Rianto, Shidiq Al Hakim, Irene M Nadhiroh, and Deden Sumirat Hidayat. 2018. Citation performance of Indonesian scholarly journals indexed in Scopus from Scopus and Google Scholar. Sci Ed 5, 1 (2018), 53–58. https://doi.org/10.6087/kcse.119Google ScholarCross Ref
- Lindung Parningotan Manik, Zaenal Akbar, Aris Yaman, and Ariani Indrawati. 2022. Indonesian Scientists’ Behavior Relative to Research Data Governance in Preventing WMD-Applicable Technology Transfer. Publications 10, 4 (2022). https://doi.org/10.3390/publications10040050Google Scholar
- C. Murray, Weimao Ke, and K. Borner. 2006. Mapping Scientific Disciplines and Author Expertise Based on Personal Bibliography Files. In Tenth International Conference on Information Visualisation (IV’06). 258–263. https://doi.org/10.1109/IV.2006.73Google ScholarDigital Library
- Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining(2015), 399–408. https://doi.org/10.1145/2684822.2685324Google ScholarDigital Library
- G Salton. 1988. Term-weighting approaches in automatic text retrieval.Google Scholar
- SINTA. 2022. Subjects. https://sinta.kemdikbud.go.id/subjectsGoogle Scholar
- Mauro Dalle Lucca Tosi and Julio Cesar Dos Reis. 2021. SciKGraph: A knowledge graph approach to structure a scientific field. Journal of Informetrics 15, 1 (2021), 101109. https://doi.org/10.1016/j.joi.2020.101109Google ScholarCross Ref
- Ike Vayansky and Sathish A.P. Kumar. 2020. A review of topic modeling methods. Information Systems 94(2020). https://doi.org/10.1016/j.is.2020.101582Google Scholar
Recommendations
A semantic approach for topic-based polarity detection: a case study in the Spanish language
AbstractIn recent years, surprising amounts of news, messages, and reviews of products and services are generated in the online social media. Several efforts are being dedicated to detecting topics, as well as mining opinions in these unstructured texts. ...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Analyzing scientific research topics in manufacturing field using a topic model
Graphical abstractDisplay Omitted
Highlights- We investigate the research topics of published scientific literature in manufacturing between 1990 and 2016.
AbstractWe can gain a thorough understanding of an academic field by surveying related scientific literature. We applied the latent Dirichlet allocation (LDA) method and developed a topic model to analyze changes in research topics over time ...
Comments