research-article

A scientific expertise classification model based on experts’ self-claims using the semantic and the TF-IDF approach

Authors:
Andre Sihombing

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0002-5607-8160
View Profile

,
Ariani Indrawati

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0002-1387-9419
View Profile

,
Aris Yaman

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0002-0305-9054
View Profile

,
Cahyo Trianggoro

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0001-6950-7780
View Profile

,
Lindung Parningotan Manik

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0001-8637-2881
View Profile

,
Zaenal Akbar

National Research and Innovation Agency (BRIN), Indonesia

National Research and Innovation Agency (BRIN), Indonesia

0000-0003-3563-0021
View Profile

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its ApplicationsNovember 2022Pages 301–305https://doi.org/10.1145/3575882.3575940

Published:27 February 2023Publication History

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

Pages 301–305

ABSTRACT

It is difficult to understand a scientific domain’s structure and extract specific information from it. A lot of human work is needed to achieve this goal. Based on previous studies, most of the data sets used in identifying the scientific expertise of academia are obtained through the information in the metadata and the contents of the papers written by academia. Therefore, machine learning tools should be utilized to accurately represent how knowledge has been arranged and presented up to this point. In this research, we compare semantic analysis approaches (Latent Dirichlet Allocation/ LDA and knowledge graph / KG) and non-explainable variables (TF-IDF) in identifying categories of scientific expertise. Dataset used based on scientific expertise self-claims written organically by academia which has not been widely studied in previous studies. The TF-IDF approach can provide better classification model accuracy results because its character only looks at the level of word importance (word relevance). However, this approach does not give meaning to the independent variable. It is also supported by the dataset with single part of speech condition. Meanwhile, the semantic analysis approach can provide meaning and relation to form the topic or cluster graph, even with a lower accuracy value.

References

Scopus: Access and use Support Center. 2020. What are Scopus subject area categories and ASJC codes?https://service.elsevier.com/app/answers/detail/a_id/12007/supporthub/scopus/Google Scholar
Jonardo R. Asor and Marco Antonio T. Subion. 2018. RESEARCH++: An Academic Social Networking Research Community Portal for Profiling and Expertise Classification. In 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI). 470–475. https://doi.org/10.1109/ISRITI.2018.8864483Google Scholar
Krisztian Balog and Maarten Rijke. 2007. Determining Expert Profiles (With an Application to Expert Finding).Proceedings IJCAI-2007, 2657–2662.Google Scholar
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent Dirichlet Allocation. The Art and Science of Analyzing Software Data 3 (2003), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google Scholar
Veselka Boeva, Liliana Boneva, and Elena Tsiporkova. 2014. Semantic-Aware Expert Partitioning. In Artificial Intelligence: Methodology, Systems, and Applications, Gennady Agre, Pascal Hitzler, Adila A. Krisnadhi, and Sergei O. Kuznetsov (Eds.). Springer International Publishing, Cham, 13–24.Google Scholar
Veselka Boeva, Maria Krusheva, and Elena Tsiporkova. 2012. Measuring expertise similarity in expert networks. In 2012 6th IEEE International Conference Intelligent Systems. 053–057. https://doi.org/10.1109/IS.2012.6335190Google ScholarCross Ref
Joshua Charles Campbell, Abram Hindle, and Eleni Stroulia. 2015. Latent Dirichlet Allocation: Extracting topics from software engineering data. The Art and Science of Analyzing Software Data (2015), 139–159. https://doi.org/10.1016/B978-0-12-411519-4.00006-9Google ScholarCross Ref
Rodrigo Gonçalves and Carina Dorneles. 2019. Automated Expertise Retrieval: A Taxonomy-Based Survey and Open Issues. Comput. Surveys 52 (09 2019), 1–30. https://doi.org/10.1145/3331000Google ScholarDigital Library
Margherita Grandini, Enrico Bagli, and Giorgio Visani. 2020. Metrics for Multi-Class Classification: an Overview. https://doi.org/10.48550/ARXIV.2008.05756Google Scholar
Lukman Lukman, Yan Rianto, Shidiq Al Hakim, Irene M Nadhiroh, and Deden Sumirat Hidayat. 2018. Citation performance of Indonesian scholarly journals indexed in Scopus from Scopus and Google Scholar. Sci Ed 5, 1 (2018), 53–58. https://doi.org/10.6087/kcse.119Google ScholarCross Ref
Lindung Parningotan Manik, Zaenal Akbar, Aris Yaman, and Ariani Indrawati. 2022. Indonesian Scientists’ Behavior Relative to Research Data Governance in Preventing WMD-Applicable Technology Transfer. Publications 10, 4 (2022). https://doi.org/10.3390/publications10040050Google Scholar
C. Murray, Weimao Ke, and K. Borner. 2006. Mapping Scientific Disciplines and Author Expertise Based on Personal Bibliography Files. In Tenth International Conference on Information Visualisation (IV’06). 258–263. https://doi.org/10.1109/IV.2006.73Google ScholarDigital Library
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the space of topic coherence measures. WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining(2015), 399–408. https://doi.org/10.1145/2684822.2685324Google ScholarDigital Library
G Salton. 1988. Term-weighting approaches in automatic text retrieval.Google Scholar
SINTA. 2022. Subjects. https://sinta.kemdikbud.go.id/subjectsGoogle Scholar
Mauro Dalle Lucca Tosi and Julio Cesar Dos Reis. 2021. SciKGraph: A knowledge graph approach to structure a scientific field. Journal of Informetrics 15, 1 (2021), 101109. https://doi.org/10.1016/j.joi.2020.101109Google ScholarCross Ref
Ike Vayansky and Sathish A.P. Kumar. 2020. A review of topic modeling methods. Information Systems 94(2020). https://doi.org/10.1016/j.is.2020.101582Google Scholar

Recommendations

A semantic approach for topic-based polarity detection: a case study in the Spanish language
Abstract
In recent years, surprising amounts of news, messages, and reviews of products and services are generated in the online social media. Several efforts are being dedicated to detecting topics, as well as mining opinions in these unstructured texts. ...
Read More
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Read More
Analyzing scientific research topics in manufacturing field using a topic model
Graphical abstract

Display Omitted
Highlights
- We investigate the research topics of published scientific literature in manufacturing between 1990 and 2016.
Abstract
We can gain a thorough understanding of an academic field by surveying related scientific literature. We applied the latent Dirichlet allocation (LDA) method and developed a topic model to analyze changes in research topics over time ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications
November 2022
415 pages
ISBN:9781450397902
DOI:10.1145/3575882

Copyright © 2022 ACM
© 2022 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 February 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Knowledge Graphs
Latent Dirichlet Allocation
Scientific Expertise Classification
Semantic Analysis
TF-IDF
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 18
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A scientific expertise classification model based on experts’ self-claims using the semantic and the TF-IDF approach

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

ABSTRACT

References

Cited By

Recommendations

A semantic approach for topic-based polarity detection: a case study in the Spanish language

Joint sentiment/topic model for sentiment analysis

Analyzing scientific research topics in manufacturing field using a topic model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A scientific expertise classification model based on experts’ self-claims using the semantic and the TF-IDF approach

IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

ABSTRACT

References

Cited By

Recommendations

A semantic approach for topic-based polarity detection: a case study in the Spanish language

Joint sentiment/topic model for sentiment analysis

Analyzing scientific research topics in manufacturing field using a topic model

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media