research-article

Translations Diversification for Expert Finding: A Novel Clustering-based Approach

Authors:

Ahmad Ali AbinAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 13, Issue 3

Article No.: 32, Pages 1 - 20

https://doi.org/10.1145/3320489

Published: 29 May 2019 Publication History

Abstract

Expert finding is the task of retrieving and ranking knowledgeable people in the subject of user’s query. It is a well-studied problem that has attracted the attention of many researchers. The most important challenge in expert finding is to determine the similarity between query words and documents authored by candidate experts. One of the most important challenges in Information Retrieval (IR) community is the issue of vocabulary gap between queries and documents. In this study, a translation model based on words clustering in two query and co-occurrence spaces is proposed to overcome this problem. First, the words that are semantically close, are clustered in a query space and then each cluster in this space are clustered again in a co-occurrence space. Representatives of each cluster in the co-occurrence space are considered as a diverse subset of the parent cluster. By this method, the query translations are expected to be diversified in the query space. Next, a probabilistic model, that is based on the belonging degree of word to cluster and similarity of cluster to query in the query space, is used to consider the problem of vocabulary gap. Finally, the corresponding translations to each query are used in conjunction with a combination model for expert finding. Experiments on Stack Overflow dataset show the effectiveness of the proposed method for expert finding.

References

[1]

Ahmad Ali Abin. 2018. A random walk approach to query informative constraints for clustering. IEEE Transactions on Cybernetics 48, 8 (2018), 2272--2283.

[2]

Ahmad Ali Abin and Hamid Beigy. 2015. Active constrained fuzzy clustering: A multiple kernels learning approach. Pattern Recognition 48, 3 (2015), 953--967.

Digital Library

[3]

Krisztian Balog, Leif Azzopardi, and Maarten de Rijke. 2009. A language modeling framework for expert finding. Information Processing 8 Management 45, 1 (2009), 1--19.

Digital Library

[4]

Krisztian Balog, Yi Fang, Maarten de Rijke, Pavel Serdyukov, Luo Si. 2012. Expertise retrieval. Foundations and Trends® in Information Retrieval 6, 2--3 (2012), 127--256.

Digital Library

[5]

Fabiano M. Belém, Carolina S. Batista, Rodrygo L. T. Santos, Jussara M. Almeida, and Marcos A. Gonçalves. 2016. Beyond relevance: Explicitly promoting novelty and diversity in tag recommendation. ACM Transactions on Intelligent Systems and Technology 7, 3 (2016), 26.

Digital Library

[6]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (Jan. 2003), 993--1022.

Digital Library

[7]

Mohamed Bouguessa, Shengrui Wang, and Benoit Dumoulin. 2010. Discovering knowledge-sharing communities in question-answering forums. ACM Transactions on Knowledge Discovery from Data 5, 1 (Dec. 2010), Article 3, 49 pages.

Digital Library

[8]

Yunbo Cao, Jingjing Liu, Shenghua Bao, and Hang Li. 2005. Research on expert search at enterprise track of TREC 2005. In Proceedings of the Text Retrieval Conference.

[9]

Ronan Cummins, Mounia Lalmas, and Colm O’Riordan. 2010. Learning aggregation functions for expert search. In Proceedings of the European Conference on Artificial Intelligence. 535--540.

Digital Library

[10]

Arash Dargahi Nobari, Sajad Sotudeh Gharebagh, and Mahmood Neshati. 2017. Skill translation models in expert finding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1057--1060.

Digital Library

[11]

Hongbo Deng, Irwin King, and Michael R. Lyu. 2012. Enhanced models for expertise retrieval using community-aware strategies. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42, 1 (2012), 93--106.

Digital Library

[12]

Hui Fang and ChengXiang Zhai. 2007. Probabilistic models for expert finding. In Proceedings of Advances in Information Retrieval. 418--430.

Digital Library

[13]

Yi Fang, Luo Si, and Aditya P. Mathur. 2010. Discriminative models of integrating document evidence and document-candidate associations for expert search. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 683--690.

Digital Library

[14]

Edward A. Fox and Joseph A. Shaw. 1994. Combination of multiple searches. NIST Special Publication SP 243 (1994).

[15]

Sajad Sotudeh Gharebagh, Peyman Rostami, and Mahmood Neshati. 2018. T-shaped mining: A novel approach to talent finding for agile software teams. In Proceedings of the European Conference on Information Retrieval. Springer, 411--423.

[16]

Maryam Karimzadehgan, Ryen White, and Matthew Richardson. 2009. Enhancing expert finding using organizational hierarchies. In Proceedings of the Advances in Information Retrieval. 177--188.

Digital Library

[17]

Maryam Karimzadehgan and ChengXiang Zhai. 2010. Estimation of statistical translation models based on mutual information for ad hoc information retrieval. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 323--330.

Digital Library

[18]

Maryam Karimzadehgan, ChengXiang Zhai, and Geneva Belford. 2008. Multi-aspect expertise matching for review assignment. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, 1113--1122.

Digital Library

[19]

Hang Li, Jun Xu, et al. 2014. Semantic matching in search. Foundations and Trends® in Information Retrieval 7, 5 (2014), 343--469.

Digital Library

[20]

Lei Li, Wei Peng, Saurabh Kataria, Tong Sun, and Tao Li. 2015. Recommending users and communities in social media. ACM Transactions on Knowledge Discovery from Data 10, 2 (Oct. 2015), Article 17, 27 pages.

Digital Library

[21]

Tie-Yan Liu et al. 2009. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval 3, 3 (2009), 225--331.

Digital Library

[22]

Craig Macdonald and Iadh Ounis. 2006. Voting for candidates: Adapting data fusion techniques for an expert search task. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management. ACM, 387--396.

Digital Library

[23]

Nima Mirbakhsh and Charles X. Ling. 2015. Improving Top-N recommendation for cold-start users via cross-domain information. ACM Transactions on Knowledge Discovery from Data 9, 4 (Jun. 2015), Article 33, 19 pages.

Digital Library

[24]

Saeedeh Momtazi and Felix Naumann. 2013. Topic modeling for expert finding using Latent Dirichlet Allocation. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 3, 5 (2013), 346--353.

Digital Library

[25]

Catarina Moreira, Pável Calado, and Bruno Martins. 2015. Learning to rank academic experts in the DBLP dataset. Expert Systems 32, 4 (2015), 477--493.

Digital Library

[26]

Catarina Moreira and Andreas Wichert. 2013. Finding academic experts on a multisensor approach using Shannon’s entropy. Expert Systems with Applications 40, 14 (2013), 5740--5754.

[27]

Mahmood Neshati, Seyyed Hadi Hashemi, and Hamid Beigy. 2014. Expertise finding in bibliographic network: Topic dominance learning approach. IEEE Transactions on Cybernetics 44, 12 (2014), 2646--2657.

[28]

Sumanth Patil and Kyumin Lee. 2016. Detecting experts on Quora: By their activity, quality of answers, linguistic characteristics and temporal behaviors. Social Network Analysis and Mining 6, 1 (2016), 5.

[29]

Pavel Serdyukov, Henning Rode, and Djoerd Hiemstra. 2008. Modeling multi-step relevance propagation for expert finding. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, 1133--1142.

Digital Library

[30]

David van Dijk, Manos Tsagkias, and Maarten de Rijke. 2015. Early detection of topical expertise in community question answering. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 995--998.

Digital Library

[31]

Christophe Van Gysel, Maarten de Rijke, and Marcel Worring. 2016. Unsupervised, efficient and semantic expertise retrieval. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1069--1079.

Digital Library

[32]

Ou Wu, Qiang You, Fen Xia, Lei Ma, and Weiming Hu. 2016. Listwise learning to rank from crowds. ACM Transactions on Knowledge Discovery from Data 11, 1 (July 2016), Article 4, 39 pages.

Digital Library

[33]

Jie Yang, Ke Tao, Alessandro Bozzon, and Geert-Jan Houben. 2014. Sparrows and owls: Characterisation of expert behaviour in stackoverflow. In Proceedings of the International Conference on User Modeling, Adaptation, and Personalization. Springer, 266--277.

[34]

Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. Expertise networks in online communities: Structure and algorithms. In Proceedings of the 16th International Conference on World Wide Web. ACM, 221--230.

Digital Library

[35]

Min Zhang, Ruihua Song, Chuan Lin, Shaoping Ma, Zhe Jiang, Yijiang Jin, Yiqun Liu, Le Zhao, and S. Ma. 2003. Expansion-based technologies in finding relevant and new information: Thu trec 2002: Novelty track experiments. NIST Special Publication SP 251 (2003), 586--590.

Cited By

Ghasemi SShakery A(2024)Harnessing the Power of Metadata for Enhanced Question Retrieval in Community Question AnsweringIEEE Access10.1109/ACCESS.2024.339544912(65768-65779)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3395449
Tang MWu DZhang SGao W(2024)EPAN-SERec: Expertise preference-aware networks for software expert recommendations with knowledge graphExpert Systems with Applications10.1016/j.eswa.2023.122985244(122985)Online publication date: Jun-2024
https://doi.org/10.1016/j.eswa.2023.122985
Khabbazan AAbin AVu V(2024)Improving the clarity of questions in Community Question Answering networksJournal of Intelligent Information Systems10.1007/s10844-024-00847-yOnline publication date: 2-May-2024
https://doi.org/10.1007/s10844-024-00847-y
Show More Cited By

Index Terms

Translations Diversification for Expert Finding: A Novel Clustering-based Approach
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Information retrieval diversity

Recommendations

Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework

Proposing a language modeling method to extract translations from comparable corpora.Comparing two similarity functions for deriving bilingual word correlations.Improving translation quality by integrating co-occurrence relations into word ...
Using Sublexical Translations to Handle the OOV Problem in Machine Translation

We introduce a method for learning to translate out-of-vocabulary (OOV) words. The method focuses on combining sublexical/constituent translations of an OOV to generate its translation candidates. In our approach, wildcard searches are formulated based ...
A size-insensitive integrity-based fuzzy c-means method for data clustering

Fuzzy c-means (FCM) is one of the most popular techniques for data clustering. Since FCM tends to balance the number of data points in each cluster, centers of smaller clusters are forced to drift to larger adjacent clusters. For datasets with ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 13, Issue 3

June 2019

261 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3331063

Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
University of Louisiana at Lafayette, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 May 2019

Accepted: 01 March 2019

Revised: 01 November 2018

Received: 01 January 2018

Published in TKDD Volume 13, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
223
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ghasemi SShakery A(2024)Harnessing the Power of Metadata for Enhanced Question Retrieval in Community Question AnsweringIEEE Access10.1109/ACCESS.2024.339544912(65768-65779)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3395449
Tang MWu DZhang SGao W(2024)EPAN-SERec: Expertise preference-aware networks for software expert recommendations with knowledge graphExpert Systems with Applications10.1016/j.eswa.2023.122985244(122985)Online publication date: Jun-2024
https://doi.org/10.1016/j.eswa.2023.122985
Khabbazan AAbin AVu V(2024)Improving the clarity of questions in Community Question Answering networksJournal of Intelligent Information Systems10.1007/s10844-024-00847-yOnline publication date: 2-May-2024
https://doi.org/10.1007/s10844-024-00847-y
Amendola MPassarella APerego R(2024)Towards Robust Expert Finding in Community Question Answering PlatformsAdvances in Information Retrieval10.1007/978-3-031-56069-9_12(152-168)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56069-9_12
Qiao ZFu YWang PXiao MNing ZZhang DDu YZhou Y(2023)RPT: Toward Transferable Model on Heterogeneous Researcher Data via Pre-TrainingIEEE Transactions on Big Data10.1109/TBDATA.2022.31523869:1(186-199)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TBDATA.2022.3152386
Liu YTang WLiu ZDing LTang A(2022)High-quality domain expert finding method in CQA based on multi-granularity semantic analysis and interest driftInformation Sciences10.1016/j.ins.2022.02.039596(395-413)Online publication date: Jun-2022
https://doi.org/10.1016/j.ins.2022.02.039
Fallahnejad ZBeigy H(2022)Attention-based skill translation models for expert findingExpert Systems with Applications10.1016/j.eswa.2021.116433(116433)Online publication date: Jan-2022
https://doi.org/10.1016/j.eswa.2021.116433
Gimenez PSiqueira S(2021)How much do I Stand Out in Communities Q&A? An Analysis of User Interactions based on Graph EmbeddingProceedings of the XVII Brazilian Symposium on Information Systems10.1145/3466933.3466966(1-8)Online publication date: 7-Jun-2021
https://dl.acm.org/doi/10.1145/3466933.3466966
Ghasemi NFatourechi RMomtazi S(2021)User Embedding for Expert Finding in Community Question AnsweringACM Transactions on Knowledge Discovery from Data10.1145/344130215:4(1-16)Online publication date: 26-Mar-2021
https://dl.acm.org/doi/10.1145/3441302
Khabbazan AAbin A(2021)A Topic Based Method to Classify the Question Clarity in CQA Networks2021 12th International Conference on Information and Knowledge Technology (IKT)10.1109/IKT54664.2021.9685163(96-101)Online publication date: 14-Dec-2021
https://doi.org/10.1109/IKT54664.2021.9685163
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents