Abstract:
Expert finding is an important technique to obtain the user authority ranking in community question answering (CQA) websites. ZhihuRank is a topic-sensitive expert findin...Show MoreMetadata
Abstract:
Expert finding is an important technique to obtain the user authority ranking in community question answering (CQA) websites. ZhihuRank is a topic-sensitive expert finding algorithm, which is based on both LDA and PageRank. Currently, with the amount of participants and documents increasing rapidly in CQA websites, how to parallel expert finding algorithms for big data analysis has received significant attention. In this paper, we find that the Spark framework is more suitable for paralleling expert finding algorithms than the MapReduce framework, which is a memory-based parallel computing model to support complicated iterative algorithms. As an example, we parallel ZhihuRank using MLlib's LDA and GraphX's PageRank in Spark. Experiments have been conducted on large-scale real data from Zhihu1 (the most popular CQA website in China). And the experimental results confirmed the effectiveness and scalability of our proposed approach.
Date of Conference: 05-08 December 2016
Date Added to IEEE Xplore: 06 February 2017
ISBN Information: