Elsevier

Knowledge-Based Systems

Volume 28, April 2012, Pages 1-12
Knowledge-Based Systems

Interest-based real-time content recommendation in online social communities

https://doi.org/10.1016/j.knosys.2011.09.019Get rights and content

Abstract

The fast-growing popularity of online social communities and the massive amounts of user-generated content pose a critical need for, and new challenges on, content recommender system. The system needs to identify the unique and diverse interests of individual users and deliver content to interested users on a real-time basis. In this work, we propose Farseer, a system for personalized real-time content recommendation and delivery in online social communities. The proposed solution consists of a set of integrated offline and online algorithms that identify and utilize unique item-based interest clusters and cluster-based item rating in order to recommend newly-generated content items to individual users in real time. Our main contributions are (1) a detailed analysis of content popularity distribution and user interest distribution in online social communities; (2) a novel interest-based clustering and cluster-based content recommendation solution; and (3) a complete implementation and deployment in an online social community. Evaluation results gathered from real-world user studies demonstrate that the proposed system outperforms three widely-used collaborative filtering algorithms (kNN, PLSA, SVD) in existing recommender systems. It can effectively identify personal interests and improve the quality and efficiency of real-time personalized content recommendation in online social communities.

Introduction

Online social communities, also referred to as online social networks or member communities, have enjoyed explosive growth during the recent years and are now among the most visited websites on the Internet. Existing online social communities, such as Facebook, Livejournal, and Twitter, provide rich functionalities for online users to create, explore, and share interested content within various social forums and groups. Everyday, a large number of user-generated content items are posted online on a real-time basis. These data items are highly-dynamic and diverse, directly reflecting the unique interests of individual users. The fast-growing popularity of online social communities and the massive user-generated content pose a critical need for, also a great challenge, on identifying the unique interests of individual users and recommend content in real time.

A number of recommendation algorithms and systems have been developed, targeting diverse recommendation scenarios ranging from movies and products to dynamic news articles. Collaborative filtering (CF) is a class of information filtering techniques that identify and leverage the common knowledge or patterns shared among a group of users or agents [12]. Recent studies have demonstrated the potential of using CF to address the content recommendation challenge in online social communities [21]. Over the years, exemplary systems, such as Google News [6] and Amazon e-commerce [7], have gradually adopted CF techniques into their content recommender systems to help identify user-interested content. However, most existing content recommender systems suffer from a common flaw – they tend to yield biased decisions favoring highly popular content, and have difficulties in judging the content with low popularities. This is mainly due to the challenge of accurately characterizing the highly diverse yet unique interests of online users. Using existing CF techniques, users sharing highly popular content (i.e., common interests) tend to be classified as similar to each other. The unique interests of each individual are thus difficult to capture. For instance, in an online basketball forum, Alice is interested in the top NBA teams, but also supports her own home team. Since most of the online posts are devoted to the top, hence more popular, NBA teams, Alice is considered to be very similar to other people in the forum by CF. As a result, posts of Alice’s home team may not be recommended to Alice as others are not interested and thus are less popular. A related problem, focusing on content diversification, was recently studied by Yu et al. [22], and techniques were proposed to compromise accuracy for diversity.

In this work, we propose and develop Farseer, a system for personalized real-time content recommendation and delivery in online social communities. Through interest-based content–user clustering and cluster-based content recommendation, plus the temporal context of online user activities, Farseer can accurately characterize the interests of individual users and deliver content to interested users in real time. As shown in Fig. 1, Farseer consists of the following key components: (1) a real-time content recommendation server that supports data management, data analysis, interest-based content–user clustering, and cluster-based content recommendation; (2) a social community crawler that collects content and user activity updates in online social communities; and (3) a user-friendly browser plug-in that supports system–user interactions. Our work makes the following contributions:

  • A detailed data analysis to understand the discrepancy between content popularity distribution and user interest distribution in online social communities. This study reveals the limitation of the biased decisions of existing CF methods and their inability to accurately assess and deliver less popular content to interested users.

  • A new recommendation solution is proposed, which consists of a novel interest-based content–user clustering algorithm and cluster-based content recommendation algorithm, with the capability of accurately characterizing the diverse yet unique interests of individual online users. The proposed clustering algorithm can efficiently and accurately determine the optimal number of clusters, thus overcoming a key limitation of many other clustering algorithms. The proposed algorithms can be applied to other CF algorithms and improve their recommendation quality and efficiency.

  • The proposed recommender system further leverages the temporal context of online user activities, thereby identifying user groups with similar interests and online access patterns. The user context information can be used to overcome the “false negative” problem suffered by many existing recommender systems, and further improve the recommendation quality.

  • The proposed system has been fully implemented and deployed in an online social community with over 63,000 users, 2 million posts, and 18 million views. Evaluation and measurement results gathered from real-world user studies demonstrate that the proposed system can effectively identify personal interests and improve the quality and efficiency of real-time personalized content recommendation and delivery in online social communities. It outperforms three widely-used CF algorithms in existing recommender systems, as demonstrated in the experiments.

The rest of this article is organized as follows. Section 2 analyzes content popularity and user interests, and motivates the problem. Section 3 presents in detail interest-based content–user clustering, clustering-based content recommendation, and real-time recommendation strategies. A comprehensive evaluation of the proposed solution is presented in Section 4. Section 5 discusses the related work, and Section 6 concludes the article and discusses future work.

Section snippets

Data analysis

This section analyzes the distribution of content popularity and the diversity of user interests in online social communities, and highlights the challenge and importance of accurately identifying users’ specific interests in content items with diverse popularities. We have collected a data set from Fudan BBS (http://www.bbs.fudan.edu.cn), one of the most popular online social communities among Chinese universities. It has over 63,000 users, 20,000 daily posts, 180,000 daily views, and in total

The Farseer content recommender system design

In this section, we present the detailed design of Farseer for personalized real-time content recommendation and delivery. Specifically, we focus on two key components in the recommendation process (Fig. 1).

  • Interest-based content–user clustering. Our first step is to identify inherent structures, e.g., interest groups, within subcommunities, such that content and users can be quickly mapped to diverse yet specific interest clusters. The key challenges of this step are (1) how to measure the

Evaluations

In this section, we evaluate Farseer using both offline studies and real-time measurements in Fudan BBS. We consider the eight most active subcommunities, which account for 12.2% post activities and 27.8% view activities of Fudan BBS, and cover a variety of social interests. More importantly, these eight subcommunities have diverse internal structures. Implicit interest groups may or may not exist in each subcommunity. Together, these eight subcommunities allow for a comprehensive evaluation of

Related work

In this work, we propose a personalized real-time recommender system for online social communities. Our work builds upon existing collaborative filtering (CF) techniques. Since Goldberg et al. [12] first introduced CF as an alternative to content-based filtering [14], [15], CF has been widely used in recommender systems because of its high prediction quality.

Existing CF techniques can be classified as memory-based [9], [19], [11] or model-based [6], [20], [13], depending on whether a model is

Conclusions and future work

This article addresses the challenge of personalized real-time content recommendation in online social communities. It overcomes a key limitation of existing collaborative filtering techniques – favoring highly popular content over less popular ones. The proposed solution unifies item-based and user-based collaborative filtering techniques, which improves the accuracy of user interest characterization, hence content recommendation quality. In addition, the proposed system aims for real-time

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 60736020 and 60803118, the Shanghai Leading Academic Discipline Project under Grant No. B114, and the National Science Foundation of USA under Grant No. CNS – C0910995.

References (34)

  • Abhinandan S. Das, Mayur Datar, Ashutosh Garg, Shyam Rajaram, Google news personalization: scalable online...
  • G. Linden et al.

    Amazon.com recommendations: item-to-item collaborative filtering

    IEEE Internet Computing

    (2003)
  • Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Reidl, Item-based collaborative filtering recommendation...
  • Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers, John Riedl, An algorithmic framework for performing...
  • Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, John Riedl, GroupLens: an open architecture for...
  • Joseph A. Konstan et al.

    Grouplens: Applying collaborative filtering to Usenet news

    Communications of the ACM

    (1997)
  • David Goldberg et al.

    Using collaborative filtering to weave an information tapestry

    Communications of the ACM

    (1992)
  • Cited by (56)

    • A new item similarity based on α-divergence for collaborative filtering in sparse data

      2021, Expert Systems with Applications
      Citation Excerpt :

      The former analyzes the profiles of users or items and generates predictions based on the features what users like. It has been widely used in recommendation for recruitment industry (Bansal, Srivastava, & Arora, 2017), clinical medicine (Khodambashi, Perry, & Nytrø, 2015), online social communities (Li et al., 2012) and so on. The weakness of content-based methods is that it is hard to obtain enough profiles about users and items.

    • Mobile recommendations based on interest prediction from consumer's installed apps–insights from a large-scale field study

      2017, Information Systems
      Citation Excerpt :

      To gain knowledge about mobile consumers in a non-cookie setting, researchers and practitioners have started to develop new approaches for user profiling. As personal interests were shown to be useful in improving advertisements and recommendations [9–11], recent studies have started to predict personal interests based on analyzing an individual's social network content (e.g., posts, likes). However, such an approach cannot be applied by most companies because they do not have direct access to the social data on mobile.

    • Efficient privacy-preserving content recommendation for online social communities

      2017, Neurocomputing
      Citation Excerpt :

      Therefore, recommender systems that can accurately and efficiently identify and deliver interested content to individual users have become increasingly important. Many existing recommender systems [1–8] adopt collaborative filtering (CF), a popular recommendation method that has high accuracy, low overhead, and is generally applicable to various application domains. CF works based on the idea that people who had similar interest in the past are likely to have similar interest in the future.

    • SalesExplorer: Exploring sales opportunities from white-space customers in the enterprise market

      2016, Knowledge-Based Systems
      Citation Excerpt :

      Recommender systems have been widely adopted in commercial sales and services [5,12,16,20,24]. And many solutions [12,16,35,36] have been proposed to solve the top-N recommendation problem, which is a common problem in commercial sales and services. However, the above methods only consider the case that customers have historical purchase records in the target product domain, and do not support cold-start recommendation, i.e., those methods cannot deal with “white space” customer problem as defined in this paper.

    View all citing articles on Scopus
    View full text