Elsevier

Knowledge-Based Systems

Volume 254, 27 October 2022, 109551
Knowledge-Based Systems

A dual learning-based recommendation approach

https://doi.org/10.1016/j.knosys.2022.109551Get rights and content

Abstract

Data sparsity and cold start are two critical issues which need to be addressed in recommender systems (RSs). Currently, most methods address these issues by applying user history files or some side information to improve the user model and complete the rating matrix. However, such methods cannot perform well when labeled data is scarce or unavailable. In this paper, we propose a dual learning-based recommendation approach (DLRA). DLRA can trigger initial recommendation and improve the quality of recommendations by using the duality characteristics of RSs, even when the available labeled information is scarce. Specifically, DLRA regards the recommendation task as two independent subtasks — primal task and dual task, and these two tasks show strong duality in DLRA. The primal task is item-centered which aims to find users who can rate high for items, while the dual task is user-centered that aims to recommend the most favorite items to users. These two tasks have strong dualities in terms of the recommendation space, selection probability and recommendation basis. Based on these dualities, we design three dual learning strategies to couple the whole recommendation process and realize the self-tuning and self-improvement of each task model, and finally optimize the whole recommendation model. Based on the dataset of Movielens and BookCrossing, we simulate data sparsity and cold start recommendation scenarios, the experimental results show that DLRA achieves substantial improvement when the labeled data is scare, and it outperforms other hybrid recommendation approaches and deep learning strategies with a smaller predictive error as well as better recommendation accuracy.

Introduction

With the explosive growth of Internet resources, particularly commodities and entertainment resources, more and more people turn to the Internet to search for items they need. Having an efficient and accurate recommender system (RS) has become an important requirement for satisfying user’s personalized needs and experience. For example, 80% of movies watched on Netflix came from recommendations [1], and 60% of video clicks came from home page recommendation in YouTube [2]. RS provides the possibility of returning items which satisfy users’ personalized needs [3]. Most recommender systems (RSs) heavily depend on history experience of active users or other users’ evaluations to generate recommendations. Content-based filtering (CBF), collaborative filtering (CF) and hybrid filtering (HF) are the common recommendation methods for filtering items. CBF recommends items which are similar to the items that users like in the past [4]. One major issue of CBF approach is that it relies on a substantial number of item features and user’s history files. CF recommender system aims to recommend items according to some other users who are similar to active users. User–item rating matrix is the basic criterion for calculating user similarity or item similarity for both CBF and CF methods [5]. It is obvious that these two approaches suffer from cold start and data sparsity problems. HF recommendation approach utilizes user’s history information or context information to complete the rating matrix to reduce the risks of data sparsity [6], [7], [8]. However, some fields suffer from extreme lack of available labeled data due to the shortage of prior data, difficulties in data collection or low reliability of data [9]. Hence, the side information and user history files are not always available in RSs.

Deep learning-based approaches have been proved promising in extracting content features of items users’ social relationships to optimize recommendation strategies [10], [11], [12], [13]. However, deep learning-based RSs often face with some common plights. Firstly, the process of training deep learning-based method is kind of black-box, hence, it lacks interpretability and modifiability, and it weakens the possibility of using some inherent features of recommendation scenarios. Secondly, deep learning is heavily dependent on big data and labeled data, which limits its application on some RSs. Thirdly, deep learning has high requirements for hardware and it usually takes long training time. Fourthly, it may take a lot of time to deploy and adjust a deep learning model, and the specific effect of a deep learning model cannot be guaranteed. Therefore, the question of how to implement an efficient and effective RS under the situation of few labeled data is still a hot topic, which motivates the development of non-deep learning recommendation techniques.

In this study, we propose a dual learning-based recommendation approach (DLRA) which is expected to show less dependence on labeled data. Dual learning is a new learning framework that leverages the symmetric structure of tasks to obtain effective feedback or regularization signals to enhance the learning process. The premise of applying dual learning theory is that the task itself shows duality, that is, the input of one task is (or can be converted to) the output of another task, and vice versa. Two dual tasks can construct a natural closed loop and form a learning mechanism through effective feedback. Therefore, with this closed-loop structure, the learning task can be realized with less labeled data.

The importance of duality has been proved and magnified in many fields, such as translation from one language to another versus its opposite direction, speech recognition versus speech synthetization, image classification versus image generation, etc [14], [15], as well as vehicle re-identification and vesselbridge collisions [16], [17].

Several reasons why dual learning can be applied to RSs are explained as follows:

(1) RSs show symmetric structure. Since RSs mainly focus on the process of matching users and items, a RS can be divided into two tasks. The primal task is a CBF based process, and it is a task of matching items with users who can rate items high. The dual task is a CF based process, and it is a task of matching users with their favorite items. Obviously, the output of primal task provides the possibility of calculating user similarity and item similarity, and these similarities can be used as the input of dual task. The output of dual task provides users’ access records and preferences for items, and these similarities can be used as the input of primal task.

(2) The structural duality of RSs implies the strong duality connections between primal and dual tasks. Specifically, three kinds of duality are considered in this study.

  • The duality of recommendation space. In primal and dual task of RSs, item-based recommendation space and user clustering-based recommendation space will be generated. These two spaces should be consistent and they can be verified each other. The gap of recommendation spaces is an important feedback signal to realize the closed loop.

  • The duality of selection probability. The primal task and dual task in RSs share the same items and users. Therefore, the probability of mutual selection between items and users should be consistent. We take the selection probability as one of the duality characteristics of RSs. Probabilistic nature can strengthen the dual learning process through structural regularization and improve the accuracy of the recommendation model.

  • The duality of recommendation basis completion. The user history access data in the primal task and the user rating matrix in the dual task are the recommendation bases for RSs. In fact, these two bases also show duality, that is, the potential rating of users on the items can be obtained from users’ history access to items, and the possibility of users’ access to the project can be predicted from the user rating matrix. Therefore, even in the face of zero or very few labeled data, the two tasks learn from each other to realize data completion and further complete the recommendation.

The aforementioned duality strategies are implemented through the mutual interactive and continuous trial and error feedback mechanisms in the primal and dual tasks. Consequently, the whole recommendation model achieves the potential of self-improvement and self-tuning. Thus, RSs can work effectively even when there is only few or zero history data. Accordingly, the adaptability and effectiveness of RSs are bound to be boosted.

The main contributions of the paper include, (1) Introducing a new recommendation model DLRA. DLRA is capable of reducing the dependence on labeled data in RSs by forming a closed loop between primal and dual tasks. Hence, DLRA shows promising ability to leverage unlabeled data and structural duality to address big-data challenge faced by CBF and CF approach. (2) Simulating users and items in RSs as entities that can actively initiate search tasks. The task of items searching users and the task of users searching items are exactly presented in dual forms. This provides a solid foundation for applying dual learning theory to such bi-directional and simultaneous dual tasks. (3) Designing the duality strategies of recommendation space, selection probability and recommendation basis. These strategies ensure the validity of the recommendation results and guarantee that the dual learning-based recommendation approach can effectively reduce the dependence on labeled data.

The remainder of the paper is organized as follows: Section 2 describes some studies related to hybrid recommendation approach and dual learning theory. Section 3 specifies the proposed dual learning-based recommendation approach. Section 4 presents the experiment datasets, comparison approach and the evaluation methods. Section 5 gives the experimental results for verifying the performance of the proposed approach. Finally, Section 6 concludes the study and introduces the future work.

Section snippets

Related works

This section presents a review of existing literatures which focus on how to alleviate the effect of data sparsity and cold start problems. Then, the advantages of dual learning theory are detailed through comparison with HF and deep learning-based approaches.

Dual learning-based recommendation strategy

By analyzing the duality characteristics of two independent tasks in RSs, we propose a dual learning-based recommendation approach — DLRA. The basic idea, framework and implementation details of the DLRA are introduced in this section.

Experiment setup

In this section, we first introduce the computing environment which includes the hardware and software tools in Section 4.1. Then, we list the benchmark datasets and our design of the dataset in Section 4.2. Several comparison algorithms are described in Section 4.3. In Section 4.4, we design evaluation matrices to test the performance of these algorithms.

Results and discussions

This section mainly introduces the experimental process, experimental results and discussions for addressing data sparsity and cold start problems of recommender systems.

Conclusions and future work

A recommender system takes users’ history information or the user–item rating matrix as the input data, and outputs the items that users may like best. To provide users with items that best meet their needs and preferences, the quality of both user history files and existing rating matrix play crucial roles. But cold start and data sparsity are usually unavoidable, which hinders the application of recommender systems. Even though some hybrid recommendation approaches use side information to

CRediT authorship contribution statement

Shanshan Wan: Conceptualization, Methodology, Formal analysis, Writing – original draft. Ying Liu: Data curation, Funding acquisition, Software. Dongwei Qiu: Methodolgy, Mathematical model modification, Revised submission proofreading. James Chambua: Writing – original draft, Writing – review & editing, Validation. Zhendong Niu: Supervision, Methodology.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors wish to acknowledge the support of the National Natural Science Foundation of China (No. 61902016, No. 61871020), the high level innovation team construction project of Beijing Municipal Colleges and universities (No. IDHT20190506), the National Key R&D Program of China (No. 2019YFB1406302, No. 2018YFC0807806-2), the postgraduate education and teaching quality improvement project of Beijing University of Civil Engineering and Architecture , China (No. J2022005).

References (62)

  • WeiJ. et al.

    Collaborative filtering and deep learning based recommendation system for cold start items

    Expert Syst. Appl.

    (2017)
  • Gomez-UribeC.A. et al.

    The netflix recommender system: Algorithms, business value, and innovation

    ACM Trans. Manag. Inform. Syst. (TMIS)

    (2016)
  • DavidsonJ. et al.

    The YouTube video recommendation system

  • AdomaviciusG. et al.

    Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions

    IEEE Trans. Knowl. Data Eng.

    (2005)
  • ChenW. et al.

    A hybrid recommendation algorithm adapted in e-learning environments

    World Wide Web

    (2014)
  • TarusJ.K. et al.

    Knowledge-based recommendation: a review of ontology-based recommender systems for e-learning

    Artif. Intell. Rev.

    (2018)
  • BurkeR.

    Hybrid recommender systems: Survey and experiments

    User Model. User-Adapt. Interact.

    (2002)
  • ZhangD. et al.

    Cold-start recommendation using Bi-clustering and fusion for large-scale social recommender systems

    IEEE Trans. Emerg. Top. Comput.

    (2014)
  • BetruB.T. et al.

    Deep learning methods on recommender system: a survey of state-of-the-art

    Int. J. Comput. Appl.

    (2017)
  • StrubF. et al.

    Hybrid recommender system based on autoencoders

  • ZhangS. et al.

    Deep learning based recommender system: A survey and new perspectives

    ACM Comput. Surv.

    (2019)
  • MuR.

    A survey of recommender systems based on deep learning

    IEEE Access

    (2018)
  • ZhaoZ. et al.

    Dual learning: Theoretical study and algorithmic extensions

    (2018)
  • TangD. et al.

    Question answering and question generation as dual tasks

    (2017)
  • HuangY. et al.

    Dual domain multi-task model for vehicle re-identification

    IEEE Trans. Intell. Transp. Syst.

    (2020)
  • ZhangB. et al.

    A warning framework for avoiding vessel-bridge and vessel-vessel collisions based on generative adversarial and dual-task networks

    Comput.-Aided Civ. Infrastruct. Eng.

    (2021)
  • D. Mukherjee, S. Banerjee, S. Bhattacharya, P. Misra, Method and System for Context-Aware Recommendation, Google...
  • PazzaniM.J. et al.

    Content-based recommendation systems

  • WangX. et al.

    Dynamic attention deep model for article recommendation by learning human editors’ demonstration

  • ShanshanW. et al.

    An E-learning recommendation approach based on the self-organization of learning resource

    Knowl.-Based Syst.

    (2018)
  • SchaferJ.B. et al.

    Collaborative filtering recommender systems

  • Cited by (0)

    View full text