Multi-modal Knowledge-aware Reinforcement Learning Network for Explainable Recommendation
Introduction
With the explosive growth of online content and services, recommendation systems are playing an increasingly important role in matching user needs with various online resources. Knowledge graphs (KGs) is used as an auxiliary resource in recommendation systems, and helps make better use of various types of structured information to improve recommendation performance and enhance the interpretability of the recommended model [1]. In general, the use of KG-awareness recommendation is two-fold: First, some approaches focuses on using knowledge-graph embedding to make personalized recommendation, such as TransE [2], node2vec [3], and Metapath2Vec [4]. These approaches adopt knowledge graph embedding to calculate the similarity between items and users for Top-N item recommendation [5], [6]. However, these methods match the similarity between a user and item, which do not produce the interpretable reasoning process. Second, as knowledge-graph embedding methods do not produce explanations for recommendation. Therefore, some recommendation systems can produce reasonable explanations through KGs. For example, Wang et al. [7] proposed the knowledge-aware path recurrent network (KPRN) to exploit effective reasoning on paths that further infer the underlying rationale of a user-item interaction. Xian et al. [8] further extended explainable recommendation by proposing policy-guided path reasoning, which formally defines and interprets the reasoning process.
As such, there is a great value in exploiting KGs for explainable recommendation systems. However, those methods inject KGs to enrich the representation of recommendation problems, but ignore the visual traits of items and multi-modal information. Further, with the growing volume of online images, content-based image decisions have come into play. For example, when browsing a movie from a website in Fig. 1, users typically look at the movie’s poster first; then they will read the text content and learn that the film is a story between a white driver, Tony, and black musician, Don. Then they will decide about whether to watch the movie. This indicates that images share a great deal of latent knowledge-level connections, which benefits recommendation systems.
Therefore, it is natural to combine the semantic and image modalities. Indeed, many efforts linking image and text have shown promising results and can be applied to recommendation systems, such as visual-semantic embedding and multi-modal correlation learning. Recently, researchers have explored the potential of multi-modal recommendation in greater depth. For example, Yu et al. [9] proposed the vision-language recommendation model, which enables users to provide natural language feedback on visual products. Zhang et al. [10] proposed Joint Representation Learning (JRL) heterogeneous recommendation systems. However, these models use the representation of text and images, which cannot produce the interpretable reasoning process.
In contrast to existing single-modal recommendation methods, such as pure KGs [5], [11], we propose a multi-modal method for explaining recommendation systems. The agent starts with a user and conducts a multi-hop logical path over the MKGs so as to discover suitable items for recommendation to the target user. If the agent recommends items for the user based on a logical path, it will be easy to interpret the reasoning process over the MKG that leads to each recommendation. Thus, the system can provide two aspects of causal evidence, visuals and knowledge, in support of the recommended items. Accordingly, the aim of our system is not only to recommend the candidate items for the user, but also to provide the corresponding explaining logic paths in the MKG. The logic paths contain visuals and knowledge that serve as interpretable evidence for why a given recommendation is made.
Considering the shortcomings of previous work and inspired by the wide application of images, KGs, and deep reinforcement learning (RL), we proposed a multi-modal knowledge graph incorporating the reinforcement learning network (MKRLN) model, that is, a deep RL model incorporating multiple modes for multi-dimension explanation and reasoning. Then, we designed a novel hierarchy attention-path over KGs, which can largely decrease the action spaces and filter noise. The designed recommendation approach has three advantages. First, we can provide explanations from both visual and knowledge aspects, which are complementary. The agent starts from a user (i.e., is linked to an entity) and performs searches so as to discover suitable items along the paths over the KGs. These multi-step paths can provide logical reasons and deep explanations as to how to recommend items for the user. The image can explain the reason from a visual perspective, and knowledge can explain the recommendation from an external knowledge perspective. Second, in a typical KG, one entity can be linked to a large number of neighbors with the same attributes. In this regard, we propose using attention neighbors and attention-paths to greatly reduce the number of large action spaces and the entities. Third, faced with a large number of items, users have difficulty focusing on the items they care most about, so attention-path mechanisms can explore the user’s really preference and filter out redundant information.
Our contributions are summarized as below:
- •
We proposed a multi-modal KG combined with deep RL for personalized recommendation. The model can explain logical reasoning from both visual and knowledge aspects, making the explanation multi-dimensional.
- •
We designed a novel approach hierarchy attention-path over multi-modal KGs, which can greatly reduce the number of action spaces, entities, and filter out noise, allowing the user to focus on items they care about most.
- •
We highlighted the significance that multi-modal KGs for the purpose of evaluating the performance of recommendation systems capable of recommendation with a higher knowledge level and more explicit reasoning about image contents using external information.
Section snippets
Recommendation with knowledge graph
In recent years, researchers have explored the potential of knowledge graph reasoning in recommendation systems. A series of studies focused on the use of knowledge-graph embedding models to make recommendation [4], [5]. Another research direction is to make interpretable recommendation based on the entity and path information in a knowledge graph. For example, Ai et al. [12] proposed a collaborative filtering (CF) method over knowledge-graph embedding for improving personalized recommendation.
Framework
A multi-modal knowledge graph is defined as , where is entity set and is the relation set, represents the KG triples. Each triple represents a facet of the relation , from the head entity, , to the tail entity, . Let be the set of images, with an image being . Each entity is associated with the corresponding image . The images describe the appearances of the entity, which enriches the entities’ representation with their hidden semantics.
Experiments
We extensively evaluated the performance of our model on real world datasets. We first introduced the benchmarks for our experiments and the corresponding experiments settings. Then, we quantitatively compared the effectiveness of our model with other state-of-the-art methods, and conducted ablation studies to show how parameter variations influence our model.
Conclusion
We believe that in the future intelligent agents should have the ability to perform explicit reasoning over knowledge and images for decision-making. In this paper, we proposed an end-to-end framework based on the interaction of deep reinforcement learning and multi-modal knowledge graph to automatically model the recommendation system for recommendation with interpretation. To achieve this, we built a multi-modal knowledge graph and learned the representations of the entities and images within
CRediT authorship contribution statement
Shaohua Tao: Conceptualization, Methodology, Software. Runhe Qiu: Supervision, Writing- Original draft preparation. Yuan Ping: Writing- Reviewing. Hui Ma: Investigation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Key R&D Program of China under Grant 2017YFB0802000, the National Natural Science Foundation of China under Grant U1736111, the Plan For Scientific Innovation Talent of Hen’an Province, China under Grant 184100510012, Key Technologies R&D Program of He’nan Province, China under Grant 212102210084, and Innovation Scientists and Technicians Troop Construction Projects of He’nan Province Key Technologies R & D Program, China of Henan Province under Grant NO.
References (28)
- et al.
Explainable recommendation: A survey and new perspectives
(2018) - A. Bordes, N. Usunier, A.G. Duran, J. Weston, et al. Translating Embeddings for Modeling Multi-relational data, in:...
- A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the 22nd ACM SIGKDD...
- Y.-X. Dong, N.-V. Chawla, A. Swami, metapath2vec: Scalable representation learning for heterogeneous networks, in:...
- F.-Z. Zhang, N.-J. Yuan, D. Lian, X. Xie, et al. Collaborative knowledge base embedding for recommender systems, in:...
- E. Palumbo, G. Rizzo, R. Troncy, Entity2rec: Learning user-item relatedness from knowledge graphs for top-N item...
- et al.
Explainable reasoning over knowledge graphs for recommendation
(2018) - Y.-K. Xian, Z.-H. Fu, S. Muthukrishnan, G.-D. Melo, et al. Reinforcement knowledge graph reasoning for explainable...
- Y. Tong, Y.-L. Lin, R.-Y. Zhang, X.-Y. Zeng, et al. Vision-language recommendation via attribute augmented multimodal...
- Y.-F. Zhang, Q.-Y. Ai, X. Chen, W.-B. Cro, Joint representation learning for top-N recommendation with heterogeneous...
Learning heterogeneous knowledge base embeddings for explainable recommendation, algorithms
Algorithms
Explainable reasoning over knowledge graphs for recommendation
Cited by (34)
EPAN-SERec: Expertise preference-aware networks for software expert recommendations with knowledge graph
2024, Expert Systems with ApplicationsA fine-grained and multi-context-aware learning path recommendation model over knowledge graphs for online learning communities
2023, Information Processing and ManagementTowards travel recommendation interpretability: Disentangling tourist decision-making process via knowledge graph
2023, Information Processing and ManagementRDERL: Reliable deep ensemble reinforcement learning-based recommender system
2023, Knowledge-Based SystemsKAiPP: An interaction recommendation approach for knowledge aided intelligent process planning with reinforcement learning
2022, Knowledge-Based SystemsCitation Excerpt :In recent years, RL has been widely introduced into interaction recommendation systems, such as ad recommendation [26], movie recommendation [27], music recommendation [28], etc., as its advantages of considering users’ long-term feedbacks. For example, Tao et al. [29] proposed an end-to-end framework based on the multi-modal knowledge-aware reinforcement learning network for explainable recommendation. Huang et al. [30] modeled the recommendation process with MDP and then introduced a top-N deep reinforcement learning-based approach to handle the long-term recommendation problem.
Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation
2022, Applied Soft ComputingCitation Excerpt :To tackle this problem, RL algorithms are preferred solutions. Differing from the recommendation models based on traditional machine learning methods [53–55] and deep learning methods [56–59], the explainable recommendation models with KG and RL [60] not only make high-quality recommendations but also provide explanations, which contribute to the effectiveness and trustworthiness of the recommender systems. For example, Park et al. [61]
The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.