Restricted Boltzmann Machine-driven Interactive Estimation of Distribution Algorithm for personalized search

doi:10.1016/j.knosys.2020.106030

Knowledge-Based Systems

Volume 200, 20 July 2020, 106030

https://doi.org/10.1016/j.knosys.2020.106030 Get rights and content

Abstract

Effective and efficient personalized search is one of the most pursued objectives in the era of big data. The challenge of this problem lies in its complex quantifying evaluations and dynamic user preferences. A user-involved interactive evolutionary algorithm is a good choice if it has reliable preference surrogate and powerful evolutionary strategies. A Restricted Boltzmann Machine (RBM) assisted Interactive Estimation of Distribution Algorithm (IEDA) is presented to enhance the IEDA in solving the personalized search. Specifically, a dual-RBM module is developed to simultaneously provide a preference surrogate and a probability model for conducting the individual selection and generation of the IEDA. Firstly, the positive and negative preferences of the currently involved user in IEDA are distinguished and combined to achieve a dual-RBM, and then the weighted energy functions of the RBM model together with social group information from users with similar preferences are designed as the preference surrogate. The probability of the trained positive RBM on the visible units is fetched as the reproduction model of EDA since it reflects the attribute distributions of more preferred items. Some benchmarks from the Movielens and Amazon datasets are applied to experimentally demonstrate the superiority of the proposed algorithm in improving the efficiency and effectiveness of the interactive evolutionary computations served personalized search.

Introduction

The task of personalized search is to find items that meet a user’s specific (can be changeable) preferences or requirements, therefore, its nature is an optimization problem. Evolutionary algorithms (EAs) will be effective on solving this problem supposing the user’s preferences or intentions can be explicitly expressed with accurate mathematical models. Unfortunately, such an assumption is hard to be satisfied even if the user’s preference is very certain and clear, not to mention the changeable scenarios.

The optimized objective of personalized search is based on users’ qualitative evaluation, comparison and decisions with their experiential knowledge and preferences, i.e., it is subjective, variable and fuzzy compared with traditional mathematically defined objectives. Accordingly, traditional optimization methods as well as various successful nature-inspired EAs for explicitly defined mathematical functions are no longer applicable. It is of practical significance to develop suitable EAs to effectively solve personalized search problems.

In the family of EAs, interactive evolutionary computations (IECs) are powerful for optimizing problems with qualitative objectives and expected to be effective for the personalized search [1], [2], [3]. In the past decades, fruitful studies on IECs have been devoted to alleviate users’ evaluation burdens in the evolutionary process, especially for complicated optimization tasks. The corresponding work can be classified into three groups: (1) Designing friendly interfaces, e.g., changing continuous evaluation mode to a discrete or fuzzy number ones [2], [4]; (2) enhancing evolutionary operators to accelerate the evolution process, e.g., Chen et al. [5] presented a Bayesian model based IEC to effectively reduce initial decision space according to the historical search; (3) developing surrogate or learning assisted IECs to quantitatively approximate the preference or evaluation of a given user on a candidate, i.e., in such IECs, the fitness function of the qualitative objective is estimated to drive the evolutionary operations as traditional ones [4], [5], [6]. We here try to effectively solve the personalized search with surrogate-assisted IECs since they have been successfully applied to some complex design and multi-objective decision problems.

Surrogate assisted IECs are similar to that of EAs. A user is required to first evaluate some individuals along with the evolutionary search, and these individuals together with the evaluated scores are used to train or build a model to approximate the user’s preferences. Then, the model is applied as a fitness surrogate in the subsequent evolution process, and the user only needs to revise few wrongly evaluated estimations by the surrogate. The model will be managed or updated when the user finds that the estimation is far from his/her preferences. Clearly, the surrogate building, including data collection, model selection and training, is critical for developing a reliable surrogate assisted IEC [7], [8], [9]. Model selection and training have been greatly attracted in various applications. Sun et al. [1] presented a semi-supervised learning based surrogate when the training data of interactive genetic algorithms are difficult to be sufficiently collected in handling complicated design problems. Pan et al. [8] proposed a classification-based surrogate to improve interactive decisions when using many-objective EA for numerically defined expensive optimization problems. Integrating parallel computing with surrogate-based EAs, Akinsolu et al. [10] proposed a parallel surrogate assisted algorithm to enhance the mutation operators of differential evolution for electromagnetic design. We also used probabilistic conditional preference network as a surrogate for personalized book search [11]. As for collecting the training data, only few studies have been developed. Chen et al. [5] presented a Bayesian induced interactive Estimation of Distribution Algorithm (IEDA) for personalized laptop search, in which users’ interactive time is used to construct an RBF-based surrogate model. Tian et al. [12] articulated granularity into a surrogate building to effectively collect the training data with relatively smaller computation cost when solving high-dimensional expensive optimization problems.

These surrogate-assisted EAs are effective on solving quantitatively or qualitatively defined complex problems. They endeavor to construct/manage the surrogate model with supervised or semi-supervised learning methods with evaluated individuals, and then use the model to approximate the individual fitness to perform evolutionary operators. The following three deficiencies of the exiting algorithms can be concluded. (1) A given user must provide initial interactions for constructing a surrogate, no matter by explicit or implicit ways, which inevitably conflicts with the motivation of alleviating user fatigue. To address this problem, unsupervised learning-based surrogates are more helpful and expectable. (2) The relationship among the evolutionary operators and the surrogate has rarely been considered, i.e., the information implicated in the surrogate construction may be valuable to strengthen the performance of the operators. (3) The intrinsic preference features of a user hidden in his/her historical interactions can greatly benefit to accurately reveal the user’s preferences, however, which has not been concerned even such a technology has been well developed and used in personalized recommendation. Therefore, integrating the achievement of user interest model in personalized recommendation into surrogate-assisted IECs will greatly improve the performance of personalized search.

As for using an unsupervised learning model to construct a surrogate and further capturing the relationship among the evolutionary operators and the surrogate, we presented a Restricted Boltzmann Machine (RBM)-based Estimation of Distribution Algorithm (EDA) for complex numerical problems [13]. In this algorithm, EDA is first performed for some generations on real problems to obtain the training data, and then RBM is trained with those better individuals (without using specific fitness values). Both the probability model of EDA and the fitness function are simultaneously fetched from the trained RBM, i.e., the joint probability of the visible layer in RBM is calculated as the probability model in EDA for population reproduction, and the energy function of the RBM is used to estimate the individual fitness of the optimized complex problem. Experimental results demonstrate its superior in effectively reducing computational complexity and improving the accuracy of fitness estimation. Inspired by these results, we here further study an unsupervised RBM surrogate-assisted IEC for personalized search since it figures out the shortages in existing surrogate-based ECs.

With regard to extract the intrinsic features of a user’s preference, many interest models used in personalized recommendation will provide valuable references [14], [15], e.g., Bayesian model [16], [17], Factorization Machine [18], [19], Multilayer Perceptron [20], [21], RBM [22], [23], Autoencoder model [24], [25], Convolutional Neural Network (CNN) [26], [27]. Rendle et al. [16] presented a Bayesian learning method for personalized ranking by maximizing the posterior estimator. In this method, the training data are grouped into evaluated items and unrated ones as positive and negative information. Cheng et al. [20] proposed a Wide & Deep learning based interest model by jointly training wide linear models and deep neural networks (DNN) to combine the benefits of memorization and generalization for recommendation. Kim et al. [26] presented a novel context-aware recommendation model, called as Convolutional Matrix Factorization (ConvMF), which makes full use of the positive and negative preferences to combine CNN with probabilistic matrix factorization for improving the prediction accuracy. Zhou et al. [28] proposed an attention-based user behavior modeling framework, which effectively integrates all of users’ historical interactive behaviors. However, these models have not been well combined with the IEC process to further effectively improve the personalized evolutionary search.

Motivated by our previous work, we here expand it to interactive personalized search and present a dual-RBM-assisted IEDA by articulating interest model construction with historical interactive behaviors. The RBM-based surrogate in [13] is first enhanced by modifying it into a dual-module one according to the grouped historical information for precisely extracting the user preference features. After the dual-module RBM is trained, the probability model of EDA will be constructed using the critical features of the positive RBM. The surrogate for estimating the fitness of the searched items is ultimately obtained by using the energy functions of the RBMs. The probability and surrogate models will be applied in IEDA to effectively find the satisfied $T o p N$ items for the current user. Adequate experiments on typical real-world datasets demonstrate that the proposed algorithm can effectively not only enhance the performance of the personalized search but also alleviate users’ evaluation burdens to improve user experiences in the searching process.

Accordingly, the main contributions of our work are as follows: (1) A dual-RBM module is presented by constructing two related RBM models, i.e., positive and negative ones. These two models are trained with dominant and inferior items evaluated by the current user to accurately track the user preference features. The module is then used to define the probability model and fitness surrogate for EDA; (2) the reproduction probability model of EDA for generating more preferred individuals is defined based on the probability of the visible layer in the positive RBM model by sufficiently using the positive preference features and effectively impairing the impacts of the negative ones; (3) the fitness surrogate is obtained by not only weighting the energy functions of the positive and negative RBM models but also social group knowledge.

The remainder of the paper is organized as follows. Section 2 introduces the notations of our study and the related preliminary work. The proposed dual RBM-driven IEDA is addressed in detail in Section 3. Section 4 presents the comparative experiments and corresponding experimental analysis. The conclusion is finally followed.

Section snippets

Notation of personalized search

Personalized search is a searching process in which a user finds out the satisfied items according to his/her interests and preferences. It can be described as a combinatorial optimization problem with qualitative index. An item (solution) with $n$ attributes (decision variables) is expressed as $x = \{x_{1}, x_{2}, \dots, x_{n}\}$ , and the objective function $f_{u} (x)$ of a user $u$ in the personalized search can be formally expressed as: $\{\begin{matrix} f_{u} (x) \\ s . t . x \in G \end{matrix}$ where $f_{u} (x)$ represents the preference of user $u$ on the item $x$ and often

Framework

As aforementioned, RBM has been successfully used in approximating users’ preferences for recommendation, but not applied in IECs for personalized search. Here, we propose a dual RBM-driven IEDA with social knowledge (shorted as SC_DRBMIEDA) for the personalized search. The framework of the proposed algorithm is presented in Fig. 2.

The SC_DRBMIEDA algorithm consists of three main contents:

(1) Construction of RBM

For getting a more reliable RBM to enhance the performance of IEDA, we here utilize

Experimental settings

Two kinds of typical datasets used in personalized recommendation are employed here to objectively demonstrate the performance of the proposed algorithms. The MovieLens datasets [33], i.e., MovieLens-latest-small (ML-l-s) and Amazon datasets [34], i.e., Digital_Music (Music), Apps_for_Android (Apps) and Movies_and_TV (Movies) are selected as the benchmark tasks. The statistical information of the datasets is shown in Table 1.

In the experiments, we run Python 3.6 on a computer with an AMD Ryzen

Conclusions

Personalized search is an optimization problem from the viewpoint of finding users’ satisfied items, and few ECs have been developed to solve such problems. Motivated by the researches of the user interest model in recommender system and the surrogate model in IECs, we present an enhanced RBM assisted IEDA by integrating social knowledge with a dual-RBM user preference model, in which the energy functions of RBMs are designed as a user’s preference surrogate to approximate the individual

CRediT authorship contribution statement

Lin Bao: Methodology, Software, Validation, Writing-original draft preparation. Xiaoyan Sun: Conceptualization, Funding acquisition, Resources, Writing-reviewing and editing. Yang Chen: Investigation. Dunwei Gong: Formal analysis, Project administration, Supervision. Yongwei Zhang: Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was jointly supported by the National Natural Science Foundation of China under grants No. 61876184 and No. 61473298. We also thank the anonymous reviewers for their valuable suggestions for helping improve the quality of this manuscript.

References (38)

SunX. et al.
Interactive genetic algorithms with large population and semi-supervised learning
Appl. Soft Comput.
(2012)
GongD. et al.
Interactive evolutionary algorithms with decision-maker? s preferences for solving interval multi-objective optimization problems
Neurocomputing
(2014)
SunX. et al.
Interactive genetic algorithm with implicit uncertainty evaluation for application in personalized search
TianJ. et al.
Granularity-based surrogate-assisted particle swarm optimization for high-dimensional expensive optimization
Knowl.-Based Syst.
(2020)
LuJ. et al.
Recommender system application developments: a survey
Decis. Support Syst.
(2015)
MaoM. et al.
Multiobjective e-commerce recommendations based on hypergraph ranking
Inform. Sci.
(2019)
ProbstM. et al.
Scalability of using restricted boltzmann machines for combinatorial optimization
European J. Oper. Res.
(2017)
TakagiH.
Interactive evolutionary computation for analyzing human characteristics
SunX. et al.
Directed fuzzy graph-based surrogate model-assisted interactive genetic algorithms with uncertain individual’s fitness
ChenY. et al.
Personalized search inspired fast interactive estimation of distribution algorithm and its application
IEEE Trans. Evol. Comput.
(2017)

SunX. et al.

A new surrogate-assisted interactive genetic algorithm with weighted semisupervised learning

IEEE Trans. Cybern.

(2013)

TianJ. et al.

Multiobjective infill criterion driven gaussian process-assisted particle swarm optimization of high-dimensional expensive problems

IEEE Trans. Evol. Comput.

(2018)

PanL. et al.

A classification-based surrogate-assisted evolutionary algorithm for expensive many-objective optimization

IEEE Trans. Evol. Comput.

(2018)

TianY. et al.

A surrogate-assisted multiobjective evolutionary algorithm for large-scale task-oriented pattern mining

IEEE Trans. Emerg. Top. Comput. Intell.

(2018)

AkinsoluM.O. et al.

A parallel surrogate model assisted evolutionary algorithm for electromagnetic design optimization

IEEE Trans. Emerg. Top. Comput. Intell.

(2019)

BaoL. et al.

Restricted boltzmann machine-assisted estimation of distribution algorithm for complex problems

Complexity

(2018)

RendleS. et al.

Bpr: bayesian personalized ranking from implicit feedback

WangX. et al.

Cmbpr: category-aided multi-channel bayesian personalized ranking for short video recommendation

IEEE Access

(2019)

RendleS.

Factorization machines with libfm

ACM Trans. Intell. Syst. Technol. (TIST)

(2012)

Cited by (13)

Multi-local Collaborative AutoEncoder
2022, Knowledge-Based Systems
Citation Excerpt :
Because of its excellent representation learning capability, it is getting more and more attention for deep clustering [9] on the image data and classification [10] on the medical data. Restricted Boltzmann Machines (RBMs) and relevant autoencoders have been proved to be provided with the capability of representation learning [11–20]. In our previous work [17], we also proposed a powerful variant of GRBM called pcGRBM for semi-supervised representation learning.
The excellent performance of representation learning of autoencoders have attracted considerable interest in various applications. However, the structure and multi-local collaborative relationships of unlabeled data are ignored in their encoding procedure that limits the capability of feature extraction. This paper presents a Multi-local Collaborative AutoEncoder (MC-AE), which consists of novel multi-local collaborative representation RBM (mcrRBM) and multi-local collaborative representation GRBM (mcrGRBM) models. Here, the Locality Sensitive Hashing (LSH) method is used to divide the input data into multi-local cross blocks which contains multi-local collaborative relationships of the unlabeled data and features since the similar multi-local instances and features of the input data are divided into the same block. In mcrRBM and mcrGRBM models, the structure and multi-local collaborative relationships of unlabeled data are integrated into their encoding procedure. Then, the local hidden features converges on the center of each local collaborative block. Under the collaborative joint influence of each local block, the proposed MC-AE has powerful capability of representation learning for unsupervised clustering. However, our MC-AE model perhaps perform training process for a long time on the large-scale and high-dimensional datasets because more local collaborative blocks are integrate into it. Five most related deep models are compared with our MC-AE. The experimental results show that the proposed MC-AE has more excellent capabilities of collaborative representation and generalization than the contrastive deep models.
Constrained evolutionary optimization based on reinforcement learning using the objective function and constraints
2022, Knowledge-Based Systems
Citation Excerpt :
Therefore, how to use intelligent computing methods to solve COPs has become a hot topic of discussion for researchers and practitioners [7]. Evolutionary algorithms (EAs), as meta-heuristic algorithms based on the population, are widely used to solve COPs [8–10], because of their robustness and wide applicability. Among various EAs [11,12], differential evolution (DE) has attracted attention due to its efficient and powerful performance.
Solving constrained optimization problems (COPs) with evolutionary algorithms is highly active in the evolutionary computation community. Combining evolutionary algorithms with the learning techniques is an efficient way to obtain promising performance for the COPs. Based on this consideration, we propose a differential evolution assisted by reinforcement learning (RL), namely RL-CORCO, to effectively solve the COPs. The proposed method can be featured as (i) the Q-learning in RL is used for adaptive operator selection; (ii) the hierarchical population is set as a state to find the feasible optimal solution; and (iii) the correlation between constraints and the objective function is utilized. The RL-CORCO is tested on 18 benchmark problems in the CEC 2010 competition and 28 benchmark problems in the CEC 2017 competition. Experimental results show that in CEC2017, RL-CORCO performed better than others on 12 problems in 50 dimensions and 14 problems in 100 dimensions. The results of the Friedman’s test demonstrate the efficacy of the algorithm, which is able to obtain highly competitive results compared with other related methods.
Forecasting energy generation in large photovoltaic plants using radial belief neural network
2021, Sustainable Computing: Informatics and Systems
Citation Excerpt :
Several models in the study and numeric simulation were developed for the calculation of global solar radiation data, insolation and daily cleanliness index on different scales. The existing methods [1–15,17–26,29,30] at times falls with inaccurate forecast due to increased parameters and that causes higher prediction error. Usually, these models encounter into various other problems like missing data, inaccurate forecast on long run, prediction of data based on a specific location with inaccurate measurement devices.
Forecasting the energy generation from the solar power is considered challenging due to inaccuracies in forecasting, reliability issues and substantial economic losses in power systems. Hence, it is necessary to consider wide features from the solar power generation point of view. In this paper, the study uses large features set to feed the deep learning classifier for optimal prediction of energy generation from the photovoltaic (PV) plants. The features selection and prediction modules automates the process of optimal prediction of energy using Radial Belief Neural Network (RBNN). The Restricted Boltzmann Machines (RBM) is used for rule set generation based on the feature extracted and the rule set generation is powered by action-reward based Reinforcement Learning (RL) method. The experiments are conducted with rich set of input features on large PV plants that ranges between 1, 50, 100 and 1000. The performance of the proposed model is compared with various metrics that includes: Root mean squared error (RMSE), normalized root mean squared error (NRMSE), mean bias error (MBE), Mean absolute error (MAE), Maximum absolute error (MaxAE), mean absolute percentage error (MAPE), Kolmogorov–Smirnov test integral (KSI) and OVER metrics, Skewness and kurtosis and variability estimation metrics. The simulation results show that the RBNN offers improved prediction ability with reduced errors than other deep and machine learning classifiers.
Review on personalized search and recommendation algorithms for multi-source heterogeneous data
2024, Kongzhi Lilun Yu Yingyong/Control Theory and Applications
An Estimation of Distribution Algorithm-Based Hyper-Heuristic for the Distributed Assembly Mixed No-Idle Permutation Flowshop Scheduling Problem
2023, IEEE Transactions on Systems, Man, and Cybernetics: Systems
Interactive Multifactorial Evolutionary Optimization Algorithm with Multidimensional Preference Surrogate Models for Personalized Recommendation
2023, Applied Sciences (Switzerland)

View all citing articles on Scopus

View full text

Restricted Boltzmann Machine-driven Interactive Estimation of Distribution Algorithm for personalized search

Abstract

Introduction

Section snippets

Notation of personalized search

Framework

Experimental settings

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Appl. Soft Comput.

Neurocomputing

Knowl.-Based Syst.

Decis. Support Syst.

Inform. Sci.

European J. Oper. Res.

Interactive evolutionary computation for analyzing human characteristics

Directed fuzzy graph-based surrogate model-assisted interactive genetic algorithms with uncertain individual’s fitness

Personalized search inspired fast interactive estimation of distribution algorithm and its application

IEEE Trans. Evol. Comput.

A new surrogate-assisted interactive genetic algorithm with weighted semisupervised learning

IEEE Trans. Cybern.

Multiobjective infill criterion driven gaussian process-assisted particle swarm optimization of high-dimensional expensive problems

IEEE Trans. Evol. Comput.

A classification-based surrogate-assisted evolutionary algorithm for expensive many-objective optimization

IEEE Trans. Evol. Comput.

A surrogate-assisted multiobjective evolutionary algorithm for large-scale task-oriented pattern mining

IEEE Trans. Emerg. Top. Comput. Intell.

A parallel surrogate model assisted evolutionary algorithm for electromagnetic design optimization

IEEE Trans. Emerg. Top. Comput. Intell.

Restricted boltzmann machine-assisted estimation of distribution algorithm for complex problems

Complexity

Bpr: bayesian personalized ranking from implicit feedback

Cmbpr: category-aided multi-channel bayesian personalized ranking for short video recommendation

IEEE Access

Factorization machines with libfm

ACM Trans. Intell. Syst. Technol. (TIST)