Multi-objective multi-view based search result clustering using differential evolution framework

https://doi.org/10.1016/j.eswa.2020.114299Get rights and content

Highlights

  • Proposed a framework for Search Results Clustering using multi-objective optimization.

  • Problem is posed as a multi-view clustering problem.

  • Considers different views, syntactic and semantic, while performing clustering.

  • Able to determine the number of cluster in an automatic way.

  • Makes use of textual-entailment, word mover distance and universal sentence encoder.

Abstract

Search Results Clustering (SRC) is a well-known problem in the field of information retrieval and refers to the clustering of web-snippets for a given query based on some similarity/dissimilarity measure. In this current study, we have posed Search Results Clustering problem as a multi-view clustering problem and solved it from an optimization point of view. Various views based on syntactic and semantic similarity measures were considered while performing the clustering. In contrast to existing algorithms, three new views based on word mover distance, textual-entailment, and universal sentence encoder, measuring semantics while performing clustering, are incorporated in our framework. Different quality measures computed on clusters generated by different views are optimized simultaneously using multi-objective binary differential evolution (MBDE) framework. MBDE comprises a set of solutions and each solution is composed of two parts corresponding to different views. An agreement index checking the accordance between partitionings of different views is also optimized to obtain a consensus partitioning. The proposed approach is automatic in nature as it is capable of detecting the number of clusters for any query in an automatic way. Experiments are performed on three benchmark multi-view datasets corresponding to web search results and evaluated using well-known F-measure metric. Results obtained illustrate that our approach outperforms state-of-the-art techniques.

Introduction

From the past few decades, Web Search Result Clustering (SRC) (Carpineto et al., 2009, Carpineto and Romano, 2010, Crabtree et al., 2005, Mitra et al., 2019) has gathered much attention in making the web browsing easier for users. In the current world, everyone uses the internet and does surfing to fetch the required information corresponding to a given query using the web search engine (WSE) like Google,1 Bingo2 etc. After feeding query to WSE, the user gets an innumerable number of web pages, but, not all the pages are relevant and sometimes user gets tired of searching for the relevant pages. To resolve the issue, web companies are working on optimization of search results by performing some sort of text mining. Here also, the main objective of SRC system is to return a set of clusters of web documents/snippets (WS) retrieved from the WSE after a query is given. In other words, SRC aims to categorize the search results into a set of clusters so that it becomes easier for the user to traverse only the relevant cluster to get the desired information.

In the current paper, SRC problem is posed as a multi-view clustering (MVC) problem (Wang & Chen, 2017) which takes advantage of two views of the web snippets. Note that views refer to a set of features that together represent an object and have shown its application in solving many real-life problems (Bickel and Scheffer, 2004, Cai et al., 2013). In our case, a view refers to (a) the syntactic view of the web-snippets (representing the structure of the web-snippets in terms of unique words and makes use of the well-known tf-idf (Ramos et al., 2003) representation to get the feature vector); (b) the semantic view which is measured in three ways (i) word mover distance (Kusner, Sun, Kolkin, & Weinberger, 2015); (ii) textual entailment (TE) (whether a snippet is inferred by another snippet or not) (Saini, Saha, Bhattacharyya, & Tuteja, 2020); (iii) universal sentence encoder (Cer et al., 2018). Any combination of views can be utilized to perform MVC (in our case, it is two or three in number). The web-snippets for a given query are clustered (or partitioned) into different categories by considering the feature vectors of different views, called multi-view clustering. Finally, to get a single/final partitioning satisfying both the views, a consensus approach is utilized motivated by the paper (Saha, Mitra, & Kramer, 2018). Note that TE is itself a challenging problem in the field of natural language processing domain and is a hot research area as can be analyzed from the recent papers (Liu et al., 2019, Yang et al., 2019). Therefore, in this paper, we have explored the TE as a part of multi-view clustering.

If the data is very complex having multiple views, then it may be very difficult to get an optimal single/final partitioning. Therefore to obtain an optimal partitioning satisfying all the views, we have formulated the task of multi-view clustering as a multi-objective (MOO) optimization problem. However, it is true that multi-objective search results clustering (SRC) is naturally an application of multi-view clustering. In MOO, the task is to optimize more than one objective functions. In our case, the partitioning corresponding to each view is quantified using an internal cluster validity index, namely, PBM index (Pakhira, Bandyopadhyay, & Maulik, 2004). It evaluates the quality of the partitioning in terms of compactness and maximum separation between clusters. Compactness should be minimized and separation should be maximized for good quality clusters. The PBM index values over different partitionings obtained using different views are optimized simultaneously so that partitioning in individual view attains its optimal structure. Moreover, the agreement index (AI) measuring the amount of similarity amongst the partitionings obtained using different views, is also optimized simultaneously.

For the optimization purpose, a binary version of differential evolution (Wang, Fu, Menhas, & Fei, 2010) is utilized in the multi-objective optimization framework (MBDE). MBDE is based on evolutionary procedure (Deb, 2001). It consists of a set of solutions or chromosomes called a population. Each solution is composed of two parts corresponding to two different views. Note that the utilities of using MOO over single objective optimization are already established in the literature (Acharya et al., 2014, Saha et al., 2018). MBDE generally uses a rand/1/bin scheme (Storn & Price, 1997) to explore the search space efficiently. But, in this paper, we have used current-to-best/1/bin DE variant which provided a promising direction in search for a global optimum solution. Moreover, in MBDE framework, CR and F are two crucial parameters and those help in finding globally optimal solutions. Therefore, instead of fixing those variables, a pool of values are used motivated by the paper (Wang, Cai, & Zhang, 2011), and any value from this set can be selected in a particular iteration. As our proposed approach is based on MOO-based concept utilizing DE as the optimization strategy, therefore, we have called our proposed approach as MMOO-DEclus (multi-view multi-objective differential evolution based clustering).

With the advent of deep-learning based models, it has become possible to better capture the semantic similarity between two words. Therefore, to capture the semantic similarity between two snippets we have utilized the same, i.e., we have either used word mover distance (WMD) (Kusner et al., 2015) or textual entailment (TE) or universal sentence encoder (USE) (Cer et al., 2018). Here, TE means whether snippet ‘a’ (called as hypothesis) is semantically inferred by snippet ‘b’ (called as premise/text) or not. It is denoted as ‘b a’.

WMD uses the well known pre-trained model on a large corpus and thus, is able to capture semantic relationship between two words. Distance between two texts/snippets is a measure of dissimilarity of the two texts/snippets. Exactly same snippets/texts have WMD distance of value 0.

For textual entailment, the recently proposed BERT (Devlin, Chang, Lee, & Toutanova, 2018) model is utilized. The motivation behind exploring BERT is its better performance in solving different problems of NLP domain like question-answering, classification, etc. Moreover, in none of the existing techniques for MVC, TE-based view was explored. More discussions on TE-based view are provided in Section 3. It provides three values belonging to three classes given two snippets: entailment, contradiction and neutral. A snippet ‘a’ is assigned to a cluster ‘k’ if ‘a’ has a high value of entailment in comparison to other classes. Note that a cluster is a group of similar web-snippets and represented by the cluster center (medoid) which is nothing but a web-snippet. An example of web-snippet assignment to a particular cluster based on TE is shown in Fig. 1. In this example, there are three clusters, Cluster1, Cluster2, and, Cluster3, having three-snippets, W1, W2, W3, as their clusters centers, respectively. As can be seen, the snippet, ‘If you help the needy, God will reward you.’ has high entailment value to Cluster1, therefore, it must be assigned to Cluster1.

Universal sentence encoding (USE) is a model for encoding sentences in the form of embedding vectors. This model was based on the transfer learning task and has shown its application in solving various NLP tasks. For more information, the reader can refer to Cer et al. (2018). After generating vectors for the web-snippets, cosine distance is calculated. If two web-snippets have the zero cosine distance, then they are exactly similar.

From an optimization point of view, it has been observed that only two techniques, NSGA-II (Deb, Pratap, Agarwal, & Meyarivan, 2002) and AMOSA (Bandyopadhyay, Saha, Maulik, & Deb, 2008), are explored, which are techniques developed in the years 2002 and 2008, respectively. But, in recent years, a lot of research is going on in improving DE (Zhang et al., 2016). In the literature, the efficacy of differential evolution is shown over other optimization algorithms in solving different real-time applications like clustering, document summarization, etc. in papers (Dong, 2009, Saini et al., 2017, Saini et al., 2019a, Saini, Saha, Jangra et al., 2019, Vesterstrom and Thomsen, 2004, Zhang and Wei, 2014, Zhang et al., 2016). Moreover, it has been already proved that DE has faster convergence in finding global optimal solutions (Saini et al., 2017, Storn and Price, 1997). Therefore, in the current paper, a multi-objective binary differential evolution (MBDE) strategy has been designed and implemented for our task.

Due to the popularity of self-organizing map-based selection operator (Saini et al., 2019a, Saini, Saha, Chakraborty et al., 2019, Saini, Saha, Jangra et al., 2019, Zhang et al., 2016), it is also explored in fusion with our proposed framework. In other words, we have also investigated the effect of using a self-organizing multi-objective differential evolutionary algorithm, which is recently introduced by Zhang et al. (2016). Note that in Zhang et al. (2016), the rand/1/bin DE variant is used; however, we have replaced it with the current-to-best/1/bin variant because our proposed framework also utilizes the same.

The key contributions/highlights of this paper are enumerated below:

  • 1.

    The current paper proposes a novel unsupervised approach (MMOO-DEclus) for clustering the search results (SRC).

  • 2.

    A binary version of multi-objective optimization (MOO) based differential evolution (DE) is utilized to optimize different cluster quality measures simultaneously, which was never attempted for solving the SRC task to the best of our knowledge.

  • 3.

    While performing clustering, the concept of multi-view clustering (MVC) is utilized. First one uses syntactic features (tf-idf Ramos et al., 2003 representation of the web-snippets). The second view uses semantic features utilizing any of the followings: (a) word mover distance (Kusner et al., 2015); (b) textual entailment (Saini et al., 2020) based similarity measure; (c) Universal Sentence Encoder. Note that these features were never utilized in the literature (Saha et al., 2018, Wahid et al., 2014, Wang et al., 2013) for solving the MVC task.

  • 4.

    A study is presented by varying the view combinations including single-view, two-views, and three-views.

  • 5.

    Proposed approach is designed to automatically detect the number of clusters in a given set of web-snippets.

  • 6.

    To show the potentiality of our MOO-based MVC (MMOO-DEclus), we have developed two baselines considering single views (uses either syntactic or semantic view). The first and second baselines are based on single and multi-objective optimization concept; thus named as SSOO-DEclus and SMOO-DEclus, respectively.

Experiments are performed on the three existing benchmark multi-view datasets corresponding to web search results and evaluated using well-known F1-measure (Saini et al., 2020). Our approach is compared with several multi-objective and single-objective based optimization approaches. Results obtained illustrate that our approach outperforms state-of-the-art techniques. The rest of the sections are organized as below. Section 2 discusses the related works. The proposed architecture for SRC is discussed in Section 3. Experimental setup and discussion of results are covered in Section 4. Finally, the conclusion is provided in Section 5.

Section snippets

Related works

Multi-view clustering technique involves the exploration of the multiple views or representations available. Generally, it is of two types: (a) all the available views or representations can be used independently to generate the partitionings and the individual partitionings can be combined to obtain a single partitioning mainly known as distributed multi-view clustering; (b) Another way is to obtain partitioning considering all the views or representations simultaneously. This technique is

Proposed methodology

In this section, we have described in detail our proposed approach for SRC. The associated flowchart and pseudo code are provided in Fig. 2 and Algorithm 1, respectively.

Experimental setup and discussion of results

The current section discusses about datasets used, evaluation measures, parameters used, results obtained followed by their discussions.

Conclusion

In this paper, we have proposed a multi-objective optimization-based multi-view clustering. Different views measuring different syntactic and semantic information are considered in our approach. Three new views based on word mover distance, textual entailment, and universal sentence encoder, measuring the semantic similarities, are incorporated into our framework. A binary differential evolution framework is utilized as the underlying optimization strategy, which is an evolutionary algorithm.

CRediT authorship contribution statement

Naveen Saini: Conceptualization, Investigation, Methodology, Project administration, Validation, Writing - review & editing. Diksha Bansal: Conceptualization, Investigation, Methodology, Writing - original draft. Sriparna Saha: Conceptualization, Methodology, Project administration, Validation, Supervision, Writing - review & editing. Pushpak Bhattacharyya: Conceptualization, supervision, Validation, Writing - review & editing.

Declaration of Competing Interest

One or more of the authors of this paper have disclosed potential or pertinent conflicts of interest, which may include receipt of payment, either direct or indirect, institutional support, or association with an entity in the biomedical field which may be perceived to have potential conflict of interest with this work. For full disclosure statements refer to https://doi.org/10.1016/j.eswa.2020.114299. Early Career Research Award of Science and Engineering Research Board (SERB)

Acknowledgment

Dr. Sriparna Saha would like to acknowledge the support from Early Career Research Award of Science and Engineering Research Board (SERB) of Department of Science and Technology India to carry out this research.

References (57)

  • CarpinetoC. et al.

    A survey of web clustering engines

    ACM Computing Surveys

    (2009)
  • CarpinetoC. et al.

    Optimal meta search results clustering

  • CerD. et al.

    Universal sentence encoder

    (2018)
  • CrabtreeD. et al.

    Improving web clustering by cluster selection

  • Da SilvaJ.F. et al.

    Using localmaxs algorithm for the extraction of contiguous and non-contiguous multiword lexical units

  • De SaV.R.

    Spectral clustering with two views

  • DebK.

    Multi-objective optimization using evolutionary algorithms, Vol 16

    (2001)
  • DebK. et al.

    A fast and elitist multiobjective genetic algorithm: Nsga-ii

    IEEE Transactions on Evolutionary Computation

    (2002)
  • DevlinJ. et al.

    Bert: Pre-training of deep bidirectional transformers for language understanding

    (2018)
  • DongR.

    Differential evolution versus particle swarm optimization for pid controller design

  • FoggiaP. et al.

    A graph-based clustering method and its applications

  • HaykinS.S. et al.

    Neural networks and learning machines, Vol. 3

    (2009)
  • HeZ. et al.

    Visualization and performance metric in many-objective optimization

    IEEE Transactions on Evolutionary Computation

    (2016)
  • IbrahimiM. et al.

    Robust max-product belief propagation

  • JainA.K. et al.

    Algorithms for clustering data

    (1988)
  • Kumar, A., & Daumé, H. (2011). A co-training approach for multi-view spectral clustering. In Proceedings of the 28th...
  • KusnerM. et al.

    From word embeddings to document distances

  • LiuJ. et al.

    Gaussian mixture model with local consistency

  • Cited by (12)

    • Metaheuristic algorithms in text clustering

      2023, Comprehensive Metaheuristics: Algorithms and Applications
    • M<sup>2</sup>SPL: Generative multiview features with adaptive meta-self-paced sampling for class-imbalance learning

      2022, Expert Systems with Applications
      Citation Excerpt :

      The development of the multiview learning mechanism was primarily motivated by the properties of data from multiple templates, in which data samples are described by various feature domains or “views”. The applications of multiview learning range from dimensionality reduction (White, Zhang, Schuurmans, & Yu, 2012) and active learning (Zhang & Sun, 2010) to clustering (Saini, Bansal, Saha, & Bhattacharyya, 2021). Recent works (Huang, Chao, & Wang, 2019; Xiao, Chen, Gong, & Zhou, 2021; Zhang, Liu, Shen, Shen, & Shao, 2018) have fused similarity measurements from diverse views to construct a graph for multiview samples, and successfully extended the conventional multiview spectral clustering.

    • A network-based sparse and multi-manifold regularized multiple non-negative matrix factorization for multi-view clustering

      2021, Expert Systems with Applications
      Citation Excerpt :

      Thus, clustering performance based on information extracted from multiple views outperforms that from a single view (Zong, Zhang, Zhao, Yu, & Zhao, 2017). In the last few years, multi-view clustering has attracted increasing attention from researchers and has become an important research area in data mining (Zhang, Nie, Li, & Wei, 2019; Zhang, Liu, Shen, Shen, & Shao, 2019; Saini, Bansal, Saha, & Bhattacharyya, 2020). However, multi-view clustering faces more challenges than classical single-view clustering in which data samples are collected from a single source or represented by one kind of feature type because, as multi-view data may be heterogeneous, different features describe different information from different perspectives (for example, a story can be written in different languages and reported by different news sources), and the similarity between data samples encapsulates complex relationships such as similarity across different views (inter-view similarity) and similarity just within a single view (intra-view similarity).

    View all citing articles on Scopus
    View full text