Multi-objective materialized view selection using NSGA-II

Prakash, Jay; Kumar, T. V. Vijay

doi:10.1007/s13198-020-01030-6

123 Accesses
6 Citations
Explore all metrics

Abstract

Data warehouse is constructed with the purpose of supporting decision making. Decision making queries, being long and complex, consume a lot of time in processing against a continuously growing data warehouse. View materialization is one of the alternative ways of improving the response time of such analytical or decision making queries. This involves selection and materialization of views that minimize the analytical query response times while adhering to the resource constraints. This is referred to as the view selection problem, which is a NP-Hard problem. The view selection problem is concerned with simultaneously minimizing the cost of evaluating materialized and non-materialized views. This being a bi-objective optimization problem is addressed using NSGA-II in this paper. The proposed approach aims to achieve an acceptable trade-off between the afore-mentioned two objectives.

Multi-objective materialized view selection using MOGA

Article 19 January 2020

Jay Prakash & T. V. Vijay Kumar

Query Prioritization for View Selection

Materialized View Selection Using Iterative Improvement

1 Introduction

Commercial organizations make business decisions by analyzing their performance using historical business transactions data. The data is usually spread across multiple disparate databases, as offices of an organization are spread across many locations in the world. Since the decision making queries are analytical in nature, processing such queries is time consuming as the required data is spread across multiple disparate data sources. Processing these analytical queries requires the data to be retrieved from multiple data sources, either using the on-demand approach or the in-advance approach for querying (Widom 1995). Data warehousing is based on the later approach, where data is retrieved in advance from multiple disparate sources and accumulated in a central repository, referred to as a data warehouse, within the organization. The purpose of a data warehouse is to answer analytical queries in order to facilitate management decisions. A data warehouse storing time variant and non-volatile data is always available for querying; even when remote data sources are inaccessible (Inmon 2003; Kimball and Ross 2002).

A data warehouse is designed with the purpose of supporting decision making (Inmon 2003). Since decision making queries are long and complex, processing these queries against a data warehouse is time consuming. Data warehouse addresses this problem by pre-computing and storing the relevant and the required data in it. This data is stored in the form of materialized views (Roussopoulos 1997). For an n-dimensional data set, the possible number of views that can be stored is 2ⁿ. Though materializing all possible views would result in the fastest query response time, it becomes infeasible to store all the possible views, when the value of N is high, due to storae space constraints. Thus, for higher values of N, subsets of views are to be selected that minimize the analytical query response time subject to the resource constraints (Chirkova et al. 2001; Yousri et al. 2005). This is referred to as the view selection, which is a constrained combinatorial optimization NP-Complete problem (Karloff and Mihail 1999).

1.1 View selection

View selection is key to the effective and efficient design of a data warehouse. Data warehouse are designed to answer decision making queries. These queries take a long time for processing when posed against a continuously growing data warehouse. A mechanism that would enable answering these queries efficiently needs to be devised. One such mechanism is the selection of a subset of views, which have the potential for answering most future queries, and materialize them so that most future analytical queries are answered using these materialized views. These materialized views, which conform to the storage space constraint, are considerably smaller when compared to a data warehouse and, thus, would process the analytical queries in an efficient manner resulting in reduced response times. The challenge lies in selecting such materialized views. These cannot be selected arbitrarily, as the set of views may become redundant and may not have the potential of answering analytical queries posed in future thereby making them an unnecessary space overhead. The objective should be to select those views that have the potential of answering future analytical queries. View selection aims to select views that reduce the response times for analytical queries while conforming to the limited resources available for materialization (Chirkova et al. 2001). Several view selection algorithms exist, most of these are either empirical (Agrawal et al. 2000; Aouiche and Darmont 2009; Baralis et al. 1997; Golfarelli and Rizzi 2000; Lehner et al. 1996; Li et al. 2005; Lin et al. 2007; Luo 2007; Teschke and Ulbrich 1997; Vijay Kumar et al. 2010b; Vijay Kumar and Devi 2012, 2013) or based on heuristic (Encinas and Montano 2007; Gupta and Mumick 2005; Gupta et al. 1997; Haider and Vijay Kumar 2011, 2017; Harinarayan et al. 1996; Shah et al. 2006; Valluri et al. 2002; Vijay Kumar 2013; Vijay Kumar and Ghoshal 2009; Vijay Kumar et al. 2010a, 2011; Vijay Kumar and Haider 2010, 2011a, b, 2012, 2015).

Several view selection frameworks, like lattice (Harinarayan et al. 1996), Cube (Shukla et al. 1998), AND-OR view graph (Gupta 1997), Multi-View Processing Plan (Yang et al. 1997), exist for which the view selection problem has been studied. In this paper, a Multidimensional lattice framework is considered for view selection.

In a multi-dimensional lattice framework (Harinarayan et al. 1996), the number of views in the lattice are exponential in regard to the number of dimensions in a star schema from which the lattice have been arrived at. That is, for a star schema comprising a fact table surrounded by n dimensional tables, the corresponding multidimensional lattice would comprise 2ⁿ views. Consider a 3-dimensional lattice shown in Fig. 1, with dimensions P, Q and R, wherein the nodes represent the views grouped by the respective dimensions, the view index is give alongside the view name and the size of the views is given alongside the view node. All the views in the lattice depend on the root view PQR, which is based on the grouping of all the dimensions P, Q and R. That is, queries on any of the views in the lattice can be answered PQR. Further, queries on a view can be answered by a view, above it, directly or indirectly connected through edge(s), i.e. there is a direct or indirect dependency between them.

For higher dimensions, the number of views in the multidimensional lattice varies exponentially in terms of the number of dimensions and, thus, it becomes infeasible to select an optimal set of views that complies with the storage space constraint. Alternately, a relevant subset of views can be selected from the multidimensional lattice. This being an NP-Complete problem (Karloff and Mihail 1999) can be appropriately addressed using metaheuristic algorithms that aim to achieve an optimal trade-off between exploration and exploitation. Several metaheuristic view selection algorithms exist, most of which are randomized (Kalnis et al. 2002; Lee and Hammer 2001; Theodoratos et al. 2001; Vijay Kumar and Kumar 2012a, c, 2015), evolutionary based (Horng et al. 1999; Kumar and Vijay Kumar 2018b; Lin and Kuo 2004; Vijay Kumar and Kumar 2012b, 2013, 2014; Wang and Zhang 2005; Yu et al. 2003; Zhang et al. 2001; Zhou et al. 2012) or swarm based (Arun and Vijay Kumar 2015a, b, 2017a, b; Sun and Wang 2009; Vijay Kumar and Arun 2016, 2017; Kumar and Vijay Kumar 2017, 2018a). In (Vijay Kumar and Kumar 2012a, 2012b, 2012c), the materialized view selection (MVS) problem is formulated as a single objective optimization problem with the objective to minimize the total view evaluation cost (TVEC). However, this problem was expressed as a bi-objective MVS problem in (Prakash and Vijay Kumar 2019a) and was addressed using VEGA in (Prakash and Vijay Kumar 2019a), MOGA in (Prakash and Vijay Kumar 2020) and SPEA-2 in (Prakash and Vijay Kumar 2019b). Among these, the SPEA-2 based MVS algorithm performed comparatively better. In this paper, a Non-dominated Sorting Genetic Algorithm (NSGA-II) (Deb et al. 2002) has been used to address this bi-objective optimization problem.

1.2 Organization of the paper

This paper is organized as follows: Sect. 2 discusses MVS using NSGA-II followed by an illustrative example in Sect. 3. Experimental results are given in Sect. 4. The conclusions are given in Sect. 5.

2 MVS using NSGA-II

The MVS problem was addressed in (Vijay Kumar and Kumar, 2012a, 2012b, 2012c) by minimizing the TVEC, which is defined below (Vijay Kumar and Kumar, 2012a, b, c):

$$ TVEC(V_{Top - K} ) = \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} = 1}}^{N} {Size(V_{i} )} + \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} = 0}}^{N} {SizeSMA(V_{i} )} $$

where, ${\text{Size}}\left({\text{V}}_{\text{i}}\right)$ and ${\text{SizeSMA}}\left({\text{V}}_{\text{i}}\right)$ are the size of the ith view V_i and the size of the smallest materialized ancestor of the ith view V_i respectively, ${\text{SM}}_{{\text{v}}_{\text{i}}}$ is Materialized Status of the ith view V_i, having value 1, if materialized, and 0, if not materialized and N is the number of views in the lattice.

The above MVS problem was expressed, as a bi-objective optimization problem with the objectives being (Prakash and Vijay Kumar 2019a):

$$ MinimizeC_{MV} = \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} }}^{N} {Size(V_{i} )} $$

$$ MinimizeC_{NMV} = \sum\limits_{{i = 0 \wedge SM_{{V_{i} }} }}^{N} {Size(V_{i} )} $$

Further, since these costs were shown to conflict with each other, as decrease in C_MV would result in an increase in the C_NMV and vice versa, Pareto based multi-objective evolutionary algorithms like MOGA in (Prakash and Vijay Kumar 2020) and SPEA-2 in (Prakash and Vijay Kumar 2019b) have already been used to solve this problem. In this paper, Pareto based technique NSGA-II (Deb et al. 2002) has been adapted to address this bi-objective MVS problem. NSGA-II is briefly discussed next.

2.1 NSGA-II

Srinivas and Deb proposed a first non-dominated sorting genetic algorithm (NSGA) for multi-objective optimization in (Srinivas and Deb 1994). NSGA had limitations like high computational complexity and the aprior specification of the sharing parameter (Deb et al. 2002; Deb 2014). In 2002, Deb et al. proposed “a fast and elitist multi-objective genetic algorithm: NSGA-II.” The NSGA-II used a fast non-dominated sorting and diversity preserving mechanism along with elite preservation, which addressed the limitations of NSGA (Deb et al. 2002). NSGA-II uses an elitist based sorting algorithm, which does not require the sharing parameter to be chosen aprior. NSGA-II is one of the most popular and widely used multi-objective evolutionary algorithms for multi-criteria optimization. It generates non-dominated and diversely spread solutions on the Pareto optimal front using rank and crowding distance (Deb et al. 2002). NSGA-II has been shown to generate better trade-off solutions between the conflicting objectives of a multi-objective optimization problem. In this paper, the bi-objective MVS problem has been addressed using NSGA-II. Accordingly, an NSGA-II based MVS algorithm (MVS_NSGA-II) is proposed and is discussed next.

2.2 Algorithm MVS_NSGA-II

The algorithm MVS_NSGA-II takes a lattice of views L, size of views in L, the number of the Top-K views in the population N, the number of objective functions |F|, crossover probability CP and mutation probability MP, as input, and outputs the Top-K view vectors in the first Pareto optimal front. The algorithm MVS_NSGA-II is given in Fig. 2.

The Top-K views in the population, which are denoted as a set of K view indexes in L, is hereafter referred to as Top-K view-vector (Top-KVVs). Each element of a Top-KVV is a view index in L. An example of Top-KVV of 5 indexes (K = 5) is given below:

First, the population, PTKV of Top-KVVs is randomly generated, in which each Top-KVV is a permutation of K distinct indexes from the lattice L and N is the number of Top-KVVs in the population PTKV. Next, C_MV and C_NMV for each of the Top-KVVs in PTKV is computed. Using these computed values, a non-dominated sorting (Deb et al. 2002) is performed on the Top-KVVs in the population PTKV. For each Top-KVV TKV, a set containing the number of Top-KVVs dominated by it, i.e. TKVD, and the number of Top-KVVs dominating it, i.e. DTKV, is computed. Initially, TKVD and DTKV for the Top-KVV TKV are assigned an empty set and zero respectively. Thereafter, for each Top-KVV TKV’ in PTKV, C_MV and C_NMV values of TKV' and TKV are compared. If TKV dominates TKV', TKV´ is added to TKVD. Otherwise, increment the value of DTKV. After computation of TKVD and DTKV for each Top-KVV in PTKV, the Top-KVVs having a DTKV value equal to ‘0′ are assigned rank ‘one’ and are added to the first Pareto front. Next, DTKV’s values of Top-KVVs TKVD in the first Pareto front is decremented by ‘1′. Thereafter, the Top-KVVs having DTKV’s value as ‘zero’ are assigned rank ‘two’ and are added to the second Pareto front. For the remaining Top-KVVs, ranks and Pareto fronts are computed in a similar manner. Next, the crowding distance CD of each Top-KVV is computed. The Top-KVVs in each Pareto front are sorted based on the C_MV and C_NMVvalues. The crowding distance (CD) for each Top-KVV in the boundary is assigned a value ‘∞’. The crowding distance (CD), for the remaining Top-KVVs is computed as given below:

$$ CD(TKV_{j}) = CD(TKV_{j} ) + \frac{{(TKV_{j + 1} )_{F} - (TKV_{j - 1} )_{F} }}{\max (F) - \min (F)} $$

where CD(TKV_j) is the crowding distance of the jth Top-K view-vector, (TKV_j)_F is the Fth objective function value for the jth Top-KVVs, max(F) and min(F) are the maximum and minimum values respectively on the Fth objective in the corresponding Pareto front.

Next, the mating population of Top-KVVs is selected from PTKV using the crowded comparison operator (Deb et al. 2002) method. In this method, from amongst two Top-KVVs, the higher ranked Top-KVV is selected. In case both the Top-KVVs have the same rank, the Top-KVV with a higher CD value is selected. Next, modified crossover (Davis 1985) and random mutation (Goldberg 1989) with probability CP and MP respectively are performed on the Top-KVVs in the mating population to produce the offspring population of Top-KVVs QTKV. Thereafter, the two populations PTKV and QTKV are combined to form an intermediate population ITKV of size 2 N. Non-dominated sorting is performed on the Top-KVVs in ITKV based on their ranks and within ranks on their CDs. Finally, the top N Top-KVVs are selected to form the new population PTKV.

Step 2 to Step 8 is executed for MaxG generations and thereafter, Top-KVVs in the first Pareto front are produced, as output.

An example illustrating the computation of non-dominated Top-KVVs from a three dimensional lattice of Fig. 1 using the algorithm MVS_NSGA-II is given next.

3 An example

Suppose Top-4 views are to be selected from the three dimensional lattice shown in Fig. 1. The randomly generated initial population PTKV of eight Top-4VVs is given in Fig. 3.

For each Top-4VV in the population, the C_MV and C_NMV values are computed. As an example, the computation of an objective fitness value for T4V_1, i.e.[4, 2, 5, 6], in the population PT4V is given below:

$$ \begin{aligned} C_{MV} = & \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} = 1}}^{N} {Size(V_{i} )} \\ = & \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} = 1}}^{8} {Size(V_{i} ) = } Size(V_{1} ) + Size(V_{4} ) + Size(V_{2} ) + Size(V_{5} ) + Size(V_{6} ) \\ = & 54 + 50 + 48 + 23 + 19 \\ = & 194 \\ \end{aligned} $$

$$ \begin{aligned} C_{NMV} = & \sum\limits_{{i = 1 \wedge SM_{{V_{i} }} = 0}}^{N} {SizeSMA(V_{i} )} \\ = & \sum\limits_{{i = 1 \wedge \frac{n!}{{r!\left( {n - r} \right)!}}SM_{{V_{i} }} = 0}}^{8} {SizeSMA(V_{i} )} = SizeSMA(V_{3} ) + SizeSMA(V_{7} ) + SizeSMA(V_{8} ) \\ = & Size(V_{1} ) + Size(V_{4} ) + Size(V_{6} ) \\ = & 54 + 50 + 19 \\ = & 123 \\ \end{aligned} $$

In a similar manner, the values of C_MV and C_NMV are computed for the remaining Top-4VVs in PT4V and is given in Fig. 4.

Next, non-dominated sorting is performed on the Top-4VVs in PT4V. The set T4VD_i, containing Top-4VVs dominated by the ith Top-4VV T4V_i, and DT4V_i, which is the number of Top-4VVs that dominates the ith Top-4VV T4V_i, is computed. These values are given in Fig. 4.

The Top-4VVs T4V₂, T4V₄, and T4V₈ are not dominated by any other view-vectors thus DT4V₂ = DT4V₄ = DT4V₈ = 0. Therefore, these Top-4VVs are assigned rank one and become the members of the first Pareto front F₁. Next, for computing the Top-4VVs to be assigned to the second Pareto front, the value of DT4V_i of the remaining Top-4VVs, i.e., T4V₁, T4V₃, T4V₅, T4V₆, and T4V₇, which are not in the first Pareto front, is decremented by one. After decrementing DT4V_i, the value of DT4V_i becomes zero for the Top-4VVs T4V₅, T4V₆, and T4V₇. These Top-4VVs are assigned rank two and become the members of the second Pareto front F₂. In a similar manner, other non-dominated Pareto fronts are computed and are given in Fig. 4.

The Top-4VVs of each Pareto front are sorted on C_MV and C_NMV. The sorted Top-4VVs Pareto front wise for C_MV and C_NMV are given in Fig. 5. In each Pareto front, the boundary Top-4VVs, i.e. those having the minimum and maximum values in the Pareto front, are assigned a crowding distance (CD) value ‘∞’. That is, T4V₁, T4V₂, T4V₃, T4V₆, T4V₇and T4V₈ are assigned a value ‘∞’. The CD values are computed for the remaining Top-4VVs using the sorted order of the Top-4VVs in the respective front.

The computations of crowding distance (CD) for Top-4VVs T4V₄ is given below:

Initially the CD value for each Top-4VV T4V₄ is set to zero, i.e. CD [T4V₄] = 0.

$$ CD[T4V_{4} ] = CD[T4V_{4} ] + \frac{{C_{MV} [T4V_{8} ] - C_{MV} [T4V_{2} ]}}{{\max [C_{MV} ] - \min [C_{MV} ]}} = 0 + \frac{215 - 185}{{235 - 185}} = \frac{30}{{50}} = 0.60 $$

$$ CD[T4V_{4} ] = CD[T4V_{4} ] + \frac{{C_{MV} [T4V_{2} ] - C_{MV} [T4V_{8} ]}}{{\max [C_{MV} ] - \min [C_{MV} ]}} = 0.60 + \frac{127 - 107}{{131 - 107}} = 0.60 + 0.83 = 1.43 $$

In a similar manner, the CD value of T4V₅ is computed as 0.83 and is given in Fig. 6.

The crowded comparison operator is used to select the mating pool of Top-4VVs from PT4V. This selection is given in Fig. 7.

The offspring population QTKV of Top-4VVs is computed by applying a single point modified crossover, with CP = 0.8, and random mutation, with MP = 0.1, on the mating pool of Top-4VVs and is given in Fig. 8.

Next, Top-4VVs in PT4V is combined with the offspring population of the Top-4VVs in QT4V to compute an intermediate population of Top-4VVs IT4V. Thereafter, non-dominated sorting is performedon the Top-4VVs in IT4V to compute PT4V for the next generation. The Top-4VVs in the first Pareto front and the population PT4V for next generation, after completion of the first iteration, is given in Fig. 9.

Experimental based comparison of MVS_NSGA-II with SPEA-2 based MVS algorithm (MVS_SPEA-2)(Prakash and Vijay Kumar 2019b) is discussed next.

4 Experimental results

The algorithm MVS_NSGA-II and MVS_SPEA-2 were implemented using MATLAB 8.0 in a Windows 7 platform on a PC with Intel Core i5 processor having 4 GB RAM. First, graphs were plotted to ascertain the crossover (CP) and mutation (MP) probabilities for which MVS_NSGA-II is able select the Top-10 views with comparatively lower TVEC. The four combinations of (P_c, P_m) considered are (CP = 0.6, MP = 0.05), (CP = 0.6, MP = 0.10), (CP = 0.8, MP = 0.05) and (CP = 0.8, MP = 0.10). The graphs depicting the TVEC versus generations for selecting Top-10 views for dimensions 5 to 10 were plotted and are shown in Figs. 10, 11, 12, 13, 14 and 15 respectively. After observing these graphs, It can be said that MVS_NSGA-II selects better quality views, having a comparatively lower TVEC, for the combination (CP = 0.8, MP = 0.10) for all dimensions. Therefore, CP = 0.8 and MP = 0.10 are considered for MVS_NSGA-II while comparing it with algorithm MVS_MOGA.

Next, MVS_NSGA-II(CP = 0.8, MP = 0.10) was compared with MVS_SPEA-2 (CP = 0.8, MP = 0.10) (Prakash & Vijay Kumar, 2019b) in terms of the C_MV and the C_NMV of the Top-10 views in the first Pareto front, selected by MVS_NSGA-II and MVS_SPEA-2 after 200 generations. The corresponding graphs for dimensions 5 to 10 are shown in Figs. 16, 17, 18, 19, 20 and 21 respectively. After observing these graphs, it can be said that MVS_NSGA-II was able to select almost similar number of non-dominated Top-10 views having a comparatively better spread on the Pareto optimal front. This was further substantiated by comparing the trade-off between C_MV and C_NMV of the Top-10 views selected by the two algorithms using C-measure (Zitzler et al. 1999, 2000, 2003). The values of the C-measure for the selection of the Top-10 views by MVS_NSGA-II and MVS_SPEA-2, for dimensions 5 to 10, is given in Fig. 22. It can be observed that, for all dimensions, the C(MVS_NSGA-II, MVS_SPEA-2) has a comparatively higher value than C(MVS_SPEA-2, MVS_NSGA-II) thereby implying that the Top-10 views selected by MVS_NSGA-II have a comparatively wider spread than those selected using MVS_SPEA-2. As a result, it can be inferred that MVS_NSGA-II selects views that achieve a comparatively better trade-off between C_MV and C_NMV.

5 Conclusions

This paper focusses on the use of Pareto based Multi-objective genetic algorithm NSGA-II to solve the bi-objective MVS problem. Accordingly, MVS_NSGA-II that minimizes the two costs C_MV and C_NMV simultaneously to select the Top-K views, from a multi-dimensional lattice, for materialization is proposed. Experimental based comparisons of MVS_NSGA-II is carried out with the existing SPEA-2 based materialized view selection algorithm MVS_SPEA-2 in order to compare the quality, with respect to C_MV and C_NMV, of the non-dominated Top-K views selected by them. The experimental results show that the proposed algorithm MVS_NSGA-II was able to select the Top-K views that achieved a comparatively better trade-off between C_MV and C_NMV. Further, it was observed that the Top-K views selected by MVS_NSGA-II had a comparatively better spread on the Pareto optimal front. Thus, it can be stated that the proposed algorithm MVS_NSGA-II is able to select comparatively better quality Top-K views. These selected views, if materialized, would minimize the analytical queries response time thereby resulting in effectual decision-making.

References

Agrawal S, Chaudhari S, Narasayya V (2000) Automated selection of materialized views and indexes in SQL databases. 26th International conference on very large data bases (VLDB 2000). Egypt, Cairo, pp 496–505
Aouiche K, Darmont J (2009) Data mining-based materialized view and index selection in data warehouse. J Intell Inf Syst 33(1):65–93
Google Scholar
Arun B, Vijay Kumar TV (2015a) Materialized view selection using marriage in honey bees optimization. Int J Nat Comput Res 5(3):1–25
Google Scholar
Arun B, Vijay Kumar TV (2015b) Materialized view selection using improvement based bee colony optimization. Int J Softw Sci Comput Intell 7(4):35–61
Google Scholar
Arun B, Vijay Kumar TV (2017a) Materialized view selection using artificial bee colony optimization. Int J Intell Inf Technol 13(1):26–49
Google Scholar
Arun B, Vijay Kumar TV (2017b) Materialized view selection using bumble bee mating optimization. Int J Decis Support Syst Technol 9(3):1–27
Google Scholar
Baralis E, Paraboschi S, Teniente E (1997) Materialized view selection in a multidimansional database. 23rd International conference on very large data bases (VLDB 1997). Greece, Athens, pp 156–165
Chirkova R, Halevy AY, Suciu D (2001) A formal perspective on the view selection problem. 27th International conference on very large data bases (VLDB 2001). Roma, Italy, pp 59–68
Davis L (1985) Applying adaptive algorithms to epistatic domains. In: Proceedings of the international joint conference on artificial intelligence, Los Angeles, California, pp 162–164.
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Tran Evolut Comput 6(2):182–197. https://doi.org/10.1109/4235.996017
Article Google Scholar
Deb K (2014) Multi-objective optimization using evolutionary algorithms. Wiley, New Delhi
MATH Google Scholar
Encinas-Serna and Montano-Hoya (2007) Algorithm for selection of materialized views: based on a costs model. In: Proceedings of ICCT, pp. 18–24
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learnin, vol 1. Addison Wesley, Boston. https://doi.org/10.1007/s10589-009-9261-6
Golfarelli M, Rizzi S (2000) View materialization for nested GPSJqueries. In: Proceedings of the international workshop on design and management of data warehouses (DMDW’ 2000), Stockholm, Sweden, pp 1–9
Gupta H (1997) Selection of views to materialize in a data Warehouse. In: Proceedings of the 6th international conference on database theory. Springer-Verlag, London, pp 98–112. Retrieved from https://dl.acm.org/citation.cfm?id=645502.656089
Gupta H, Mumick IS (2005) Selection of views to materialize in a data warehouse. IEEE Trans Knowledge Data Eng 17(1):24–43
Google Scholar
Gupta H, Harinarayan V, Rajaraman V, Ullman J (1997) Index Selection for OLAP. In: Proceedings of the 13th international conference on data engineering, ICDE 97, Birmingham, UK, pp 208–219.
Haider M, Vijay Kumar TV (2011) ‘Materialised views selection using size and query frequency. Int J Value Chain Manage 5(2):95–105
Google Scholar
Haider M, Vijay Kumar TV (2017) Query frequency based view selection. Int J Bus Anal 4(1):36–55
Google Scholar
Harinarayan V, Rajaraman A, Ullman JD (1996) Implementing data cubes efficiently. ACM SIGMOD, Montreal, Canada, pp 205–216
Google Scholar
Horng JT, Chang YJ, Liu BJ, Kao CY (1999) Materialized view selection using genetic algorithms in a data warehouse system. In: Proceedings of the 1999 congress on evolutionary computation, vol 3, Washington D. C., USA, IEEE CEC, pp 2221–2227
Inmon WH (2003) Building the data warehouse, 3rd edn. Wiley Dreamtech India Pvt, Ltd
Google Scholar
Kalnis P, Mamoulis N, Papadias D (2002) View selection using randomized search. Data Knowledge Eng 42(1):89–111
MATH Google Scholar
Karloff H, Mihail M (1999) On the Complexity of the view selection problem. In: Proceeding of the eighteenth ACM-SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS), May 1999, pp. 167–173.
Kimball R, Ross M (2002) The data warehouse toolkit, 2nd edn. Wiley Computer Publishing, New York
Google Scholar
Kumar A, Vijay Kumar TV (2017) Improved quality view selection for analytical query performance enhancement using particle swarm optimization. Int J Reliability Qual Safety Eng 24(6):1740001. https://doi.org/10.1142/S0218539317400010
Kumar A, Vijay Kumar TV (2018a) Materialized view selection using set based particle swarm optimization. Int J Cogn Inf Nat Intell 12(3):18–39. https://doi.org/10.4018/IJCINI.2018070102
Article Google Scholar
Kumar S, Vijay Kumar TV (2018b) A novel quantum-inspired evolutionary view selection algorithm. Sādhanā 43:166
Google Scholar
Lee M, Hammer J (2001) Speeding up materialized view selection in data warehouses using a randomized algorithm. Int J Cooperative Inf Syst 10(3):327–353
Google Scholar
Lehner, W., Ruf, T. and Teschke, M. (1996) ‘Improving query response time in scientific databases using data aggregation. In proceedings of 7th international conference and workshop on database and expert systems applications, DEXA 96, Zurich, pp 201–206
Li J, Talebi ZA, Chirkova R, Fathi Y (2005) A formal model for the problem of view selection for aggregate queries. In: Eder J, Haav H, Kalja A, Penjam J (eds) Advances in databases and information systems. Springer, Berlin, Heidelberg, pp. 125–138. https://doi.org/10.1007/11547686_10
Lin W, Kuo I (2004) A Genetic algorithm for OLAP data cubes. Int J Knowl Inf Syst 6(1):83–102
Google Scholar
Lin Z, Yang D, Song G, Wang T (2007) User-oriented materialized view selection. In: The 7th IEEE International conference on computer and information technology (CIT-2007), IEEE Computer Society, pp. 133–138
Luo G (2007) Partial materialized views. In: International conference on data engineering (ICDE 2007), Istanbul, Turkey, April 2007, pp. 756–765
Prakash J, Vijay Kumar TV (2019a) A multi-objective approach for materialized view selection. Int J Oper Res Inf Syst 10(2):1–19
Google Scholar
Prakash J, Vijay Kumar TV (2019b) Multi-objective materialized view selection using improved strength pareto evolutionary algorithm. Int J Artif Intell Mach Learn 9(2):1–21
Google Scholar
Prakash J, Vijay Kumar TV (2020) Multi-objective materialized view selection using MOGA. Int J Syst Assurance Eng Manage. https://doi.org/10.1007/s13198-020-00947-2
Article Google Scholar
Roussopoulos N (1997) Materialized views and data warehouse. In: 4th Workshop KRDB, Athens, Greece, August 1997
Shah B, Ramachandran K, Raghavan V (2006) ‘A hybrid approach for data warehouse view selection. Int J Data Warehousing Mining 2(2):1–37
Google Scholar
Shukla A, Deshpande PM, Naughton JF (1998) Matreialized view selection for Multidimensional Datasets. In: Proceedings of of VLDB, pp. 488–500
Srinivas N, Deb K (1994) Muiltiobjective optimization using nondominated sorting in genetic algorithms. Evolut Comput 2(3):221–248. https://doi.org/10.1162/evco.1994.2.3.221
Article Google Scholar
Sun X, Wang Z (2009). An efficient materialized views selection algorithm based on PSO. In: Proceedings of the international workshop on intelligent systems and applications, pp. 1–4.
Teschke M, Ulbrich A (1997) Using materialized views to speed up data warehousing, Technical Report, IMMD 6, Universität Erlangen-Nümberg
Theodoratos D, Dalamagas T, Simitsis A, Stavropoulos M (2001) A randomized approach for the incremental design of an evolving data warehouse. Lect Notes Comput Sci 2224:325–338
MATH Google Scholar
Valluri S, Vadapalli S, Karlapalem K (2002) View relevance driven materialized view selection in data warehousing environment. Aust Comput Sci Commun 24(2):187–196
Google Scholar
Vijay Kumar TV, Ghoshal A (2009) A reduced lattice greedy algorithm for selecting materialized views. Commun Comput Inf Sci 31:6–18
Google Scholar
Vijay Kumar TV, Haider M, Kumar S (2010a) Proposing candidate views for materialization. Commun Comput Inf Sci 54:89–98
Google Scholar
Vijay Kumar TV, Goel A, Jain N (2010b) Mining information for constructing materialised views. Int J Inf Commun Technol 2(4):386–405
Google Scholar
Vijay Kumar TV, Haider M (2010) A query answering greedy algorithm for selecting materialized views. Lecture Notes in Artificial Intelligence (LNAI), vol 6422, Springer Verlag, Heidelberg, pp. 153–162.
Vijay Kumar TV, Haider M (2011a) ‘Greedy views selection using size and query frequency. Commun Comput Inf Sci 125:11–17
Vijay Kumar TV, Haider M (2011) Selection of views for materialization using size and query frequency. Commun Comput Inf Sci 147:150–155
Google Scholar
Vijay Kumar TV, Haider M, Kumar S (2011) A view recommendation greedy algorithm for materialized views selection. Commun Comput Inf Sci 141:61–70
Google Scholar
Vijay Kumar TV, Devi K (2012) Materialized view construction in data warehouse for decision making. Int J Bus Inf Syst 11(4):379–396
Google Scholar
Vijay Kumar V, Haider M (2012) ‘Materialized views selection for answering queries. Lecture Notes in Computer Science (LNCS), volume 6411, Springer Verlag, pp. 43–51
Vijay Kumar TV, Kumar S (2012a) Materialized view selection using iterative improvement. Adv Intell Syst Comput 178:205–214
Google Scholar
Vijay Kumar TV, Kumar S (2012) Materialized view selection using genetic algorithm. Commun ComputInf Sci 306:225–237
Google Scholar
Vijay Kumar TV, Kumar S (2012c) Materialized view selection using simulated annealing. Lecture Notes in Computer Science (LNCS), vol. 7678. Springer Verlag, Heidelberg, pp. 168–179.
Vijay Kumar TV (2013) ‘Answering query-based selection of materialised views. Int J Inf Decision Sci 5(1):103–116
Google Scholar
Vijay Kumar TV, Devi K (2013) An architectural framework for constructing materialized views in a data warehouse. Int J Innovation Manage Technol 4(2):192–197
Vijay Kumar TV, Kumar S (2013) Materialized view selection using memetic algorithm. Lecture Notes in Artificial Intelligence (LNAI), vol 8284, Springer Verlag, Heidelberg, pp. 316–327
Vijay Kumar TV, Kumar S (2014) Materialized view selection using differential evolution. Int J Innovative ComputAppl 6(2):102–113
Google Scholar
Vijay Kumar TV, Haider M (2015) ‘Query answering based view selection. Int J Bus Inf Syst 18(3):338–353
Google Scholar
Vijay Kumar TV, Kumar S (2015) ‘Materialized view selection using randomized algorithms. Int J Bus Inf Syst 19(2):224–240
Google Scholar
Vijay Kumar TV, Arun B (2016) Materialized view selection using BCO. Int J Bus Inf Syst 22(3):280–301
Google Scholar
Vijay Kumar TV, Arun B (2017) Materialized view selection using HBMO. Int J Syst Assurance Eng Manage
Wang Z, Zhang D (2005) Optimal genetic view selection algorithm under space constraint. Int J Inf Technol 11(5):44–51
Google Scholar
Widom J (1995) Research problems in data warehousing. In: Proceedings of international conference on information and knowledge management (ICIKM-1995), pp. 25–30
Yousri NAR, Ahmed KM, El-Makky NM (2005) Algorithms for selecting materialized views in a data warehouse. In: The proceedings of international conference on computer systems and applications, AICCSA’ 2005, Cairo, Egypt, pp. 27–34.
Yang J, Karlapalem K, Li Q (1997) Algorithms for materialized view design in data warehousing environment. In: the proceedings of 23rd International conference on very large data bases (VLDB-1997), August 25–29, 1997, pp.136–145.
Yu JX, Yao X, Choi C, Gou G (2003) Materialized view selection as constrained evolutionary optimization systems. IEEE Trans Syst Man Cybern C Appl Rev 33(4):458–467
Google Scholar
Zhang C, Yao X, Yang J (2001) An evolutionary approach to materialized views selection in a data warehouse environment. IEEE Trans Syst Man Cybern 31(3):282–294
Google Scholar
Zhou L, He X, Li K (2012) An improved approach for materialized view selection based on genetic algorithm. J Comput 7(7):1591–1598
Google Scholar
Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the Sstrength Pareto approach. IEEE Trans Evolut Comput 3(4):257–271
Google Scholar
Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms: empirical results. Evolut Comput 8(2):173–195
Google Scholar
Zitzler E, Thiele L, Laumanns M, Fonseca CM, da Fonseca VG (2003) Performance assessment of multiobjective optimizers: an analysis and review’. IEEE Trans Evolut Comput 7:117–132
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
Jay Prakash & T. V. Vijay Kumar

Authors

Jay Prakash
View author publications
You can also search for this author in PubMed Google Scholar
T. V. Vijay Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. V. Vijay Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prakash, J., Kumar, T.V.V. Multi-objective materialized view selection using NSGA-II. Int J Syst Assur Eng Manag 11, 972–984 (2020). https://doi.org/10.1007/s13198-020-01030-6

Download citation

Received: 17 July 2019
Revised: 17 July 2019
Published: 12 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s13198-020-01030-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multi-objective materialized view selection using NSGA-II

Abstract

Similar content being viewed by others