Abstract
The quality of data warehouse (DW) depends largely on the multidimensional (MD) model quality. Few researchers have proposed metrics to evaluate the MD model quality based on facts, dimensions, hierarchy relationships, foreign keys, multiple hierarchies etc. However, some dimension hierarchy aspects like sharing of hierarchy levels within a dimension, sharing of hierarchy levels among various dimensions, relationship between dimension levels etc. have not been given due consideration. The aforementioned aspects may contribute to structural complexity of MD models, which in turn can affect its quality. Our previous study defined a set of quality metrics emphasizing dimension hierarchy sharing in MD models for DW. In this paper, we attempt to thoroughly validate a subset (six) of these metrics, theoretically as well as empirically. A careful study of the mathematical properties of defined metrics is done using Briand framework, which reveals that out of six metrics, two of them correspond to coupling measure, one metric is a cohesion measure and rest three are size/complexity measures. Further, an empirical validation consisting of Spearman’s Rho correlation analysis and linear regression analysis is conducted on 20 MD schemas and 28 subjects to determine the relationship between the metrics and understandability of MD models. The experimental study shows that four of our metrics are good indicators of understandability of MD models and can help in predicting the structural complexity of MD schemas for DW.
Similar content being viewed by others
Notes
Remaining metrics will be investigated as future work.
References
Abelló A, Samos J, Saltor F (2006) YAM2: a multidimensional conceptual model extending UML. Infor Syst 31(6):541–567
Bandi RK, Vaishnavi VK, Turk DE (2003) Predicting maintenance performance using object-oriented design complexity metrics. IEEE Trans Softw Eng 29(1):77–87
Basili VR, Weiss DM (1984) A methodology for collecting valid software engineering data. IEEE Trans Softw Eng 10(6):728–738
Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761
Berenguer G, Romero R, Trujillo J, Serrano M, Piattini M (2005) A set of quality indicators and their corresponding metrics for conceptual models of data warehouses. In: Tjoa AM, Trujillo J (eds) Data warehousing and knowledge discovery, Springer, Berlin Heidelberg, pp 95–104
Briand LC, Morasca S, Basili VR (1994) Defining and validating high-level design metrics. Technical Report CS-TR-3301, University of Maryland, Department of Computer Science, College Park, Md
Briand LC, El Emam K, Morasca S (1995) Theoretical and empirical validation of software product measures. International Software Engineering Research Network, Technical Report ISERN-95-03
Briand LC, Morasca S, Basili VR (1996) Property based software engineering measurement. IEEE Trans Softw Eng 22:68–86
Briand LC, Wuest J, Ikonomovski S, Lounis H (1998) A comprehensive investigation of quality factors in object oriented designs–an industrial case study. Technical report ISERN, International Software Engineering Research Network pp 29–98
Briand LC, Morasca S, Basili VR (1999) Defining and validating measures for object-based high-level design. IEEE Trans Softw Eng 25(5):722–743
Briand LC, Wuest J, Daly JW, Porter DV (2000) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273
Cabibbo L, Torlone R (1998) A logical approach to multidimensional databases. Advances in database technology—EDBT’98. Springer, Berlin Heidelberg, pp 183–197
Calero C, Piattini M, Pascual C, Serrano MA (2001) Towards data warehouse quality metrics. In: Proceedings of 3rd international workshop on design and management of data warehouse, Interlaken, Switzerland, p 2
Cherfi SS, Prat N (2003) Multidimensional schemas quality: assessing and balancing analyzability and simplicity. Conceptual modeling for novel application domains. Springer, Berlin Heidelberg, pp 140–151
Fenton N (1994) Software measurement: a necessary scientific basis. IEEE Trans Softw Eng 20(3):199–206
Fenton N, Bieman J (2014) Software metrics: a rigorous and practical approach. Boca Raton, Florida: CRC Press
Golfarelli M, Maio D, Rizzi S (1998) The dimensional fact model: a conceptual model for data warehouses. Int J Coop Inf Syst 7(02n03):215–247
Gosain A, Mann S (2014) Empirical validation of metrics for object oriented multidimensional model for data warehouse. Int J Syst Assur Eng Manag 5(3):262–275
Gosain A, Singh J (2015a) Quality metrics for data warehouse multidimensional models with focus on dimension hierarchy sharing. In: El-Alfy ESM, Thampi SM, Takagi H, Piramuthu S, Hanne T (eds) Advances in Intelligent Informatics, Springer International Publishing, pp 429-443
Gosain A, Singh J (2015b) Conceptual multidimensional modeling for data warehouses: a survey. In: Proceedings of the 3rd international conference on Frontiers of intelligent computing: theory and applications, Springer International Publishing, pp 305–316
Gosain A, Sabharwal S, Nagpal S (2010) Neural network approach to predict quality of data warehouse multidimensional model. In: Proceedings of international conference on advances in computer science, pp 241–244
Gosain A, Nagpal S, Sabharwal S (2011a) Quality metrics for conceptual models for data warehouse focusing on dimension hierarchies. ACM SIGSOFT Softw Eng Notes 36(4):1–5
Gosain A, Sabharwal S, Nagpal S (2011b) Assessment of quality of data warehouse multidimensional model. Int J Inf Qual 2(4):344–358
Gosain A, Nagpal S, Sabharwal S (2013) Validating dimension hierarchy metrics for the understandability of multidimensional models for data warehouse. IET Softw 7(2):93–103
Henderson-Sellers B (1996) Software metrics. Prentice-Hall, Hemel Hempstead
Hüsemann B, Lechtenbörger J, Vossen G (2000) Conceptual data warehouse design. In: Proceedings of 2nd international workshop on design and management of data warehouses (DMDW), Stockholm, Sweden, pp 1–6
Inmon WH (2005) Building the data WAREHOUSE, 4th edn. Wiley, New York
ISO (2001) Software product evaluation-quality characteristics and guidelines for their use. ISO/IEC Standard 9126, Geneva
Jagadish HV, Lakshmanan LV, Srivastava D (1999) What can hierarchies do for data warehouses?., In: Proceedings of 25th international conference on very large databases pp 530–541
Jarke M, Lenzerini M, Vassiliou Y, Vassiliadis P (2003) Fundamentals of data warehouses, 2nd edn. Springer Science and Business Media, Berlin
Kimball R, Ross M (2002) The data warehouse toolkit: the complete guide to dimensional modeling. 2nd edn. Wiley, New York
Kitchenham BA, Pfleeger SL, Fenton N (1995) Towards a framework for software measurement validation. IEEE Trans Softw Eng 21(12):929–944
Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734
Kumar M, Gosain A, Singh Y (2014) Empirical validation of structural metrics for predicting understandability of conceptual schemas for data warehouse. I J Syst Assur Eng Manag 5(3):291–306
Lakshmanan KB, Jayaprakash S, Sinha PK (1991) Properties of control-flow complexity measures. IEEE Trans Softw Eng 17(12):1289–1295
Linstedt D, Olschimke M (2015) Building a scalable data warehouse with data vault 2.0. Morgan Kaufmann, Burlington
Lorenz M, Kidd J (1994) Object-oriented software metrics. Object-oriented series. Prentice-Hall, Englewood Cliffs
Lujan-Mora S, Trujillo J, Song IY (2006) A UML profile for multidimensional modeling in data warehouses. Data Knowl Eng 59(3):725–769
Malinowski E, Zimanyi E (2004) OLAP hierarchies: a conceptual perspective. In: Persson A, Stirna J (eds) Advanced information systems engineering, Springer Berlin Heidelberg pp 477–491
Malinowski E, Zimanyi E (2006) Hierarchies in a multidimensional model: from conceptual modeling to logical representation. Data Knowl Eng 59(2):348–377
Mansmann S, Scholl MH (2007a) Empowering the OLAP technology to support complex dimension hierarchies. Int J Data Wareh Min 3(4):31–50
Mansmann S, Scholl MH (2007b) Extending the multidimensional data model to handle complex data. J Comput Sci Eng 1(2):125–160
Mazon JN, Lechtenborger J, Trujillo J (2009) A survey on summarizability issues in multidimensional modeling. Data Knowl Eng 68(12):1452–1469
Melton A (1996) Software measurement. International Thomson Computer Press, London
Nagpal S, Gosain A, Sabharwal S (2012) Complexity metric for multidimensional models for data warehouse. In: Potdar V, Mukhopadhyay D (eds) Proceedings of the CUBE international information technology conference, pp 360–365
Nagpal S, Gosain A, Sabharwal S (2013) Theoretical and empirical validation of comprehensive complexity metric for multidimensional models for data warehouse. Int J Syst Assur Eng Manag 4(2):193–204
Pedersen TB, Jensen CS, Dyreson CE (2001) A foundation for capturing and querying complex multidimensional data. Inf Syst 26(5):383–423
Poels G, Dedene G (1999) Distance: a framework for software measure construction. DTEW research report 9937, Department of Applies Economics Katholieke Universiteit Lueven, Belgium pp 1-47
Rizzi S, Abello A, Lechtenbörger J, Trujillo J (2006) Research in data warehouse modeling and design: dead or alive?. In: Proceedings of the 9th ACM international workshop on data warehousing and OLAP, pp 3–10
Sapia C, Blaschka M, Höfling G, Dinter B (1999) Extending the E/R model for the multidimensional paradigm. In: Kambayashi Y, Lee DL, Lim EP, Mohania M, Masunaga Y (eds) Advances in Database Technologies, Springer Berlin Heidelberg, pp 105–116
Serrano MA (2004) Definition of a Set of Metrics for Assuring Data Warehouse Quality. Univeristy of Castilla, La Mancha
Serrano MA, Calero C, Piattini M (2002) Validating metrics for data warehouse. Softw IEEE Proc 149(5):161–166
Serrano MA, Calero C, Piattini M (2003) Experimental validation of multidimensional data models metrics. In: Proceedings of 36th annual hawaii IEEE international conference on system sciences, p 7
Serrano MA, Calero C, Trujillo J, Lujan-Mora S, Piattini M (2004) Empirical validation of metrics for conceptual models of data warehouses. In: Persson A, Stirna J (eds) Advanced information systems engineering, Springer, Berlin Heidelberg, pp 506–520
Serrano MA, Calero C, Piattini M (2005) An experimental replication with data warehouse metrics. Int J Data Wareh Min 1(4):1–21
Serrano MA, Trujillo J, Calero C, Piattini M (2007) Metrics for data warehouse conceptual models understandability. Inf Softw Technol 49(8):851–870
Serrano MA, Calero C, Sahraoui HA, Piattini M (2008) Empirical studies to assess the understandability of data warehouse schemas using structural metrics. Softw Qual J 16(1):79–106
Svahnberg M, Aurum A, Wohlin C (2008) Using students as subjects-an empirical evaluation. In: Proceedings of 2nd ACM-IEEE International symposium on empirical software engineering and measurement, pp 288–290
Tryfona N, Busborg F, Borch-Christiansen JG (1999) starER: a conceptual model for data warehouse design. In: Proceedings of 2nd ACM international workshop on data warehousing and OLAP, pp 3–8
Weyuker EJ (1988) Evaluating software complexity measures. IEEE Trans Softw Eng 14(9):1357–1365
William R, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Cengage learning, Wadsworth
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A, Wesslén A (2012) Experimentation in software engineering. Springer Science and Business Media, Berlin
Zuse H (1998) A framework of software measurement. Walter de Gruyter, Berlin
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gosain, A., Singh, J. Quality metrics emphasizing dimension hierarchy sharing in multidimensional models for data warehouse: a theoretical and empirical evaluation. Int J Syst Assur Eng Manag 8 (Suppl 2), 1672–1688 (2017). https://doi.org/10.1007/s13198-017-0641-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13198-017-0641-5