Skip to main content
Log in

Quality metrics emphasizing dimension hierarchy sharing in multidimensional models for data warehouse: a theoretical and empirical evaluation

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

The quality of data warehouse (DW) depends largely on the multidimensional (MD) model quality. Few researchers have proposed metrics to evaluate the MD model quality based on facts, dimensions, hierarchy relationships, foreign keys, multiple hierarchies etc. However, some dimension hierarchy aspects like sharing of hierarchy levels within a dimension, sharing of hierarchy levels among various dimensions, relationship between dimension levels etc. have not been given due consideration. The aforementioned aspects may contribute to structural complexity of MD models, which in turn can affect its quality. Our previous study defined a set of quality metrics emphasizing dimension hierarchy sharing in MD models for DW. In this paper, we attempt to thoroughly validate a subset (six) of these metrics, theoretically as well as empirically. A careful study of the mathematical properties of defined metrics is done using Briand framework, which reveals that out of six metrics, two of them correspond to coupling measure, one metric is a cohesion measure and rest three are size/complexity measures. Further, an empirical validation consisting of Spearman’s Rho correlation analysis and linear regression analysis is conducted on 20 MD schemas and 28 subjects to determine the relationship between the metrics and understandability of MD models. The experimental study shows that four of our metrics are good indicators of understandability of MD models and can help in predicting the structural complexity of MD schemas for DW.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Remaining metrics will be investigated as future work.

References

  • Abelló A, Samos J, Saltor F (2006) YAM2: a multidimensional conceptual model extending UML. Infor Syst 31(6):541–567

    Article  Google Scholar 

  • Bandi RK, Vaishnavi VK, Turk DE (2003) Predicting maintenance performance using object-oriented design complexity metrics. IEEE Trans Softw Eng 29(1):77–87

    Article  Google Scholar 

  • Basili VR, Weiss DM (1984) A methodology for collecting valid software engineering data. IEEE Trans Softw Eng 10(6):728–738

    Article  Google Scholar 

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761

    Article  Google Scholar 

  • Berenguer G, Romero R, Trujillo J, Serrano M, Piattini M (2005) A set of quality indicators and their corresponding metrics for conceptual models of data warehouses. In: Tjoa AM, Trujillo J (eds) Data warehousing and knowledge discovery, Springer, Berlin Heidelberg, pp 95–104

  • Briand LC, Morasca S, Basili VR (1994) Defining and validating high-level design metrics. Technical Report CS-TR-3301, University of Maryland, Department of Computer Science, College Park, Md

  • Briand LC, El Emam K, Morasca S (1995) Theoretical and empirical validation of software product measures. International Software Engineering Research Network, Technical Report ISERN-95-03

  • Briand LC, Morasca S, Basili VR (1996) Property based software engineering measurement. IEEE Trans Softw Eng 22:68–86

    Article  Google Scholar 

  • Briand LC, Wuest J, Ikonomovski S, Lounis H (1998) A comprehensive investigation of quality factors in object oriented designs–an industrial case study. Technical report ISERN, International Software Engineering Research Network pp 29–98

  • Briand LC, Morasca S, Basili VR (1999) Defining and validating measures for object-based high-level design. IEEE Trans Softw Eng 25(5):722–743

    Article  Google Scholar 

  • Briand LC, Wuest J, Daly JW, Porter DV (2000) Exploring the relationships between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245–273

    Article  Google Scholar 

  • Cabibbo L, Torlone R (1998) A logical approach to multidimensional databases. Advances in database technology—EDBT’98. Springer, Berlin Heidelberg, pp 183–197

    Google Scholar 

  • Calero C, Piattini M, Pascual C, Serrano MA (2001) Towards data warehouse quality metrics. In: Proceedings of 3rd international workshop on design and management of data warehouse, Interlaken, Switzerland, p 2

  • Cherfi SS, Prat N (2003) Multidimensional schemas quality: assessing and balancing analyzability and simplicity. Conceptual modeling for novel application domains. Springer, Berlin Heidelberg, pp 140–151

    Chapter  Google Scholar 

  • Fenton N (1994) Software measurement: a necessary scientific basis. IEEE Trans Softw Eng 20(3):199–206

    Article  Google Scholar 

  • Fenton N, Bieman J (2014) Software metrics: a rigorous and practical approach. Boca Raton, Florida: CRC Press

  • Golfarelli M, Maio D, Rizzi S (1998) The dimensional fact model: a conceptual model for data warehouses. Int J Coop Inf Syst 7(02n03):215–247

    Article  Google Scholar 

  • Gosain A, Mann S (2014) Empirical validation of metrics for object oriented multidimensional model for data warehouse. Int J Syst Assur Eng Manag 5(3):262–275

    Article  Google Scholar 

  • Gosain A, Singh J (2015a) Quality metrics for data warehouse multidimensional models with focus on dimension hierarchy sharing. In: El-Alfy ESM, Thampi SM, Takagi H,  Piramuthu S, Hanne T (eds) Advances in Intelligent Informatics, Springer International Publishing, pp 429-443

  • Gosain A, Singh J (2015b) Conceptual multidimensional modeling for data warehouses: a survey. In: Proceedings of the 3rd international conference on Frontiers of intelligent computing: theory and applications, Springer International Publishing, pp 305–316

  • Gosain A, Sabharwal S, Nagpal S (2010) Neural network approach to predict quality of data warehouse multidimensional model. In: Proceedings of international conference on advances in computer science, pp 241–244

  • Gosain A, Nagpal S, Sabharwal S (2011a) Quality metrics for conceptual models for data warehouse focusing on dimension hierarchies. ACM SIGSOFT Softw Eng Notes 36(4):1–5

    Article  Google Scholar 

  • Gosain A, Sabharwal S, Nagpal S (2011b) Assessment of quality of data warehouse multidimensional model. Int J Inf Qual 2(4):344–358

    Article  Google Scholar 

  • Gosain A, Nagpal S, Sabharwal S (2013) Validating dimension hierarchy metrics for the understandability of multidimensional models for data warehouse. IET Softw 7(2):93–103

    Article  Google Scholar 

  • Henderson-Sellers B (1996) Software metrics. Prentice-Hall, Hemel Hempstead

    Google Scholar 

  • Hüsemann B, Lechtenbörger J, Vossen G (2000) Conceptual data warehouse design. In: Proceedings of 2nd international workshop on design and management of data warehouses (DMDW), Stockholm, Sweden, pp 1–6

  • Inmon WH (2005) Building the data WAREHOUSE, 4th edn. Wiley, New York

    Google Scholar 

  • ISO (2001) Software product evaluation-quality characteristics and guidelines for their use. ISO/IEC Standard 9126, Geneva

    Google Scholar 

  • Jagadish HV, Lakshmanan LV, Srivastava D (1999) What can hierarchies do for data warehouses?., In: Proceedings of 25th international conference on very large databases pp 530–541

  • Jarke M, Lenzerini M, Vassiliou Y, Vassiliadis P (2003) Fundamentals of data warehouses, 2nd edn. Springer Science and Business Media, Berlin

    Book  MATH  Google Scholar 

  • Kimball R, Ross M (2002) The data warehouse toolkit: the complete guide to dimensional modeling. 2nd edn. Wiley, New York

  • Kitchenham BA, Pfleeger SL, Fenton N (1995) Towards a framework for software measurement validation. IEEE Trans Softw Eng 21(12):929–944

    Article  Google Scholar 

  • Kitchenham BA, Pfleeger SL, Pickard LM, Jones PW, Hoaglin DC, El Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28(8):721–734

    Article  Google Scholar 

  • Kumar M, Gosain A, Singh Y (2014) Empirical validation of structural metrics for predicting understandability of conceptual schemas for data warehouse. I J Syst Assur Eng Manag 5(3):291–306

    Article  Google Scholar 

  • Lakshmanan KB, Jayaprakash S, Sinha PK (1991) Properties of control-flow complexity measures. IEEE Trans Softw Eng 17(12):1289–1295

    Article  Google Scholar 

  • Linstedt D, Olschimke M (2015) Building a scalable data warehouse with data vault 2.0. Morgan Kaufmann, Burlington

    Google Scholar 

  • Lorenz M, Kidd J (1994) Object-oriented software metrics. Object-oriented series. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Lujan-Mora S, Trujillo J, Song IY (2006) A UML profile for multidimensional modeling in data warehouses. Data Knowl Eng 59(3):725–769

    Article  Google Scholar 

  • Malinowski E, Zimanyi E (2004) OLAP hierarchies: a conceptual perspective. In: Persson A, Stirna J (eds) Advanced information systems engineering, Springer Berlin Heidelberg pp 477–491

  • Malinowski E, Zimanyi E (2006) Hierarchies in a multidimensional model: from conceptual modeling to logical representation. Data Knowl Eng 59(2):348–377

    Article  Google Scholar 

  • Mansmann S, Scholl MH (2007a) Empowering the OLAP technology to support complex dimension hierarchies. Int J Data Wareh Min 3(4):31–50

    Article  Google Scholar 

  • Mansmann S, Scholl MH (2007b) Extending the multidimensional data model to handle complex data. J Comput Sci Eng 1(2):125–160

    Article  Google Scholar 

  • Mazon JN, Lechtenborger J, Trujillo J (2009) A survey on summarizability issues in multidimensional modeling. Data Knowl Eng 68(12):1452–1469

    Article  Google Scholar 

  • Melton A (1996) Software measurement. International Thomson Computer Press, London

    MATH  Google Scholar 

  • Nagpal S, Gosain A, Sabharwal S (2012) Complexity metric for multidimensional models for data warehouse. In: Potdar V, Mukhopadhyay D (eds) Proceedings of the CUBE international information technology conference, pp 360–365

  • Nagpal S, Gosain A, Sabharwal S (2013) Theoretical and empirical validation of comprehensive complexity metric for multidimensional models for data warehouse. Int J Syst Assur Eng Manag 4(2):193–204

    Article  Google Scholar 

  • Pedersen TB, Jensen CS, Dyreson CE (2001) A foundation for capturing and querying complex multidimensional data. Inf Syst 26(5):383–423

    Article  MATH  Google Scholar 

  • Poels G, Dedene G (1999) Distance: a framework for software measure construction. DTEW research report 9937, Department of Applies Economics Katholieke Universiteit Lueven, Belgium pp 1-47

  • Rizzi S, Abello A, Lechtenbörger J, Trujillo J (2006) Research in data warehouse modeling and design: dead or alive?. In: Proceedings of the 9th ACM international workshop on data warehousing and OLAP, pp 3–10

  • Sapia C, Blaschka M, Höfling G, Dinter B (1999) Extending the E/R model for the multidimensional paradigm. In: Kambayashi Y, Lee DL, Lim EP, Mohania M, Masunaga Y (eds) Advances in Database Technologies, Springer Berlin Heidelberg, pp 105–116

  • Serrano MA (2004) Definition of a Set of Metrics for Assuring Data Warehouse Quality. Univeristy of Castilla, La Mancha

    Google Scholar 

  • Serrano MA, Calero C, Piattini M (2002) Validating metrics for data warehouse. Softw IEEE Proc 149(5):161–166

    Article  Google Scholar 

  • Serrano MA, Calero C, Piattini M (2003) Experimental validation of multidimensional data models metrics. In: Proceedings of 36th annual hawaii IEEE international conference on system sciences, p 7

  • Serrano MA, Calero C, Trujillo J, Lujan-Mora S, Piattini M (2004) Empirical validation of metrics for conceptual models of data warehouses. In: Persson A, Stirna J (eds) Advanced information systems engineering, Springer, Berlin Heidelberg, pp 506–520

  • Serrano MA, Calero C, Piattini M (2005) An experimental replication with data warehouse metrics. Int J Data Wareh Min 1(4):1–21

    Article  Google Scholar 

  • Serrano MA, Trujillo J, Calero C, Piattini M (2007) Metrics for data warehouse conceptual models understandability. Inf Softw Technol 49(8):851–870

    Article  Google Scholar 

  • Serrano MA, Calero C, Sahraoui HA, Piattini M (2008) Empirical studies to assess the understandability of data warehouse schemas using structural metrics. Softw Qual J 16(1):79–106

    Article  Google Scholar 

  • Svahnberg M, Aurum A, Wohlin C (2008) Using students as subjects-an empirical evaluation. In: Proceedings of 2nd ACM-IEEE International symposium on empirical software engineering and measurement, pp 288–290

  • Tryfona N, Busborg F, Borch-Christiansen JG (1999) starER: a conceptual model for data warehouse design. In: Proceedings of 2nd ACM international workshop on data warehousing and OLAP, pp 3–8

  • Weyuker EJ (1988) Evaluating software complexity measures. IEEE Trans Softw Eng 14(9):1357–1365

    Article  MathSciNet  Google Scholar 

  • William R, Cook TD, Campbell DT (2002) Experimental and quasi-experimental designs for generalized causal inference. Cengage learning, Wadsworth

    Google Scholar 

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A, Wesslén A (2012) Experimentation in software engineering. Springer Science and Business Media, Berlin

    Book  MATH  Google Scholar 

  • Zuse H (1998) A framework of software measurement. Walter de Gruyter, Berlin

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaspreeti Singh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gosain, A., Singh, J. Quality metrics emphasizing dimension hierarchy sharing in multidimensional models for data warehouse: a theoretical and empirical evaluation. Int J Syst Assur Eng Manag 8 (Suppl 2), 1672–1688 (2017). https://doi.org/10.1007/s13198-017-0641-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-017-0641-5

Keywords

Navigation