Skip to main content
Log in

Measuring size, complexity, and coupling of hypergraph abstractions of software: An information-theory approach

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Software development is fundamentally based on cognitive processes. Our motivating hypothesis is that amounts of various kinds of information in software artifacts may have useful statistical relationships with software-engineering attributes. This paper proposes measures of size, complexity and coupling in terms of the amount of information, building on formal definitions of these software-metric families proposed by Briand, Morasca, and Basili.

Ordinary graphs represent relationships between pairs of nodes. We extend prior work with ordinary graphs to hypergraphs representing relationships among sets of nodes. Some software engineering abstractions, such as set-use relations for public variables, are better represented as hypergraphs than ordinary (binary) graphs.

Traditional software metrics are based on counting. In contrast, we adopt information theory as the basis for measurement, because the design decisions embodied by software are information. This paper proposes software metrics of size, complexity, and coupling based on information in the pattern of incident hyperedges. For comparison, we also define corresponding counting-based metrics.

Three exploratory case studies illustrate some of the distinctive features of the proposed metrics. The case studies found that information theory-based software metrics make distinctions that counting metrics do not, which may be relevant to software engineering quality and process. We also identify situations when information theory-based metrics are simply proportional to corresponding counting metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. As a simplifying assumption, our analysis excludes from measurement the design decisions of how many nodes and how many hyperedges the graph includes.

References

  • Abd-El-Hafiz, S.K. 2001. Entropies as measures of software information. In: Proceedings IEEE International Conference on Software Maintenance, Florence, Italy. IEEE Computer Society, pp. 110–117.

  • Allen, E.B. 1995. Information theory and software measurement. PhD thesis, Florida Atlantic University, Boca Raton, Florida. Advised by Taghi M. Khoshgoftaar.

  • Allen, E.B. 2002. Measuring graph abstractions of software: An information-theory approach. In: Proceedings: Eighth IEEE Symposium on Software Metrics, Ottawa, Canada. IEEE Computer Society, pp. 182–193.

  • Allen, E.B., Khoshgoftaar, T.M. 1999. Measuring coupling and cohesion: An information-theory approach. In: Proceedings of the Sixth International Software Metrics Symposium, Boca Raton, Florida. IEEE Computer Society, pp. 119–127.

  • Allen, E.B., Khoshgoftaar, T.M., Chen, Y. 2001. Measuring coupling and cohesion of software modules: An information-theory approach. In: Proceedings: Seventh International Software Metrics Symposium, London, England. IEEE Computer Society, pp. 124–134.

  • Andersson, C., Thelin, T., Runeson, P., Dzamashvili, N. 2003. An experimental evaluation of inspection and testing for detection of design faults. In: Proceedings: 2003. International Symposium on Empirical Software Engineering, Rome, Italy. IEEE Computer Society, pp. 174–184.

  • Bansiya, J., Davis, C.G., Etzkorn, L. 1999. An entropy based complexity measure for object-oriented designs. Theory and Practice of Object Systems 5(2):1–9.

    Article  Google Scholar 

  • Bell Canada 2000a. Datrix Abstract Semantic Graph Reference-Manual (Version 1.4).

  • Bell Canada 2000b. Datrix Metric Reference Manual. Montreal, Quebec, Canada, version 4.0 edition. For Datrix version 3.6.9.

  • Birov, L., Prokofiev, A., Bartenev, Y., Vargin, A., Purkayastha, A., Skjellum, A., Dandass, Y., Erzunov, V., Shanikova, E., Ovechkin, V., Bangalore, P., Shuvalov, E., Orlov, N.F.A., Egorov, S. 1999. The Parallel Mathematical Libraries Project (PMLP): Overview, design innovations, and preliminary results. In: Proceedings of the Fifth International Conference on Parallel Computing Technologies.

  • Briand, L.C., Daly, J.W., Wüst, J. 1997a. A unified framework for cohesion measurement in object-oriented systems. In: Proceedings of the Fourth International Symposium on Software Metrics, Albuquerque, New Mexico. IEEE Computer Society, pp. 43–53.

  • Briand, L.C., Daly, J.W., Wüst, J.K. 1999. A unified framework for coupling measurement in object-oriented systems. IEEE Transactions on Software Engineering 25(1):91–121.

    Article  Google Scholar 

  • Briand, L.C., El Emam, K., Morasca, S. 1996a. On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 1(1):61–88. (See Briand et al., 1997b; Zuse, 1997a).

    Article  Google Scholar 

  • Briand, L.C., El Emam, K., Morasca, S. 1997b. Reply to Comments to the paper: Briand, El Emam, Morasca: On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 2(3):317–322. (See Briand et al., 1996a; Zuse, 1997a).

    Article  Google Scholar 

  • Briand, L.C., Morasca, S., Basili, V.R. 1996b. Property-based software engineering measurement. IEEE Transactions on Software Engineering 22(1):68–85. See comments in Briand et al. (1997c), Poels and Dedene (1997), Zuse (1997c).

    Article  Google Scholar 

  • Briand, L.C., Morasca, S., Basili, V.R. 1997c. Response to: Comments on Property-based software engineering measurement: Refining the additivity properties. IEEE Transactions on Software Engineering 23(3):196–197. (See Briand et al., 1996b; Poels and Dedene, 1997).

    Article  Google Scholar 

  • Chaitin, G.J. 1966. On the length of programs for computing finite binary sequences. Journal of the Association for Computing Machinery 13(4):547–569.

    MATH  MathSciNet  Google Scholar 

  • Chaitin, G.J. 1975. A theory of program size formally identical to information theory. Journal of the Association for Computing Machinery 22(3):329–340.

    MATH  MathSciNet  Google Scholar 

  • Chapin, N. 2002. Entropy-metric for systems with COTS software. In: Proceedings: Eighth IEEE Symposium on Software Metrics, Ottawa, Canada. IEEE Computer Society, pp. 173–181.

  • Chen, Y. 2000. Measurement of coupling and cohesion of software. Master's thesis, Florida Atlantic University, Boca Raton, Florida. Advised by Taghi M. Khoshgoftaar.

  • Chidamber, S.R., Kemerer, C.F. 1994. A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20(6):476–493.

    Article  Google Scholar 

  • Cover, T.M., Thomas, J.A. 1991. Elements of Information Theory. John Wiley & Sons, New York.

    MATH  Google Scholar 

  • Davis, J.S., LeBlanc, R.J. 1988. A study of the applicability of complexity measures. IEEE Transactions on Software Engineering 14(9):1366–1372.

    Article  MATH  Google Scholar 

  • Dean, T., Malton, A., Holt, R. 2001. Union schemas as the basis for a C++ extractor. In: Proceedings: Working Conference on Reverse Engineering, Stuttgart, Germany.

  • El Emam, K., Benlarbi, S., Goel, N., Rai, S.N. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering 27(7):630–650. (See Evanco, 2003).

    Article  Google Scholar 

  • Evanco, W.M. 2003. Comments on ‘The confounding effect of class size on the validity of object-oriented metrics’. IEEE Transactions on Software Engineering 29(7):670–672. (See El Emam et al., 2001).

    Article  Google Scholar 

  • Fenton, N.E., Pfleeger, S.L. 1997. Software Metrics: A Rigorous and Practical Approach, 2nd edn. PWS Publishing, London.

    Google Scholar 

  • Gottipati, S. 2003. Empirical validation of the usefulness of information theory-based software metrics. Master's thesis, Mississippi State University, Mississippi State, Mississippi. Advised by Edward B. Allen.

  • Govindarajan, R. 2004. An empirical validation of information theory-based software metrics in comparison to counting-based metrics: A case study approach. Master's thesis, Mississippi State University, Mississippi State, Mississippi. Advised by Edward B. Allen.

  • Hatton, L. 1997. Reexamining the fault density-component size connection. IEEE Software 14(2):89–97.

    Article  Google Scholar 

  • Hilgard, E.R., Atkinson, R.C., Atkinson, R.L. 1971. Introduction to Psychology. Harcourt Brace Jovanovich, New York.

    Google Scholar 

  • Khoshgoftaar, T.M., Allen, E.B. 1994. Applications of information theory to software engineering measurement. Software Quality Journal 3(2):79–103.

    Article  Google Scholar 

  • Kim, K., Shin, Y., Wu, C. 1995. Complexity measures for object oriented program based on the entropy. In: Proceedings: 1995 Asia Pacific Software Engineering Conference, Brisbane, Australia. IEEE Computer Society, pp. 127–136.

  • Kitchenham, B.A., Pfleeger, S.L., Fenton, N.E. 1995. Towards a framework for software measurement validation. IEEE Transactions on Software Engineering 21(12):929–944. (See comments in Kitchenham et al. 1997, Morasca et al., 1997).

    Article  Google Scholar 

  • Kitchenham, B.A., Pfleeger, S.L., Fenton, N.E. 1997. Reply to: Comments on ‘Towards a framework for software measurement validation’. IEEE Transactions on Software Engineering 23(3):189. (See Kitchenham et al., 1995; Morasca et al., 1997; Weyuker, 1988).

    Article  Google Scholar 

  • Kolmogorov, A.N. 1965. Three approaches for defining the concept of information quantity. Problems in Information Transmission 1(1):1–7.

    MathSciNet  Google Scholar 

  • Kolmogorov, A.N. 1968. Logical basis for information theory and probability theory. IEEE Transactions on Information Theory IT-14(5):662–664.

    Article  MathSciNet  Google Scholar 

  • Lapierre, S., Laguë, B., Leduc, C. 2001. Datrix source code model and its interchange format: Lessons learned and considerations for future work. ACM SIGSOFT Software Engineering Notes 26(1):53–60.

    Article  Google Scholar 

  • Lew, K.S., Dillon, T.S., Forward, K.E. 1988. Software complexity and its impact on software reliability. IEEE Transactions on Software Engineering 14(11):1645–1655.

    Article  Google Scholar 

  • Li, M., Vitányi, P.M.B. 1988. Two decades of applied Kolmogorov complexity. In: Proceedings of the Third Annual Structure in Complexity Theory Conference, Washington, DC, pp. 80–101.

  • Mayrand, J., Coallier, F. 1996. System acquisition based on software product assessment. In: Proceedings of the Eighteenth International Conference on Software Engineering, Berlin. IEEE Computer Society, pp. 210–219.

  • McCabe, T.J. 1976. A complexity measure. IEEE Transactions on Software Engineering SE-2(4):308–320.

    MathSciNet  Google Scholar 

  • Miller, G.A. 1956. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review 63(2):81–97.

    Article  Google Scholar 

  • Mohanty, S.N. 1979 Models and measurements for quality assessment of software. Computing Surveys 11(3):251–275.

    Article  MATH  Google Scholar 

  • Mohanty, S.N. 1981. Entropy metrics for software design evaluation. Journal of Systems and Software 2:39–46.

    Article  Google Scholar 

  • Morasca, S., Briand, L.C. 1997. Towards a theoretical framework for measuring software attributes. In: Proceedings of the Fourth International Symposium on Software Metrics, Albuquerque, New Mexico, IEEE Computer Society, pp. 119–126.

  • Morasca, S., Briand, L.C., Basili, V.R., Weyuker, E.J., Zelkowitz, M.V. 1997. Comments on ‘Towards a framework for software measurement validation’. IEEE Transactions on Software Engineering 23(3):187–188. (See Kitchenham et al., 1995; Weyuker, 1988).

    Article  Google Scholar 

  • Munson, J.C., Khoshgoftaar, T.M. 1989. The dimensionality of program complexity. In: Proceedings of the Eleventh International Conference on Software Engineering, Pittsburgh, Pennsylvania. IEEE Computer Society, pp. 245–253.

  • Oviedo, E.I. 1980. Control flow, data flow and program complexity. In: Proceedings: The IEEE Computer Society's Fourth International Computer Software and Applications Conference, Chicago, Illinois. IEEE Computer Society, pp. 146–152.

  • Poels, G., Dedene, G. 1997 Comments on ‘Property-based software engineering measurement’: Refining the additivity properties. IEEE Transactions on Software Engineering 23(3):190–195. (See Briand et al., 1996b).

    Article  Google Scholar 

  • Runeson, P., Andersson, C., Thelin, T., Andrews, A., Berling, T. 2006. What do we know about defect detection methods? IEEE Software 23(3):82–90.

    Article  Google Scholar 

  • Schütt, D. 1977. On a hypergraph oriented measure for applied computer science. In Digest of Papers: COMPCON 77 Fall, Washington, DC. IEEE Computer Society, pp. 295–296, Abstract only.

  • Shannon, C.E., Weaver, W. 1949. The Mathematical Theory of Communication. University of Illinois Press, Urbana, Illinois.

    MATH  Google Scholar 

  • Shereshevshky, M., Ammari, H., Gradetsky, N., Mili, A., Ammar, H.H. 2001. Information theoretic metrics for software architecture. In: Proceedings 25th Annual International Computer Software and Applications Conference, Chicago. IEEE Computer Society, pp. 151–157.

  • Solomonoff, R.J. 1964. A formal theory of inductive inference, part 1 and part 2. Information and Control 7:1–22, 224–254.

    Article  MathSciNet  MATH  Google Scholar 

  • University of Waterloo 2004. CPPX: Open source C++ fact extractor. http://swag.uwaterloo.ca/∼cppx. (Current July 7, 2006).

  • van Emden, M.H. 1970. Hierarchical decomposition of complexity. Machine Intelligence 5:361–380. (See also van Emden, 1971 for details).

    MathSciNet  Google Scholar 

  • van Emden, M.H. 1971. An Analysis of Complexity. Number 35 in Mathematical Centre Tracts. Mathematisch Centrum, Amsterdam.

  • Visaggio, G. 1997. Structural information as a quality metric in software systems organization. In: Proceedings International Conference on Software Maintenance, Bari, Italy. IEEE Computer Society, pp. 92–99.

  • Watanabe, S. 1960. Information theoretical analysis of multivariate correlation. IBM Journal of Research and Development 4(1):66–82.

    Article  MATH  Google Scholar 

  • Weyuker, E.J. 1988. Evaluating software complexity measures. IEEE Transactions on Software Engineering 14(9):1357–1365.

    Article  MathSciNet  Google Scholar 

  • Zuse, H. 1997a. Comments to the paper: Briand, Emam, Morasca: On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 2(3):313–316. (See Briand et al., 1996a, 1997b).

    Article  Google Scholar 

  • Zuse, H. 1997b. A Framework for Software Measurement. Walter de Gruyter and Co., Berlin.

    Google Scholar 

  • Zuse, H. 1997c. Reply to: ‘Property-based software engineering measurement’. IEEE Transactions on Software Engineering 23(8):533. (See Briand et al., 1996b).

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by grant CCR-0098024 from the National Science Foundation. We thank Bell Canada for an academic license to use Datrix, a software measurement tool. We thank the Software Architecture Group of the University of Waterloo for providing the open-source tool cppx. We thank Shiva Juluru for providing the physics data manipulation program's source code. We thank Anthony Skjellum for providing pmlp source code. We thank Yoginder Dandass and Archana Chilukuri for help with pmlp measurement. We thank the Empirical Software Engineering research group at Mississippi State University for helpful discussions. We thank the anonymous reviewers for their helpful suggestions which significantly strengthened the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward B. Allen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Allen, E.B., Gottipati, S. & Govindarajan, R. Measuring size, complexity, and coupling of hypergraph abstractions of software: An information-theory approach. Software Qual J 15, 179–212 (2007). https://doi.org/10.1007/s11219-006-9010-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-006-9010-3

Keywords

Navigation