Abstract
Software development is fundamentally based on cognitive processes. Our motivating hypothesis is that amounts of various kinds of information in software artifacts may have useful statistical relationships with software-engineering attributes. This paper proposes measures of size, complexity and coupling in terms of the amount of information, building on formal definitions of these software-metric families proposed by Briand, Morasca, and Basili.
Ordinary graphs represent relationships between pairs of nodes. We extend prior work with ordinary graphs to hypergraphs representing relationships among sets of nodes. Some software engineering abstractions, such as set-use relations for public variables, are better represented as hypergraphs than ordinary (binary) graphs.
Traditional software metrics are based on counting. In contrast, we adopt information theory as the basis for measurement, because the design decisions embodied by software are information. This paper proposes software metrics of size, complexity, and coupling based on information in the pattern of incident hyperedges. For comparison, we also define corresponding counting-based metrics.
Three exploratory case studies illustrate some of the distinctive features of the proposed metrics. The case studies found that information theory-based software metrics make distinctions that counting metrics do not, which may be relevant to software engineering quality and process. We also identify situations when information theory-based metrics are simply proportional to corresponding counting metrics.










Similar content being viewed by others
Notes
As a simplifying assumption, our analysis excludes from measurement the design decisions of how many nodes and how many hyperedges the graph includes.
References
Abd-El-Hafiz, S.K. 2001. Entropies as measures of software information. In: Proceedings IEEE International Conference on Software Maintenance, Florence, Italy. IEEE Computer Society, pp. 110–117.
Allen, E.B. 1995. Information theory and software measurement. PhD thesis, Florida Atlantic University, Boca Raton, Florida. Advised by Taghi M. Khoshgoftaar.
Allen, E.B. 2002. Measuring graph abstractions of software: An information-theory approach. In: Proceedings: Eighth IEEE Symposium on Software Metrics, Ottawa, Canada. IEEE Computer Society, pp. 182–193.
Allen, E.B., Khoshgoftaar, T.M. 1999. Measuring coupling and cohesion: An information-theory approach. In: Proceedings of the Sixth International Software Metrics Symposium, Boca Raton, Florida. IEEE Computer Society, pp. 119–127.
Allen, E.B., Khoshgoftaar, T.M., Chen, Y. 2001. Measuring coupling and cohesion of software modules: An information-theory approach. In: Proceedings: Seventh International Software Metrics Symposium, London, England. IEEE Computer Society, pp. 124–134.
Andersson, C., Thelin, T., Runeson, P., Dzamashvili, N. 2003. An experimental evaluation of inspection and testing for detection of design faults. In: Proceedings: 2003. International Symposium on Empirical Software Engineering, Rome, Italy. IEEE Computer Society, pp. 174–184.
Bansiya, J., Davis, C.G., Etzkorn, L. 1999. An entropy based complexity measure for object-oriented designs. Theory and Practice of Object Systems 5(2):1–9.
Bell Canada 2000a. Datrix Abstract Semantic Graph Reference-Manual (Version 1.4).
Bell Canada 2000b. Datrix Metric Reference Manual. Montreal, Quebec, Canada, version 4.0 edition. For Datrix version 3.6.9.
Birov, L., Prokofiev, A., Bartenev, Y., Vargin, A., Purkayastha, A., Skjellum, A., Dandass, Y., Erzunov, V., Shanikova, E., Ovechkin, V., Bangalore, P., Shuvalov, E., Orlov, N.F.A., Egorov, S. 1999. The Parallel Mathematical Libraries Project (PMLP): Overview, design innovations, and preliminary results. In: Proceedings of the Fifth International Conference on Parallel Computing Technologies.
Briand, L.C., Daly, J.W., Wüst, J. 1997a. A unified framework for cohesion measurement in object-oriented systems. In: Proceedings of the Fourth International Symposium on Software Metrics, Albuquerque, New Mexico. IEEE Computer Society, pp. 43–53.
Briand, L.C., Daly, J.W., Wüst, J.K. 1999. A unified framework for coupling measurement in object-oriented systems. IEEE Transactions on Software Engineering 25(1):91–121.
Briand, L.C., El Emam, K., Morasca, S. 1996a. On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 1(1):61–88. (See Briand et al., 1997b; Zuse, 1997a).
Briand, L.C., El Emam, K., Morasca, S. 1997b. Reply to Comments to the paper: Briand, El Emam, Morasca: On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 2(3):317–322. (See Briand et al., 1996a; Zuse, 1997a).
Briand, L.C., Morasca, S., Basili, V.R. 1996b. Property-based software engineering measurement. IEEE Transactions on Software Engineering 22(1):68–85. See comments in Briand et al. (1997c), Poels and Dedene (1997), Zuse (1997c).
Briand, L.C., Morasca, S., Basili, V.R. 1997c. Response to: Comments on Property-based software engineering measurement: Refining the additivity properties. IEEE Transactions on Software Engineering 23(3):196–197. (See Briand et al., 1996b; Poels and Dedene, 1997).
Chaitin, G.J. 1966. On the length of programs for computing finite binary sequences. Journal of the Association for Computing Machinery 13(4):547–569.
Chaitin, G.J. 1975. A theory of program size formally identical to information theory. Journal of the Association for Computing Machinery 22(3):329–340.
Chapin, N. 2002. Entropy-metric for systems with COTS software. In: Proceedings: Eighth IEEE Symposium on Software Metrics, Ottawa, Canada. IEEE Computer Society, pp. 173–181.
Chen, Y. 2000. Measurement of coupling and cohesion of software. Master's thesis, Florida Atlantic University, Boca Raton, Florida. Advised by Taghi M. Khoshgoftaar.
Chidamber, S.R., Kemerer, C.F. 1994. A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20(6):476–493.
Cover, T.M., Thomas, J.A. 1991. Elements of Information Theory. John Wiley & Sons, New York.
Davis, J.S., LeBlanc, R.J. 1988. A study of the applicability of complexity measures. IEEE Transactions on Software Engineering 14(9):1366–1372.
Dean, T., Malton, A., Holt, R. 2001. Union schemas as the basis for a C++ extractor. In: Proceedings: Working Conference on Reverse Engineering, Stuttgart, Germany.
El Emam, K., Benlarbi, S., Goel, N., Rai, S.N. 2001. The confounding effect of class size on the validity of object-oriented metrics. IEEE Transactions on Software Engineering 27(7):630–650. (See Evanco, 2003).
Evanco, W.M. 2003. Comments on ‘The confounding effect of class size on the validity of object-oriented metrics’. IEEE Transactions on Software Engineering 29(7):670–672. (See El Emam et al., 2001).
Fenton, N.E., Pfleeger, S.L. 1997. Software Metrics: A Rigorous and Practical Approach, 2nd edn. PWS Publishing, London.
Gottipati, S. 2003. Empirical validation of the usefulness of information theory-based software metrics. Master's thesis, Mississippi State University, Mississippi State, Mississippi. Advised by Edward B. Allen.
Govindarajan, R. 2004. An empirical validation of information theory-based software metrics in comparison to counting-based metrics: A case study approach. Master's thesis, Mississippi State University, Mississippi State, Mississippi. Advised by Edward B. Allen.
Hatton, L. 1997. Reexamining the fault density-component size connection. IEEE Software 14(2):89–97.
Hilgard, E.R., Atkinson, R.C., Atkinson, R.L. 1971. Introduction to Psychology. Harcourt Brace Jovanovich, New York.
Khoshgoftaar, T.M., Allen, E.B. 1994. Applications of information theory to software engineering measurement. Software Quality Journal 3(2):79–103.
Kim, K., Shin, Y., Wu, C. 1995. Complexity measures for object oriented program based on the entropy. In: Proceedings: 1995 Asia Pacific Software Engineering Conference, Brisbane, Australia. IEEE Computer Society, pp. 127–136.
Kitchenham, B.A., Pfleeger, S.L., Fenton, N.E. 1995. Towards a framework for software measurement validation. IEEE Transactions on Software Engineering 21(12):929–944. (See comments in Kitchenham et al. 1997, Morasca et al., 1997).
Kitchenham, B.A., Pfleeger, S.L., Fenton, N.E. 1997. Reply to: Comments on ‘Towards a framework for software measurement validation’. IEEE Transactions on Software Engineering 23(3):189. (See Kitchenham et al., 1995; Morasca et al., 1997; Weyuker, 1988).
Kolmogorov, A.N. 1965. Three approaches for defining the concept of information quantity. Problems in Information Transmission 1(1):1–7.
Kolmogorov, A.N. 1968. Logical basis for information theory and probability theory. IEEE Transactions on Information Theory IT-14(5):662–664.
Lapierre, S., Laguë, B., Leduc, C. 2001. Datrix source code model and its interchange format: Lessons learned and considerations for future work. ACM SIGSOFT Software Engineering Notes 26(1):53–60.
Lew, K.S., Dillon, T.S., Forward, K.E. 1988. Software complexity and its impact on software reliability. IEEE Transactions on Software Engineering 14(11):1645–1655.
Li, M., Vitányi, P.M.B. 1988. Two decades of applied Kolmogorov complexity. In: Proceedings of the Third Annual Structure in Complexity Theory Conference, Washington, DC, pp. 80–101.
Mayrand, J., Coallier, F. 1996. System acquisition based on software product assessment. In: Proceedings of the Eighteenth International Conference on Software Engineering, Berlin. IEEE Computer Society, pp. 210–219.
McCabe, T.J. 1976. A complexity measure. IEEE Transactions on Software Engineering SE-2(4):308–320.
Miller, G.A. 1956. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review 63(2):81–97.
Mohanty, S.N. 1979 Models and measurements for quality assessment of software. Computing Surveys 11(3):251–275.
Mohanty, S.N. 1981. Entropy metrics for software design evaluation. Journal of Systems and Software 2:39–46.
Morasca, S., Briand, L.C. 1997. Towards a theoretical framework for measuring software attributes. In: Proceedings of the Fourth International Symposium on Software Metrics, Albuquerque, New Mexico, IEEE Computer Society, pp. 119–126.
Morasca, S., Briand, L.C., Basili, V.R., Weyuker, E.J., Zelkowitz, M.V. 1997. Comments on ‘Towards a framework for software measurement validation’. IEEE Transactions on Software Engineering 23(3):187–188. (See Kitchenham et al., 1995; Weyuker, 1988).
Munson, J.C., Khoshgoftaar, T.M. 1989. The dimensionality of program complexity. In: Proceedings of the Eleventh International Conference on Software Engineering, Pittsburgh, Pennsylvania. IEEE Computer Society, pp. 245–253.
Oviedo, E.I. 1980. Control flow, data flow and program complexity. In: Proceedings: The IEEE Computer Society's Fourth International Computer Software and Applications Conference, Chicago, Illinois. IEEE Computer Society, pp. 146–152.
Poels, G., Dedene, G. 1997 Comments on ‘Property-based software engineering measurement’: Refining the additivity properties. IEEE Transactions on Software Engineering 23(3):190–195. (See Briand et al., 1996b).
Runeson, P., Andersson, C., Thelin, T., Andrews, A., Berling, T. 2006. What do we know about defect detection methods? IEEE Software 23(3):82–90.
Schütt, D. 1977. On a hypergraph oriented measure for applied computer science. In Digest of Papers: COMPCON 77 Fall, Washington, DC. IEEE Computer Society, pp. 295–296, Abstract only.
Shannon, C.E., Weaver, W. 1949. The Mathematical Theory of Communication. University of Illinois Press, Urbana, Illinois.
Shereshevshky, M., Ammari, H., Gradetsky, N., Mili, A., Ammar, H.H. 2001. Information theoretic metrics for software architecture. In: Proceedings 25th Annual International Computer Software and Applications Conference, Chicago. IEEE Computer Society, pp. 151–157.
Solomonoff, R.J. 1964. A formal theory of inductive inference, part 1 and part 2. Information and Control 7:1–22, 224–254.
University of Waterloo 2004. CPPX: Open source C++ fact extractor. http://swag.uwaterloo.ca/∼cppx. (Current July 7, 2006).
van Emden, M.H. 1970. Hierarchical decomposition of complexity. Machine Intelligence 5:361–380. (See also van Emden, 1971 for details).
van Emden, M.H. 1971. An Analysis of Complexity. Number 35 in Mathematical Centre Tracts. Mathematisch Centrum, Amsterdam.
Visaggio, G. 1997. Structural information as a quality metric in software systems organization. In: Proceedings International Conference on Software Maintenance, Bari, Italy. IEEE Computer Society, pp. 92–99.
Watanabe, S. 1960. Information theoretical analysis of multivariate correlation. IBM Journal of Research and Development 4(1):66–82.
Weyuker, E.J. 1988. Evaluating software complexity measures. IEEE Transactions on Software Engineering 14(9):1357–1365.
Zuse, H. 1997a. Comments to the paper: Briand, Emam, Morasca: On the application of measurement theory in software engineering. Empirical Software Engineering: An International Journal 2(3):313–316. (See Briand et al., 1996a, 1997b).
Zuse, H. 1997b. A Framework for Software Measurement. Walter de Gruyter and Co., Berlin.
Zuse, H. 1997c. Reply to: ‘Property-based software engineering measurement’. IEEE Transactions on Software Engineering 23(8):533. (See Briand et al., 1996b).
Acknowledgments
This work was supported in part by grant CCR-0098024 from the National Science Foundation. We thank Bell Canada for an academic license to use Datrix, a software measurement tool. We thank the Software Architecture Group of the University of Waterloo for providing the open-source tool cppx. We thank Shiva Juluru for providing the physics data manipulation program's source code. We thank Anthony Skjellum for providing pmlp source code. We thank Yoginder Dandass and Archana Chilukuri for help with pmlp measurement. We thank the Empirical Software Engineering research group at Mississippi State University for helpful discussions. We thank the anonymous reviewers for their helpful suggestions which significantly strengthened the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Allen, E.B., Gottipati, S. & Govindarajan, R. Measuring size, complexity, and coupling of hypergraph abstractions of software: An information-theory approach. Software Qual J 15, 179–212 (2007). https://doi.org/10.1007/s11219-006-9010-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-006-9010-3