Skip to main content

Measuring the Utility of Functional-Based Software Using Centroid-Adjusted Class Labelling

  • Chapter
  • First Online:
Computational Intelligence and Quantitative Software Engineering

Part of the book series: Studies in Computational Intelligence ((SCI,volume 617))

  • 612 Accesses

Abstract

The functional programming paradigm involves stateless computation on immutable data constructs. While this paradigm’s historical context dates back to the early twentieth century with lambda calculus and a formal study of computability and function definition, there has been a resurgence in functional programming, especially in the area of predictive analytics. New, purely functional, languages have recently emerged, and functional extensions have been added to several popular programming languages. It is sometimes difficult to estimate the overall utility and extensibility of functional programming software components. At the same time, many software metrics exist that attempt to quantify various qualitative attributes of software components. Here, we use a computational intelligence strategy that uses a set of software metrics to predict the qualitative utility of a software system’s underlying components. Centroid-adjusted class labelling is a pattern classification preprocessing method that compensates for the possible imprecision of an established external reference test (gold standard) by adjusting, when necessary, design pattern class labels while maintaining the reference test’s discriminatory power. The adjusted design labels incorporate within-class centroid information using robust measures of location and dispersion. This method is applied to a biomedical data analysis software system written in a functional programming style. It is shown that significant improvement to the discriminatory power of the classifier is obtained when using this preprocessing method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adler, J.: R in a Nutshell, 2nd edn. O’Reilly Media Inc, Sebastopol (2012)

    Google Scholar 

  2. Aggarwa, C.C.: Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014)

    Google Scholar 

  3. Backus, J.: Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM 21(8), 613–641 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2007)

    MATH  Google Scholar 

  5. Brown, B.M.: Statistical use of spatial median. J. Roy. Stat. Soc. B 45, 25–35 (1983)

    MATH  Google Scholar 

  6. Canfora, G., Troiano, L.: The importance of dealing with uncertainty in the evaluation of software engineering methods and tools. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, Ischia, Italy, 15–19 July, pp. 691–698 (2002)

    Google Scholar 

  7. Card, D., Glass, R.: Measuring Software Design Quality. Prentice-Hall, Englewood Cliffs (1990)

    Google Scholar 

  8. Cesarini, F., Thompson, S.: Erlang Programming: A Concurrent Approach to Software Development. O’Reilly Media Inc, Sebastopol (2014)

    Google Scholar 

  9. Chidamber, S.R., Kemerer, C.F.: A metrics suite for object-oriented design. IEEE Trans. Softw. Eng. 20, 476–493 (1994)

    Article  Google Scholar 

  10. Church, A.: An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363 (1936)

    Article  MathSciNet  MATH  Google Scholar 

  11. Coad, P., Mayfield, M., Kern, J.: Java Design: Building Better Apps & Applets. Prentice Hall, Upper Saddle River (1999)

    Google Scholar 

  12. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  13. Donoho, D.L.: Breakdown properties of multivariate location estimators. Ph.D. Qualifying Paper, Department of Statistics, Harvard University (1982)

    Google Scholar 

  14. Dougherty, G.: Pattern Recognition and Classification: An Introduction. Springer, New York (2013)

    Book  MATH  Google Scholar 

  15. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, Hoboken (2004)

    MATH  Google Scholar 

  16. El-Alfy, E.-S.M., Thampi, S.M., Takagi, H., Piramuthu, S., Hanne, T.: Advances in Intelligent Informatics. Springer, Berlin (2014)

    Google Scholar 

  17. Emerick, C., Carper, B., Grand, C.: Clojure Programming: Practical Lisp for the Java World. O’Reilly Media Inc, Sebastopol (2012)

    Google Scholar 

  18. Everitt, B.S.: Moments of the statistics kappa and weighted kappa. Br. J. Math. Stat. Psychol. 21(1), 97–103 (1968)

    Article  Google Scholar 

  19. Fenton, N.E., Kaposi, A.A.: Metrics and software structure. Inf. Softw. Technol. 29, 301–320 (1987)

    Article  Google Scholar 

  20. Fenton, N.E., Pfleeger, S.L.: Software Metrics: A Rigorous and Practical Approach. PWS Publishing, Boston (1997)

    Google Scholar 

  21. Fleiss, J.L.: Measuring agreement between judges on the presence or absence of a trait. Biometrics 31(3), 651–659 (1975)

    Article  MathSciNet  Google Scholar 

  22. Ford, N.: Functional Thinking: Paradigm Over Syntax. O’Reilly Media Inc, Sebastopol (2014)

    Google Scholar 

  23. Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)

    Google Scholar 

  24. Glover, F.: Tabu search, I. ORSA J. Comput. 1, 190–206 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  25. Grandvalet, Y., Canu, S.: Adaptive scaling for feature selection in SVMs. In: Advances in Neural Information Processing Systems, vol. 15 (NIPS 2002), pp. 569–576. Cambridge, MIT Press (2003)

    Google Scholar 

  26. Haldane, J.B.S.: Note on the median of a multivariate distribution. Biometrika 35(3–4), 414–415 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  27. Halstead, M.H.: Elements of Software Science. Elsevier, New York (1977)

    MATH  Google Scholar 

  28. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2011)

    MATH  Google Scholar 

  29. Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity. Prentice Hall, Upper Saddle River (1995)

    Google Scholar 

  30. Hoaglin, D.C., Mosteller, F., Tukey, J.W.: Understanding Robust and Exploratory Data Analysis. Wiley-Interscience, New York (2000)

    MATH  Google Scholar 

  31. Huang, S.-J., Lin, C.-Y., Chiu, N.-H.: Fuzzy decision tree approach for embedding risk assessment information into software cost estimation model. J. Inf. Sci. Eng. 22, 297–313 (2006)

    Google Scholar 

  32. Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  33. Hudak, P., Jones, M.P.: Haskell vs. Ada vs. C++ vs. Awk vs. … An experiment in software prototyping productivity 1994, 17 p. http://haskell.cs.yale.edu/wp-content/uploads/2011/03/HaskellVsAda-NSWC.pdf

  34. Hughes, J.: Why functional programming matters. Comput. J. 32(2), 98–107 (1989)

    Article  Google Scholar 

  35. Jones, C.: Software metrics: good. Bad Missing Comput. 27, 98–100 (1994)

    Google Scholar 

  36. Jung, H.-W., Kim, S.-G., Chung, C.-S.: Measuring software product quality: a survey of ISO/IEC 9126. IEEE Softw. 21, 88–92 (2004)

    Article  Google Scholar 

  37. Kasabov, N., Song, Q.: DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans. Fuzzy Syst. 10, 144–154 (2002)

    Article  Google Scholar 

  38. Kitchenham, B.A., Hughes, R.T., Kinkman, S.G.: Modeling software measurement data. IEEE Trans. Softw. Eng. 27, 788–804 (2001)

    Article  Google Scholar 

  39. Landis, J.R., Koch, G.G.: The measurements of observer agreement for categorical data. Biometrics 33(1), 159–174 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  40. Leroy, X., Doligez, D., Frisch, A., Garrigue, J., Rémy, D., Vouillon, J.: The OCaml system release 4.02: documentation and user’s manual. Institut National de Recherche en Informatique et en Automatique (2014). http://caml.inria.fr/distrib/ocaml-4.02/ocaml-4.02-refman.pdf

  41. Lieberherr, K.J., Holland, I.M.: Assuring good style for object-oriented programs. IEEE Softw. 6, 38–48 (1989)

    Article  Google Scholar 

  42. Liu, Q., Sung, A., Chen, Z., Xu, J.: Feature mining and pattern classification for LSB matching steganography in grayscale images. Pattern Recogn. 41, 56–66 (2008)

    Article  MATH  Google Scholar 

  43. Lyu, M.R.: Handbook of Software Reliability Engineering. McGraw-Hill, Toronto (1996)

    Google Scholar 

  44. Mangano, S.: Mathematica Cookbook. O’Reilly Media Inc, Sebastopol (2010)

    Google Scholar 

  45. Marinescu, R.: Detecting design flaws via metrics in object-oriented system. International Conference and Exhibition on Technology of Object-Oriented Languages and Systems, Santa Barbara, USA, 29 July–3 August, pp. 173–182 (2001)

    Google Scholar 

  46. McCabe, T.J.: A complexity metric. IEEE Trans. Softw. Eng. 2, 308–320 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  47. Mohri, M.: Foundations of Machine Learning. MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  48. Murofushi, T., Sugeno, M.: A theory of fuzzy measures: Representations, the Choquet integral, and null sets. J. Math. Anal. Appl. 159, 532–549 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  49. O’Sullivan, B., Goerzen, J., Stewart, D.B.: Real World Haskell: Code You Can Believe In. O’Reilly Media Inc, Sebastopol (2008)

    Google Scholar 

  50. Okasaki, C.: Purely Functional Data Structures. Cambridge University Press, Cambridge (1998)

    Book  MATH  Google Scholar 

  51. Pedrycz, W., Sosnowski, Z.A.: The design of decision trees in the framework of granular data and their application to software quality models. Fuzzy Sets Syst. 123, 271–290 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  52. Phelps, C.E., Hutson, A.: Estimating diagnostic test accuracy using a “fuzzy gold standard”. Med. Decis. Mak. 15(1), 44–57 (1995)

    Article  Google Scholar 

  53. Pizzi, N.J.: Fuzzy preprocessing of gold standards as applied to biomedical spectra classification. Artif. Intell. Med. 16(2), 171–182 (1999)

    Article  Google Scholar 

  54. Pizzi, N.J.: Discrimination of biomedical patterns using centroid-adjusted class labels. Can. Appl. Math. Q. (2014, in press)

    Google Scholar 

  55. Poels, G., Dedene, G.: Distance-based software measurement: necessary and sufficient properties for software measures. Inf. Softw. Technol. 42, 35–46 (2000)

    Article  Google Scholar 

  56. Pressman, R.S., Maxim, B.R.: Software Engineering: A Practitioner’s Approach, 8th edn. McGraw-Hill, New York (2014)

    Google Scholar 

  57. Pudil, P., Novovicová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)

    Article  Google Scholar 

  58. Reformat, M., Pedrycz, W., Pizzi, N.J.: Software quality analysis with the use of computational intelligence. Inf. Softw. Technol. 45, 405–417 (2003)

    Article  Google Scholar 

  59. Schmitt, E., Bombardier, V., Wendling, L.: Improving fuzzy rule classifier by extracting suitable features from capacities with respect to the Choquet integral. IEEE Trans. Syst. Man Cybern. 38, 1195–1206 (2008)

    Article  Google Scholar 

  60. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)

    Google Scholar 

  61. Seber, G.A.F.: Multivariate Observations. Wiley, Hoboken (2007)

    MATH  Google Scholar 

  62. Sicilia, M.A., Cuadrado, J.J., Crespo, J., García-Barriocanal, E.: Software cost estimation with fuzzy inputs: fuzzy modeling and aggregation of cost drivers. Kybernetika 41, 249–264 (2005)

    MATH  Google Scholar 

  63. Small, C.G.: Measures of centrality of multivariate and directional distributions. Can. J. Stat. 15(1), 31–39 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  64. Smith, C.: Programming F# 3.0: A Comprehensive Guide for Writing Simple Code to Solve Complex Problems, 2nd edn. O’Reilly Media, Inc., Sebastopol (2012)

    Google Scholar 

  65. Sommerville, I.: Software Engineering, 9th edn. Addison-Wesley, Boston (2010)

    MATH  Google Scholar 

  66. Sturm, O.: Functional Programming in C#: Classic Programming Techniques for Modern Projects. Wiley, Chichester (2011)

    MATH  Google Scholar 

  67. Tahir, M., Bouridane, A., Kurugollu, F.: Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier. Pattern Recogn. Lett. 28, 438–446 (2007)

    Article  Google Scholar 

  68. Tang, E.K., Suganthan, P.N., Yao, X.: Gene selection algorithms for microarray data based on least square support vector machine. BMC Bioinformatics 7(95) (2006)

    Google Scholar 

  69. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, San Diego (2008)

    MATH  Google Scholar 

  70. Tukey, J.W.: Mathematics and picturing data. In: Proceedings of the International Congress of Mathematicians, Vancouver, Canada, pp. 523–531 (1975)

    Google Scholar 

  71. Valenstein, P.N.: Evaluating diagnostic tests with imperfect standards. Am. J. Clin. Pathol. 93(2), 252–258 (1990)

    Google Scholar 

  72. van den Berg, K.G., van den Broek, P.M.: Static analysis of functional programs. Inf. Softw. Technol. 37(4), 213–224 (1995)

    Article  Google Scholar 

  73. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  74. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  75. Vapnik, V., Lerner, A.: Pattern recognition using generalized portrait method. Autom. Remote Control 24(6), 774–780 (1963)

    Google Scholar 

  76. Walter, S.D., Irwig, L.M.: Estimation of test error rates, disease prevalence, and relative risk from misclassified data: A review. J. Clin. Epidemiol. 41(9), 923–937 (1988)

    Article  Google Scholar 

  77. Wang, L.: Support Vector Machines: Theory and Applications. Springer, Berlin (2005)

    MATH  Google Scholar 

  78. Warburton, R.: Java 8 Lambdas: Functional Programming for the Masses. O’Reilly Media, Inc., Sebastopol (2014)

    Google Scholar 

  79. Weyuker, E.J.: Evaluating software complexity measures. IEEE Trans. Softw. Eng. 14, 1357–1365 (1988)

    Article  MathSciNet  Google Scholar 

  80. Yehuda, V., Zhang, C.: The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. 97(4), 1423–1426 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  81. Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man. Cybern. SMC-3(1), 28–44 (1973)

    Google Scholar 

Download references

Acknowledgment

The Natural Sciences and Engineering Research Council of Canada (NSERC) is gratefully acknowledged for its support of this investigation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nick J. Pizzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Pizzi, N.J. (2016). Measuring the Utility of Functional-Based Software Using Centroid-Adjusted Class Labelling. In: Pedrycz, W., Succi, G., Sillitti, A. (eds) Computational Intelligence and Quantitative Software Engineering. Studies in Computational Intelligence, vol 617. Springer, Cham. https://doi.org/10.1007/978-3-319-25964-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25964-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25962-8

  • Online ISBN: 978-3-319-25964-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics