Abstract
The functional programming paradigm involves stateless computation on immutable data constructs. While this paradigm’s historical context dates back to the early twentieth century with lambda calculus and a formal study of computability and function definition, there has been a resurgence in functional programming, especially in the area of predictive analytics. New, purely functional, languages have recently emerged, and functional extensions have been added to several popular programming languages. It is sometimes difficult to estimate the overall utility and extensibility of functional programming software components. At the same time, many software metrics exist that attempt to quantify various qualitative attributes of software components. Here, we use a computational intelligence strategy that uses a set of software metrics to predict the qualitative utility of a software system’s underlying components. Centroid-adjusted class labelling is a pattern classification preprocessing method that compensates for the possible imprecision of an established external reference test (gold standard) by adjusting, when necessary, design pattern class labels while maintaining the reference test’s discriminatory power. The adjusted design labels incorporate within-class centroid information using robust measures of location and dispersion. This method is applied to a biomedical data analysis software system written in a functional programming style. It is shown that significant improvement to the discriminatory power of the classifier is obtained when using this preprocessing method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adler, J.: R in a Nutshell, 2nd edn. O’Reilly Media Inc, Sebastopol (2012)
Aggarwa, C.C.: Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014)
Backus, J.: Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM 21(8), 613–641 (1978)
Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2007)
Brown, B.M.: Statistical use of spatial median. J. Roy. Stat. Soc. B 45, 25–35 (1983)
Canfora, G., Troiano, L.: The importance of dealing with uncertainty in the evaluation of software engineering methods and tools. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, Ischia, Italy, 15–19 July, pp. 691–698 (2002)
Card, D., Glass, R.: Measuring Software Design Quality. Prentice-Hall, Englewood Cliffs (1990)
Cesarini, F., Thompson, S.: Erlang Programming: A Concurrent Approach to Software Development. O’Reilly Media Inc, Sebastopol (2014)
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object-oriented design. IEEE Trans. Softw. Eng. 20, 476–493 (1994)
Church, A.: An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363 (1936)
Coad, P., Mayfield, M., Kern, J.: Java Design: Building Better Apps & Applets. Prentice Hall, Upper Saddle River (1999)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Donoho, D.L.: Breakdown properties of multivariate location estimators. Ph.D. Qualifying Paper, Department of Statistics, Harvard University (1982)
Dougherty, G.: Pattern Recognition and Classification: An Introduction. Springer, New York (2013)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, Hoboken (2004)
El-Alfy, E.-S.M., Thampi, S.M., Takagi, H., Piramuthu, S., Hanne, T.: Advances in Intelligent Informatics. Springer, Berlin (2014)
Emerick, C., Carper, B., Grand, C.: Clojure Programming: Practical Lisp for the Java World. O’Reilly Media Inc, Sebastopol (2012)
Everitt, B.S.: Moments of the statistics kappa and weighted kappa. Br. J. Math. Stat. Psychol. 21(1), 97–103 (1968)
Fenton, N.E., Kaposi, A.A.: Metrics and software structure. Inf. Softw. Technol. 29, 301–320 (1987)
Fenton, N.E., Pfleeger, S.L.: Software Metrics: A Rigorous and Practical Approach. PWS Publishing, Boston (1997)
Fleiss, J.L.: Measuring agreement between judges on the presence or absence of a trait. Biometrics 31(3), 651–659 (1975)
Ford, N.: Functional Thinking: Paradigm Over Syntax. O’Reilly Media Inc, Sebastopol (2014)
Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)
Glover, F.: Tabu search, I. ORSA J. Comput. 1, 190–206 (1989)
Grandvalet, Y., Canu, S.: Adaptive scaling for feature selection in SVMs. In: Advances in Neural Information Processing Systems, vol. 15 (NIPS 2002), pp. 569–576. Cambridge, MIT Press (2003)
Haldane, J.B.S.: Note on the median of a multivariate distribution. Biometrika 35(3–4), 414–415 (1948)
Halstead, M.H.: Elements of Software Science. Elsevier, New York (1977)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2011)
Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity. Prentice Hall, Upper Saddle River (1995)
Hoaglin, D.C., Mosteller, F., Tukey, J.W.: Understanding Robust and Exploratory Data Analysis. Wiley-Interscience, New York (2000)
Huang, S.-J., Lin, C.-Y., Chiu, N.-H.: Fuzzy decision tree approach for embedding risk assessment information into software cost estimation model. J. Inf. Sci. Eng. 22, 297–313 (2006)
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Hudak, P., Jones, M.P.: Haskell vs. Ada vs. C++ vs. Awk vs. … An experiment in software prototyping productivity 1994, 17 p. http://haskell.cs.yale.edu/wp-content/uploads/2011/03/HaskellVsAda-NSWC.pdf
Hughes, J.: Why functional programming matters. Comput. J. 32(2), 98–107 (1989)
Jones, C.: Software metrics: good. Bad Missing Comput. 27, 98–100 (1994)
Jung, H.-W., Kim, S.-G., Chung, C.-S.: Measuring software product quality: a survey of ISO/IEC 9126. IEEE Softw. 21, 88–92 (2004)
Kasabov, N., Song, Q.: DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans. Fuzzy Syst. 10, 144–154 (2002)
Kitchenham, B.A., Hughes, R.T., Kinkman, S.G.: Modeling software measurement data. IEEE Trans. Softw. Eng. 27, 788–804 (2001)
Landis, J.R., Koch, G.G.: The measurements of observer agreement for categorical data. Biometrics 33(1), 159–174 (1997)
Leroy, X., Doligez, D., Frisch, A., Garrigue, J., Rémy, D., Vouillon, J.: The OCaml system release 4.02: documentation and user’s manual. Institut National de Recherche en Informatique et en Automatique (2014). http://caml.inria.fr/distrib/ocaml-4.02/ocaml-4.02-refman.pdf
Lieberherr, K.J., Holland, I.M.: Assuring good style for object-oriented programs. IEEE Softw. 6, 38–48 (1989)
Liu, Q., Sung, A., Chen, Z., Xu, J.: Feature mining and pattern classification for LSB matching steganography in grayscale images. Pattern Recogn. 41, 56–66 (2008)
Lyu, M.R.: Handbook of Software Reliability Engineering. McGraw-Hill, Toronto (1996)
Mangano, S.: Mathematica Cookbook. O’Reilly Media Inc, Sebastopol (2010)
Marinescu, R.: Detecting design flaws via metrics in object-oriented system. International Conference and Exhibition on Technology of Object-Oriented Languages and Systems, Santa Barbara, USA, 29 July–3 August, pp. 173–182 (2001)
McCabe, T.J.: A complexity metric. IEEE Trans. Softw. Eng. 2, 308–320 (1976)
Mohri, M.: Foundations of Machine Learning. MIT Press, Cambridge (2012)
Murofushi, T., Sugeno, M.: A theory of fuzzy measures: Representations, the Choquet integral, and null sets. J. Math. Anal. Appl. 159, 532–549 (1991)
O’Sullivan, B., Goerzen, J., Stewart, D.B.: Real World Haskell: Code You Can Believe In. O’Reilly Media Inc, Sebastopol (2008)
Okasaki, C.: Purely Functional Data Structures. Cambridge University Press, Cambridge (1998)
Pedrycz, W., Sosnowski, Z.A.: The design of decision trees in the framework of granular data and their application to software quality models. Fuzzy Sets Syst. 123, 271–290 (2001)
Phelps, C.E., Hutson, A.: Estimating diagnostic test accuracy using a “fuzzy gold standard”. Med. Decis. Mak. 15(1), 44–57 (1995)
Pizzi, N.J.: Fuzzy preprocessing of gold standards as applied to biomedical spectra classification. Artif. Intell. Med. 16(2), 171–182 (1999)
Pizzi, N.J.: Discrimination of biomedical patterns using centroid-adjusted class labels. Can. Appl. Math. Q. (2014, in press)
Poels, G., Dedene, G.: Distance-based software measurement: necessary and sufficient properties for software measures. Inf. Softw. Technol. 42, 35–46 (2000)
Pressman, R.S., Maxim, B.R.: Software Engineering: A Practitioner’s Approach, 8th edn. McGraw-Hill, New York (2014)
Pudil, P., Novovicová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)
Reformat, M., Pedrycz, W., Pizzi, N.J.: Software quality analysis with the use of computational intelligence. Inf. Softw. Technol. 45, 405–417 (2003)
Schmitt, E., Bombardier, V., Wendling, L.: Improving fuzzy rule classifier by extracting suitable features from capacities with respect to the Choquet integral. IEEE Trans. Syst. Man Cybern. 38, 1195–1206 (2008)
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Seber, G.A.F.: Multivariate Observations. Wiley, Hoboken (2007)
Sicilia, M.A., Cuadrado, J.J., Crespo, J., García-Barriocanal, E.: Software cost estimation with fuzzy inputs: fuzzy modeling and aggregation of cost drivers. Kybernetika 41, 249–264 (2005)
Small, C.G.: Measures of centrality of multivariate and directional distributions. Can. J. Stat. 15(1), 31–39 (1987)
Smith, C.: Programming F# 3.0: A Comprehensive Guide for Writing Simple Code to Solve Complex Problems, 2nd edn. O’Reilly Media, Inc., Sebastopol (2012)
Sommerville, I.: Software Engineering, 9th edn. Addison-Wesley, Boston (2010)
Sturm, O.: Functional Programming in C#: Classic Programming Techniques for Modern Projects. Wiley, Chichester (2011)
Tahir, M., Bouridane, A., Kurugollu, F.: Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier. Pattern Recogn. Lett. 28, 438–446 (2007)
Tang, E.K., Suganthan, P.N., Yao, X.: Gene selection algorithms for microarray data based on least square support vector machine. BMC Bioinformatics 7(95) (2006)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, San Diego (2008)
Tukey, J.W.: Mathematics and picturing data. In: Proceedings of the International Congress of Mathematicians, Vancouver, Canada, pp. 523–531 (1975)
Valenstein, P.N.: Evaluating diagnostic tests with imperfect standards. Am. J. Clin. Pathol. 93(2), 252–258 (1990)
van den Berg, K.G., van den Broek, P.M.: Static analysis of functional programs. Inf. Softw. Technol. 37(4), 213–224 (1995)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Vapnik, V., Lerner, A.: Pattern recognition using generalized portrait method. Autom. Remote Control 24(6), 774–780 (1963)
Walter, S.D., Irwig, L.M.: Estimation of test error rates, disease prevalence, and relative risk from misclassified data: A review. J. Clin. Epidemiol. 41(9), 923–937 (1988)
Wang, L.: Support Vector Machines: Theory and Applications. Springer, Berlin (2005)
Warburton, R.: Java 8 Lambdas: Functional Programming for the Masses. O’Reilly Media, Inc., Sebastopol (2014)
Weyuker, E.J.: Evaluating software complexity measures. IEEE Trans. Softw. Eng. 14, 1357–1365 (1988)
Yehuda, V., Zhang, C.: The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. 97(4), 1423–1426 (2000)
Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man. Cybern. SMC-3(1), 28–44 (1973)
Acknowledgment
The Natural Sciences and Engineering Research Council of Canada (NSERC) is gratefully acknowledged for its support of this investigation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Pizzi, N.J. (2016). Measuring the Utility of Functional-Based Software Using Centroid-Adjusted Class Labelling. In: Pedrycz, W., Succi, G., Sillitti, A. (eds) Computational Intelligence and Quantitative Software Engineering. Studies in Computational Intelligence, vol 617. Springer, Cham. https://doi.org/10.1007/978-3-319-25964-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-25964-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25962-8
Online ISBN: 978-3-319-25964-2
eBook Packages: EngineeringEngineering (R0)