Measuring the Utility of Functional-Based Software Using Centroid-Adjusted Class Labelling

Pizzi, Nick J.

doi:10.1007/978-3-319-25964-2_6

Nick J. Pizzi^5,6

Part of the book series: Studies in Computational Intelligence ((SCI,volume 617))

612 Accesses

Abstract

The functional programming paradigm involves stateless computation on immutable data constructs. While this paradigm’s historical context dates back to the early twentieth century with lambda calculus and a formal study of computability and function definition, there has been a resurgence in functional programming, especially in the area of predictive analytics. New, purely functional, languages have recently emerged, and functional extensions have been added to several popular programming languages. It is sometimes difficult to estimate the overall utility and extensibility of functional programming software components. At the same time, many software metrics exist that attempt to quantify various qualitative attributes of software components. Here, we use a computational intelligence strategy that uses a set of software metrics to predict the qualitative utility of a software system’s underlying components. Centroid-adjusted class labelling is a pattern classification preprocessing method that compensates for the possible imprecision of an established external reference test (gold standard) by adjusting, when necessary, design pattern class labels while maintaining the reference test’s discriminatory power. The adjusted design labels incorporate within-class centroid information using robust measures of location and dispersion. This method is applied to a biomedical data analysis software system written in a functional programming style. It is shown that significant improvement to the discriminatory power of the classifier is obtained when using this preprocessing method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adler, J.: R in a Nutshell, 2nd edn. O’Reilly Media Inc, Sebastopol (2012)
Google Scholar
Aggarwa, C.C.: Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014)
Google Scholar
Backus, J.: Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM 21(8), 613–641 (1978)
Article MathSciNet MATH Google Scholar
Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2007)
MATH Google Scholar
Brown, B.M.: Statistical use of spatial median. J. Roy. Stat. Soc. B 45, 25–35 (1983)
MATH Google Scholar
Canfora, G., Troiano, L.: The importance of dealing with uncertainty in the evaluation of software engineering methods and tools. In: Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, Ischia, Italy, 15–19 July, pp. 691–698 (2002)
Google Scholar
Card, D., Glass, R.: Measuring Software Design Quality. Prentice-Hall, Englewood Cliffs (1990)
Google Scholar
Cesarini, F., Thompson, S.: Erlang Programming: A Concurrent Approach to Software Development. O’Reilly Media Inc, Sebastopol (2014)
Google Scholar
Chidamber, S.R., Kemerer, C.F.: A metrics suite for object-oriented design. IEEE Trans. Softw. Eng. 20, 476–493 (1994)
Article Google Scholar
Church, A.: An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363 (1936)
Article MathSciNet MATH Google Scholar
Coad, P., Mayfield, M., Kern, J.: Java Design: Building Better Apps & Applets. Prentice Hall, Upper Saddle River (1999)
Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Article MATH Google Scholar
Donoho, D.L.: Breakdown properties of multivariate location estimators. Ph.D. Qualifying Paper, Department of Statistics, Harvard University (1982)
Google Scholar
Dougherty, G.: Pattern Recognition and Classification: An Introduction. Springer, New York (2013)
Book MATH Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, Hoboken (2004)
MATH Google Scholar
El-Alfy, E.-S.M., Thampi, S.M., Takagi, H., Piramuthu, S., Hanne, T.: Advances in Intelligent Informatics. Springer, Berlin (2014)
Google Scholar
Emerick, C., Carper, B., Grand, C.: Clojure Programming: Practical Lisp for the Java World. O’Reilly Media Inc, Sebastopol (2012)
Google Scholar
Everitt, B.S.: Moments of the statistics kappa and weighted kappa. Br. J. Math. Stat. Psychol. 21(1), 97–103 (1968)
Article Google Scholar
Fenton, N.E., Kaposi, A.A.: Metrics and software structure. Inf. Softw. Technol. 29, 301–320 (1987)
Article Google Scholar
Fenton, N.E., Pfleeger, S.L.: Software Metrics: A Rigorous and Practical Approach. PWS Publishing, Boston (1997)
Google Scholar
Fleiss, J.L.: Measuring agreement between judges on the presence or absence of a trait. Biometrics 31(3), 651–659 (1975)
Article MathSciNet Google Scholar
Ford, N.: Functional Thinking: Paradigm Over Syntax. O’Reilly Media Inc, Sebastopol (2014)
Google Scholar
Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Reading (1999)
Google Scholar
Glover, F.: Tabu search, I. ORSA J. Comput. 1, 190–206 (1989)
Article MathSciNet MATH Google Scholar
Grandvalet, Y., Canu, S.: Adaptive scaling for feature selection in SVMs. In: Advances in Neural Information Processing Systems, vol. 15 (NIPS 2002), pp. 569–576. Cambridge, MIT Press (2003)
Google Scholar
Haldane, J.B.S.: Note on the median of a multivariate distribution. Biometrika 35(3–4), 414–415 (1948)
Article MathSciNet MATH Google Scholar
Halstead, M.H.: Elements of Software Science. Elsevier, New York (1977)
MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2011)
MATH Google Scholar
Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity. Prentice Hall, Upper Saddle River (1995)
Google Scholar
Hoaglin, D.C., Mosteller, F., Tukey, J.W.: Understanding Robust and Exploratory Data Analysis. Wiley-Interscience, New York (2000)
MATH Google Scholar
Huang, S.-J., Lin, C.-Y., Chiu, N.-H.: Fuzzy decision tree approach for embedding risk assessment information into software cost estimation model. J. Inf. Sci. Eng. 22, 297–313 (2006)
Google Scholar
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Article MathSciNet MATH Google Scholar
Hudak, P., Jones, M.P.: Haskell vs. Ada vs. C++ vs. Awk vs. … An experiment in software prototyping productivity 1994, 17 p. http://haskell.cs.yale.edu/wp-content/uploads/2011/03/HaskellVsAda-NSWC.pdf
Hughes, J.: Why functional programming matters. Comput. J. 32(2), 98–107 (1989)
Article Google Scholar
Jones, C.: Software metrics: good. Bad Missing Comput. 27, 98–100 (1994)
Google Scholar
Jung, H.-W., Kim, S.-G., Chung, C.-S.: Measuring software product quality: a survey of ISO/IEC 9126. IEEE Softw. 21, 88–92 (2004)
Article Google Scholar
Kasabov, N., Song, Q.: DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans. Fuzzy Syst. 10, 144–154 (2002)
Article Google Scholar
Kitchenham, B.A., Hughes, R.T., Kinkman, S.G.: Modeling software measurement data. IEEE Trans. Softw. Eng. 27, 788–804 (2001)
Article Google Scholar
Landis, J.R., Koch, G.G.: The measurements of observer agreement for categorical data. Biometrics 33(1), 159–174 (1997)
Article MathSciNet MATH Google Scholar
Leroy, X., Doligez, D., Frisch, A., Garrigue, J., Rémy, D., Vouillon, J.: The OCaml system release 4.02: documentation and user’s manual. Institut National de Recherche en Informatique et en Automatique (2014). http://caml.inria.fr/distrib/ocaml-4.02/ocaml-4.02-refman.pdf
Lieberherr, K.J., Holland, I.M.: Assuring good style for object-oriented programs. IEEE Softw. 6, 38–48 (1989)
Article Google Scholar
Liu, Q., Sung, A., Chen, Z., Xu, J.: Feature mining and pattern classification for LSB matching steganography in grayscale images. Pattern Recogn. 41, 56–66 (2008)
Article MATH Google Scholar
Lyu, M.R.: Handbook of Software Reliability Engineering. McGraw-Hill, Toronto (1996)
Google Scholar
Mangano, S.: Mathematica Cookbook. O’Reilly Media Inc, Sebastopol (2010)
Google Scholar
Marinescu, R.: Detecting design flaws via metrics in object-oriented system. International Conference and Exhibition on Technology of Object-Oriented Languages and Systems, Santa Barbara, USA, 29 July–3 August, pp. 173–182 (2001)
Google Scholar
McCabe, T.J.: A complexity metric. IEEE Trans. Softw. Eng. 2, 308–320 (1976)
Article MathSciNet MATH Google Scholar
Mohri, M.: Foundations of Machine Learning. MIT Press, Cambridge (2012)
MATH Google Scholar
Murofushi, T., Sugeno, M.: A theory of fuzzy measures: Representations, the Choquet integral, and null sets. J. Math. Anal. Appl. 159, 532–549 (1991)
Article MathSciNet MATH Google Scholar
O’Sullivan, B., Goerzen, J., Stewart, D.B.: Real World Haskell: Code You Can Believe In. O’Reilly Media Inc, Sebastopol (2008)
Google Scholar
Okasaki, C.: Purely Functional Data Structures. Cambridge University Press, Cambridge (1998)
Book MATH Google Scholar
Pedrycz, W., Sosnowski, Z.A.: The design of decision trees in the framework of granular data and their application to software quality models. Fuzzy Sets Syst. 123, 271–290 (2001)
Article MathSciNet MATH Google Scholar
Phelps, C.E., Hutson, A.: Estimating diagnostic test accuracy using a “fuzzy gold standard”. Med. Decis. Mak. 15(1), 44–57 (1995)
Article Google Scholar
Pizzi, N.J.: Fuzzy preprocessing of gold standards as applied to biomedical spectra classification. Artif. Intell. Med. 16(2), 171–182 (1999)
Article Google Scholar
Pizzi, N.J.: Discrimination of biomedical patterns using centroid-adjusted class labels. Can. Appl. Math. Q. (2014, in press)
Google Scholar
Poels, G., Dedene, G.: Distance-based software measurement: necessary and sufficient properties for software measures. Inf. Softw. Technol. 42, 35–46 (2000)
Article Google Scholar
Pressman, R.S., Maxim, B.R.: Software Engineering: A Practitioner’s Approach, 8th edn. McGraw-Hill, New York (2014)
Google Scholar
Pudil, P., Novovicová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)
Article Google Scholar
Reformat, M., Pedrycz, W., Pizzi, N.J.: Software quality analysis with the use of computational intelligence. Inf. Softw. Technol. 45, 405–417 (2003)
Article Google Scholar
Schmitt, E., Bombardier, V., Wendling, L.: Improving fuzzy rule classifier by extracting suitable features from capacities with respect to the Choquet integral. IEEE Trans. Syst. Man Cybern. 38, 1195–1206 (2008)
Article Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Google Scholar
Seber, G.A.F.: Multivariate Observations. Wiley, Hoboken (2007)
MATH Google Scholar
Sicilia, M.A., Cuadrado, J.J., Crespo, J., García-Barriocanal, E.: Software cost estimation with fuzzy inputs: fuzzy modeling and aggregation of cost drivers. Kybernetika 41, 249–264 (2005)
MATH Google Scholar
Small, C.G.: Measures of centrality of multivariate and directional distributions. Can. J. Stat. 15(1), 31–39 (1987)
Article MathSciNet MATH Google Scholar
Smith, C.: Programming F# 3.0: A Comprehensive Guide for Writing Simple Code to Solve Complex Problems, 2nd edn. O’Reilly Media, Inc., Sebastopol (2012)
Google Scholar
Sommerville, I.: Software Engineering, 9th edn. Addison-Wesley, Boston (2010)
MATH Google Scholar
Sturm, O.: Functional Programming in C#: Classic Programming Techniques for Modern Projects. Wiley, Chichester (2011)
MATH Google Scholar
Tahir, M., Bouridane, A., Kurugollu, F.: Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier. Pattern Recogn. Lett. 28, 438–446 (2007)
Article Google Scholar
Tang, E.K., Suganthan, P.N., Yao, X.: Gene selection algorithms for microarray data based on least square support vector machine. BMC Bioinformatics 7(95) (2006)
Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press, San Diego (2008)
MATH Google Scholar
Tukey, J.W.: Mathematics and picturing data. In: Proceedings of the International Congress of Mathematicians, Vancouver, Canada, pp. 523–531 (1975)
Google Scholar
Valenstein, P.N.: Evaluating diagnostic tests with imperfect standards. Am. J. Clin. Pathol. 93(2), 252–258 (1990)
Google Scholar
van den Berg, K.G., van den Broek, P.M.: Static analysis of functional programs. Inf. Softw. Technol. 37(4), 213–224 (1995)
Article Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Vapnik, V., Lerner, A.: Pattern recognition using generalized portrait method. Autom. Remote Control 24(6), 774–780 (1963)
Google Scholar
Walter, S.D., Irwig, L.M.: Estimation of test error rates, disease prevalence, and relative risk from misclassified data: A review. J. Clin. Epidemiol. 41(9), 923–937 (1988)
Article Google Scholar
Wang, L.: Support Vector Machines: Theory and Applications. Springer, Berlin (2005)
MATH Google Scholar
Warburton, R.: Java 8 Lambdas: Functional Programming for the Masses. O’Reilly Media, Inc., Sebastopol (2014)
Google Scholar
Weyuker, E.J.: Evaluating software complexity measures. IEEE Trans. Softw. Eng. 14, 1357–1365 (1988)
Article MathSciNet Google Scholar
Yehuda, V., Zhang, C.: The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. 97(4), 1423–1426 (2000)
Article MathSciNet MATH Google Scholar
Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man. Cybern. SMC-3(1), 28–44 (1973)
Google Scholar

Download references

Acknowledgment

The Natural Sciences and Engineering Research Council of Canada (NSERC) is gratefully acknowledged for its support of this investigation.

Author information

Authors and Affiliations

InfoMagnetics Technologies Corporation, Research and Technology Development, Winnipeg, MB, R3C 3Z5, Canada
Nick J. Pizzi
Department of Computer Science, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada
Nick J. Pizzi

Authors

Nick J. Pizzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nick J. Pizzi .

Editor information

Editors and Affiliations

Department of Electrical and Computer En, University of Alberta, Edmonton, Alberta, Canada
Witold Pedrycz
Innopolis University, Bolzano, Italy
Giancarlo Succi
Center for Applied Software Engineering, Bolzano, Italy
Alberto Sillitti

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pizzi, N.J. (2016). Measuring the Utility of Functional-Based Software Using Centroid-Adjusted Class Labelling. In: Pedrycz, W., Succi, G., Sillitti, A. (eds) Computational Intelligence and Quantitative Software Engineering. Studies in Computational Intelligence, vol 617. Springer, Cham. https://doi.org/10.1007/978-3-319-25964-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-25964-2_6
Published: 15 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25962-8
Online ISBN: 978-3-319-25964-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics