Skip to main content
Log in

Towards a Software Change Classification System: A Rough Set Approach

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

The basic contribution of this paper is the presentation of two methods that can be used to design a practical software change classification system based on data mining methods from rough set theory. These methods incorporate recent advances in rough set theory related to coping with the uncertainty in making change decisions either during software development or during post-deployment of a software system. Two well-known software engineering data sets have been used as means of benchmarking the proposed classification methods, and also to facilitate comparison with other published studies on the same data sets. Two technologies in computation intelligence (CI) are used in the design of the software change classification systems described in this paper, namely, rough sets (a granular computing technology) and genetic algorithms. Using 10-fold cross validated paired t-test, this paper also compares the rough set classification learning method with the Waikato Environment for Knowledge Analysis (WEKA) classification learning method. The contribution of this paper is the presentation of two models for software change classification based on two CI technologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Basili, V. and Perricone, B.T. 1984. Software errors and complexity: An empirical investigation, IEEE Transactions on Software Engineering 10(6): 728–738.

    Google Scholar 

  • Bazan, J.G. 1998. A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision tables, In (Polkowski and Skowron, 1998a), pp. 321–365.

  • Bazan, J.G. 2000. RSES and RSESlib-A collection of tools for rough set computations, In W. Ziarko and Y. Yao (eds.), Rough Sets and Current Trends in Computing, Lecture Notes in Artificial Intelligence, Vol. 2005, Springer-Verlag, Berlin, pp. 106–113.

    Google Scholar 

  • Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., and Wroblewski, J. 2000. Rough set algorithms in classification problem, In L. Polkowski, S. Tsumoto, and T.Y. Lin (eds.), Rough Set Methods and Applications, Physica-Verlag, New York, pp. 49–88.

    Google Scholar 

  • Bazan, J.G., Szczuka, M.S., and Wroblewski, J. 2002. A new version of the rough set exploration system, In J.J.Alpigini, J.F. Peters, A. Skowron, and N. Zhong (eds.), Rough Sets and Current Trends in Computing, Lecture Notes in Artificial Intelligence, Vol. 2475, Springer-Verlag, Berlin, pp. 397–404.

    Google Scholar 

  • Belady, L.A. 1979. On software complexity, In Proceedings of the Workshop on Quantitative Software Models for Reliability, IEEE No. TH0067-9, New York, pp. 90–94.

  • Belady, L.A. and Evangelisti, C.J. 1980. A graphic representation of structured programs, IBM Systems Journal 19(4): 542–553.

    Google Scholar 

  • Belady, L.A. and Evangelisti, C.J. 1981. System partitioning and its measure, The Journal of Systems and Software 2: 23–29.

    Google Scholar 

  • Beyer, W.H. 1968. Handbook of Tables for Probability and Statistics, CRC Press, Ohio.

    Google Scholar 

  • Briand, L.C., Basili, V.R., and Thomas,W.M. 1992. A pattern recognition approach for software engineering data analysis, IEEE Transactions on Software Engineering 18(11): 931–942.

    Google Scholar 

  • Cusumano, M.A. 1991. Japan's Software Factories, Oxford University Press, Oxford.

    Google Scholar 

  • Dietterich, T.G. 1998. Approximate statistical tests for comparing supervised classification learning algorithms, Neural Computation 10(7): 1895–1924.

    Google Scholar 

  • Fenton, N.E. and Kaposi, A.A. 1987. Metrics and software structure, Journal of Information and Software Technology 29: 301–320.

    Google Scholar 

  • Fenton, N.E. and Pleeger, S.L. 1997. Software Metrics: A Rigorous & Practical Approach, PWS Publishing Company, Boston, MA.

    Google Scholar 

  • Goldberg, D.E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA.

    Google Scholar 

  • Gryzmala-Busse, J.W. 1992. LERS-A system for learning from examples based on rough sets, In (Slowinski, 1992), pp. 3–18.

  • Gryzmala-Busse, J.W. 1998. LERS: A knowledge discovery system, In (Polkowski and Skowron, 1998b), pp. 562–565.

  • Halstead, M.H. 1977. Elements of Software Science, Elsevier, New York.

    Google Scholar 

  • Hogg, R.V. and Tanis, E.A. 1977. Probability and Statistical Inference. Macmillan Publishing Co., Inc., New York.

    Google Scholar 

  • Hussein, A. and Dietterich, T.G. 1992. Efficient algorithms for identifying relevant features, In Proc. of the 9th Canadian Conf. on Artificial Intelligence, Vancouver, BC, pp. 38–45.

  • Ichino, M. and Sklansky, J. 1984. Optimal feature selection by zero-one programming, IEEE Trans. Sys. Man & Cyb. SMC-14(5): 737–746.

    Google Scholar 

  • Jensen, H.A. and Vairavan, K. 1985. An experimental study of software metrics for real-time software, IEEE Transactions on Software Engineering 11(2): 231–234.

    Google Scholar 

  • Johnson, D.S. 1974. Approximation algorithms for combinatorial problems, Journal of Computer and System Sciences 9: 256–278.

    Google Scholar 

  • Khoshgoftaar, T.M. and Allen, E.B. 1994. Predicting software quality during testing using neural network models: A comparative study, Int. J. of Reliability, Quality and Safety Engineering 1(3): 303–319.

    Google Scholar 

  • Khoshgoftaar, T.M. and Allen, E.B. 1998. Neural networks for software quality prediction, In W. Pedrycz and J.F. Peters (eds.), Computational Intelligence in Software Engineering, World Scientific, Singapore, pp. 33–63.

    Google Scholar 

  • Khoshgoftaar, T.M. and Munson, J.C. 1990. The lines of code metric as a predictor of program faults: A critical analysis, In Proceedings of Computer Software and Applications Conference (COMPSAC), pp. 408–413.

  • Khoshgoftaar, T.M., Munson, J.C., Bhattacharya, B.B., and Richardson, G.D. 1992. Predictive modeling techniques of software quality from software measures, IEEE Trans. on Software Engineering 18(11): 979–986.

    Google Scholar 

  • Khoshgoftaar, T.M., Szabo, R.M., and Woodcock, T.G. 1994. An empirical study of program quality during testing and maintenance, Software Quality Journal 3: 137–151.

    Google Scholar 

  • Kitchenham, B. and Pickard, L. 1987. Towards a constructive quality model-Part II: Statistical techniques for modeling software quality in the ESPRIT REQUEST project, Software Engineering Journal 2(4): 114–126.

    Google Scholar 

  • Komorowski, J., Pawlak, Z., Polkowski, L., and Skowron, A. 1999. Rough sets: A tutorial, In S.K. Pal and A. Skowron (eds.), Rough Fuzzy Hybridization: A New Trend in Decision-Making, Springer-Verlag, Berlin, pp. 3–98.

    Google Scholar 

  • Koza, J.R. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection, The MIT Press, Cambridge, MA.

    Google Scholar 

  • Lind, R.K. and Vairavan, K. 1989. An experimental investigation of software metrics and their relationships to software development effort, IEEE Trans. on Software Engineering 15(5): 649–653.

    Google Scholar 

  • Mayer, A. and Sykes, A.M. 1992. Statistical methods for the analysis of software metrics data, Software Quality Journal 1: 209–223.

    Google Scholar 

  • McCabe, T. 1976. A complexity measure, IEEE Trans. on Software Engineering 2(4): 308–320.

    Google Scholar 

  • Mitchell, T.M. 1997. Machine Learning, McGraw-Hill, New York.

    Google Scholar 

  • Modrzejewski, M. 1993. Feature selection using rough set theory, In Proceedings of the ECML, pp. 213–226.

  • Munson, J.C. and Khoshgoftaar, J.C. 1990. Regression modeling of software quality: Empirical investigation, Information and Software Technology 32(2): 106–114.

    Google Scholar 

  • Nguyen, H.S. and Nguyen, S.H. 1998a. Discretization methods in data mining, In (Polkowski and Skowron, 1998a), pp. 451–482.

  • Nguyen, S.H. and Nguyen, H.S. 1998b. Pattern extraction from data, Fundamenta Informaticae 34: 1–16.

    Google Scholar 

  • Ohrn, A. 1999. Discernibility and Rough Sets in Medicine: Tools and Applications, Ph.D. Thesis, Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway.

    Google Scholar 

  • Pagallo, G. and Haussler, D. 1990. Boolean feature discovery in empirical learning, Machine Learning 5(1): 71–100.

    Google Scholar 

  • Pao, Y.-H. and Bozma, I. 1986. Quantization of numerical sensor data for inductive learning, In J.S. Kowalik (ed.), Coupling Symbolic and Numeric Computing in Expert Systems, Elsevier Science, Amsterdam, pp. 69–81.

    Google Scholar 

  • Pawlak, Z. (1991). Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishers, Boston, MA.

    Google Scholar 

  • Pawlak, Z., Peters, J.F., Skowron, A., Suraj, Z., Ramanna, S., and Borkowski, M. 2001. Rough measures: Theory and Applications, In S. Hirano, M. Inuiguchi, and S. Tsumoto (eds.), Rough Set Theory and Granular Computing Bulletin of the International Rough Set Society, Vol. 5, No. 1/2, pp. 177–184.

  • Pawlak, Z., Peters, J.F., Skowron, A., Suraj, Z., Ramanna, S., and Borkowski, M. 2002. Rough measures, rough integrals, and sensor fusion, In S. Hirano, M. Inuiguchi, and S. Tsumoto (eds.), Rough Sets and Granular Computing, Physica-Verlag, Berlin.

    Google Scholar 

  • Pawlak, Z. and Skowron, A. 1994. Rough membership functions, In R. Yager, M. Fedrizzi, and J. Kacprzyk (eds.), Advances in the Dempster-Shafer Theory of Evidence, John Wiley & Sons, New York, pp. 251–271.

    Google Scholar 

  • Pedrycz, W., Han, L., Peters, J.F., Ramanna, S., and Zhai, R. 2001. Calibration of software quality: Fuzzy neural and rough neural approaches, Neurocomputing 36: 149–170.

    Google Scholar 

  • Pedrycz, W. and Peters, J.F. 1997. Computational intelligence in software engineering, In Proceedings of the Canadian Conf. on Electrical & Computer Engineering, pp. 253–257.

  • Pedrycz, W. and Peters, J.F. 1998. Computational Intelligence in Software Engineering, World Scientific, Singapore.

    Google Scholar 

  • Peters, J.F., Han, L., and Ramanna, S. 2000. The Choquet integral in a rough software cost estimation system, In M. Grabisch, T. Murofushi, and M. Sugeno (eds.), Fuzzy Measures and Integrals: Theory and Applications, Springer-Verlag, Heidelberg, Germany, pp. 392–414.

    Google Scholar 

  • Peters, J.F. and Pedrycz, W. 1999. Computational Intelligence, In J.G. Webster (ed.),Encyclopedia of Electrical and Electronic Engineering, 22 vols, John Wiley & Sons, Inc., New York.

    Google Scholar 

  • Peters, J.F. and Ramanna, S. 1999. A rough sets approach to assessing software quality: Concepts and rough Petri net models, In S.K. Pal and A. Skowron (eds.), Rough-Fuzzy Hybridization: New Trends in Decision Making, Springer-Verlag, Berlin, pp. 349–380.

    Google Scholar 

  • Peters, J.F. and Skowron, A. 2002. A rough set approach to knowledge discovery, International Journal of Intelligent Systems 17(2): 109–112.

    Google Scholar 

  • Peters, J.F., Skowron, A., Suraj, Z., Pedrycz, W., Pizzi, N. and Ramanna, S. 2003. Classification of meteorological volumetric radar data using rough set methods, Pattern Recognition Letters 24(6): 911–920.

    Google Scholar 

  • Polkowski, L. and Skowron, A., eds. 1998a. Rough Sets in Knowledge Discovery, Vol. 1, Physica-Verlag, Berlin.

    Google Scholar 

  • Polkowski, L. and Skowron, A., eds. 1998b. Rough Sets in Knowledge Discovery, Vol. 2, Physica-Verlag, Berlin.

    Google Scholar 

  • Polkowski, L. and Skowron, A., eds. 1998c. Rough Sets and Current Trends in Computing, Lecture Notes in Artificial Intelligence, Vol. 1424, Springer-Verlag, Berlin.

    Google Scholar 

  • Quinlan, J.R. 1986. Induction of decision trees, Machine Learning 1(1): 81–106.

    Google Scholar 

  • Sal, J., Lehman, A., and Creighton, L. 2001. JMP Start Statistics: A Guide to Statistics and Data Analysis, Statistical Analysis Systems (SAS) Institute, Duxbury, Pacific Grove, CA.

    Google Scholar 

  • Skowron, A. and Rauszer, C. 1992. The discernibility matrices and functions in information systems, In (Slowinski, 1992), pp. 331–362.

  • Skowron, A. and Polkowski, L. 1997. Synthesis of decision systems from data tables, In T.Y. Lin and N. Cercone (eds.), Rough Sets and Data Mining: Analysis for Imprecise Data, Kluwer Academic Publishers, Boston, pp. 259–300.

    Google Scholar 

  • Skowron, A., Stepaniuk, J., and Peters, J.F. 2001. Extracting patterns using information granules, In S. Hirano, M. Inuiguchi, and S. Tsumoto (eds.), Proc. of Int. Workshop on Rough Set Theory and Granular Computing (RSTGC'01), Matsue, Shimane, pp. 135–142.

    Google Scholar 

  • Skowron, A., Stepaniuk, J., and Peters, J.F. 2002. Towards discovery of relevant patterns from parameterized schemes of information granule construction, In S. Hirano, M. Inuiguchi, and S. Tsumoto (eds.), Rough Sets and Granular Computing, Physica-Verlag, Berlin.

    Google Scholar 

  • Skowron, A. and Swiniarski, R.W. 2002. Information granulation and pattern recognition, In S. Pal, L. Polkowski, and A. Skowron (eds.), Rough-Neuro Computing, Physica-Verlag, Berlin, pp. 636–670.

    Google Scholar 

  • Slowinski, R., ed. 1992. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht.

    Google Scholar 

  • Stepaniuk, J. 1998. Approximation spaces, reducts and representatives, In (Polkowski and Skowron, 1998b), pp. 295–306.

  • Rosetta. 1999. http://www.idi.ntnu.no/~aleks/rosetta/

  • RSES. 2002. http://logic.mimuw.edu.pl/~rses/

  • Tanaka, H. and Maeda, Y. 1998. Reduction methods for medical data, In (Polkowski and Skowron, 1998b), pp. 295–306.

  • WEKA. 2002. http://www.cs.waikato.ac.nz/ml/weka

  • Witten, I.H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kauffman Publishers, San Francisco.

    Google Scholar 

  • Wroblewski, J. 1995. Finding minimal reducts using genetic algorithms, In Proc. of the 2nd Annual Joint Conf. on Information Sciences, Wrightsville Beach, NC, pp. 186–189.

  • Wroblewski, J. 1998a. Genetic algorithms in decomposition and classification problem, In (Polkowski and Skowron, 1998a), pp. 471–487.

  • Wroblewski, J. 1998b. Covering with reducts-A fast algorithm for rule generation, In (Polkowski and Skowron, 1998c), pp. 402–407.

  • Zuse, H. 1990. Software Complexity: Measures and Methods, W. deGruyter, New York.

    Google Scholar 

  • Zuse, H. 1998. A Framework for Software Measurement, W. deGruyter, New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peters, J.F., Ramanna, S. Towards a Software Change Classification System: A Rough Set Approach. Software Quality Journal 11, 121–147 (2003). https://doi.org/10.1023/A:1023764510838

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1023764510838

Navigation