Abstract
This paper presents a general approach for the identification of objects in procedural programs. The approach is based on neural architectures that perform an unsupervised learning of clusters. We describe two such neural architectures, explain how to use them in identifying objects in software systems and briefly describe a prototype tool, which implements the clustering algorithms. With the aid of several examples, we explain how our approach can identify abstract data types as well as groups of routines which reference a common set of data. The clustering results are compared to the results of many other object identification techniques. Finally, several case studies were performed on existing programs to evaluate the object identification approach. Results concerning two representative programs and their generated clusters are discussed.
Similar content being viewed by others
References
Abd-El-Hafiz, S.K. 1997. Effects of decomposition techniques on knowledge-based program understanding. In Proceedings of the International Conference on Software Maintenance, Bari, Italy, pp. 21–30.
Abd-El-Hafiz, S.K. and Basili, V.R. 1996. A knowledge-based approach to the analysis of loops. IEEE Trans.on Software Engineering, 22(5):339–360.
Abd-El-Hafiz, S.K., Basili, V.R., and Caldiera, G. 1991. Towards automated support for extraction of reusable components. In Proceedings of the Conference on Software Maintenance, Sorrento, Italy, pp. 212–219.
Achee, B.L. and Carver, D.L. 1994. A greedy approach to object identification in imperative code. In Proceedings of the Third Workshop on Program Comprehension, pp. 4–11.
Anquetil, N. and Lethbridge, T. 1998. Extracting concepts from file names; a new file clustering criterion. In Proceedings of the International Conference on Software Engineering, Kyoto, Japan.
Antoniol, G., Fiutem, R., Lutteri, G., and Merlo, E. 1997. Program understanding and maintenance with the CANTO environment. In Proceedings of the International Conference on Software Maintenance, Bari, Italy, pp. 72–81.
Canfora, G., Cimitile, A., and Munro, M. 1993a. A reverse engineering method for identifying reusable abstract data types. In Proceeding of the First Working Conference on Reverse Engineering, Baltimore, Maryland, pp. 73–82.
Canfora, G., Cimitile, A., and Munro, M. 1996. An improved algorithm for identifying objects in code. Software Practice and Experience, 26(1):25–48.
Canfora, G., Cimitile, A., Munro, M., and Taylor, C.J. 1993b. Extracting abstract data types from C programs: A case study. In Proceedings of the International Conference on Software Maintenance, Montreal, Quebec, Canada, pp. 200–209.
Cimitile, A. and Visaggio, G. 1995. Software salvaging and the call dominance tree. The Journal of Systems and Software, 28(2):117–127.
Dekker, R. and Ververs, F. 1994. Abstract data structure recognition. In Proceedings of the Ninth Knowledge-Based Software Engineering Conference, pp. 133–140.
Dunn, M.F. and Knight, J.C. 1993. Automating the detection of reusable parts in existing software. In Proceedings of the 15th International Conference on Software Engineering, Baltimore, Maryland, pp. 381–390.
Frakes, W.B., Fox, C.J., and Nejmeh, B.A. 1991. Software Engineering in the UNIX/C Environment. Prentice Hall.
Hutchens, D.H. and Basili, V.R. 1985. System structure analysis: Clustering with data binding. IEEE Transaction on Software Engineering, SE-11(8):749–757.
Ibba, R., Natale, D., Benedusi, P., and Naddei, R. 1993. Structure-based clustering of components for software reuse. In Proceedings of the International Conference on Software Maintenance, Montreal, Quebec, Canada, pp. 210–215.
Jain, A.K., Mao, J., and Mohiuddin, K.M. 1996. Artificial neural networks: A tutorial. IEEE Computer, 29(3):31–44.
Jalote, P. 1991. An Integrated Approach to Software Engineering. Springer-Verlag.
Knight, K. 1990. Connectionist ideas and algorithms. Communications of the ACM, 33(11):59–74.
Kunz, T. 1996. Evaluating process clusters to support automatic program understanding. In Proceedings of the Fourth Workshop on Program Comprehension, pp. 198–207.
Lakothia, A. 1997. A unified framework for expressing software subsystem classification techniques. Journal of Systems and Software, 36:211–231.
Lindig, C. and Snelting, G. 1997. Assessing modular structure of legacy code based on mathematical concept analysis. In Proceedings of the 19th International Conference on Software Engineering, pp. 349–359.
Liu, S. and Wilde, N. 1990. Identifying objects in a conventional procedural language: An example of data design recovery. In Proceedings of the Conference on Software Maintenance, San Diego, California, pp. 266–271.
Livadas, P.E. and Johnson, T. 1994. A new approach to finding objects in programs. Software Maintenance: Research and Practice, 6:249–260.
Mancoridis, S., Mitchell, B.S., Rorres, C., Chen,Y., and Gansner, E.R. 1998. Using automatic clustering to produce high-level system organizations of source code. In Proceedings of the Sixth InternationalWorkshop on Program Comprehension, Ischia, Italy.
McFall, D. and Sleith, G. 1993. Reverse engineering structured code to an object oriented representation. In Proceedings of the Fifth International Conference on Software Engineering and Knowledge Engineering, pp. 86–93.
Mehrotra, K., Mohan, C.K., and Ranka, S. 1997. Elements of Artificial Neural Networks. The MIT Press.
Merlo, E., McAdam, I., De Mori, R. 1993. Source code informal information analysis using connectionist models. International Joint Conference of Artificial Intelligence, vol. 2. Los Altos, CA, pp. 1339–1344.
Muller, H.A., Orgun, M.A., Tilley, S.R., and Uhl, J.S. 1993. A reverse engineering approach to subsystem structure identification. Software Maintenance: Research and Practice, 5(4):181–204.
Nelson, P.A. 1993. GDBM, the GNU Data Base Manager. Cambridge, MA: Free Software Foundation.
Newcomb, P. and Kotik, G. 1995. Reengineering procedural into object-oriented systems. In Proceeding of the Second Working Conference on Reverse Engineering, Toronto, Ontario, Canada, pp. 237–249.
North, S. and Koutsofios, E. 1994. Applications of graph visualization. In Proceedings of Graphics Interface, Banff, Alberta, pp. 235–245.
Sahraoui, H.A., Melo, W., Lounis, H., Dumont, F. 1997. Applying concept formation methods to object identification in procedural code. Technical Report CRIM-97/05-77, CRIM.
Schwanke, R.W. 1991. An intelligent tool for re-engineering software modularity. In Proceedings of the Thirteenth IEEE International Conference on Software Engineering, Austin, Texas, pp. 83–92.
Siff, M. and Reps, T. 1997. Identifying modules via concept analysis. In Proceedings of the International Conference on Software Maintenance, Bari, Italy, pp. 170–179.
Snelting, G. 1996. Reengineering of configurations based on mathematical concept analysis. ACM Transactions on Software Engineering and Methodology, 5(2):146–189.
Weiser, M. 1984. Program slicing. IEEE Trans.on Software Engineering, SE-10(4):352–357.
Wiggerts, T.A. 1997. Using clustering algorithms in legacy systems remodularization. In Proceedings of the Working Conference on Reverse Engineering, Amsterdam, Holland, pp. 33–43.
Yeh, A., Harris, D.R., and Reubenstein, H.B. 1995. Recovering abstract data types and object instances from a conventional procedural language. In Proceeding of the Second Working Conference on Reverse Engineering, Toronto, Ontario, Canada, pp. 227–236.
Zurada, J. 1992. Introduction to Artificial Neural Systems. West Publishing Company.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Abd-El-Hafiz, S.K. Identifying Objects in Procedural Programs Using Clustering Neural Networks. Automated Software Engineering 7, 239–261 (2000). https://doi.org/10.1023/A:1008718105516
Issue Date:
DOI: https://doi.org/10.1023/A:1008718105516