How to write comments suitable for automatic software indexing1
Section snippets
Motivation
The traditional purpose of comments is to facilitate code understandability; however, there is a recent trend, originated from the research in the field of software reuse (see, for instance, Shafer et al., 1994; Systematic reuse, 1994; Mili et al., 1995; Software reuse, 1995) which attempts to use the comments also in the software indexing process with the final aim of building a software catalog.
Previous efforts for building software catalogs can be roughly classified into three basic groups
A free-text automatic indexing scheme
The indexing scheme we refer to here is the LA-based one proposed by Maarek et al. (1991); being an LA the co-occurrence of a pair of (inflectional roots of) words (let say (w1,w2)) within the generic sentence of a document. Specifically, the authors consider as meaning-bearing only those LAs involving open-class words (namely, nouns, verbs, adjectives, and adverbs). The LA extraction algorithm they adopt (Fig. 3) takes advantage of the empirical finding that 98% of all LAs relate to words
The natural-language documentation investigated
As mentioned in Section 1, the natural-language documentation used to carry out the experiment reported in the next section concerns two different categories of text files selected ad hoc:
- 1.
The text files of 20 Unix commands,4 as available on line through the command man. These text files are structured in terms of 8 items (Fig. 4).
- 2.
The text files reproducing the typeset sections given as natural-language documentation of 20 IMSL routines being
An experiment of automatic indexing
The aim of the experiment reported below, concerning both text files mentioned in Section 3,5 is twofold, namely give evidence:
- 1.
of the already mentioned impossibility of making predictions about the meaningfulness of the results arising from the application of automatic free-text indexing schemes to a short text;
- 2.
that the appropriateness of the derived
A new scenario
Since the quality of text cannot be expressed simply by the values of parameters capturing its lexical structure, we propose adopting a novel way of writing the comments in order to have a direct control over the final result of the indexing process. Such a strategy keeps the two objectives of the comments distinct (Section 1); moreover it relies on an automatic tool to get information about the appropriateness of the profile extractable from them at any given moment.
The idea behind the
Conclusions
Besides facilitating the code understandability, comments should also be suitable for indexing the software (basic step in the construction of a software catalog essential for speeding up the process of locating reusable software components). In this paper we have defended the thesis that in order to achieve both such objectives comments, like programs, have to be written according to a given discipline fixing the comments' specifications as well as the procedure to meet them. In this way, the
Unlinked References
Ralston and Rabinowitz, 1978
Acknowledgements
We are grateful to two anonymous referees whose comments deeply influenced the presentation of our work.
Paolino Di Felice is an associate professor of computer science at the Department of Electrical Engineering of the University of L'Aquila, Italy.
He has published articles in the areas of programming methodologies, visual programming, scientific software, relational databases, and object-oriented data modeling. His current research interests concern software reuse, spatial relations, and approximate spatial reasoning.
He is an affiliate member of the IEEE Computer Society and the Association for
References (14)
- et al.
Using English to retrieve software
J. Systems Software
(1995) Reusability of mathematical software: a contribution
IEEE Trans. Software Eng.
(1993)- Di Felice, P., Fonzi, G., 1995. On automatic software indexing. Technical Rep. No.47–95, Univ. of...
- Frakes, W.B., Nejmeh, B.A., 1987. Software reuse through information retrieval. In: Proceedings of the Twentieth Annual...
- et al.
Proteus: A reuse library system that supports multiple representation methods
ACM SIGIR Forum
(1990) - et al.
An empirical study of representation methods for reusable software components
IEEE Trans. Software Eng.
(1994) - et al.
An information retrieval approach for automatically construct software libraries
IEEE Trans. Software Eng.
(1991)
Cited by (4)
FNDS: A dialogue-based system for accessing digested financial news
2005, Journal of Systems and SoftwareCitation Excerpt :We envision that the techniques underlying the design of FNDS will ultimately be applied to knowledge grid research (Berman, 2001; Zhuge, 2004). Assuming that heterogeneous sources of information are described using natural language-like metadata (similar to that proposed in (Di Felice and Fonzi, 1998)), it may be possible to automatically integrate and digest information/knowledge by applying various information extraction techniques (with the aid of an ontology of the relevant domain). A representation may thus be generated for specializing the digested knowledge, which can facilitate the retrieval of the knowledge (Zhuge and Liu, 2004).
Improved method for the indexing of software
1999, Information and Software TechnologyReuse-conducive development environments
2005, Automated Software EngineeringPromoting reuse with active reuse repository systems
2000, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Paolino Di Felice is an associate professor of computer science at the Department of Electrical Engineering of the University of L'Aquila, Italy.
He has published articles in the areas of programming methodologies, visual programming, scientific software, relational databases, and object-oriented data modeling. His current research interests concern software reuse, spatial relations, and approximate spatial reasoning.
He is an affiliate member of the IEEE Computer Society and the Association for Computing Machinery. He is a founder member of the ACM special interest group on applied computing (SIGAPP) and a program committee member of the annual symposium of the SIGAPP.
Goffredo Fonzi received the Dr. Ing. degree in Electronic Engineering from the University of L'Aquila, Italy, in 1995. His main research interest concern software reuse. He is an affiliate member of the IEEE Computer Society.
- 1
Work supported by the M.U.R.S.T.