Abstract
This paper presents a method which discovers the structure of given open source programs from their developer mailing lists. Our goal is to help successive developers understand the structures and the components of open source programs even if documents about them are not provided sufficiently. Our method consists of two phases: (1) producing a mapping between the source files and the emails, and (2) constructing a lattice from the produced mapping and then reducing it with a novel algorithm, called PRUNIA (PRUNing Algorithm Based on Introduced Attributes), in order to obtain a more compact structure. We performed experiments with some open source projects which are originally from or popular in Japan such as Namazu and Ruby. The experimental results reveal that the extracted structures reflect very well important parts of the hidden structures of the programs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cimitile, A., Visaggio, G.: Software salvaging and the call dominance tree. Journal of Systems and Software 28(2), 117–127 (1995)
Ganter, B., Wille, R.: Applied lattice theory–Formal concept analysis. In: Gratzer, G. (ed.) General Lattice Theory. Birkhauser, Basel (1997)
Ganter, B., Wille, R.: Formal Concept Analysis–Mathematical Foundations. Springer, Heidelberg (1999)
Lindig, C.: Colibri–command line tool for concept analysis, http://www.st.cs.uni-saarland.de/~lindig/
Lindig, C., Snelting, G.: Assessing modular structure of legacy code based on mathematical concept analysis. In: Proceedings of the 19th International Conference on Software Engineering (ICSE 1997), pp. 349–359 (1997)
Namazu, http://www.namazu.org/
Nicolas, P., Yves, B., Rafik, T., Lotfi, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24, 25–46 (1999)
Rasinen, A., Hollmen, J., Mannila, H.: Analysis of Linux evolution using aligned source code segments. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds.) DS 2006. LNCS (LNAI), vol. 4265, pp. 209–218. Springer, Heidelberg (2006)
Schwanke, R.W.: An intelligent tool for re-engineering software modularity. In: Proceedings of the 13th International Conference on Software Engineering (ICSE 1991), pp. 83–92. IEEE Computer Society Press, Los Alamitos (1991)
Snelting, G.: Concept analysis–A new framework for program understanding. In: Proceedings of the 1998 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 1998), pp. 1–10. ACM, New York (1998)
Tanaka, K., Akaishi, M., Takasu, A.: Topic change extraction and reorganization from problem-solving records. In: Proceedings of International Conference on Software Knowledge Information Management and Applications, pp. 153–158 (2006)
Tang, J., Li, H., Cao, Y., Tang, Z.: Email data cleaning. In: Proceedings of the 11th International Conference on Knowledge Discovery in Data Mining (KDD 2005), pp. 489–498 (2005)
Washizaki, H., Fukazawa, Y.: A technique for automatic component extraction from object-oriented programs by refactoring. Sci. Comput. Program. 56(1-2), 99–116 (2005)
Wille, R.: Restructuring lattice theory–An approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht (1982)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)
Zaki, M.J.: Mining non-redundant association rules. Data Min. Knowl. Discov. 9(3), 223–248 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nguyen, D.A., Doi, K., Yamamoto, A. (2009). Discovering the Structures of Open Source Programs from Their Developer Mailing Lists. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-04747-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04746-6
Online ISBN: 978-3-642-04747-3
eBook Packages: Computer ScienceComputer Science (R0)