Skip to main content

Discovering the Structures of Open Source Programs from Their Developer Mailing Lists

  • Conference paper
Discovery Science (DS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5808))

Included in the following conference series:

  • 1911 Accesses

Abstract

This paper presents a method which discovers the structure of given open source programs from their developer mailing lists. Our goal is to help successive developers understand the structures and the components of open source programs even if documents about them are not provided sufficiently. Our method consists of two phases: (1) producing a mapping between the source files and the emails, and (2) constructing a lattice from the produced mapping and then reducing it with a novel algorithm, called PRUNIA (PRUNing Algorithm Based on Introduced Attributes), in order to obtain a more compact structure. We performed experiments with some open source projects which are originally from or popular in Japan such as Namazu and Ruby. The experimental results reveal that the extracted structures reflect very well important parts of the hidden structures of the programs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chasen, http://chasen.naist.jp/hiki/ChaSen/

  2. Cimitile, A., Visaggio, G.: Software salvaging and the call dominance tree. Journal of Systems and Software 28(2), 117–127 (1995)

    Article  Google Scholar 

  3. Ganter, B., Wille, R.: Applied lattice theory–Formal concept analysis. In: Gratzer, G. (ed.) General Lattice Theory. Birkhauser, Basel (1997)

    Google Scholar 

  4. Ganter, B., Wille, R.: Formal Concept Analysis–Mathematical Foundations. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  5. HOS, http://sourceforge.jp/projects/hos/

  6. Lindig, C.: Colibri–command line tool for concept analysis, http://www.st.cs.uni-saarland.de/~lindig/

  7. Lindig, C., Snelting, G.: Assessing modular structure of legacy code based on mathematical concept analysis. In: Proceedings of the 19th International Conference on Software Engineering (ICSE 1997), pp. 349–359 (1997)

    Google Scholar 

  8. Namazu, http://www.namazu.org/

  9. Nicolas, P., Yves, B., Rafik, T., Lotfi, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24, 25–46 (1999)

    Article  Google Scholar 

  10. Rasinen, A., Hollmen, J., Mannila, H.: Analysis of Linux evolution using aligned source code segments. In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds.) DS 2006. LNCS (LNAI), vol. 4265, pp. 209–218. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Ruby, http://www.ruby-lang.org/

  12. Schwanke, R.W.: An intelligent tool for re-engineering software modularity. In: Proceedings of the 13th International Conference on Software Engineering (ICSE 1991), pp. 83–92. IEEE Computer Society Press, Los Alamitos (1991)

    Chapter  Google Scholar 

  13. Snelting, G.: Concept analysis–A new framework for program understanding. In: Proceedings of the 1998 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 1998), pp. 1–10. ACM, New York (1998)

    Chapter  Google Scholar 

  14. Tanaka, K., Akaishi, M., Takasu, A.: Topic change extraction and reorganization from problem-solving records. In: Proceedings of International Conference on Software Knowledge Information Management and Applications, pp. 153–158 (2006)

    Google Scholar 

  15. Tang, J., Li, H., Cao, Y., Tang, Z.: Email data cleaning. In: Proceedings of the 11th International Conference on Knowledge Discovery in Data Mining (KDD 2005), pp. 489–498 (2005)

    Google Scholar 

  16. Washizaki, H., Fukazawa, Y.: A technique for automatic component extraction from object-oriented programs by refactoring. Sci. Comput. Program. 56(1-2), 99–116 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  17. Wille, R.: Restructuring lattice theory–An approach based on hierarchies of concepts. In: Rival, I. (ed.) Ordered Sets, pp. 445–470. Reidel, Dordrecht (1982)

    Chapter  Google Scholar 

  18. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  19. Zaki, M.J.: Mining non-redundant association rules. Data Min. Knowl. Discov. 9(3), 223–248 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, D.A., Doi, K., Yamamoto, A. (2009). Discovering the Structures of Open Source Programs from Their Developer Mailing Lists. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04747-3_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04746-6

  • Online ISBN: 978-3-642-04747-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics