Clustering Dynamic Class Coupling Data to Measure Class Reusability Pattern

Parashar, Anshu; Chhabra, Jitender Kumar

doi:10.1007/978-3-642-22577-2_17

Anshu Parashar³ &
Jitender Kumar Chhabra³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 169))

Included in the following conference series:

International Conference on High Performance Architecture and Grid Computing

3120 Accesses
2 Citations

Abstract

Identification of reusable components during the process of software development is an essential activity. Data mining techniques can be applied for identifying set of software components having dependence amongst each other. In this paper an attempt has been made to identify the group of classes having dependence amongst each other existing in the same repository. We explore document clustering technique based on tf-idf weighing to cluster classes from vast collection of class coupling data for particular java project/program. For this purpose firstly dynamic analysis of java application is done using UML diagrams to collect class import coupling data. Then in second step, this coupling data of each class is treated as a document and represented using VSM (using TF and IDF). Then finally in the third step basic K-mean clustering technique is applied to find clusters of classes. Further each cluster is ranked for its goodness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abrantesy, A.J., Marquesz, J.S.: A Method for Dynamic Clustering of Data. In: Proceedings of the British Machine Vision Conference, pp. 154–163 (1998)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM, SIGMOD, pp. 207–216 (1993)
Google Scholar
Alzghool, M., Inkpen, D.: Clustering the Topics using TF-IDF for Model Fusion. In: ACM Proceeding of the 2nd PhD Workshop on Information and Knowledge Management, pp. 97–100 (2008)
Google Scholar
Arisholm, E.: Dynamic Coupling Measurement for Object-Oriented Software. IEEE Transactions on Software Engineering 30(8), 491–506 (2004)
Article Google Scholar
Bhatia, P.K., Mann, R.: An Approach to Measure Software Reusability of OO Design. In: Proceedings of the 2nd National Conference on Challenges & Opportunities in Information Technology, pp. 26–30 (2008)
Google Scholar
Cosine Similarity, http://en.wikipedia.org/wiki/Cosine_similarity
Czibula, I.G., Serban, G.: Hierarchical Clustering Based Design Patterns Identification. Int. J. of Computers Communications & Control 3, 248–252 (2008)
Google Scholar
Eickhoff, F. Ellis, J., Demurjian, S., Needham, D.: A Reuse Definition, Assessment, and Analysis Framework for UML. In: International Conference on Software Engineering (2003), http://www.engr.uconn.edu/~steve/Cse298300/eickhofficse2003submit.pdf
Fung, B.C.M., Wang, K., Esterz, M.: Hierarchical Document Clustering Using Frequent Itemsets. In: Proceedings of the third SIAM International Conference on Data Mining (2003)
Google Scholar
Gupta, V., Chhabra, J.K.: Measurement of Dynamic Metrics Using Dynamic Analysis of Programs. In: Proceedings of the Applied Computing Conference, pp. 81–86 (2008)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data, An introduction to Cluster Analysis. John Wiley & Sons, Inc., Chichester (1990)
MATH Google Scholar
Kiran, G.V.R., Shankar, K.R., Pudi, V.: Frequent Itemset based Hierarchical Document Clustering using Wikipedia as External Knowledge. In: Proceeding pf Intl Conference on Knowledge-Based and Intelligent Information Engineering Systems (2010)
Google Scholar
Li, W., Chen, C., Wang, J.: PCS: An Efficient Clustering Method for High-Dimensional Data. In: Proceedings of the 4th International Conference on Data Mining (DMIN 2008), July 14-17 (2008)
Google Scholar
Ng, R.T., Han, J.: Efficient and effective clustering methods or spatial data mining. In: Proceeding of VLDB conference, pp. 144–155 (1994)
Google Scholar
Rao, I.K.R.: Data Mining and Clustering Techniques. In: Proceeding of DRTC Workshop on Semantic Web (2003)
Google Scholar
Shiva, S.J., Shala, L.: Software Reuse: Research and Practice. In: Proceedings of the IEEE International Conference on Information Technology, pp. 603–609 (2007)
Google Scholar
Taha, W., Crosby, S., Swadi, K.: A New Approach to Data Mining for Software Design. In: 3rd International Conference on Computer Science, Software Engineering, Information Technology, e-Business, and Applications (2004)
Google Scholar
Xiao, Y.: A Survey of Document Clustering Techniques & Comparison of LDA and moVMF. In: CS 229 Machine Learning Final Projects (2010)
Google Scholar
Xie, T., Pei, J.: Data mining for Software Engineering, http://ase.csc.ncsu.edu/dmse/dmse.pdf
Yossef, Z.B., Guy, I.: Cluster Ranking with an Application to Mining Mailbox Networks. In: ACM Proceedings of the Sixth International Conference on Data Mining (2006)
Google Scholar
Zhang, T., Ramakrishnan, R., Birch, L.M.: An efficient data clustering method for very large data-bases. In: ACM SIGMOD, pp. 103–114 (1996)
Google Scholar
http://en.wikipedia.org/wiki/Distance
http://en.wikipedia.org/wiki/Euclidean_distance
http://en.wikipedia.org/wiki/Metric_mathematics

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, National Institute of Technology, Kurukshetra, Kurukshetra, 136 119, India
Anshu Parashar & Jitender Kumar Chhabra

Authors

Anshu Parashar
View author publications
You can also search for this author in PubMed Google Scholar
Jitender Kumar Chhabra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Chitkara University, 160 009, Chandigarh, India
Archana Mantri , Suman Nandi & Sandeep Kumar , &
Chitkara University,, 160 009, Chandigarh, India
Gaurav Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Parashar, A., Chhabra, J.K. (2011). Clustering Dynamic Class Coupling Data to Measure Class Reusability Pattern. In: Mantri, A., Nandi, S., Kumar, G., Kumar, S. (eds) High Performance Architecture and Grid Computing. HPAGC 2011. Communications in Computer and Information Science, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22577-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-22577-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22576-5
Online ISBN: 978-3-642-22577-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics