Abstract
Software information sites such as Stack Overflow and Ask Ubuntu allow programmers to post their questions and share knowledge online. Usually tags that describe the key content of the questions are required by the website. These tags play an important role in organizing and indexing user posts efficiently and provide accurate abstracts of complicated technical problems. Users attach tags to the questions according to their experience and knowledge. Due to the expression difference and lack of grasp of the software, choosing the accurate tags is not an easy job. In this paper, we propose CUT, an automatic tag recommendation approach which recommends appropriate tags after users post their questions. This approach incorporates code fragments, text content, users’ preference to tags and tag relation in recommendation process. We evaluated CUT by conducting comparative experiments on the Stack Overflow dataset. The results show that CUT achieves 69.9 % and 81.6 % respectively for recall@5 and recall@10, which outperforms the latest relevant approach.
Y. Li—The work is supported by Key Program of National Natural Science Foundation of China (Grant No. 61232005), and VMware UR project.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xia, X., Lo, D., Wang, X., et al.: Tag recommendation in software information sites. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 287–296. IEEE Press (2013)
Wang, S., Lo, D., Vasilescu, B., et al.: EnTagRec: an enhanced tag recommendation system for software information sites. In: 2014 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 291–300. IEEE (2014)
Short L, Wong C, Zeng D. Tag recommendations in stackoverflow. 2014
Treude, C., Storey, M.A.: How tagging helps bridge the gap between social and technical aspects in software development. In: Proceedings of the 31st International Conference on Software Engineering, pp. 12–22. IEEE Computer Society (2009)
Sigurbjörnsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336 ACM (2008)
Jäschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recommendations in folksonomies. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 506–514. Springer, Heidelberg (2007)
Chirita, P.A., Costache, S., Nejdl, W., et al.: P-tag: large scale automatic generation of personalized annotation tags for the web. In: Proceedings of the 16th International Conference on World Wide Web, pp. 845–854. ACM (2007)
Ramage, D., Hall, D., Nallapati, R., et al.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1. Association for Computational Linguistics, pp. 248–256 (2009)
Zangerle, E., Gassler, W., Specht, G.: Using tag recommendations to homogenize folksonomies in microblogging environments. In: Datta, A., Shulman, S., Zheng, B., Lin, S.-D., Sun, A., Lim, E.-P. (eds.) SocInfo 2011. LNCS, vol. 6984, pp. 113–126. Springer, Heidelberg (2011)
Thung, F., Lo, D., Jiang, L.: Detecting similar applications with collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 600–603. IEEE (2012)
Beyer, S., Pinzger, M.: Synonym suggestion for tags on stack overflow. In: Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, pp. 94–103. IEEE Press (2015)
Wang, S., Lo, D., Jiang, L.: Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 604–607. IEEE (2012)
Toutanova, K., Klein, D., Manning, C.D., et al.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1. Association for Computational Linguistics, pp. 173–180 (2003)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10, 707 (1966)
Log-linear Part-of-Speech Tagger. http://nlp.stanford.edu/software/tagger.shtml
Labeled LDA. http://nlp.stanford.edu/software/tmt/tmt-0.4/
English stop words. http://www.nltk.org/api/nltk.corpus.html
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Stack Exchange Data Dump, 6 March 2016. https://archive.org/details/stackexchange
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the. J. Mach. Learn. Res. 3, 993–1022 (2003)
Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1556–1581 (2012)
Wang, Y., Qu, J., Liu, J., Chen, J., Huang, Y.: What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 610–618. Springer, Heidelberg (2014)
Zhao, F., Zhu, Y., Jin, H., et al.: A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Gener. Comput. Syst. 65, 196–206 (2015)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Yang, Y., Li, Y., Yue, Y., Wu, Z., Shao, W. (2016). CUT: A Combined Approach for Tag Recommendation in Software Information Sites. In: Lehner, F., Fteimi, N. (eds) Knowledge Science, Engineering and Management. KSEM 2016. Lecture Notes in Computer Science(), vol 9983. Springer, Cham. https://doi.org/10.1007/978-3-319-47650-6_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-47650-6_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47649-0
Online ISBN: 978-3-319-47650-6
eBook Packages: Computer ScienceComputer Science (R0)