Skip to main content

CUT: A Combined Approach for Tag Recommendation in Software Information Sites

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9983))

Abstract

Software information sites such as Stack Overflow and Ask Ubuntu allow programmers to post their questions and share knowledge online. Usually tags that describe the key content of the questions are required by the website. These tags play an important role in organizing and indexing user posts efficiently and provide accurate abstracts of complicated technical problems. Users attach tags to the questions according to their experience and knowledge. Due to the expression difference and lack of grasp of the software, choosing the accurate tags is not an easy job. In this paper, we propose CUT, an automatic tag recommendation approach which recommends appropriate tags after users post their questions. This approach incorporates code fragments, text content, users’ preference to tags and tag relation in recommendation process. We evaluated CUT by conducting comparative experiments on the Stack Overflow dataset. The results show that CUT achieves 69.9 % and 81.6 % respectively for recall@5 and recall@10, which outperforms the latest relevant approach.

Y. Li—The work is supported by Key Program of National Natural Science Foundation of China (Grant No. 61232005), and VMware UR project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xia, X., Lo, D., Wang, X., et al.: Tag recommendation in software information sites. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 287–296. IEEE Press (2013)

    Google Scholar 

  2. Wang, S., Lo, D., Vasilescu, B., et al.: EnTagRec: an enhanced tag recommendation system for software information sites. In: 2014 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 291–300. IEEE (2014)

    Google Scholar 

  3. Short L, Wong C, Zeng D. Tag recommendations in stackoverflow. 2014

    Google Scholar 

  4. Treude, C., Storey, M.A.: How tagging helps bridge the gap between social and technical aspects in software development. In: Proceedings of the 31st International Conference on Software Engineering, pp. 12–22. IEEE Computer Society (2009)

    Google Scholar 

  5. Sigurbjörnsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336 ACM (2008)

    Google Scholar 

  6. Jäschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recommendations in folksonomies. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 506–514. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  7. Chirita, P.A., Costache, S., Nejdl, W., et al.: P-tag: large scale automatic generation of personalized annotation tags for the web. In: Proceedings of the 16th International Conference on World Wide Web, pp. 845–854. ACM (2007)

    Google Scholar 

  8. Ramage, D., Hall, D., Nallapati, R., et al.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1. Association for Computational Linguistics, pp. 248–256 (2009)

    Google Scholar 

  9. Zangerle, E., Gassler, W., Specht, G.: Using tag recommendations to homogenize folksonomies in microblogging environments. In: Datta, A., Shulman, S., Zheng, B., Lin, S.-D., Sun, A., Lim, E.-P. (eds.) SocInfo 2011. LNCS, vol. 6984, pp. 113–126. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. Thung, F., Lo, D., Jiang, L.: Detecting similar applications with collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 600–603. IEEE (2012)

    Google Scholar 

  11. Beyer, S., Pinzger, M.: Synonym suggestion for tags on stack overflow. In: Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, pp. 94–103. IEEE Press (2015)

    Google Scholar 

  12. Wang, S., Lo, D., Jiang, L.: Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 604–607. IEEE (2012)

    Google Scholar 

  13. Toutanova, K., Klein, D., Manning, C.D., et al.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1. Association for Computational Linguistics, pp. 173–180 (2003)

    Google Scholar 

  14. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  15. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10, 707 (1966)

    MathSciNet  MATH  Google Scholar 

  16. Log-linear Part-of-Speech Tagger. http://nlp.stanford.edu/software/tagger.shtml

  17. Labeled LDA. http://nlp.stanford.edu/software/tmt/tmt-0.4/

  18. English stop words. http://www.nltk.org/api/nltk.corpus.html

  19. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  20. Stack Exchange Data Dump, 6 March 2016. https://archive.org/details/stackexchange

  21. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. the. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  22. Teh, Y.W., Jordan, M.I., Beal, M.J., et al.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1556–1581 (2012)

    MathSciNet  Google Scholar 

  23. Wang, Y., Qu, J., Liu, J., Chen, J., Huang, Y.: What to tag your microblog: hashtag recommendation based on topic analysis and collaborative filtering. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 610–618. Springer, Heidelberg (2014)

    Google Scholar 

  24. Zhao, F., Zhu, Y., Jin, H., et al.: A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Gener. Comput. Syst. 65, 196–206 (2015)

    Article  Google Scholar 

  25. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Yang, Y., Li, Y., Yue, Y., Wu, Z., Shao, W. (2016). CUT: A Combined Approach for Tag Recommendation in Software Information Sites. In: Lehner, F., Fteimi, N. (eds) Knowledge Science, Engineering and Management. KSEM 2016. Lecture Notes in Computer Science(), vol 9983. Springer, Cham. https://doi.org/10.1007/978-3-319-47650-6_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47650-6_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47649-0

  • Online ISBN: 978-3-319-47650-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics