Abstract
Software information sites are widely used to help developers to share and communicate their knowledge. Tags in these sites play an important role in facilitating information classification and organization. However, the insufficient understanding of software objects and the lack of relevant knowledge among developers may lead to incorrect tags. Thus, the automatic tag recommendation technique has been proposed. However, tag explosion and tag synonym are two major factors that affect the quality of tag recommendation. Prior studies have found that deep learning techniques are effective for mining software information sites. Inspired by recent deep learning researches, we propose TagDeepRec, a new tag recommendation approach for software information sites using attention-based Bi-LSTM. The attention-based Bi-LSTM model has the advantage of deep potential semantics mining, which can accurately infer tags for new software objects by learning the relationships between historical software objects and their corresponding tags. Given a new software object, TagDeepRec is able to compute the confidence probability of each tag and then recommend top-k tags by ranking the probabilities. We use the dataset from six software information sites with different scales to evaluate our proposed TagDeepRec. The experimental results show that TagDeepRec has achieved better performance compared with the state-of-the-art approaches TagMulRec and FastTagRec in terms of Recall@k, Precision@k and \(F1-score@k\).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Al-Kofahi, JM., Tamrawi, A., Nguyen, T.T., et al.: Fuzzy set approach for automatic tagging in evolving software. In: 2010 IEEE International Conference on Software Maintenance, pp. 1–10. IEEE (2010)
Xia, X., Lo, D., Wang, X., et al.: Tag recommendation in software information sites. In: 2013 10th Working Conference on Mining Software Repositories (MSR), pp. 287–296. IEEE (2013)
Wang, S., Lo, D., Vasilescu, B., et al.: EnTagRec: an enhanced tag recommendation system for software information sites. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 291–300. IEEE (2014)
Zhou, P., Liu, J., Yang, Z., et al.: Scalable tag recommendation for software information sites. In: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 272–282. IEEE (2017)
Liu, J., Zhou, P., Yang, Z., et al.: FastTagRec: fast tag recommendation for software information sites. Autom. Softw. Eng. 25(4), 675–701 (2018)
Joorabchi, A., English, M., Mahdi, A.E.: Automatic mapping of user tags to Wikipedia concepts: the case of a Q&A website-StackOverflow. J. Inf. Sci. 41(5), 570–583 (2015)
Barua, A., Thomas, S.W., Hassan, A.E.: What are developers talking about? An analysis of topics and trends in stack overflow. Empir. Softw. Eng. 19(3), 619–654 (2014)
Deshmukh, J., Podder, S., Sengupta, S., et al.: Towards accurate duplicate bug retrieval using deep learning techniques. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 115–124. IEEE (2017)
Li, L., Feng, H., Zhuang, W., et al.: Cclearner: a deep learning-based clone detection approach. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 249–260 IEEE (2017)
Cho, K., Van Merriënboer, B., Bahdanau, D., et al.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 207–212 (2016)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text With the Natural Language Toolkit. O’Reilly Media Inc., Sebastopol (2009)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1395–1403 (2015)
Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)
Hong, B., Kim, Y., Lee, S.H.: An efficient tag recommendation method using topic modeling approaches. In: Proceedings of the International Conference on Research in Adaptive and Convergent Systems, pp. 56–61. ACM (2017)
Treude, C., Storey, M.A.: How tagging helps bridge the gap between social and technical aspects in software development. In: Proceedings of the 31st International Conference on Software Engineering, pp. 12–22. IEEE Computer Society (2009)
Thung, F., Lo, D., Jiang, L.: Detecting similar applications with collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 600–603. IEEE (2012)
Wang, S., Lo, D., Jiang, L.: Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 604–607. IEEE (2012)
Beyer, S., Pinzger, M.: Synonym suggestion for tags on stack overflow. In: Proceedings of the 2015 IEEE 23rd International Conference on Program Comprehension, pp. 94–103. IEEE Press (2015)
Beyer, S., Pinzger, M.: Grouping android tag synonyms on stack overflow. In: Proceedings of the 13th International Conference on Mining Software Repositories, pp. 430–440. ACM (2016)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14(2), 1137–1145 (1995)
Wang, S., Lo, D., Vasilescu, B., et al.: EnTagRec++: an enhanced tag recommendation system for software information sites. Empir. Softw. Eng. 23(2), 800–832 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, C., Xu, L., Yan, M., He, J., Zhang, Z. (2019). TagDeepRec: Tag Recommendation for Software Information Sites Using Attention-Based Bi-LSTM. In: Douligeris, C., Karagiannis, D., Apostolou, D. (eds) Knowledge Science, Engineering and Management. KSEM 2019. Lecture Notes in Computer Science(), vol 11776. Springer, Cham. https://doi.org/10.1007/978-3-030-29563-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-29563-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29562-2
Online ISBN: 978-3-030-29563-9
eBook Packages: Computer ScienceComputer Science (R0)