Abstract
Social annotation systems enable users to annotate large-scale texts with tags which provide a convenient way to discover, share and organize rich information. However, manually annotating massive texts is in general costly in manpower. Therefore, automatic annotation by tag prediction is of great help to improve the efficiency of semantic identification of social contents. In this paper, we propose a tag prediction model based on convolutional neural networks (CNN) and bi-directional long short term memory (BiLSTM) network, through which, tags of texts can be predicted efficiently and accurately. By Experiments on real-world datasets from a social Q&A community, the results show that the proposed CNN-BiLSTM model achieves state-of-the-art accuracy for tag prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhang, M.-L., Zhou, Z.-H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al’.s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)
McCallum, A.: Multi-label text classification with a mixture model trained by EM. In: AAAI workshop on Text Learning (1999)
Zhang, M.-L., Zhou, Z.-H.: A k-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE International Conference on Granular Computing, vol. 2. IEEE (2005)
Hllermeier, E., et al.: Label ranking by learning pairwise preferences. Artif. Intell. 172(16–17), 1897–1916 (2008)
Sriram, B., et al.: Short text classification in twitter to improve information filtering. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (2010)
Melville, P., Gryc, W., Lawrence, R.D.: Sentiment analysis of blogs by combining lexical knowledge with text classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2009)
Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM (2005)
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 497–511. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_33
Read, J.: A pruned problem transformation method for multi-label classification. In: Proceedings of the 2008 New Zealand Computer Science Research Student Conference (2008)
Zhou, D., Schölkopf, B.: Learning from labeled and unlabeled data using random walks. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 237–244. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28649-3_29
PadmaPriya, G., Duraiswamy, K.: An approach for text summarization using deep learning algorithm. J. Comput. Sci. 10(1), 1–9 (2014)
Vens, C., et al.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185–214 (2008)
Chen, L., Qu, H., Zhao, J.: Generalized correntropy induced loss function for deep learning. In: 2016 International Joint Conference on Neural Networks (IJCNN). IEEE (2016)
Ding, S., et al.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn. 48(10), 2993–3003 (2015)
Lewis, D.D., et al.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Van De Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)
Yousfi, S., Berrani, S.-A., Garcia, C.: Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE (2015)
Widrow, B., McCool, J.: A comparison of adaptive algorithms based on the methods of steepest descent and random search. IEEE Trans. Antennas Propag. 24(5), 615–637 (1976)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NSFC) grant funded by the China government, Ministry of Science and Technology(No.61672108).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Li, B., Wang, Q., Wang, X., Li, W. (2018). Tag Prediction in Social Annotation Systems Based on CNN and BiLSTM. In: Tan, Y., Shi, Y., Tang, Q. (eds) Advances in Swarm Intelligence. ICSI 2018. Lecture Notes in Computer Science(), vol 10942. Springer, Cham. https://doi.org/10.1007/978-3-319-93818-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-93818-9_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93817-2
Online ISBN: 978-3-319-93818-9
eBook Packages: Computer ScienceComputer Science (R0)