Abstract
Searching for microblog short text by their meaning is a challenging task because of the semantic sparsity of the information in social networks. The extended search approaches are commonly accepted which facilitate short text understanding and search by enriching the short text. However, they only analyze the literal semantics of short text, and the unique social characteristics of social network which also contain semantic information are not utilized well. To better capture the rich semantics in microblog short text, we propose a new microblog short text extended search method based on a semantic hashtag graph by combining social and conceptual information, which enriches each short text by concepts and associated hashtags to represent whole semantic features. Considering the microblog context, we introduce concepts through Wikipedia, as well as semantic consistency of hashtags. Specifically, for conceptual semantics, we propose a conceptual analysis method which merges explicit and implicit information in Wikipedia. For social semantics in hashtags, a semantic hashtag graph which combines social and conceptual information is put forward to generate semantic associated hashtags. We conduct experiments and the results show that our method is obviously better than the other existing state-of-the-art approaches in semantic understanding and search of short text.
Similar content being viewed by others
References
Albishre, K., Li, Y.F., Xu, Y.: Effective pseudo-relevance for microblog retrieval. In: Australasian Computer Science Week Multi conference, pp.51. ACM (2017)
Bandyopadhyay, A., Ghosh, K., Majumder, P., Mitra, M.: Query expansion for microblog retrieval. Int. J. Web Sci. 1(4), 368–380 (2012)
Bansal, P., Jain, S., Varma, V.: Towards Semantic Retrieval of Hashtags in Microblogs. In: Proceedings of the 24th International Conference on World Wide Web, pp.7–8. ACM (2015)
Cao, G., Nie, J.Y., Gao, J.F., Stephen, R.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.243–250. ACM (2008)
Cindy, H., William, A.W., Malik, M.I.: Query expansion. In: Encyclopedia of Social Network Analysis and Mining, pp. 1455–1455. Springer, New York (2014)
David, M.B., Andrew, Y.N., Michael, I.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Doroudian, M., Akbari, R.: Using ontologies for developing a search mechanism in social networks. In: International Industrial Engineering Conference, pp.114–120. IJWA (2014)
Egozi, O., Markovitch, S., Gabrilovich, E.: Concept-based information retrieval using explicit semantic analysis. ACM Trans. Inf. Syst. 29(2), 1–34 (2011)
Fan, F.F., Qiang, R.W., Lv, C., Yang, J. W.: Improving microblog retrieval with feedback entity model. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp.573–582. ACM (2015)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: international joint conference on Artificial intelligence, pp.1606–1611. IJCAI (2007)
Gao, L.L., Wang, Y., Li, D.S., Shao, J.M., Song, J.K.: Real-time social media retrieval with spatial, temporal and social constraints. Neurocomputing. 253, 77–88 (2017)
Guo, D., Gao, P.F.: Complex-query Web image search with concept-based relevance estimation. World Wide Web. 19(2), 247–264 (2016)
Hawking, D., Craswell, N., Bailey, P., Griffihs, K.: Measuring search engine quality. Inf. Retr. 4(1), 33–59 (2001)
Herranz, L., Jiang, S.Q., Li, X.Y, Scene recognition with CNNs: objects, scales and dataset Bias. In: Computer Vision and Pattern Recognition, pp,571–579. CVPR (2016)
Hong, K.J., Kim, H.J.: A Semantic Search Technique with wikipedia-based text representation model. In: 2016 International Conference on Big Data and Smart Computing, pp.177–182. IEEE (2016)
Hua, W., Wang, Z.Y., Wang, H.X., Zheng, K., Zhou, X.F.: Understand short texts by harvesting and analyzing semantic knowledge. IEEE Trans. Knowl. Data Eng. 29(3), 499–512 (2017)
Huang, G.Y., He, J., Zhang, Y.C., Zhou, W.L., Liu, H., Zhang, P., Ding, Z.M., You, Y., Cao, J.: Mining streams of short text for analysis of world-wide event evolutions. World Wide Web. 18(5), 1201–1217 (2015)
Jia, Y., Gan, L., Li, A.P., Xu, J.: Research progress and development trend of online social network smart search. J. Commun. 36(12), 9–12 (2015)
Jiang, D., Leung, W.T., Ng, W.: Query intent mining with multiple dimensions of Web search data. World Wide Web. 19(3), 475–497 (2016)
Jiang, Y.C., Bai, W., Zhang, X.P., Hu, J.J.: Wikipedia-based information content and semantic similarity computation. Inf. Process. Manag. 53(1), 248–265 (2016)
Kang, J.H., Luo, Z.X.: Research on RDF data storage based on graph database Neo4j. Inf. Technol. 6, 031 (2015)
Kiesel, J., Potthast, M., Hagen, M., Stein, B.: Spatio-temporal analysis of reverted wikipedia edits. In: International AAAI Conference on Web and Social Media, pp,122–131. AAAI (2017)
Kotov, A., Zhai, C.X.: Tapping into knowledge base for concept feedback: leveraging conceptnet to improve search results for difficult queries. In: Proceedings of the fifth ACM International Conference on Web Search and Data Mining, pp.403–412. ACM (2012)
Lu, Z., Zha, H.Y., Yang, X.K., Lin, W.Y., Zheng, Z.H.: A new algorithm for inferring user search goals with feedback sessions. IEEE Trans. Knowl. Data Eng. 25(3), 502–513 (2013)
Luo, Z.C., Yu, Y., Osborne, M., Wang, T.: Structuring tweets for improving twitter search. J. Assoc. Inf. Sci. Technol. 66(12), 2522–2539 (2015)
Miyanishi, T., Seki, K., Uehara, K.: Time-Aware Latent Concept Expansion for Microblog Search. In: International AAAI Conference on Web and Social Media. AAAI (2014)
Ogilvie, P., Voorhees, E., Callan, J.: On the number of terms used in automatic query expansion. Inf. Retr. 12(6), 666–679 (2009)
PérezAgüera, J.R., Arroyo, J., Greenberg, J, Iglesias, J.P., Fresno V: Using BM25F for semantic search. In: Proceedings of the 3rd International Semantic Search Workshop, pp.2. ACM (2010)
Péreziglesias, J., Pérezagüera, J.R., Fresno, V., Feinstein, Y.Z.: Integrating the probabilistic models BM25/BM25F into Lucene. Comput. Sci. 5046, (2009)
Shi, Z.D., Keung, J., Song, Q.B.: An empirical study of BM25 and BM25F based feature location techniques. In: Proceedings of the International Workshop on Innovative Software Development Methodologies and Practices, pp.106–114. ACM (2014)
Song, X.H., Jiang, S.Q., Herranz, L.: Multi-scale multi-feature context modeling for scene recognition in the semantic manifold. IEEE Trans. Image Process. 26(6), 2721–2735 (2017)
Sun, J., Xu, J., Zheng, K., Liu, C.: Interactive spatial keyword querying with semantics. In: ACM on Conference on Information and Knowledge Management, pp, 1727–1736. CIKM (2017)
Tran, T., Tran, N.K., Asmelash, T.H., Jäschke, R.: Semantic annotation for microblog topics using wikipedia temporal information. EMNLP (2017)
Wang, Z., Zhao, K., Wang, H., Meng, X.F., Wen, J.R.: Query understanding through knowledge-based conceptualization. In: International Conference on Artificial Intelligence, pp.3264–3270. IJCAI(2015)
Wang, X.Y., Zheng, Y.Q., Xiao, Y.H.: Entity-relation modeling and discovery for smart search. J. Commun. 36(12), 17–27 (2015)
Wang, Y., Liu, J., Huang, Y.L., Feng, X.: Using hashtag graph-based topic model to connect semantically-related words without co-occurrence in microblogs. IEEE Trans. Knowl. Data Eng. 28(7), 1919–1933 (2016)
Wang, Z.Y., Cheng, J.P., Wang, H.X.: Short text understanding: a survey. J Comput. Res. Dev. 53(2), 262–269 (2016)
Wang, Y.S., Huang, H.Y., Feng, C.: Query expansion based on a feedback concept model for microblog retrieval. In: Proceeding of the 26th International Conference on World Wide Web, pp.559–568. ACM (2017)
Wiemer-Hastings, P., Wiemer-Hastings, K. and Graesser, A.: Latent Semantic Analysis. In: Proceedings of the 16th international joint conference on Artificial intelligence, pp.1–14. IJCAI (2004)
Wu, W.T., Li, H.S., Wang, H.X., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp.481–492. ACM (2012)
Xia, F., Yu, C.C., Xu, L.H., Qian, W.N., Zhou, A.Y.: Top- k Temporal Keyword Search over Social Media Data. World Wide Web. 20(5), 1–21 (2017)
Xu, Z.G., Liu, L., Cai, H.B., Yan, S.Y.: Research on Chinese Microblog's semantic expansion model based on specific context. In: 11th International Conference on International Conference on Fuzzy Systems and Knowledge Discovery, pp.610-615. IEEE (2014)
Ye, Z., Huang, J.X., Lin, H.F.: Finding a good query-related topic for boosting pseudo-relevance feedback. J. Assoc. Inf. Sci. Technol. 62(4), 748–760 (2014)
Yu, Z., Wang, H.X., Lin, X.M., Wang, M.: Understanding Short Texts through Semantic Enrichment and Hashing. IEEE Trans. Knowl. Data Eng. 28(2), 566–579 (2016)
Zhao, F., Zhu, Y.J., Jin, H., Yang, L.T.: A personalized hashtag recommendation approach using LDA-based topic model in microblog environment[J]. Futur. Gener. Comput. Syst. 65, 196–206 (2016)
Zhou, D., Wu, X., Zhao, W.Y., Lawless, S., Liu, J.X.: Query expansion with enriched user profiles for personalized search utilizing folksonomy data. IEEE Trans. Knowl. Data Eng. 29(7), 1536–1548 (2017)
Zuo, Y., Zhao, J.C., Xu, K.: Word network topic model: a simple but general solution for short and imbalanced texts. Knowl. Inf. Syst. 48(2), 379–398 (2016)
Acknowledgments
This work was supported by the National Natural Science Foundation of China (NSFC) under Grant (No.61320106006, No.61532006, No.61772083, No. 61502042).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Web and Big Data
Guest Editors: Junjie Yao, Bin Cui, Christian S. Jensen, and Zhe Zhao
Rights and permissions
About this article
Cite this article
Cui, W., Du, J., Wang, D. et al. Extended search method based on a semantic hashtag graph combining social and conceptual information. World Wide Web 22, 2589–2610 (2019). https://doi.org/10.1007/s11280-018-0584-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-018-0584-z