Skip to main content
Log in

Extended search method based on a semantic hashtag graph combining social and conceptual information

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Searching for microblog short text by their meaning is a challenging task because of the semantic sparsity of the information in social networks. The extended search approaches are commonly accepted which facilitate short text understanding and search by enriching the short text. However, they only analyze the literal semantics of short text, and the unique social characteristics of social network which also contain semantic information are not utilized well. To better capture the rich semantics in microblog short text, we propose a new microblog short text extended search method based on a semantic hashtag graph by combining social and conceptual information, which enriches each short text by concepts and associated hashtags to represent whole semantic features. Considering the microblog context, we introduce concepts through Wikipedia, as well as semantic consistency of hashtags. Specifically, for conceptual semantics, we propose a conceptual analysis method which merges explicit and implicit information in Wikipedia. For social semantics in hashtags, a semantic hashtag graph which combines social and conceptual information is put forward to generate semantic associated hashtags. We conduct experiments and the results show that our method is obviously better than the other existing state-of-the-art approaches in semantic understanding and search of short text.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. http://lucene.apache.org

  2. https://en.wikipedia.org/wiki/Wikipedia:Database_download

  3. http://medialab.di.unipi.it/wiki/Wikipedia_Extractor

References

  1. Albishre, K., Li, Y.F., Xu, Y.: Effective pseudo-relevance for microblog retrieval. In: Australasian Computer Science Week Multi conference, pp.51. ACM (2017)

  2. Bandyopadhyay, A., Ghosh, K., Majumder, P., Mitra, M.: Query expansion for microblog retrieval. Int. J. Web Sci. 1(4), 368–380 (2012)

    Article  Google Scholar 

  3. Bansal, P., Jain, S., Varma, V.: Towards Semantic Retrieval of Hashtags in Microblogs. In: Proceedings of the 24th International Conference on World Wide Web, pp.7–8. ACM (2015)

  4. Cao, G., Nie, J.Y., Gao, J.F., Stephen, R.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.243–250. ACM (2008)

  5. Cindy, H., William, A.W., Malik, M.I.: Query expansion. In: Encyclopedia of Social Network Analysis and Mining, pp. 1455–1455. Springer, New York (2014)

    Google Scholar 

  6. David, M.B., Andrew, Y.N., Michael, I.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  7. Doroudian, M., Akbari, R.: Using ontologies for developing a search mechanism in social networks. In: International Industrial Engineering Conference, pp.114–120. IJWA (2014)

  8. Egozi, O., Markovitch, S., Gabrilovich, E.: Concept-based information retrieval using explicit semantic analysis. ACM Trans. Inf. Syst. 29(2), 1–34 (2011)

    Article  Google Scholar 

  9. Fan, F.F., Qiang, R.W., Lv, C., Yang, J. W.: Improving microblog retrieval with feedback entity model. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp.573–582. ACM (2015)

  10. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: international joint conference on Artificial intelligence, pp.1606–1611. IJCAI (2007)

  11. Gao, L.L., Wang, Y., Li, D.S., Shao, J.M., Song, J.K.: Real-time social media retrieval with spatial, temporal and social constraints. Neurocomputing. 253, 77–88 (2017)

    Article  Google Scholar 

  12. Guo, D., Gao, P.F.: Complex-query Web image search with concept-based relevance estimation. World Wide Web. 19(2), 247–264 (2016)

    Article  Google Scholar 

  13. Hawking, D., Craswell, N., Bailey, P., Griffihs, K.: Measuring search engine quality. Inf. Retr. 4(1), 33–59 (2001)

    Article  Google Scholar 

  14. Herranz, L., Jiang, S.Q., Li, X.Y, Scene recognition with CNNs: objects, scales and dataset Bias. In: Computer Vision and Pattern Recognition, pp,571–579. CVPR (2016)

  15. Hong, K.J., Kim, H.J.: A Semantic Search Technique with wikipedia-based text representation model. In: 2016 International Conference on Big Data and Smart Computing, pp.177–182. IEEE (2016)

  16. Hua, W., Wang, Z.Y., Wang, H.X., Zheng, K., Zhou, X.F.: Understand short texts by harvesting and analyzing semantic knowledge. IEEE Trans. Knowl. Data Eng. 29(3), 499–512 (2017)

    Article  Google Scholar 

  17. Huang, G.Y., He, J., Zhang, Y.C., Zhou, W.L., Liu, H., Zhang, P., Ding, Z.M., You, Y., Cao, J.: Mining streams of short text for analysis of world-wide event evolutions. World Wide Web. 18(5), 1201–1217 (2015)

    Article  Google Scholar 

  18. Jia, Y., Gan, L., Li, A.P., Xu, J.: Research progress and development trend of online social network smart search. J. Commun. 36(12), 9–12 (2015)

    Google Scholar 

  19. Jiang, D., Leung, W.T., Ng, W.: Query intent mining with multiple dimensions of Web search data. World Wide Web. 19(3), 475–497 (2016)

    Article  Google Scholar 

  20. Jiang, Y.C., Bai, W., Zhang, X.P., Hu, J.J.: Wikipedia-based information content and semantic similarity computation. Inf. Process. Manag. 53(1), 248–265 (2016)

    Article  Google Scholar 

  21. Kang, J.H., Luo, Z.X.: Research on RDF data storage based on graph database Neo4j. Inf. Technol. 6, 031 (2015)

    Google Scholar 

  22. Kiesel, J., Potthast, M., Hagen, M., Stein, B.: Spatio-temporal analysis of reverted wikipedia edits. In: International AAAI Conference on Web and Social Media, pp,122–131. AAAI (2017)

  23. Kotov, A., Zhai, C.X.: Tapping into knowledge base for concept feedback: leveraging conceptnet to improve search results for difficult queries. In: Proceedings of the fifth ACM International Conference on Web Search and Data Mining, pp.403–412. ACM (2012)

  24. Lu, Z., Zha, H.Y., Yang, X.K., Lin, W.Y., Zheng, Z.H.: A new algorithm for inferring user search goals with feedback sessions. IEEE Trans. Knowl. Data Eng. 25(3), 502–513 (2013)

    Article  Google Scholar 

  25. Luo, Z.C., Yu, Y., Osborne, M., Wang, T.: Structuring tweets for improving twitter search. J. Assoc. Inf. Sci. Technol. 66(12), 2522–2539 (2015)

    Article  Google Scholar 

  26. Miyanishi, T., Seki, K., Uehara, K.: Time-Aware Latent Concept Expansion for Microblog Search. In: International AAAI Conference on Web and Social Media. AAAI (2014)

  27. Ogilvie, P., Voorhees, E., Callan, J.: On the number of terms used in automatic query expansion. Inf. Retr. 12(6), 666–679 (2009)

    Article  Google Scholar 

  28. PérezAgüera, J.R., Arroyo, J., Greenberg, J, Iglesias, J.P., Fresno V: Using BM25F for semantic search. In: Proceedings of the 3rd International Semantic Search Workshop, pp.2. ACM (2010)

  29. Péreziglesias, J., Pérezagüera, J.R., Fresno, V., Feinstein, Y.Z.: Integrating the probabilistic models BM25/BM25F into Lucene. Comput. Sci. 5046, (2009)

  30. Shi, Z.D., Keung, J., Song, Q.B.: An empirical study of BM25 and BM25F based feature location techniques. In: Proceedings of the International Workshop on Innovative Software Development Methodologies and Practices, pp.106–114. ACM (2014)

  31. Song, X.H., Jiang, S.Q., Herranz, L.: Multi-scale multi-feature context modeling for scene recognition in the semantic manifold. IEEE Trans. Image Process. 26(6), 2721–2735 (2017)

    Article  MathSciNet  Google Scholar 

  32. Sun, J., Xu, J., Zheng, K., Liu, C.: Interactive spatial keyword querying with semantics. In: ACM on Conference on Information and Knowledge Management, pp, 1727–1736. CIKM (2017)

  33. Tran, T., Tran, N.K., Asmelash, T.H., Jäschke, R.: Semantic annotation for microblog topics using wikipedia temporal information. EMNLP (2017)

  34. Wang, Z., Zhao, K., Wang, H., Meng, X.F., Wen, J.R.: Query understanding through knowledge-based conceptualization. In: International Conference on Artificial Intelligence, pp.3264–3270. IJCAI(2015)

  35. Wang, X.Y., Zheng, Y.Q., Xiao, Y.H.: Entity-relation modeling and discovery for smart search. J. Commun. 36(12), 17–27 (2015)

    Google Scholar 

  36. Wang, Y., Liu, J., Huang, Y.L., Feng, X.: Using hashtag graph-based topic model to connect semantically-related words without co-occurrence in microblogs. IEEE Trans. Knowl. Data Eng. 28(7), 1919–1933 (2016)

    Article  Google Scholar 

  37. Wang, Z.Y., Cheng, J.P., Wang, H.X.: Short text understanding: a survey. J Comput. Res. Dev. 53(2), 262–269 (2016)

    MathSciNet  Google Scholar 

  38. Wang, Y.S., Huang, H.Y., Feng, C.: Query expansion based on a feedback concept model for microblog retrieval. In: Proceeding of the 26th International Conference on World Wide Web, pp.559–568. ACM (2017)

  39. Wiemer-Hastings, P., Wiemer-Hastings, K. and Graesser, A.: Latent Semantic Analysis. In: Proceedings of the 16th international joint conference on Artificial intelligence, pp.1–14. IJCAI (2004)

  40. Wu, W.T., Li, H.S., Wang, H.X., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp.481–492. ACM (2012)

  41. Xia, F., Yu, C.C., Xu, L.H., Qian, W.N., Zhou, A.Y.: Top- k Temporal Keyword Search over Social Media Data. World Wide Web. 20(5), 1–21 (2017)

    Article  Google Scholar 

  42. Xu, Z.G., Liu, L., Cai, H.B., Yan, S.Y.: Research on Chinese Microblog's semantic expansion model based on specific context. In: 11th International Conference on International Conference on Fuzzy Systems and Knowledge Discovery, pp.610-615. IEEE (2014)

  43. Ye, Z., Huang, J.X., Lin, H.F.: Finding a good query-related topic for boosting pseudo-relevance feedback. J. Assoc. Inf. Sci. Technol. 62(4), 748–760 (2014)

    Article  Google Scholar 

  44. Yu, Z., Wang, H.X., Lin, X.M., Wang, M.: Understanding Short Texts through Semantic Enrichment and Hashing. IEEE Trans. Knowl. Data Eng. 28(2), 566–579 (2016)

    Article  Google Scholar 

  45. Zhao, F., Zhu, Y.J., Jin, H., Yang, L.T.: A personalized hashtag recommendation approach using LDA-based topic model in microblog environment[J]. Futur. Gener. Comput. Syst. 65, 196–206 (2016)

    Article  Google Scholar 

  46. Zhou, D., Wu, X., Zhao, W.Y., Lawless, S., Liu, J.X.: Query expansion with enriched user profiles for personalized search utilizing folksonomy data. IEEE Trans. Knowl. Data Eng. 29(7), 1536–1548 (2017)

    Article  Google Scholar 

  47. Zuo, Y., Zhao, J.C., Xu, K.: Word network topic model: a simple but general solution for short and imbalanced texts. Knowl. Inf. Syst. 48(2), 379–398 (2016)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC) under Grant (No.61320106006, No.61532006, No.61772083, No. 61502042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junping Du.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web and Big Data

Guest Editors: Junjie Yao, Bin Cui, Christian S. Jensen, and Zhe Zhao

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cui, W., Du, J., Wang, D. et al. Extended search method based on a semantic hashtag graph combining social and conceptual information. World Wide Web 22, 2589–2610 (2019). https://doi.org/10.1007/s11280-018-0584-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0584-z

Keywords

Navigation