Skip to main content
Log in

Integrating social annotations into topic models for personalized document retrieval

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Social annotations are valuable resources generated by users on the Web, which encode abundant information on user preferences for certain documents. Social annotation-based information retrieval has been studied in recent years for personalizing search results and fulfilling user information needs. However, since social annotations are complicated and associated with users, documents and tags simultaneously, it remains a great challenge to fully capture the potentially useful information for improving retrieval performance. To meet the challenge, we propose a novel method to integrate social annotations into topic models for personalized document retrieval. Our method first reconstructs candidate documents for a given query using social tags of documents to capture user preferences. The reconstructed documents are tailored to user preferences for achieving better performance. We then generalize the latent Dirichlet allocation-based topic models by considering the relationship among users, social tags and documents from social annotations. The modified topic model optimizes the distribution of latent topics of documents for different users to meet user information needs. Experimental results show that our method can significantly outperform the state-of-the-art baseline models for improving the performance of personalized retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://del.icio.us/.

  2. https://www.jianguoyun.com/p/DZWqIdgQ_66nBxiThagB.

References

  • Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2017) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput 21(7):1785–1801

    Article  Google Scholar 

  • Bao S, Xue G, Wu X, Yu Y, Fei B, Su Z (2007) Optimizing web search using social annotations. In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 501–510

  • Blei DM, Jordan MI (2003) Modeling annotated data. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 127–134

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  • Bouadjenek MR, Hacid H, Bouzeghoub M, Vakali A (2013) Using social annotations to enhance document representation for personalized search. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 1049–1052

  • Chen X, Lu C, An Y, Achananuparp P (2009) Probabilistic models for topic learning from images and captions in online biomedical literatures. In: Proceedings of the 18th ACM conference on information and knowledge management. ACM, pp 495–504

  • Du Q, Xie H, Cai Y, Leung H, Li Q, Min H, Wang FL (2016) Folksonomy-based personalized search by hybrid user profiles in multiple levels. Neurocomputing 204:142–152

    Article  Google Scholar 

  • Erosheva E, Fienberg S, Lafferty J (2004) Mixed-membership models of scientific publications. Proc Natl Acad Sci 101(suppl 1):5220–5227

    Article  Google Scholar 

  • Godoy D, Corbellini A (2016) Folksonomy-based recommender systems: a state-of-the-art review. Int J Intell Syst 31(4):314–346

    Article  Google Scholar 

  • Golder SA, Huberman BA (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208

    Article  Google Scholar 

  • Hofmann T (1999) Probabilistic latent semantic analysis. In: Proceedings of the 15th conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp 289–296

  • Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Information retrieval in folksonomies: search and ranking. In European semantic web conference. Springer, pp 411–426

  • Ibrahim OAS, Landa-Silva D (2016) Term frequency with average term occurrences for textual information retrieval. Soft Comput 20(8):3045–3061

    Article  Google Scholar 

  • Laura L, Me G (2017) Searching the web for illegal content: the anatomy of a semantic search engine. Soft Comput 21(5):1245–1252

    Article  Google Scholar 

  • Lee S, Masoud M, Balaji J, Belkasim S, Sunderraman R, Moon S-J (2017) A survey of tag-based information retrieval. Int J Multimed Inf Retr 6(2):99–113

    Article  Google Scholar 

  • Lin Y, Lin H, Jin S, Ye Z (2011) Social annotation in query expansion: a machine learning approach. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 405–414

  • Liu M, Wan C, Wang L (2002) Content-based audio classification and retrieval using a fuzzy logic system: towards multimedia search engines. Soft Comput 6(5):357–364

    Article  Google Scholar 

  • Liu Y, Niculescu-Mizil A, Gryc W (2009) Topic-link lda: joint models of topic and author community. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 665–672

  • Lu C, Hu X, Chen X, Park J-R, He TT, Li Z (2010) The topic-perspective model for social tagging systems. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 683–692

  • Mahboob VA, Jalali M, Jahan MV, Barekati P (2017) Swallow: resource and tag recommender system based on heat diffusion algorithm in social annotation systems. Comput Intell 33(1):99–118

    Article  MathSciNet  Google Scholar 

  • Martin-Bautista MJ, Kraft DH, Vila MA, Chen J, Cruz J (2002) User profiles and fuzzy logic for web retrieval issues. Soft Comput 6(5):365–372

    Article  Google Scholar 

  • Newman D, Chemudugunta C, Smyth P (2006) Statistical entity-topic models. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 680–686

  • Pantel P, Gamon M, Alonso O, Haas K (2012) Social annotations: utility and prediction modeling. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 285–294

  • Ramage D, Heymann P, Manning CD, Garcia-Molina H (2009) Clustering the tagged web. In: Proceedings of the 2nd ACM international conference on web search and data mining. ACM, pp 54–63

  • Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P (2004) The author-topic model for authors and documents. In: Proceedings of the 20th conference on uncertainty in artificial intelligence. AUAI Press, pp 487–494

  • Wang Y, Huang Y, Pang X, Lu M, Xie M, Liu J (2013) Supervised rank aggregation based on query similarity for document retrieval. Soft Comput 17(3):421–429

    Article  Google Scholar 

  • Wu X, Zhang L, Yu Y (2006) Exploring social annotations for the semantic web. In: Proceedings of the 15th international conference on World Wide Web. ACM, pp 417–426

  • Xie H, Li X, Wang T, Lau RYK, Wong T-L, Chen L, Wang FL, Li Q (2016) Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy. Inf Process Manag 52(1):61–72

    Article  Google Scholar 

  • Xu S, Bao S, Fei B, Su Z, Yu Y (2008) Exploring folksonomy for personalized search. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 155–162

  • Yu H, Zhou B, Deng M, Hu F (2018) Tag recommendation method in folksonomy based on user tagging status. J Intell Inf Syst 50(3):479–500

    Article  Google Scholar 

  • Zhou D, Bian J, Zheng S, Zha H, Giles CL (2008) Exploring social annotations for information retrieval. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 715–724

  • Zhou D, Wu X, Zhao W, Lawless S, Liu J (2017) Query expansion with enriched user profiles for personalized search utilizing folksonomy data. IEEE Trans Knowl Data Eng 29(7):1536–1548

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by Grant from the Natural Science Foundation of China (Nos. 61632011, 61572102, 61602078, 61572098), the Ministry of Education Humanities and Social Science Project (No. 19YJCZH199), the China Postdoctoral Science Foundation (No. 2018M641691), the Fundamental Research Funds for the Central Universities (No. DUT18ZD102) and the National Key Research Development Program of China (No. 2016YFB1001103).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bo Xu or Hongfei Lin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and Animal Rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, B., Lin, H., Lin, Y. et al. Integrating social annotations into topic models for personalized document retrieval. Soft Comput 24, 1707–1716 (2020). https://doi.org/10.1007/s00500-019-03998-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-03998-1

Keywords

Navigation