Skip to main content

Tag-Based Paper Retrieval: Minimizing User Effort with Diversity Awareness

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9049))

Included in the following conference series:

  • 1939 Accesses

Abstract

As the number of scientific papers getting published is likely to soar, most of modern paper management systems (e.g. ScienceWise, Mendeley, CiteULike) support tag-based retrieval. In that, each paper is associated with a set of tags, allowing user to search for relevant papers by formulating tag-based queries against the system. One of the most critical issues in tag-based retrieval is that user often has difficulties in precisely formulating his information need. Addressing this issue, our paper tackles the problem of automatically suggesting new tags for user when he formulates a query. The set of tags are selected in such a way that resolves query ambiguity in two aspects: informativeness and diversity. While the former reduces user effort in finding the desired papers, the latter enhances the variety of information shown to user. Through studying theoretical properties of this problem, we propose a heuristic-based algorithm with several salient performance guarantees. We also demonstrate the efficiency of our approach through extensive experimentation using real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. http://sciencewise.info

  2. http://www.citeulike.org/

  3. http://www.mendeley.com/

  4. Aktolga, E., Allan, J.: Sentiment diversification with different biases. In: SIGIR, pp. 593–602 (2013)

    Google Scholar 

  5. Bing, L., Lam, W., Wong, T.L.: Using query log and social tagging to refine queries based on latent topics. In: CIKM, pp. 583–592 (2011)

    Google Scholar 

  6. Cohen, A.M., Hersh, W.R.: A survey of current work in biomedical text mining. Briefings in Bioinformatics, 57–71 (2005)

    Google Scholar 

  7. Drosou, M., Pitoura, E.: Disc diversity: result diversification based on dissimilarity and coverage. In: PVLDB, pp. 13–24 (2012)

    Google Scholar 

  8. Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica, 410–421 (2001)

    Google Scholar 

  9. Goffman, W.: A searching procedure for information retrieval. ISR, 73–78 (1964)

    Google Scholar 

  10. He, J., Tong, H., Mei, Q., Szymanski, B.: Gender: a generic diversified ranking algorithm. In: NIPS, pp. 1142–1150 (2012)

    Google Scholar 

  11. Hurley, N., Zhang, M.: Novelty and diversity in top-n recommendation - analysis and evaluation. TOIT, 1–30 (2011)

    Google Scholar 

  12. Iwata, M., Sakai, T., Yamamoto, T., Chen, Y., Liu, Y., Wen, J.R., Nishio, S.: Aspectiles: tile-based visualization of diversified web search results. In: SIGIR, pp. 85–94 (2012)

    Google Scholar 

  13. Jain, V., Varma, M.: Learning to re-rank: query-dependent image re-ranking using click data. In: WWW, pp. 277–286 (2011)

    Google Scholar 

  14. Jomsri, P., Sanguansintukul, S., Choochaiwattana, W.: A comparison of search engine using “tag title and abstract” with citeulike - an initial evaluation. In: ICITST, pp. 1–5 (2009)

    Google Scholar 

  15. Kashyap, A., Hristidis, V., Petropoulos, M.: Facetor: cost-driven exploration of faceted query results. In: CIKM, pp. 719–728 (2010)

    Google Scholar 

  16. Kim, J.W., Candan, K.S., Tatemura, J.: Organization and tagging of blog and news entries based on content reuse. J. Sign. Process. Syst., 407–421 (2010)

    Google Scholar 

  17. Küçüktunç, O., Saule, E., Kaya, K., Çatalyürek, U.V.: Diversified recommendation on graphs: pitfalls, measures, and algorithms. In: WWW, pp. 715–726 (2013)

    Google Scholar 

  18. van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: WWW, pp. 341–350 (2009)

    Google Scholar 

  19. Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. In: TMM, pp. 1310–1322 (2009)

    Google Scholar 

  20. Lin, Y., Lin, H., Jin, S., Ye, Z.: Social annotation in query expansion: a machine learning approach. In: SIGIR, pp. 405–414 (2011)

    Google Scholar 

  21. MacRoberts, M.H., MacRoberts, B.R.: Problems of citation analysis: a critical review. JASIST, 342–349 (1989)

    Google Scholar 

  22. Maniu, S., Cautis, B.: Network-aware search in social tagging applications: instance optimality versus efficiency. In: CIKM, pp. 939–948 (2013)

    Google Scholar 

  23. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press (2008)

    Google Scholar 

  24. Nemhauser, G., Wolsey, L., Fisher, M.: An analysis of approximations for maximizing submodular set functions-i. MP, 265–294 (1978)

    Google Scholar 

  25. Noël, S., Beale, R.: Sharing vocabularies: tag usage in citeulike. In: BCS-HCI, pp. 71–74 (2008)

    Google Scholar 

  26. Oliveira, V., Gomes, G., Belém, F., Brandão, W., Almeida, J., Ziviani, N., Gonçalves, M.: Automatic query expansion based on tag recommendation. In: CIKM, pp. 1985–1989 (2012)

    Google Scholar 

  27. Prokofyev, R., Boyarsky, A., Ruchayskiy, O., Aberer, K., Demartini, G., Cudré-Mauroux, P.: Tag recommendation for large-scale ontology-based information systems. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 325–336. Springer, Heidelberg (2012)

    Google Scholar 

  28. Russell, S.J., Norvig, P., Canny, J.F., Malik, J.M., Edwards, D.D.: Artificial Intelligence: A Modern Approach, vol. 74. Prentice Hall Englewood Cliffs (1995)

    Google Scholar 

  29. Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. JASIST (1997)

    Google Scholar 

  30. Sebastiani, F.: Machine learning in automated text categorization. CSUR, 1–47 (2002)

    Google Scholar 

  31. Skoutas, D., Alrifai, M.: Tag clouds revisited. In: CIKM, pp. 221–230 (2011)

    Google Scholar 

  32. Vieira, M.R., Razente, H.L., Barioni, M.C.N., Hadjieleftheriou, M., Srivastava, D., Traina, C., Tsotras, V.J.: On query result diversification. In: ICDE, pp. 1163–1174 (2011)

    Google Scholar 

  33. Wang, M., Yang, K., Hua, X.S., Zhang, H.J.: Towards a relevant and diverse search of social images. In: TMM, pp. 829–842 (2010)

    Google Scholar 

  34. Wang, Q., Ruan, L., Zhang, Z., Si, L.: Learning compact hashing codes for efficient tag completion and prediction. In: CIKM, pp. 1789–1794 (2013)

    Google Scholar 

  35. Weinberger, K.Q., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: MM, pp. 111–120 (2008)

    Google Scholar 

  36. Xie, L., He, X.: Picture tags and world knowledge: learning tag relations from visual semantic sources. In: MM, pp. 967–976 (2013)

    Google Scholar 

  37. Zha, Z.J., Yang, L., Mei, T., Wang, M., Wang, Z.: Visual query suggestion. In: MM, pp. 15–24 (2009)

    Google Scholar 

  38. Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.Y.: Improving web search results using affinity graph. In: SIGIR, pp. 504–511 (2005)

    Google Scholar 

  39. Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: MM, pp. 461–470 (2010)

    Google Scholar 

  40. Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quoc Viet Hung Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Nguyen, Q.V.H., Do, S.T., Nguyen, T.T., Aberer, K. (2015). Tag-Based Paper Retrieval: Minimizing User Effort with Diversity Awareness. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18120-2_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18119-6

  • Online ISBN: 978-3-319-18120-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics