Abstract
As the number of scientific papers getting published is likely to soar, most of modern paper management systems (e.g. ScienceWise, Mendeley, CiteULike) support tag-based retrieval. In that, each paper is associated with a set of tags, allowing user to search for relevant papers by formulating tag-based queries against the system. One of the most critical issues in tag-based retrieval is that user often has difficulties in precisely formulating his information need. Addressing this issue, our paper tackles the problem of automatically suggesting new tags for user when he formulates a query. The set of tags are selected in such a way that resolves query ambiguity in two aspects: informativeness and diversity. While the former reduces user effort in finding the desired papers, the latter enhances the variety of information shown to user. Through studying theoretical properties of this problem, we propose a heuristic-based algorithm with several salient performance guarantees. We also demonstrate the efficiency of our approach through extensive experimentation using real-world datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aktolga, E., Allan, J.: Sentiment diversification with different biases. In: SIGIR, pp. 593–602 (2013)
Bing, L., Lam, W., Wong, T.L.: Using query log and social tagging to refine queries based on latent topics. In: CIKM, pp. 583–592 (2011)
Cohen, A.M., Hersh, W.R.: A survey of current work in biomedical text mining. Briefings in Bioinformatics, 57–71 (2005)
Drosou, M., Pitoura, E.: Disc diversity: result diversification based on dissimilarity and coverage. In: PVLDB, pp. 13–24 (2012)
Feige, U., Peleg, D., Kortsarz, G.: The dense k-subgraph problem. Algorithmica, 410–421 (2001)
Goffman, W.: A searching procedure for information retrieval. ISR, 73–78 (1964)
He, J., Tong, H., Mei, Q., Szymanski, B.: Gender: a generic diversified ranking algorithm. In: NIPS, pp. 1142–1150 (2012)
Hurley, N., Zhang, M.: Novelty and diversity in top-n recommendation - analysis and evaluation. TOIT, 1–30 (2011)
Iwata, M., Sakai, T., Yamamoto, T., Chen, Y., Liu, Y., Wen, J.R., Nishio, S.: Aspectiles: tile-based visualization of diversified web search results. In: SIGIR, pp. 85–94 (2012)
Jain, V., Varma, M.: Learning to re-rank: query-dependent image re-ranking using click data. In: WWW, pp. 277–286 (2011)
Jomsri, P., Sanguansintukul, S., Choochaiwattana, W.: A comparison of search engine using “tag title and abstract” with citeulike - an initial evaluation. In: ICITST, pp. 1–5 (2009)
Kashyap, A., Hristidis, V., Petropoulos, M.: Facetor: cost-driven exploration of faceted query results. In: CIKM, pp. 719–728 (2010)
Kim, J.W., Candan, K.S., Tatemura, J.: Organization and tagging of blog and news entries based on content reuse. J. Sign. Process. Syst., 407–421 (2010)
Küçüktunç, O., Saule, E., Kaya, K., Çatalyürek, U.V.: Diversified recommendation on graphs: pitfalls, measures, and algorithms. In: WWW, pp. 715–726 (2013)
van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: WWW, pp. 341–350 (2009)
Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. In: TMM, pp. 1310–1322 (2009)
Lin, Y., Lin, H., Jin, S., Ye, Z.: Social annotation in query expansion: a machine learning approach. In: SIGIR, pp. 405–414 (2011)
MacRoberts, M.H., MacRoberts, B.R.: Problems of citation analysis: a critical review. JASIST, 342–349 (1989)
Maniu, S., Cautis, B.: Network-aware search in social tagging applications: instance optimality versus efficiency. In: CIKM, pp. 939–948 (2013)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge University Press (2008)
Nemhauser, G., Wolsey, L., Fisher, M.: An analysis of approximations for maximizing submodular set functions-i. MP, 265–294 (1978)
Noël, S., Beale, R.: Sharing vocabularies: tag usage in citeulike. In: BCS-HCI, pp. 71–74 (2008)
Oliveira, V., Gomes, G., Belém, F., Brandão, W., Almeida, J., Ziviani, N., Gonçalves, M.: Automatic query expansion based on tag recommendation. In: CIKM, pp. 1985–1989 (2012)
Prokofyev, R., Boyarsky, A., Ruchayskiy, O., Aberer, K., Demartini, G., Cudré-Mauroux, P.: Tag recommendation for large-scale ontology-based information systems. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 325–336. Springer, Heidelberg (2012)
Russell, S.J., Norvig, P., Canny, J.F., Malik, J.M., Edwards, D.D.: Artificial Intelligence: A Modern Approach, vol. 74. Prentice Hall Englewood Cliffs (1995)
Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. JASIST (1997)
Sebastiani, F.: Machine learning in automated text categorization. CSUR, 1–47 (2002)
Skoutas, D., Alrifai, M.: Tag clouds revisited. In: CIKM, pp. 221–230 (2011)
Vieira, M.R., Razente, H.L., Barioni, M.C.N., Hadjieleftheriou, M., Srivastava, D., Traina, C., Tsotras, V.J.: On query result diversification. In: ICDE, pp. 1163–1174 (2011)
Wang, M., Yang, K., Hua, X.S., Zhang, H.J.: Towards a relevant and diverse search of social images. In: TMM, pp. 829–842 (2010)
Wang, Q., Ruan, L., Zhang, Z., Si, L.: Learning compact hashing codes for efficient tag completion and prediction. In: CIKM, pp. 1789–1794 (2013)
Weinberger, K.Q., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: MM, pp. 111–120 (2008)
Xie, L., He, X.: Picture tags and world knowledge: learning tag relations from visual semantic sources. In: MM, pp. 967–976 (2013)
Zha, Z.J., Yang, L., Mei, T., Wang, M., Wang, Z.: Visual query suggestion. In: MM, pp. 15–24 (2009)
Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.Y.: Improving web search results using affinity graph. In: SIGIR, pp. 504–511 (2005)
Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: MM, pp. 461–470 (2010)
Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Nguyen, Q.V.H., Do, S.T., Nguyen, T.T., Aberer, K. (2015). Tag-Based Paper Retrieval: Minimizing User Effort with Diversity Awareness. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-18120-2_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18119-6
Online ISBN: 978-3-319-18120-2
eBook Packages: Computer ScienceComputer Science (R0)