Abstract
Today, entity-centric searches are common tasks for information gathering. But, due to the huge amount of available information the entity itself is often not sufficient for finding suitable results. Users are usually searching for entities in a specific search context which is important for their relevance assessment. Therefore, for digital library providers it is inevitable to also consider this search context to allow for high quality retrieval. In this paper we present an approach enabling context searches for chemical entities. Chemical entities play a major role in many specific domains, ranging from biomedical over biology to material science. Since most of the domain specific documents lack of suitable context annotations, we present a similarity measure using cross-domain knowledge gathered from Wikipedia. We show that structure-based similarity measures are not suitable for chemical context searches and introduce a similarity measure combining entity- and context similarity. Our experiments show that our measure outperforms structure-based similarity measures for chemical entities. We compare against two baseline approaches: a Boolean retrieval model and a model using statistical query expansion for the context term. We compared the measures computing mean average precision (MAP) using a set of queries and manual relevance assessments from domain experts. We were able to get a total increase of the MAP of 30% (from 31% to 61%). Furthermore, we show a personalized retrieval system which leads to another increase of around 10%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Corbett, P., Murray-Rust, P.: High-throughput identification of chemistry in life science texts. In: Berthold, M., Glen, R.C., Fischer, I. (eds.) CompLife 2006. LNCS (LNBI), vol. 4216, pp. 107–118. Springer, Heidelberg (2006)
Sun, B., et al.: Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents. ACM Transactions on Information Systems 29 (2011)
Tönnies, S., Köhncke, B., Koepler, O., Balke, W.-T.: Exposing the Hidden Web for Chemical Digital Libraries. In: Proc. of the Joint Conf. on Digital Libraries (JCDL) (2010)
Tönnies, S., et al.: Taking Chemistry to the Task – Personalized Queries for Chemical Digital Libraries. In: Proc. of the Joint Conf. on Digital Libraries (JCDL) (2011)
Kraft, R., Zien, J.: Mining anchor text for query refinement. In: Proc. of the Int. Conf. on World Wide Web (WWW) (2004)
Kraft, R., Chang, C.C., Maghoul, F., Kumar, R.: Searching with context. In: Proc. of the Int. Conf. on World Wide Web (WWW) (2006)
Jiang, D., et al.: Context-aware search personalization with concept preference. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2011)
Haveliwala, T.: Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering 15 (2003)
Chen, L., Papakonstantinou, Y.: Context-sensitive ranking for document retrieval. In: Proc. of ACM SIGMOD Conf. (2011)
Degtyarenko, K., et al.: ChEBI: A database and ontology for chemical entities of biological interest. Nucleic Acids Research 36, Database issue (2008)
Köhncke, B., Balke, W.-T.: Using Wikipedia categories for compact representations of chemical documents. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2010)
Liu, C., Wu, S., Jiang, S., Tung, A.K.H.: Cross Domain Search by Exploiting Wikipedia. In: Int. Conf. on Data Engineering (ICDE) (2012)
Milne, D., Witten, I.H.: An open-source toolkit for mining Wikipedia. Artificial Intelligence 194 (2012)
Milne, D., Witten, I.: Learning to link with wikipedia. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2008)
Kendall, M.G.: A New Measure of Rank Correlation. Journal of Biometrika 30(1-2) (1938)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Köhncke, B., Balke, WT. (2013). Context-Sensitive Ranking Using Cross-Domain Knowledge for Chemical Digital Libraries. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2013. Lecture Notes in Computer Science, vol 8092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40501-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-40501-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40500-6
Online ISBN: 978-3-642-40501-3
eBook Packages: Computer ScienceComputer Science (R0)