Skip to main content
Log in

Finding hidden relevant documents buried in scientific documents by terminological paraphrases

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Technical terms play an important role of effective queries for many users to search scientific databases. However, authors of scientific literature often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons, which leads to producing relevant documents that are not captured by conventional terms above. In this paper, we propose an effective way to retrieve “de facto relevant documents” which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies by adapting Predicate Argument Tuple (PAT). The experiment confirms that PAT-based document retrieval is an effective and promising method to discover those kinds of documents and to improve the recall of terminology-based scientific information access models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Google Scholar(http://scholar.google.com/), PubMed(http://www.ncbi.nlm.nih.gov/pubmed), Microsoft Academic Search (http://academic.research.microsoft.com/)

  2. The controlled vocabulary of PubMed (http://www.ncbi.nlm.nih.gov/pubmed) is MeSH (Medical Sub-ject Headings). ACM (http://portal.acm.org) maintains CCS (Computing Classification System), and LC (http://catalog.loc.gov/) controls access to their content by LCSH (Library of Congress Subject Headings).

  3. http://www-tsujii.is.s.u-tokyo.ac.jp/enju/

  4. http://www.ndsl.kr/index.do

References

  1. Abdou S, Ruck P, Savoy J (2005) Evaluation of stemming, query expansion and manual indexing approaches for the genomic task. In: The Fourtheenth Text REtrieval Conference Proceedings (TREC 2005), vol. 501, pp. 863–871

  2. Aronson AR (1996) The effect of textual variation on concept based information retrieval. Proceedings a conference of the American Medical Informatics Association. pp. 373–377

  3. Bacchin M, Melucci M (2005) Symbol-based query expansion experiments at TREC 2005 genomics track

  4. Choi S-P, Song S, Jung H, Geierhos M, Myaeng S-H (2012) Scientific literature retrieval based on terminological paraphrases using predicate argument tuple. In: SoftTech 2012

  5. Cohen J (1968) Weighed kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70(4):687–699

    Article  Google Scholar 

  6. InfoTerm, “Terminology Standardization,” 2010. [Online]. Available: http://www.infoterm.info/standardization/index.php

  7. Lavrenko V, Croft WB (2001) Relevance based language models. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New Orleans, Louisiana, United States, pp. 120–127

  8. Lu Z, Kim W, Wilbur WJ (2009) Evaluation of query expansion using MeSH in PubMed. Inf Retr 12(1):69–80

    Article  Google Scholar 

  9. Macdonald C, Ounis I (2007) Using relevance feedback in expert search. Proceedings of the 29th European conference on IR research. Springer-Verlag, Rome, Italy, pp. 431–443

  10. Miyao Y, Tsujii J (2008) Feature forest models for probabilistic HPSG parsing. Comput Linguist 34(1):35–80

    Article  MathSciNet  Google Scholar 

  11. Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Melbourne, Australia, pp. 275–281

  12. Srinivasan P (1996) Query expansion and MEDLINE. Inf Process Manag 32(4):431–443

    Article  Google Scholar 

  13. Turtle H, Croft WB (1991) Evaluation of an inference network-based retrieval model. ACM Trans Inf Syst 9(3):187–222

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daesung Lee.

Additional information

This paper is the substantially extended version of the paper accepted and presented for SoftTech 2012 [4]. The extensions include additional experiments, analysis and details about proposed approaches.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, SP., Shin, SH., Jung, H. et al. Finding hidden relevant documents buried in scientific documents by terminological paraphrases. Multimed Tools Appl 74, 8729–8743 (2015). https://doi.org/10.1007/s11042-013-1484-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1484-y

Keywords

Navigation