Finding hidden relevant documents buried in scientific documents by terminological paraphrases

Choi, Sung-Pil; Shin, Sung-Ho; Jung, Hanmin; Lee, Daesung

doi:10.1007/s11042-013-1484-y

Finding hidden relevant documents buried in scientific documents by terminological paraphrases

Published: 05 May 2013

Volume 74, pages 8729–8743, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sung-Pil Choi^1,2,
Sung-Ho Shin^1,2,
Hanmin Jung¹ &
…
Daesung Lee²

175 Accesses
2 Citations
Explore all metrics

Abstract

Technical terms play an important role of effective queries for many users to search scientific databases. However, authors of scientific literature often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons, which leads to producing relevant documents that are not captured by conventional terms above. In this paper, we propose an effective way to retrieve “de facto relevant documents” which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies by adapting Predicate Argument Tuple (PAT). The experiment confirms that PAT-based document retrieval is an effective and promising method to discover those kinds of documents and to improve the recall of terminology-based scientific information access models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lexical paraphrasing and pseudo relevance feedback for biomedical document retrieval

Article 04 June 2018

Muhammad Wasim, Muhammad Nabeel Asim, … Irfan Mehmood

Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine

Article Open access 29 April 2016

Boris L. Alperin, Andrey O. Kuzmin, … Valentin N. Parmon

A Novel Parsing-Based Automatic Domain Terminology Extraction Method

Notes

Google Scholar(http://scholar.google.com/), PubMed(http://www.ncbi.nlm.nih.gov/pubmed), Microsoft Academic Search (http://academic.research.microsoft.com/)
The controlled vocabulary of PubMed (http://www.ncbi.nlm.nih.gov/pubmed) is MeSH (Medical Sub-ject Headings). ACM (http://portal.acm.org) maintains CCS (Computing Classification System), and LC (http://catalog.loc.gov/) controls access to their content by LCSH (Library of Congress Subject Headings).
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/
http://www.ndsl.kr/index.do

References

Abdou S, Ruck P, Savoy J (2005) Evaluation of stemming, query expansion and manual indexing approaches for the genomic task. In: The Fourtheenth Text REtrieval Conference Proceedings (TREC 2005), vol. 501, pp. 863–871
Aronson AR (1996) The effect of textual variation on concept based information retrieval. Proceedings a conference of the American Medical Informatics Association. pp. 373–377
Bacchin M, Melucci M (2005) Symbol-based query expansion experiments at TREC 2005 genomics track
Choi S-P, Song S, Jung H, Geierhos M, Myaeng S-H (2012) Scientific literature retrieval based on terminological paraphrases using predicate argument tuple. In: SoftTech 2012
Cohen J (1968) Weighed kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70(4):687–699
Article Google Scholar
InfoTerm, “Terminology Standardization,” 2010. [Online]. Available: http://www.infoterm.info/standardization/index.php
Lavrenko V, Croft WB (2001) Relevance based language models. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New Orleans, Louisiana, United States, pp. 120–127
Lu Z, Kim W, Wilbur WJ (2009) Evaluation of query expansion using MeSH in PubMed. Inf Retr 12(1):69–80
Article Google Scholar
Macdonald C, Ounis I (2007) Using relevance feedback in expert search. Proceedings of the 29th European conference on IR research. Springer-Verlag, Rome, Italy, pp. 431–443
Miyao Y, Tsujii J (2008) Feature forest models for probabilistic HPSG parsing. Comput Linguist 34(1):35–80
Article MathSciNet Google Scholar
Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Melbourne, Australia, pp. 275–281
Srinivasan P (1996) Query expansion and MEDLINE. Inf Process Manag 32(4):431–443
Article Google Scholar
Turtle H, Croft WB (1991) Evaluation of an inference network-based retrieval model. ACM Trans Inf Syst 9(3):187–222
Article Google Scholar

Download references

Author information

Authors and Affiliations

Korea Institution of Science and Technology Information (KISTI), 335 Gwahangno, Yuseong-gu, Daejeon, South Korea, 305-806
Sung-Pil Choi, Sung-Ho Shin & Hanmin Jung
School of Applied Science, Department of Computer Engineering, Catholic University of Pusan, #9, Bugok 3-dong, Geumjeong-gu, Pusan, South Korea, 609-757
Sung-Pil Choi, Sung-Ho Shin & Daesung Lee

Authors

Sung-Pil Choi
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Ho Shin
View author publications
You can also search for this author in PubMed Google Scholar
Hanmin Jung
View author publications
You can also search for this author in PubMed Google Scholar
Daesung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daesung Lee.

Additional information

This paper is the substantially extended version of the paper accepted and presented for SoftTech 2012 [4]. The extensions include additional experiments, analysis and details about proposed approaches.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, SP., Shin, SH., Jung, H. et al. Finding hidden relevant documents buried in scientific documents by terminological paraphrases. Multimed Tools Appl 74, 8729–8743 (2015). https://doi.org/10.1007/s11042-013-1484-y

Download citation

Published: 05 May 2013
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11042-013-1484-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding hidden relevant documents buried in scientific documents by terminological paraphrases

Abstract

Access this article

Similar content being viewed by others

Lexical paraphrasing and pseudo relevance feedback for biomedical document retrieval

Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine

A Novel Parsing-Based Automatic Domain Terminology Extraction Method

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Finding hidden relevant documents buried in scientific documents by terminological paraphrases

Abstract

Access this article

Similar content being viewed by others

Lexical paraphrasing and pseudo relevance feedback for biomedical document retrieval

Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine

A Novel Parsing-Based Automatic Domain Terminology Extraction Method

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation