Multi-words Terminology Recognition Using Web Search

Song, Sa-Kwang; Choi, Yun-Soo; Chun, Hong-Woo; Jeong, Chang-Hoo; Choi, Sung-Pil; Sung, Won-Kyung

doi:10.1007/978-3-642-27210-3_29

Sa-Kwang Song⁹,
Yun-Soo Choi⁹,
Hong-Woo Chun⁹,
Chang-Hoo Jeong⁹,
Sung-Pil Choi⁹ &
…
Won-Kyung Sung⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 264))

Included in the following conference series:

International Conference on U- and E-Service, Science and Technology

1103 Accesses
8 Citations

Abstract

Terminology recognition system which is a fundamental research for Technology Opportunity Discovery (TOD) has been intensively studied in limited range of domains, especially in bio-medical domain. We propose a domain independent terminology recognition system based on machine learning method using dictionary, syntactic features, and Web search results, since the previous works revealed limitation on applying their approaches to general domain because their resources were domain specific. We achieved F-score 80.4 and 6.4% improvement after comparing the proposed approach with the related approach, C-value, which has been widely used and is based on local domain frequencies. In the second experiment with various combinations of unithood features, the method combined with NGD(Normalized Google Distance) showed the best performance of 81.5 on F-score. We applied two machine learning methods such as Logistic regression and SVMs, and got the best score at SVMs method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Meeting the Growing Needs in Scientific and Technological Terms with China’s Terminology Management Agency – CNCTST

Morpheme-Based Chemical Term Analysis and Recognition

A term extraction algorithm based on machine learning and comprehensive feature strategy

Article 05 September 2023

References

Yoon, B.: On the development of a technology intelligence tool for identifying technology opportunity. Expert Systems with Applications 35, 124–135 (2008)
Article Google Scholar
Church, K., Hanks, P.: Word association norms, mutual information, and lexicography. Computational Linguistics 16(1), 22–29 (1990)
Google Scholar
Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74 (1993)
Google Scholar
Smadja, F., McKeown, K.R., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: A statistical approach. Computational Linguistics 22(1), 1–38 (1996)
Google Scholar
Wermter, J., Hahn, U.: Paradigmatic Modifiability Statistics for the Extraction of Complex Multi-Word Terms. In: HLT 2005 Proceedings of the Conference on Human Language Technology and Empirical Methods in NLP (2005)
Google Scholar
Hilbe, J.M.: Logistic Regression Models. Chapman & Hall/CRC Press (2009)
Google Scholar
Justeson, J.S., Katz, S.M.: Technical terminology: some lingustic propertis and an algorithm for identification in text. Natural Language Engineering 1(1), 9–27 (1995)
Article Google Scholar
Frantzi, K., Ananiadou, S., Mima, H.: Automatic recognition of multi-word terms: the C-value/NC-value method. International Journal on Digital Libraries 3(2), 115–130 (2000)
Article Google Scholar
Nakagawa, H., Mori, T.: Automatic term recognition based on statistics of compound nouns and their components. Terminology 9(2), 201–219 (2003)
Article Google Scholar
Cilibrasi, R., Vitanyi, P.: The Google Similarity Distance. IEEE Trans. Knowledge and Data Engineering 19(3), 370–383 (2007)
Article Google Scholar
Zeng, Q.T., Tse, T., et al.: Term identification methods for consumer health vocabulary development. Journal of Medical Internet Research 9(1) (2007)
Google Scholar
Tseng, Y., Lin, C., Lin, Y.: Text mining techniques for patent analysis. Information Processing and Management 43(5), 1216–1247 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Korea Institute of Science and Technology Information, Daejeon, Korea
Sa-Kwang Song, Yun-Soo Choi, Hong-Woo Chun, Chang-Hoo Jeong, Sung-Pil Choi & Won-Kyung Sung

Authors

Sa-Kwang Song
View author publications
You can also search for this author in PubMed Google Scholar
Yun-Soo Choi
View author publications
You can also search for this author in PubMed Google Scholar
Hong-Woo Chun
View author publications
You can also search for this author in PubMed Google Scholar
Chang-Hoo Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Pil Choi
View author publications
You can also search for this author in PubMed Google Scholar
Won-Kyung Sung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Multimedia Engineering Department, Hannam University, 133 Ojeong-dong, Daeduk-gu, Daejeon, Korea
Tai-hoon Kim
The Ohio State University, 470 Hitchcock Hall, 2070 Neil Avenue, 43210-1275, Columbus, OH, USA
Hojjat Adeli
Hosei University, 184-8584, Tokyo, Japan
Jianhua Ma
National Chiao Tung University, Hsinchu, Taiwan, R.O.C.
Wai-chi Fang
School of Computing and Information Systems, University of Tasmania, Hobart, TAS, Australia
Byeong-Ho Kang
Hannam University, Daejeon, Korea
Byungjoo Park
Oslo University College, Norway
Frode Eika Sandnes
Sungkyunkwan University, 110-745, Seoul, Republic of Korea
Kun Chang Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, SK., Choi, YS., Chun, HW., Jeong, CH., Choi, SP., Sung, WK. (2011). Multi-words Terminology Recognition Using Web Search. In: Kim, Th., et al. U- and E-Service, Science and Technology. UNESST 2011. Communications in Computer and Information Science, vol 264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27210-3_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-27210-3_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27209-7
Online ISBN: 978-3-642-27210-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics