Abstract
A Korean-Japanese-Chinese aligned wordnet, “CoreNet” is introduced. For the purpose of this paper, the term “wordnet” refers to a network of words. It is constructed based on a shared semantic hierarchy that is originated from NTT Goidaikei (Lexical Hierarchical System). Korean wordnet was constructed through the semantic category assignment to every meaning of Korean words in a dictionary. Verbs and adjectives’ word senses are assigned to the same semantic hierarchy as that of nouns. Each sense of verbs is investigated from corpora for their usage, and compared with Japanese translation. Chinese wordnet with the same semantic hierarchy was built up based on the comparison with Korean wordnet. Each sense of Chinese verb corresponds to Korean with its argument structure. The use of the same semantic hierarchy for nouns and predicates has several advantages. First, the surface forms of nouns and predicates share the similar one, especially in Chinese words. In case of Korean and Japanese, the typical formation is like “do+Noun” in English like “Noun+suru” in Japanese and “Noun+hada” in Korean. Second, the language generation from conceptual structures takes freedom to choose the surface form whether it chooses noun phrases or verb phrases. CoreNet has been constructed by the following principles: word sense mapping to concept, corpus-based, multi-lingualism, and single concept system for multi-languages. The overall flow of construction is based on dictionary-based bootstrapping, incremental similarity-based classification and manual post-editing. Among consideration points, the followings are introduced: multiple concept mapping, verbal noun, and concept splitting. For multiple concept mapping, a word is mapped into numerous concepts that comprise respective meanings of the word. For example, school is an “institution for the instruction of students.” The word school is mapped into three concepts such as location, organization, and facility. For verbal noun, a word that is a verb is assigned to concepts after it is transformed to a noun. For example, “write” is transformed to its noun form “writing” that is mapped into a concept writing falling under event. For concept splitting, every time inconsistency among nodes of concepts is discovered, a node may be added. What differs between CoreNet and NTT Goidaikei is that CoreNet features mapping between word senses (not just words) and concepts. These works have lasted since 1994.
This work has been supported by Ministry of Science and Technology in Korea. The result of this work is enhanced and distributed through Bank of Language Resources supported by grant No. R21-2003-000-10042-0 from Korea Science & Technology Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Choi, KS., Bae, HS. (2003). A Korean-Japanese-Chinese Aligned Wordnet with Shared Semantic Hierarchy. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_79
Download citation
DOI: https://doi.org/10.1007/978-3-540-24594-0_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20608-8
Online ISBN: 978-3-540-24594-0
eBook Packages: Springer Book Archive