Skip to main content
Log in

Semantic representation of Korean numeral classifier and its ontology building for HLT applications

  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

The complexity of Korean numeral classifiers demands semantic as well as computational approaches that employ natural language processing (NLP) techniques. The classifier is a universal linguistic device, having the two functions of quantifying and classifying nouns in noun phrase constructions. Many linguistic studies have focused on the fact that numeral classifiers afford decisive clues to categorizing nouns. However, few studies have dealt with the semantic categorization of classifiers and their semantic relations to the nouns they quantify and categorize in building ontologies. In this article, we propose the semantic recategorization of the Korean numeral classifiers in the context of classifier ontology based on large corpora and KorLex Noun 1.5 (Korean wordnet; Korean Lexical Semantic Network), considering its high applicability in the NLP domain. In particular, the classifier can be effectively used to predict the semantic characteristics of nouns and to process them appropriately in NLP. The major challenge is to make such semantic classification and the attendant NLP techniques efficient. Accordingly, a Korean numeral classifier ontology (KorLexClas 1.0), including semantic hierarchies and relations to nouns, was constructed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. In this article, we use romanization according to notice 2000–8 of the Korean Ministry of Culture and Tourism (2000.7.7) and its converter, which was developed by the Korean Language Processing Laboratory of Pusan National University.

  2. The meanings of the second and third types of classifiers are based on their use in Korean.

  3. The Standard Korean Dictionary is published by the National Institute of the Korean Language (1999).

  4. The list of high-frequency Korean words is based on <A Survey of Frequency in Use of Modern Korean Words> conducted by the National Institute of the Korean Language (2002).

  5. Currently, KorLex (Korean WordNet) 1.5 is composed of nouns, verbs, adjectives and adverbs, and KorLex Noun 1.5 contains 58,656 synsets and 41,368 word senses.

  6. The formalization and implementation of the ontological relations of Korean classifiers using OWL triples, comprising the referred nouns, and various lexical information are illustrated in more detail (Jung et al. 2006).

  7. In most classifier languages, in addition to the semantically specialized classifiers used in referring to particular kinds of entities, there is purported to be a semantically ‘neutral’ classifier capable of being employed with reference to all sorts of entities. This semantically neutral classifier, like the generic classifier used in the present study, tends to be restricted to ‘nonpersonal’, or even ‘inanimate’ entities (Lyons 1977, vol. 2, p. 461). For example, ‘ge’ in Mandarin Chinese or ‘gae’ in Korean are used in this way.

  8. The numbers after the English words, such as ‘1’ in ‘melon1’ and ‘2’ in ‘watermelon2’, indicate sense ID’s in the Princeton WordNet (ver. 2.0) Nouns Synset.

Abbreviations

ACC:

Accusative

ADJ:

Adjective

CL:

Classifier

DEM:

Demonstrative

GEN:

Genitive

HLT:

Human language technology

KCL-M:

Korean numeral classifier module

LUB:

Least upper bound

MT:

Machine translation

NLP:

Natural language processing

NOM:

Nominative

NP:

Noun phrase

OWL:

Ontology web language

PAST:

Past

PRST:

Present

Q:

Numeral quantifier

TOP:

Topic

WSD:

Word sense disambiguation

References

  • Allan, K. (1977). Classifiers. Language, 53(2), 285–311.

    Article  Google Scholar 

  • Allan, K. (2001). Natural language semantics. Oxford: Blackwell.

    Google Scholar 

  • Alani, H., Kim, S., Millard, D. E., Weal, M. J., Hall, W., Lewis, P. H., & Shadbolt, N. R. (2003). Automatic ontology-based knowledge extraction from web documents. IEEE Intelligent Systems 18(1), 14–21.

    Google Scholar 

  • Biq, Y.-O., Tai, J., & Thompson, S. (1996). Recent developments in functional approaches to Chinese. In C.-T. J. Huang & Y.-H. A. Li (Eds.), New horizons in chinese linguistics (pp. 97–140). Kluwer: Academic Publishers.

    Google Scholar 

  • Bond, F., Ogura, K., & Ikehara, S. (1996). Classifiers in Japanese-to-English machine translation. Paper presented at the 16th International Conference on Computational Linguistics: COLING-1996, Copenhagen, pp. 125–130.

  • Bond, F., & Paik, K. (1997). Classifying correspondence in Japanese and Korean. Paper presented at the 3rd Pacific Association for Computational Linguistics Conference: PACLING-97, Tokyo, pp. 58–67.

  • Bond, F., & Paik, K. (2000). Reusing an ontology to generate numeral classifiers. Paper presented at the 18th Conference on Computational Linguistics: COLING-2000, Saarbrücken, pp. 90–96.

  • Buscaldi, D., Rosso, P., & Arnal, E. S. (2006). WordNet as a geographical information resource. Paper presented at the 3rd Global WordNet Conference, Jeju Island, pp. 37–42.

  • Chae, W. (1983). A study on numerals and numeral classifier constructions in Korean. Linguistics Study, 19(1), 19–34.

    Google Scholar 

  • Croft, W. (1994). Semantic universals in classifier system. Word, 45(2), 145–171.

    Google Scholar 

  • Downing, P. (1993). Pragmatic and semantic constraints on numeral quantifier position in Japanese. Linguistics, 29, 65–93.

    Article  Google Scholar 

  • Fellbaum, C. (Ed.) (1998). WordNet—An electronic lexical database. Cambridge: MIT Press.

    Google Scholar 

  • Garcia, R. V., Nieves, D. C., Breis, J. F., & Vicente, P. V. (2006). A methodology for extracting ontological knowledge from Spanish documents. Paper presented at the 7th Computational Linguistics and Intelligent Text Processing: CICLING 2006, Mexico-City, pp. 71–80.

  • Goddard, C. (1998). Semantic analysis: A practical introduction. Oxford: Oxford University Press.

    Google Scholar 

  • Goddard, C., & Wierzbicka, A. (1994). Semantic and lexical universals. Amsterdam/Philadelphia: John Benjamins Publishing Company.

    Google Scholar 

  • Gruber, T. R. (1993). A translation approach to portable ontologies. Knowledge Acquisition, 5(2), 199–220.

    Article  Google Scholar 

  • Guo, Ch. W. (2000). The comparison of Korean and Chinese classifiers. Korean Semantics. The Society of Korean Semantics, 7, 1–28.

    Google Scholar 

  • Guo, H., & Zhong, H. (2005). Chinese classifier assignment using SVMs. Paper presented at the 4th SIGHAN Workshop on Chinese Language Processing, Jeju Island, pp. 25–31.

  • Hovy, E. H. (2005). Methodologies for the reliable construction of ontological knowledge. Paper presented at the International Conference on Computational Science: ICCS-2005, Atlanta, pp. 91–106.

  • Huang, C. R., & Ahrens, K. (2003). Individuals, kinds and events: Classifier coercion of nouns. Language Sciences, 25, 353–373.

    Article  Google Scholar 

  • Hwang, S. H., Jung, Y. I., Yoon, A. S., & Kwon, H. C. (2006). Building Korean classifier ontology based on Korean WordNet. Paper presented at the 9th International Conference on Text, Speech and Dialogue, Brno, pp. 261–268.

  • Hwang, S. H., Yoon, A. S., & Kwon, H. C. (2007). Semantic feature-based Korean classifier module for MT Systems. Paper presented at the 6th International Conference on Advanced Language Processing and Web Information Technology, Luoyang, pp. 146–154.

  • Jackendoff, R. (1983). Semantics and cognition. Cambridge: MIT Press.

    Google Scholar 

  • Jung, Y. I., Hwang, S. H., Yoon, A. S., & Kwon, H. C. (2006). Formalization of ontological relations of Korean numeral classifiers. Paper presented at the 19th Australian Joint Conference on Artificial Intelligence, Hobart, pp. 1106–1110.

  • Kim, S. H. (2005). Korean classifiers and grammaticalization. Korean Linguistics, The Association for Korean Linguistics, 27, 107–123.

    Google Scholar 

  • Lakoff, G. (1986). Classifiers as a reflection of mind. In C. Craig (Ed.), Noun classes and categorization (pp. 13–51). Amsterdam/Philadelphia: John Benjamins Publishing Company.

    Google Scholar 

  • Lyons, J. (1977). Semantics (Vol. 2). Cambridge: Cambridge University Press.

    Google Scholar 

  • Matsumoto, Y. (1993). Japanese numeral classifiers: A study of semantic categories and lexical organization. Linguistics, 31(4), 667–713.

    Article  Google Scholar 

  • Nam, J. S. (2006). Étude sur les noms de mesure en coréen pour construire une base de données franco-coréenne des expressions à quantificateur en vue de la traduction automatique. Revue d’études françaises, (Association coréeenne d’études françaises), 54, 1–28 (written in Korean).

    Google Scholar 

  • Nichols, E., Bond, F., & Flickinger, D. (2005). Robust ontology acquisition from machine-readable dictionaries. Paper presented at the 19th International Joint Conference on Artificial Intelligence, Edinburgh, pp. 1111–1116.

  • Nida, E. A. (1975). Componential analysis of meaning. The Hague: Mouton.

    Google Scholar 

  • Nirenburg, S., & Raskin, V. (2004). Ontological semantics. Cambridge: MIT Press.

    Google Scholar 

  • Paik, K., & Bond, F. (2001). Multilingual generation of numeral classifiers using a common ontology. Paper presented at the 19th International Conference on Computer Processing of Oriental Languages : ICCPOL-2001, Taichung, pp. 141–147.

  • Paul, M., Sumita, E., & Yamamoto, S. (2002). Corpus-based generation of numeral classifier using phrase alignment. Paper presented at the 19th Conference on Computational Linguistics: COLING-2002, Taipei, pp. 779–785.

  • Philpot, A. G., Fleischman, M., & Hovy, E. H. (2003). Semi-automatic construction of a general purpose ontology. Paper presented at the International Lisp Conference, New York, pp. 1–8.

  • Sinopalnikova, A. (2004). Word association thesaurus as a resource for building WordNet. Paper presented at the 3rd Global WordNet Conference, Jeju Island, pp. 199–205.

  • Sornlertlamvanich, V., Pantachat, W., & Meknavin, S. (1994). Classifier assignment by corpus-based approach. Paper presented at the International Conference on Computational Linguistics: COLING-1994, Kyoto, pp. 152–159.

  • Sowa, J. F. (2000). Knowledge representation. Pacific Grove, CA: Brooks Cole Publishing Co.

  • Sundheim, B. M., Mardis, S., & Burger, J. (2006). Gazetter linkage to WordNet. Paper presented at the 3rd Global WordNet Conference, Jeju Island, pp. 103–104.

Download references

Acknowledgements

This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MOST) (No. R01-2007-000-20517-0). The authors would like to thank the anonymous reviewers for their interest in our research and for their valuable comments and arguments, which have served to modify and improve this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyuk-Chul Kwon.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hwang, S., Yoon, A. & Kwon, HC. Semantic representation of Korean numeral classifier and its ontology building for HLT applications. Lang Resources & Evaluation 42, 151–172 (2008). https://doi.org/10.1007/s10579-007-9047-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-007-9047-3

Keywords

Navigation