Abstract
In Vietnamese sentences, function words and word order patterns (WOPs) identify the semantic meaning and the grammatical word classes. We study the most popular WOPs and find out the candidates for new Vietnamese words (NVWs) based on the phrase and word segmentation algorithm [7]. The best WOPs, which are used for recognizing and tagging NVWs, are chosen based on the support and confidence concepts. These concepts are also used in examining if a word belongs to a word class.
Our experiments were examined over a huge corpus, which contains more than 50 million sentences. Four sets of WOPs are studied for recognizing and tagging nouns, verbs, adjectives and pronouns. There are 6,385 NVWs in our new dictionary including 2,791 new noun-taggings, 1,436 new verb-tagging, 682 new adj-taggings, and 1,476 new pronoun taggings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hao, C.X.: Vietnamese: Draft of Functional Grammar. Education Publisher, Hanoi (2004) (in Vietnamese)
Hao, C.X.: Vietnamese - Some Questions on Phonetics, Syntax and Semantics. Education Publisher, Hanoi (2000) (in Vietnamese)
Ban, D.Q.: Vietnamese Grammar. Education Publisher, Hanoi (2004) (in Vietnamese)
Chu, M.N., Nghieu, V.D., Phien, H.T.: Linguistics Foundation Vietnamese. Education Publisher, Hanoi (1997) (in Vietnamese)
Thuyet, N.M., Hiep, N.V.: Components of Vietnamese Sentence. Hanoi National University Publisher (1998) (in Vietnamese)
Halliday, M.A.K.: Introduction to functional grammar, 2nd edn. Edward Arnold, London (1994)
Le Trung, H., Le Anh, V., Le Trung, K.: An Unsupervised Learning and Statistical Approach for Vietnamese Word Recognition and Segmentation. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds.) ACIIDS 2010, Part II. LNCS (LNAI), vol. 5991, pp. 195–204. Springer, Heidelberg (2010)
Chau, Q.N., Tuoi, T.P.: A Pattern-based Approach to Vietnamese Key Phrase Extraction. In: Addendum Contributions of the 5th International IEEE Conference on Computer Sciences - RIVF 2007, pp. 41–46 (2007)
Chau, Q.N., Tuoi, T.P.: A Hybrid Approach to Vietnamese Part-Of-Speech Tagging. In: Proceedings of the 9th International Oriental COCOSDA Conference (O-COCOSDA 2006), Malaysia, pp. 157–160 (2006)
Hai, L.M., Tuoi, P.T.: Vietnamese lexical functional grammar. In: The 1st International Conference on Knowledge and Systems Enginnering, pp. 168–171 (2009)
Dien, D., Kiem, H., Toan, N.V.: Vietnamese Word Segmentation. In: The Sixth Natural Language Processing Pacific Rim Symposium, Tokyo, Japan, pp. 749–756 (2001)
Ha, L.A.: A method for word segmentation in Vietnamese. In: Proceedings of Corpus Linguistics 2003, Lancaster, UK (2003)
Phuong, L.H., Huyên, N.T.M., Roussanaly, A., Vinh, H.T.: A Hybrid Approach to Word Segmentation of Vietnamese Texts. In: MartÃn-Vide, C., Otto, F., Fernau, H. (eds.) LATA 2008. LNCS, vol. 5196, pp. 240–249. Springer, Heidelberg (2008)
Oanh, T.T., Cuong, L.A., Thuy, H.Q., Quynh, L.H.: An Experimental Study on Vietnamese POS Tagging. In: International Conference on Asian Language Processing, IALP 2009, pp. 23–27 (2009)
Nguyen, C.T., Nguyen, T.K., Phan, X.H., Nguyen, L.M., Ha, Q.T.: Vietnamese word segmentationwith CRFs and SVMs: An investigation. In: Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation (PACLIC 2006), Wuhan, CH (2006)
Huyen, N.T.M., Luong, V.X., Phuong, L.H.: A case study of the probabilistic tagger QTAG for Tagging Vietnamese Texts. In: Proceedings of the First National Symposium on Research, Development and Application of Information and Communication Technology, Vietnam (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Le Trung, H., Le Anh, V., Dang, VH., Vo Hoang, H. (2013). Recognizing and Tagging Vietnamese Words Based on Statistics and Word Order Patterns. In: Nguyen, N., Trawiński, B., Katarzyniak, R., Jo, GS. (eds) Advanced Methods for Computational Collective Intelligence. Studies in Computational Intelligence, vol 457. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34300-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-34300-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34299-8
Online ISBN: 978-3-642-34300-1
eBook Packages: EngineeringEngineering (R0)