Abstract
The purpose of this study is to automatically extract a set of simple modifier-head chunks from a large-scale corpus. By analyzing the distribution of simple modifier-head chunks in usage, a set of formal rules of chunks extraction are formulated and a rule-based automatic extraction algorithm is designed. In the experiment of random sampling, the precision of extraction result with this method reaches 82.63%, which casts light on knowledge extraction based on large-scale corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
“Adjective + noun” modifier-head chunks in Chinese mainly consist of collocations of “disyllable adjective + + disyllable noun”, such as “ ”(magnificent + de + architecture–a magnificent building).
References
Miller, G.A.: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63, 81–97 (1956)
Becker, J.: The phrasal lexicon. In: Shank, R., Nash-Webber, B.L. (eds.) Theoretical Issues in Natural Language Processing, pp. 60–63. Bolt Beranek & Newman, Cambridge (1975)
Zhou, J.: Reinforce the language chunk teaching to foster the intuition of Chinese. Jinan J. (Philos. Soc. Sci. Ed.) 1 (2007). (in Chinese)
Qian, X.: A preliminary study on Chinese chunk. J. Peking Univ. (Philos. Soc. Sci. Ed.) 5 (2008). (in Chinese)
Lu, B.: Defined, classify and teaching research of Chinese practical chunk. Guangzhou University (2012). (in Chinese)
Xue, X., Shi, C.: The nature of lexical chunks and the hierarchical relationship of the Chinese lexical chunk system. Contemp. Rhetor. 3 (2013). (in Chinese)
Zhan, H.: Methods and tools of retrieving lexical chunks from Corpora. Foreign Lang. Teach. (2011). (in Chinese)
Zhan, H.: Psychological reality of L2 phraseologisms: evidence from phoneme monitoring. Foreign Lang. Foreign Lang. Teach. (2012). (in Chinese)
Jiang, B.: Chinese multi-word chunks extraction for computer aided translation. Chin. J. Inf. Technol. 21(1) ( 2007). (in Chinese)
Xun, E., Rao, G., Xiao, X., Zang, J.: The construction of the BCC corpus in the age of big data. Corpus Linguist. 3(1), 93–109 (2016). (in Chinese)
Acknowledgement
This study was supported by National Natural Social Foundation of China (16AYY007), Beijing Advanced Innovation Center for Language Resources (TYR17001), Graduate Innovation Foundation in 2019 (19YCX117), the funds of Major Projects of Key Research Bases of Humanities and Social Sciences of the Ministry of Education (16JJD740004).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Chengwen, W., Zheng, Z., Gaoqi, R., Endong, X., Jingjing, M. (2020). Research on Extraction of Simple Modifier-Head Chunks Based on Corpus. In: Hong, JF., Zhang, Y., Liu, P. (eds) Chinese Lexical Semantics. CLSW 2019. Lecture Notes in Computer Science(), vol 11831. Springer, Cham. https://doi.org/10.1007/978-3-030-38189-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-38189-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38188-2
Online ISBN: 978-3-030-38189-9
eBook Packages: Computer ScienceComputer Science (R0)