Abstract
Academic Phrasebank is an important resource composed of neutral and generic phrases for academic writers. In this paper, we name these neutral and generic phrases reusable phrases, and student writers use them to organize their research articles. Due to the limited size of Academic Phrasebank, it can not meet all the academic writing needs. There are still a large number of reusable phrases in authentic research articles. In order to make up for the deficiency of Academic Phrasebank, we proposed a reusable phrase extraction model based on constituency parsing and dependency parsing to automatically extract reusable phrases from unlabelled research articles. We divided the proposed model into three main components including a reusable words corpus module, a sentence simplification module, and a syntactic parsing module. We created a reusable words corpus of 2129 words to help judge whether a word is neutral and generic, and created two datasets under two scenarios to verify the feasibility of the proposed model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Davis, M., Morley, J.: Facilitating learning about academic phraseology: teaching activities for student writers. J. Learn. Dev. High. Educ. (2018)
Davis, M.: The corpus of contemporary American English: 450 million words, 1990-present. (2008)
Sharma, S.K.: Clause boundary identification for different languages: a survey. Int. J. Comput. Appl. Inf. Technol. 8(2), 152 (2016)
Sacaleanu, B., Marascu, A., Jochim, C.: Rule-based syntactic approach to claim boundary detection in complex sentences. International Business Machines Corp U.S. Patent 9,652,450 (2017)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Oakey, D.: Phrases in EAP academic writing pedagogy: illuminating Halliday’s influence on research and practice. J. Engl. Acad. Purp. 44, 100829 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Duan, X., Zan, H., Bai, X., Zahner, C. (2020). Reusable Phrase Extraction Based on Syntactic Parsing. In: Sun, M., Li, S., Zhang, Y., Liu, Y., He, S., Rao, G. (eds) Chinese Computational Linguistics. CCL 2020. Lecture Notes in Computer Science(), vol 12522. Springer, Cham. https://doi.org/10.1007/978-3-030-63031-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-63031-7_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63030-0
Online ISBN: 978-3-030-63031-7
eBook Packages: Computer ScienceComputer Science (R0)