Abstract
In this paper, we report a qualitative and quantitative evaluation of a hand-crafted set of discourse features and their interaction with different text types. To be more specific, we compared two distinct text types—scientific abstracts and their accompanying full texts—in terms of linguistic properties, which include, among others, sentence length, coreference information, noun density, self-mentions, noun phrase count, and noun phrase complexity. Our findings suggest that abstracts and full texts differ in three mechanisms which are size and purpose bound. In abstracts, nouns tend to be more densely distributed, which indicates that there is a smaller distance between noun occurrences to be observed because of the compact size of abstracts. Furthermore, in abstracts we find a higher frequency of personal and possessive pronouns which authors use to make references to themselves. In contrast, in full texts we observe a higher frequency of noun phrases. These findings are our first attempt to identify text type motivated linguistic features that can help us draw clearer text type boundaries. These features could be used as parameters during the construction of systems for writing evaluation that could assist both tutors and students in text analysis, or as guides in linguistically-controllable neural text generation systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmad, J.: Stylistic features of scientific English: a study of scientific research articles. English Lang. Literat. Stud. 2(1), 47–55 (2012). https://doi.org/10.5539/ells.v2n1p47
Benz, A., Jasinskaja, K.: Questions under discussion: from sentence to discourse. Discourse Proc. 54(3), 177–186 (2017). https://doi.org/10.1080/0163853X.2017.1316038. (04.07.2020)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805 (04.07.2020)
Flower, L., Hayes, J.R.: A cognitive process theory of writing. College Compos. Commun. 32(4), 365–387 (1981)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: a framework for modeling the local coherence of discourse. Comput. Linguist. 21(2), 203–225 (1995)
Halliday, M., Hasan, R.: Cohesion in English. Longman Group Ltd London (1976)
Hyland, K.: Humble servants of the discipline? self-mention in research articles. Engl. Specif. Purp. 20(3), 207–226 (2001). https://doi.org/10.1016/S0889-4906(00)00012-0. (04.07.2020)
Jin, C., He, B., Hui, K., Sun, L.: TDNN: a two stage deep neural network for prompt-independent automated essay scoring. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1, Long Papers), pp. 1088–1097 (2018). https://doi.org/10.18653/v1/P18-1100(04.07.2020)
Kalpić, D., Hlupić, N., Lovrić, M.: Student’s tTests, pp. 1559–1563. Springer, Berlin (2011)
Keskar, N.S., McCann, B., Varshney, L.R., Xiong, C., Socher, R.: CTRL: a conditional transformer language model for controllable generation (2019). https://doi.org/10.48550/arXiv.1909.05858 (Oct 7 2020)
McNamara, D.S., Crossley, S.A., Mccarthy, P.M.: Linguistic features of writing quality. Written Commun. 27(1), 57–86 (2009). https://doi.org/10.1177/0741088309351547
McNamara, D.S. Graesser, A.C.: Coh-metrix: an automated tool for theoretical and applied natural language processing. In McCarthy, P., Boonthum-Denecke, C. (eds.) Applied Natural Language Processing: Identication, Investigation and Resolution, pp. 188–205. IGI Global, Hershey, PA (2011). https://doi.org/10.4018/978-1-60960-741-8.ch011(10.07.2020)
Orasan, C.: Patterns in scientific abstracts. In: Proceedings Corpus Linguistics, pp. 433–445 (2001)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: . Language models are unsupervised multitask learners. https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (4 July 2020)
Rey, D., Neuhäuser, M.: Wilcoxon-Signed-Rank Test. In: International Encyclopedia of Statistical Science, pp. 1658–1659. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017). https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf (July 4 2020)
von Stutterheim, C., Klein, W.: Referential movement in descriptive and narrative discourse, 54, 39–76 North-Holland Linguistic Series: Linguistic Variations. Elsevier (1989). https://doi.org/10.1016/B978-0-444-87144-2.50005-7
Witte, S.P., Faigley, L.: Coherence, cohesion, and writing quality. Coll. Compos. Commun. 32(2), 189–204 (1981)
Wolf, T., et al.: HuggingFace’s transformers: State-of-the-art natural language processing (2019). https://arxiv.org/abs/1910.03771v4 (4 July 2020)
Yazilarda, A., İşaret, Y., Kullanłmł, E.S., Kafes, H.: The use of authorial self-mention words in academic writing. Inter. J. Language Academy 5(3), 165–180 (2017). https://doi.org/10.18033/ijla.3532
Acknowledgements
I would like to thank Dr. Niko Schenk for his constant support, inspiration, and supervision of the extracted and analyzed data. Sincere gratitude to Prof. Gert Webelhuth, Dr. Janina Radó, and Prof. Manfred Sailer for their guidance and assistance in the interpretation of the research results.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ivanova, I. (2024). Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications. In: Pavlova, A., Pedersen, M.Y., Bernardi, R. (eds) Selected Reflections in Language, Logic, and Information. ESSLLI 2019. Lecture Notes in Computer Science, vol 14354. Springer, Cham. https://doi.org/10.1007/978-3-031-50628-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-50628-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50627-7
Online ISBN: 978-3-031-50628-4
eBook Packages: Computer ScienceComputer Science (R0)