Skip to main content

Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications

  • Conference paper
  • First Online:
Selected Reflections in Language, Logic, and Information (ESSLLI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14354))

Included in the following conference series:

  • 48 Accesses

Abstract

In this paper, we report a qualitative and quantitative evaluation of a hand-crafted set of discourse features and their interaction with different text types. To be more specific, we compared two distinct text types—scientific abstracts and their accompanying full texts—in terms of linguistic properties, which include, among others, sentence length, coreference information, noun density, self-mentions, noun phrase count, and noun phrase complexity. Our findings suggest that abstracts and full texts differ in three mechanisms which are size and purpose bound. In abstracts, nouns tend to be more densely distributed, which indicates that there is a smaller distance between noun occurrences to be observed because of the compact size of abstracts. Furthermore, in abstracts we find a higher frequency of personal and possessive pronouns which authors use to make references to themselves. In contrast, in full texts we observe a higher frequency of noun phrases. These findings are our first attempt to identify text type motivated linguistic features that can help us draw clearer text type boundaries. These features could be used as parameters during the construction of systems for writing evaluation that could assist both tutors and students in text analysis, or as guides in linguistically-controllable neural text generation systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://openai.com/blog/better-language-models/.

  2. 2.

    https://transformer.huggingface.co/ (July 4 2020).

  3. 3.

    https://aclanthology.org/.

  4. 4.

    https://stanfordnlp.github.io/CoreNLP/index.html.

References

  1. Ahmad, J.: Stylistic features of scientific English: a study of scientific research articles. English Lang. Literat. Stud. 2(1), 47–55 (2012). https://doi.org/10.5539/ells.v2n1p47

    Article  Google Scholar 

  2. Benz, A., Jasinskaja, K.: Questions under discussion: from sentence to discourse. Discourse Proc. 54(3), 177–186 (2017). https://doi.org/10.1080/0163853X.2017.1316038. (04.07.2020)

    Article  Google Scholar 

  3. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805 (04.07.2020)

  4. Flower, L., Hayes, J.R.: A cognitive process theory of writing. College Compos. Commun. 32(4), 365–387 (1981)

    Article  Google Scholar 

  5. Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: a framework for modeling the local coherence of discourse. Comput. Linguist. 21(2), 203–225 (1995)

    Google Scholar 

  6. Halliday, M., Hasan, R.: Cohesion in English. Longman Group Ltd London (1976)

    Google Scholar 

  7. Hyland, K.: Humble servants of the discipline? self-mention in research articles. Engl. Specif. Purp. 20(3), 207–226 (2001). https://doi.org/10.1016/S0889-4906(00)00012-0. (04.07.2020)

    Article  Google Scholar 

  8. Jin, C., He, B., Hui, K., Sun, L.: TDNN: a two stage deep neural network for prompt-independent automated essay scoring. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1, Long Papers), pp. 1088–1097 (2018). https://doi.org/10.18653/v1/P18-1100(04.07.2020)

  9. Kalpić, D., Hlupić, N., Lovrić, M.: Student’s tTests, pp. 1559–1563. Springer, Berlin (2011)

    Google Scholar 

  10. Keskar, N.S., McCann, B., Varshney, L.R., Xiong, C., Socher, R.: CTRL: a conditional transformer language model for controllable generation (2019). https://doi.org/10.48550/arXiv.1909.05858 (Oct 7 2020)

  11. McNamara, D.S., Crossley, S.A., Mccarthy, P.M.: Linguistic features of writing quality. Written Commun. 27(1), 57–86 (2009). https://doi.org/10.1177/0741088309351547

  12. McNamara, D.S. Graesser, A.C.: Coh-metrix: an automated tool for theoretical and applied natural language processing. In McCarthy, P., Boonthum-Denecke, C. (eds.) Applied Natural Language Processing: Identication, Investigation and Resolution, pp. 188–205. IGI Global, Hershey, PA (2011). https://doi.org/10.4018/978-1-60960-741-8.ch011(10.07.2020)

  13. Orasan, C.: Patterns in scientific abstracts. In: Proceedings Corpus Linguistics, pp. 433–445 (2001)

    Google Scholar 

  14. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: . Language models are unsupervised multitask learners. https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf (4 July 2020)

  15. Rey, D., Neuhäuser, M.: Wilcoxon-Signed-Rank Test. In: International Encyclopedia of Statistical Science, pp. 1658–1659. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2

  16. Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017). https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf (July 4 2020)

  17. von Stutterheim, C., Klein, W.: Referential movement in descriptive and narrative discourse, 54, 39–76 North-Holland Linguistic Series: Linguistic Variations. Elsevier (1989). https://doi.org/10.1016/B978-0-444-87144-2.50005-7

  18. Witte, S.P., Faigley, L.: Coherence, cohesion, and writing quality. Coll. Compos. Commun. 32(2), 189–204 (1981)

    Article  Google Scholar 

  19. Wolf, T., et al.: HuggingFace’s transformers: State-of-the-art natural language processing (2019). https://arxiv.org/abs/1910.03771v4 (4 July 2020)

  20. Yazilarda, A., İşaret, Y., Kullanłmł, E.S., Kafes, H.: The use of authorial self-mention words in academic writing. Inter. J. Language Academy 5(3), 165–180 (2017). https://doi.org/10.18033/ijla.3532

Download references

Acknowledgements

I would like to thank Dr. Niko Schenk for his constant support, inspiration, and supervision of the extracted and analyzed data. Sincere gratitude to Prof. Gert Webelhuth, Dr. Janina Radó, and Prof. Manfred Sailer for their guidance and assistance in the interpretation of the research results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iverina Ivanova .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 231 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ivanova, I. (2024). Assessing the Effect of Text Type on the Choice of Linguistic Mechanisms in Scientific Publications. In: Pavlova, A., Pedersen, M.Y., Bernardi, R. (eds) Selected Reflections in Language, Logic, and Information. ESSLLI 2019. Lecture Notes in Computer Science, vol 14354. Springer, Cham. https://doi.org/10.1007/978-3-031-50628-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50628-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50627-7

  • Online ISBN: 978-3-031-50628-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics