Skip to main content

A Sentence-Wide Collocation Recommendation System with Error Detection for Academic Writing

  • Conference paper
  • First Online:
Innovative Technologies and Learning (ICITL 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11003))

Included in the following conference series:

  • 3038 Accesses

Abstract

Collocation plays an important role in English article writing. This research builds a collocation corpus for academic writings in engineering and science fields. Based on the collocation corpus, this research also establishes a sentence-wide collocation recommendation and error detection system for academic writing. The corpus is built from Science Citation Index (SCI) papers and industry field thesis, which are collected and processed by a formal procedure developed in this research. The first step of the procedure uses the Stanford Parser to parse and retrieve collocations sentence by sentence from those papers and thesis. The second step classifies these collected collocations in different types and gathers their information to establish a collocation corpus specifically for academic article writings. The use of the corpus is through a web-based collocation system built in this study. Distinguished from other collocation systems found on the web nowadays, the system can do full sentence collocation error detections and recommendations. After several conducted experiments, the system is proved capable of giving satisfied feedbacks and recommendations for scientific article authors. Although the collocation corpus now is not complete enough to give the most precise results, the formal procedure can still keep enhancing the corpus and improving the system by automatically collecting articles from various fields.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lewis, M.: Implementing the Lexical Approach. Thomson Heinle, Boston (2002)

    Google Scholar 

  2. Oxford University Press: Oxford Collocations Dictionary for Students of English (2002)

    Google Scholar 

  3. Chen, Y.-C., Yen, T.-X., Chang, J.S.: Associating collocations with WordNet senses using hybrid models. In: Proceedings of the Twenty-Fourth Conference on Computational Linguistics and Speech Processing (2012)

    Google Scholar 

  4. Benson, M., Benson, E., Ilson, R.F.: The BBI Combinatory Dictionary of English: A Guide to Word Combinations (1986)

    Google Scholar 

  5. Smadja, F.: Lexical Co-occurrence: The Missing Link Journal for Literary and Linguistic Computing (1989)

    Google Scholar 

  6. Smadja, F.: Retrieving Collocations from Text: Xtract. Association for Computational Linguistics (1993)

    Google Scholar 

  7. Church, K.W., Hanks, P.: Word Association Norms, Mutual Information, And Lexicography (1990)

    Google Scholar 

  8. Aji, S., & Kaimal, R. (2012). DOCUMENT SUMMARIZATION USING POSITIVE POINTWISE MUTUAL INFORMATION. International Journal of Computer Science & Information Technology, 4

    Google Scholar 

  9. Bouma, G.: Normalized (Pointwise) mutual information in collocation extraction. In: Proceedings of the Biennial GSCL Conference (2009)

    Google Scholar 

  10. Clear, J.: T-score and mutual information score from Birmingham Corpus website. http://lingua.mtsu.edu/chinese-computing/docs/tscore.html

  11. Thanopoulos, A., Fakotakis, N., Kokkinakis, G.: Comparative evaluation of collocation extraction metrics. In: The International Conference on Language Resources and Evaluation (2002)

    Google Scholar 

  12. Gao, Z.-M.: Automatic identification of English collocation errors based on dependency relations. In: 27th Pacific Asia Conference on Language, Information, and Computation, pp. 550–555 (2013)

    Google Scholar 

  13. Wu, J.-C., Chang, Y.-C., Mitamura, T., Chang, J.S.: Automatic collocation suggestion in academic writing. In: Proceedings of the ACL 2010 Conference, pp. 115–119 (2010)

    Google Scholar 

  14. Davies, M.: The Corpus of Contemporary American English: 450 million words, 1990–present (2008). http://corpus.byu.edu/coca/

  15. Jian, J.-Y., Chang, Y.-C., Chang, J.S.: TANGO: bilingual collocational concordancer. In: Annual Conference of the Association for Computational Linguistics (2004)

    Google Scholar 

  16. Ackermann, K., Chen, Y.-H.: Developing the Academic Collocation List (ACL) – a corpus driven and expert-judged approach. J. Engl. Acad. Purp. 12(4), 235–247 (2013)

    Article  Google Scholar 

  17. Bahns, J.: Lexical collocations: a contrastive view. ELT J. 47(1), 56–63 (1993)

    Article  Google Scholar 

  18. Peter, H.: Phraseology and second language proficiency. Appl. Linguist. 19(1), 24–44 (1998)

    Article  Google Scholar 

  19. Smadja, F.: From n-grams to collocations an evaluation of Xtract. In: Proceedings of the 29th Annual Meeting on Association for Computational Linguistics (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tzone-I Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chu, YL., Wang, TI. (2018). A Sentence-Wide Collocation Recommendation System with Error Detection for Academic Writing. In: Wu, TT., Huang, YM., Shadiev, R., Lin, L., Starčič, A. (eds) Innovative Technologies and Learning. ICITL 2018. Lecture Notes in Computer Science(), vol 11003. Springer, Cham. https://doi.org/10.1007/978-3-319-99737-7_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99737-7_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99736-0

  • Online ISBN: 978-3-319-99737-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics