Skip to main content

Automatic Identification of Cause-Effect Relations in Tamil Using CRFs

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Abstract

We present our work on automatic identification of cause-effect relations in a given Tamil text. Based on the analysis of causal constructions in Tamil, we identified a set of causal markers for Tamil and arrived at certain features used to develop our language model. We manually annotated a Tamil corpus of 8648 sentences for cause-effect relations. With this corpus, we developed the model for identifying causal relations using the machine learning technique, Conditional Random Fields (CRFs). We performed experiments and the results are encouraging. We performed an error analysis of the results and found that the errors can be attributed to some very interesting structural interdependencies between closely occurring causal relations. After comparing these structures in Tamil and English, we claim that at discourse level, the complexity of structural interdependencies between causal relations is more complex in Tamil than in English due to the free word order nature of Tamil.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arulmozhi, P., Devi, S.L.: HMM based POS Tagger for a Relatively Free Word Order Language. Journal of Research on Computing Science 18, 37–48 (2006)

    Google Scholar 

  2. Elwell, R., Baldridge, J.: Discourse Connective Argument Identification with Connective Specific Rankers. In: IEEE International Conference on Semantic Computing, August 4-7, pp. 198–205 (2008)

    Google Scholar 

  3. Girju, R.: Automatic Detection of Causal Relations for Question Answering. In: The Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), Workshop on Multilingual Summarization and Question Answering - Machine Learning and Beyond (2003)

    Google Scholar 

  4. Khoo, C., Kornfilt, J., Oddy, R., Myaeng, S.H.: Automatic extraction of cause-effect information from newspaper text without knowledge-based inferencing. Literary & Linguistic Computing 13(4), 177–186 (1998)

    Article  Google Scholar 

  5. Kudo, T.: CRF++, an open source toolkit for CRF (2005), http://crfpp.sourceforge.net

  6. Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: Proceedings of the 18th International Conference on Machine Learning (ICML 2001), pp. 282–289 (2001)

    Google Scholar 

  7. Lee, A., Prasad, R., Joshi, A., Dinesh, N., Webber, B.: Complexity of Dependencies in Discourse: Are Dependencies in Discourse More Complex Than in Syntax? In: Proceedings of the 5th International Workshop on Treebanks and Linguistic Theories, Prague, Czech Republic (December 2006)

    Google Scholar 

  8. Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3), 243–281 (1988)

    Article  Google Scholar 

  9. Marcu, D., Echihabi, A.: An Unsupervised Approach to Recognizing Discourse Relations. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, PA, July 7-12 (2002)

    Google Scholar 

  10. Oza, U., Prasad, R., Kolachina, S., Sharma, D.M., Joshi, A.: The Hindi Discourse Relation Bank. In: Proceedings of the Third Linguistic Annotation Workshop, Annual Meeting of the ACL, Suntec, Singapore, pp. 158–161. Association for Computational Linguistics, Morristown (2009)

    Google Scholar 

  11. Pechsiri, C., Sroison, P., Janviriyasopak, U.: Know-why extraction from textual data for supporting what question. In: Coling 2008: Proceedings of the Workshop on Knowledge and Reasoning For Answering Questions, Manchester, UK, ACL Workshops, pp. 17–24. Association for Computational Linguistics, Morristown (2008)

    Chapter  Google Scholar 

  12. The PDTB Research Group: The Penn Discourse TreeBank 1.0. Annotation Manual, IRCS Technical Report IRCS-06-01, Institute for Research in Cognitive Science, University of Pennsylvania (March 2006)

    Google Scholar 

  13. Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn Discourse TreeBank 2.0. In: Proc. of LREC 2008 (2008)

    Google Scholar 

  14. Sobha, L., Vijay Sundar Ram, R.: Noun Phrase Chunker for Tamil. In: Proceedings of the First National Symposium on Modeling and Shallow Parsing of Indian Languages (MSPIL), IIT Mumbai, India, pp. 194–198 (2006)

    Google Scholar 

  15. Devi, S.L., Menaka, S.: Semantic Representation of Causality, National Seminar on Lexical Resources and Applied Computational Techniques on Indian Languages, Pondicherry University, October 4-5 (2010)

    Google Scholar 

  16. Vijay Sundar Ram, R., Devi, S.L.: Clause Boundary Identification Using Conditional Random Fields. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 140–150. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Viswanathan, S., Ramesh Kumar, S., Kumara Shanmugam, B., Arulmozi, S., Vijay Shanker, K.: A Tamil Morphological Analyser. In: Proceedings of the International Conference on Natural Language Processing (ICON), CIIL, Mysore, India (2003)

    Google Scholar 

  18. Wellner, B., Pustejovsky, J.: Automatically Identifiying the Arguments of Discourse Connectives. In: Proceedings of EMNLP-CoNLL (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

S., M., Rao, P.R.K., Lalitha Devi, S. (2011). Automatic Identification of Cause-Effect Relations in Tamil Using CRFs. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19400-9_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19399-6

  • Online ISBN: 978-3-642-19400-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics