Skip to main content

Clause Boundary Identification Using Conditional Random Fields

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4919))

Abstract

This paper discusses about the detection of clause boundaries using a hybrid approach. The Conditional Random fields (CRFs), which have linguistic rules as features, identifies the boundaries initially. The boundary marked is checked for false boundary marking using Error Pattern Analyser. The false boundary markings are re-analysed using linguistic rules. The experiments done with our approach shows encouraging results and are comparable with the other approaches

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carreras, X., Màrquez, L.: Boosting Trees for Clause Splitting. In: Daelemans., W., Zajac, R. (eds.) Proceedings of CoNLL 2001, Toulouse France, pp. 73–75 (2001)

    Google Scholar 

  2. Carreras, X., Màrquez, L.: Phrase Recognition by Filtering and Ranking with Percep-trons. In: Proceedings of RANLP-2003, Borovets Bulgaria, pp. 205–216 (2003)

    Google Scholar 

  3. Carreras, X., Màrquez, L., Punyakanok, V., Roth, D.: Learning and Inference for Clause Identification. In: Proceedings of the 14th European Conference on Machine Learning, Finland, pp. 35–47 (2002)

    Google Scholar 

  4. Carreras, X., Màrquez, L., Castro, J.: Filtering-ranking Perceptron Learning for Partial Parsing. Machine Learning 60(1), 41–71 (2005)

    Article  Google Scholar 

  5. McCallum, A., Li, W.: Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web Enhanced Lexicons. In: Proceedings of CoNLL-2003, Edmonton Canada, pp. 188–191 (2003)

    Google Scholar 

  6. Déjean, H.: Using Allis for Clausing. In: Daelemans, W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 64–66 (2001)

    Google Scholar 

  7. Ejerhed, E.: Finding Clauses in Unrestricted Text by Finitary and Stochastic Methods. In: Proceedings of the 2nd Conference on Applied Natural Language Processing, Austin Texas, pp. 219–227 (1988)

    Google Scholar 

  8. Hammerton, J.: Clause Identification with Long Short-term Memory. In: Daele-mans, W., Zajac, R. (eds.) Proceedings of CoNLL 2001, Toulouse France, pp. 61–63 (2001)

    Google Scholar 

  9. Lafferty., J., McCallum., A., Pereira, F.: Conditional Random Fields: Prob-abilistic Models for Segmenting and Labeling Sequence Data. In: Proc. 18th International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  10. Vilson, J.L.: Clause Processing in Complex Sentences. In: Proceedings of the First International Conference on Language Resource & Evaluation, vol. 1, pp. 937–943 (1998)

    Google Scholar 

  11. Molina., A., Pla, F.: Clause Detection Using HMM. In: Daelemans., W., Zajac., R. (eds.) Proceedings of CoNLL-2001, Toulouse, France, pp. 70–72 (2001)

    Google Scholar 

  12. Molina., A., Pla, F.: Shallow Parsing Using Specialized HMMs. Journal of Ma-chine Learning Research 2, 595–613 (2002)

    Article  Google Scholar 

  13. Orasan, C.: A Hybrid Method for Clause Splitting in Unrestricted English Text. In: Proceedings of ACIDCA 2000 Corpora Processing, Monastir Tunisia, pp. 129–134 (2000)

    Google Scholar 

  14. Jon, D.P., Goyal, I.: Boosted Decision Graphs for NLP Learning Tasks. In: Daelemans., W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 58–60 (2001)

    Google Scholar 

  15. Harris, V.P.: Clause Recognition in the Framework of Alignment. In: Mitkov, R., Nicolov, N. (eds.) Recent Advances in Natural Language Processing, pp. 417–425. John Benjamins Publishing Company, Amsterdam/Philadelphia (1997)

    Google Scholar 

  16. Puscasu, G.: A Multilingual Method for Clause Splitting. In: Proceedings of the 7th Annual Colloquium for the UK Special Interest Group for Computational Linguistics, Bir-mingham UK (2004)

    Google Scholar 

  17. Sha, F., Pereira, F.: Shallow Parsing with Conditional Random Fields. In: Proceedings of HLT-NAACL03, pp. 213–220 (2003)

    Google Scholar 

  18. Erik, F.T.K.S., Déjean, H.: Introduction to the CoNLL-2001 shared task: Clause Identification. In: Daelemans, W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 53–57 (2001)

    Google Scholar 

  19. Kudo, T.: CRF++, an Open Source Toolkit for CRF (2005), http://crfpp.sourceforge.net

  20. Van Nguyen, V.: Using Conditional Random Fields for Clause Splitting. In: Proceedings of The Pacific Association for Computational Linguistics, University of Melbourne Australia (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ram, R.V.S., Lalitha Devi, S. (2008). Clause Boundary Identification Using Conditional Random Fields. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78135-6_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78134-9

  • Online ISBN: 978-3-540-78135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics