Abstract
Evaluation of text chunking is revisited. The proposed method tries to analyze the errors made by a chunker and formulates an evaluation strategy that brings out the strength and weakness of a chunker in a better way than the existing precision, recall and F score based methods or their variants do. A tree-matching based algorithm of linear time complexity is designed, analyzed, and illustrated by giving examples. Correctness of the algorithm is checked by using a chunker and a set of test sentences.







References
Abney, S., & Abney, S. P. (1991). Parsing by chunk. In: R. C. Berwick, S. P. Abney & C. Tenny (Eds.), Principle-based parsing: Computation and Psycholinguistics. (pp. 257–278). Dordrecht: Kluwer Academic Publishers.
Bharti, A., Sangal, R., & Sharma, D. M. (2007). SSF: Shakti Standard Format Guide. Hyderabad: LTRC, IIIT.
Bharti, A., Sharma, D. M., Husain, S., Bai, L., Begam, R., & Sangal, R. (2009). AnnCorra:TreeBanks for Indian Languages, Guidelines for Annotating Hindi TreeBank v2.0. Hyderabad: LTRC, IIIT.
Biswas, S., Dhar, A., De, S., & Garain, U. (2010). Performance evaluation of text chunking. In Proceedings of the 8th international conference on natural language processing (ICON), Kharagpur, India.
Black, E., Abney, S., Flickenger, D., Gdaniec, C., Grishman, R., Harison, P., Hindle, D., Ingria, R., Jelineck, F., Klavan, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., & Strzalkozskijl, T. (1991). A procedure for quantitatively comparing the syntactic coverage of english grammars. In Proceedings of the 4th DARPA speech and natural language workshop, Morgan Kaufman, pp. 306–311.
Carroll, J., Briscoe, T., & Sanfilippo, A. (1998). Parser evaluation: A survey and a new proposal. In Proceedings of the 1st international conference language resources and evaluation (LREC), pp. 447–454.
Carroll, J., Frank, A., Lin, D., Prescher, D., & Uszkoreit, H. (2002). Beyond PARSEVAL—towards improved evaluation measures for parsing system. In Proceedings of 3rd international conference Language Resources and Evaluation (LREC).
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to algorithms (3rd ed.). Cambridge, MA: MIT Press.
De, S., Dhar, A., Biswas, S., & Garain, U. (2011). On development and evaluation of a chunker in Bangla. In Proceedings of 2nd international conference on Emerging Applications of Information Technology (EAIT), pp. 321–324.
Husian, S., Mannem, P., Ambati, B., & Gadde, P. (2010). Proceedings of ICON10 NLP Tools Contest: Indian language dependency parsing. The 8th international conference on natural language processing (ICON), India.
Lin, D. (2003). Dependency-based evaluation of Minipar. In: A. Abeille (Ed.), Treebanks: Building and using parsed corpora (Chap. 18, Vol. 20, pp. 317–329). The Netherlands: Springer.
Manning, C. D., & Schutze, H. (1999). Foundation of statistical natural language processing. Cambridge, MA: MIT Press.
Paroubek, P., Hamon, O., Clergerie, E., Grouin, C., & Vilnat, A. (2010). The second evaluation campaign of PASSAGE on parsing of French. In Proceedings of 7th international conference on language resources and evaluation (LREC), pp. 19–21.
Paroubek, P., Robba, I., Vilnat, A., & Ayache, C. (2008). Easy, evaluation of parsers of French: What are the results? In Proceedings of 6th international conference language resources and evaluation (LREC).
Roark, B. (2002). Evaluating parser accuracy using edit distance. In Proceedings of the beyond PARSEVAL workshop, 3rd international conference language resources and evaluation (LREC), pp. 30–36.
Sakoe, H., & Chiba, S. (1978), Dynamic programming algorithm optimization for spoken word recognition. In IEEE transactions on acoustics. Speech and signal processing, Vol. 2, pp. 43–49.
Sampson, G., & Babarczy, A. (2003). A test of the leaf-ancestor metric for parse accuracy. Journal of Natural Language Engineering, 9, 365–380.
Sang Tjong Kim, E. F., & Buchholz, S. (2000) Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL-2000 and LLL-2000 (pp. 127–132). Lisbon, Portugal.
Singh, A., Bendre, S. M., & Sangal, R. (2005), HMM based chunker for Hindi. In Proceedings 2nd International Joint Conference on Natural Language Processing (IJCNLP), Jeju Island, Republic of Korea.
Srinivas, B. (2000). A lightweight dependency analyzer for partial parsing. Natural Language Engineering, 6(2), 113–138.
Srinivas, B., Doran, C., Hockey, B. A., & Joshi, A. (1996). An approach to robust partial parsing and evaluation metrics. In Proceedings of 8th european summer school in logic, language and information, pp. 70–82.
Zhang, K., & Shasha, D. (1989). Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 18, 1245–1262.
Acknowledgments
The authors sincerely thank the anonymous reviewers of this paper. We also express our gratitude to one of the reviewers who appreciated our work and pointed out its need for revisiting chunking in the context of noisy text (sms, tweet, blog, email, etc.) analysis.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Maiti, S., Garain, U., Dhar, A. et al. A novel method for performance evaluation of text chunking. Lang Resources & Evaluation 49, 215–226 (2015). https://doi.org/10.1007/s10579-013-9250-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-013-9250-3