Skip to main content

A Cascade Multiple Classifier System for Document Categorization

  • Conference paper
Multiple Classifier Systems (MCS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5519))

Included in the following conference series:

Abstract

A novel cascade multiple classifier system (MCS) for document image classification is presented in the paper. It consists of two different classifiers with different feature sets. The proceeding classifier uses image features, learns physical representation of the document, and outputs a set of candidate class labels for the second classifier. The succeeding classifier is a hierarchical classification model based on textual features. The candidate labels set from the first classifier provides subtrees for the second classifier to search in the hierarchical tree and derive a final classification decision. Hence, it reduces the computational complexity and improves classification accuracy for the second classifier. We test the proposed cascade MCS on a large scale set of tax document classification. The experimental results show improvement of classification performance over individual classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, N., Blostein, D.: A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int. J. Doc. Anal. Recognit. 10, 1–16 (2007)

    Article  Google Scholar 

  2. Héroux, P., Diana, S., Ribert, A., Trupin, E.: Classification method study for automatic form class identification. In: Proc. Intl. Conf. on Pattern Recognition (ICPR), Brisbane, Australia, pp. 926–929 (1998)

    Google Scholar 

  3. Wenzel, C., Baumann, S., Jäger, T.: Advances in document classification by voting of competitive approaches. In: Proc. of Intl. Asso. for Pattern Recognition Workshop on Doc. Anal. Syst. (DAS), Malvern, USA, Octber 1996, pp. 352–372 (1996)

    Google Scholar 

  4. Alpaydin, E., Kaynak, C.: Cascading classifiers. Kybernetika 34, 369–374 (1998)

    MATH  Google Scholar 

  5. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)

    Article  Google Scholar 

  6. Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst., Man and Cybern. 22(3), 418–435 (1992)

    Article  Google Scholar 

  7. Kittler, J., Matas, G., Jonsson, K., Sánchez, M.: Combining evidence in personal identity verification systems. Pattern Recog. Lett. 18(9), 845–852 (1997)

    Article  Google Scholar 

  8. Huang, Y.S., Suen, C.Y.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. Pattern Anal. Mach. Intell. 17(1) (1995)

    Google Scholar 

  9. Woods, K., Kegelmeyer, W.P., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19(4), 405–410 (1997)

    Article  Google Scholar 

  10. Larkey, L.S., Croft, W.B.: Combining classifiers in text categorization. In: Proc. of ACM SIGIR, pp. 289–297 (1996)

    Google Scholar 

  11. Hull, D., Pedersen, J., Schuetze, H.: Method combination for document filtering. In: Proc. of ACM SIGIR, pp. 279–287 (1996)

    Google Scholar 

  12. Yang, Y., Ault, T., Pierce, T.: Combining multiple learning strategies for effective cross validation. In: Proc. Intl. Conf. on Mach. Learn. (ICML), pp. 1167–1182 (2000)

    Google Scholar 

  13. Bennett, P.N., Dumais, S., Horvitz, E.: Probabilistic combination of text classifier using reliability indicators: Models and results. In: Proc. of ACM SIGIR, pp. 207–214 (2002)

    Google Scholar 

  14. Sarkar, P.: Image classification: classifying distributions of visual features. In: Proc. Intl. Conf. on Pattern Recognition (ICPR), Hong Kong, pp. 472–475 (2006)

    Google Scholar 

  15. Shin, C., Doermann, D., Rosenfeld, A.: Classification of document pages using structure-based features. Int. J. Doc. Anal. Recognit. 3(4), 232–247 (2001)

    Article  Google Scholar 

  16. Xu, J., Singh, V., Govindaraju, V., Neogi, D.: A hierarchical classification model for document categorization. In: Proc. Intl. Conf. on Doc. Anal. Recognit (ICDAR), Barcelona, Spain (July 2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, JW., Singh, V., Govindaraju, V., Neogi, D. (2009). A Cascade Multiple Classifier System for Document Categorization. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2009. Lecture Notes in Computer Science, vol 5519. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02326-2_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02326-2_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02325-5

  • Online ISBN: 978-3-642-02326-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics