Conferences >2014 4th International Confer...

Investigation of feature selection for historical document layout analysis

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper we investigate the importance of individual features for the task of document layout analysis, in particular for the classification of the document pixels. ...View more

Metadata

Abstract:

In this paper we investigate the importance of individual features for the task of document layout analysis, in particular for the classification of the document pixels. The feature set consists of numerous state-of-the-art features, including color, gradient, and local binary patterns (LBP). To deal with the high dimensionality of the feature set, we propose a cascade of an adapted forward selection and a genetic selection. We have evaluated our feature selection method on three historical document datasets. For the classification we used machine learning methods which classify each pixel into either periphery, background, text block, or decoration. The proposed cascading feature selection method reduced the number of features significantly while preserving the cross-validation performance. Furthermore, it selected less features with comparable performance, compared with the conventional feature selection methods. In our analysis we found that LBP features are consistently selected by all feature selection methods on all three datasets. This indicates that LBP correlate highly with the pixel classes much more than any other type of features does. These findings suggest a clue in paradigm for document layout analysis in general.

Published in: 2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA)

Date of Conference: 14-17 October 2014

Date Added to IEEE Xplore: 08 January 2015

ISBN Information:

ISSN Information:

DOI: 10.1109/IPTA.2014.7001961

Conference Location: Paris, France

Contents

References is not available for this document.

Investigation of feature selection for historical document layout analysis

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Investigation of feature selection for historical document layout analysis

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?