Conferences >2012 35th International Confe...

Automatic extraction of non-textual information in web document and their classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper deals with research in the area of automatic extraction of textual and non-textual information and their classification. The main idea is to create a robust me...Show More

Metadata

Abstract:

This paper deals with research in the area of automatic extraction of textual and non-textual information and their classification. The main idea is to create a robust method for extraction of image and textual segments to obtain short web document. Thus, developed method consist of two data types extractions, where both image and text data extraction are using Document Object Model tree. Extracted objects are saved in separate databases followed the images analysis that define and describe image object from semantic point of view. Moreover, the semantic description of all modal objects are utilized to short web document creation. To accurate object classification, the fast and powerful hybrid segmentation algorithm based on Mean Shift and Believe Propagation principles are mentioned in this paper, too. Likewise, the image segmentation algorithm was integrated with SIFT descriptor. Finally, in order to obtain a semantic description of objects in static image, the SVM classification is used. The developed method was tested on real unsegmented and segmented images, too.

Published in: 2012 35th International Conference on Telecommunications and Signal Processing (TSP)

Date of Conference: 03-04 July 2012

Date Added to IEEE Xplore: 02 August 2012

ISBN Information:

DOI: 10.1109/TSP.2012.6256398

Conference Location: Prague, Czech Republic

Contents

References is not available for this document.

Automatic extraction of non-textual information in web document and their classification

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Automatic extraction of non-textual information in web document and their classification

Alerts

Abstract:

Metadata

Abstract:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?