Loading [MathJax]/extensions/MathZoom.js
Sub-word based Persian OCR Using Auto-Encoder Features and Cascade Classifier | IEEE Conference Publication | IEEE Xplore

Sub-word based Persian OCR Using Auto-Encoder Features and Cascade Classifier


Abstract:

In Persian text, unlike English, the letters are stick together and their shapes varies depending on where they are in the word. This makes it difficult to distinguish le...Show More

Abstract:

In Persian text, unlike English, the letters are stick together and their shapes varies depending on where they are in the word. This makes it difficult to distinguish letters in Persian OCR. One way to overcome this problem is to recognize the sub-words, not the letters. In this paper with the help of a complete sub-word image dictionary, a new approach for Persian OCR is presented. For sub-word image recognition, two cascade SVM classifiers that are trained with the features extracted by Auto-Encoder, are exploited. The extracted sub-word texts form the words based on the results of a pre-process word segmentation step. The resulted text is enhanced using a fast post-process algorithm which uses a word dictionary.
Date of Conference: 17-19 December 2018
Date Added to IEEE Xplore: 07 March 2019
ISBN Information:
Conference Location: Tehran, Iran

Contact IEEE to Subscribe

References

References is not available for this document.