Loading [MathJax]/extensions/MathMenu.js
OCR-Diff: A Two-Stage Deep Learning Framework for Optical Character Recognition Using Diffusion Model in Industrial Internet of Things | IEEE Journals & Magazine | IEEE Xplore

OCR-Diff: A Two-Stage Deep Learning Framework for Optical Character Recognition Using Diffusion Model in Industrial Internet of Things


Abstract:

Optical character recognition (OCR) is one of the key enabling technologies in industrial Internet of Things (IIoT) for extracting and utilizing useful textual informatio...Show More

Abstract:

Optical character recognition (OCR) is one of the key enabling technologies in industrial Internet of Things (IIoT) for extracting and utilizing useful textual information, but it is technically challenging due to poor environmental conditions. To deal with such challenges, in this letter, we propose a novel two-stage deep learning framework for OCR using a generative diffusion model, namely, OCR-Diff. In the first stage, our customized conditional U-Net is pretrained jointly with a feature extractor with the aid of the forward diffusion process such that the quality of a low-resolution text image is improved via the reverse diffusion process. In the next stage, the pretrained conditional U-Net and feature extractor are jointly fine tuned for an off-the-shelf text recognizer to precisely recognize the texts in the image. Experimental results on TextZoom data sets substantiate the superiority and effectiveness of the proposed scheme.
Published in: IEEE Internet of Things Journal ( Volume: 11, Issue: 15, 01 August 2024)
Page(s): 25997 - 26000
Date of Publication: 18 April 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.