Abstract:
Most real-time end-to-end text spotting methods employ sequence models as their recognition heads. However, these models generate characters one by one, which is ineffici...Show MoreMetadata
Abstract:
Most real-time end-to-end text spotting methods employ sequence models as their recognition heads. However, these models generate characters one by one, which is inefficient when there are many characters. To solve this problem, we propose a Character Feature Summarization (CFS) Model, which can predict fixed-length characters in parallel, regardless of length. Specifically, we propose a Character Feature Summarization Module (CFSM) consisting of a Global Feature Capture and a Historical Feature Summarizer to extract and summarize global character features, enabling getting characters by simple linear prediction. We use Multi-stage Testing, cascading multiple CFSMs to obtain multi-stage summarized global character features to obtain several predictions for better convergence. The Result Selector is used to select the most likely result. Experiments on the Total-Text dataset show that CFS achieves a 3.53% improvement on the "Full" while being 3.6 times faster than ABCNet v2’s head.
Published in: 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)
Date of Conference: 04-07 December 2023
Date Added to IEEE Xplore: 29 January 2024
ISBN Information: