research-article

Orthogonal Feature Alignment Network for Cross-Domain Text Detection

Authors:
Yong Hu

Beijing University of Posts and Telecommunications, China

Beijing University of Posts and Telecommunications, China

0009-0007-4306-0205
View Profile

,
Xueming Li

Beijing University of Posts and Telecommunications, China

Beijing University of Posts and Telecommunications, China

0000-0003-1058-2799
View Profile

,
Yue Zhang

Beijing University of Posts and Telecommunications, China

Beijing University of Posts and Telecommunications, China

0000-0002-6327-5023
View Profile

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics ProcessingJanuary 2024Pages 301–307https://doi.org/10.1145/3647649.3647697

Published:03 May 2024Publication History

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing

Pages 301–307

ABSTRACT

Scene text detection methods based on deep learning have achieved remarkable success. To address the laborious and time-consuming process of manually annotating datasets, a large amount of synthetic data has been created and utilized. However, due to the domain discrepancy between synthetic and real scene data, models trained on synthetic data may suffer from performance degradation when applied to real scenes. In order to tackle the domain shift issue between synthetic and real scene data, we propose the Orthogonal Feature Alignment Network (OFAN) specifically designed for text objects. OFAN incorporates an orthogonal feature enhancement module to strengthen the edge features of text instances, emphasizing the text objects, and employs adversarial training for text instance alignment across domains. Additionally, a multi-transform self-training mixture technique is utilized to further improve the detection performance of the model in the target domain, mitigating the adverse effects of false positives and false negatives. We extensively evaluate OFAN on four benchmark datasets, and the experimental results demonstrate the effectiveness of our approach.

References

Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., & Liang, J. (2017). East: an efficient and accurate scene text detector. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 5551-5560).Google ScholarCross Ref
Zhu, Y., Chen, J., Liang, L., Kuang, Z., **, L., & Zhang, W. (2021). Fourier contour embedding for arbitrary-shaped text detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3123-3131).Google ScholarCross Ref
Wang, W., **e, E., Li, X., Hou, W., Lu, T., Yu, G., & Shao, S. (2019). Shape robust text detection with progressive scale expansion network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9336-9345).Google ScholarCross Ref
Wang, W., **e, E., Song, X., Zang, Y., Wang, W., Lu, T., ... & Shen, C. (2019). Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 8440-8449).Google ScholarCross Ref
Wang, Y., **e, H., Zha, Z. J., **ng, M., Fu, Z., & Zhang, Y. (2020). Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11753-11762).Google ScholarCross Ref
Li, J., Xu, R., Ma, J., Zou, Q., Ma, J., & Yu, H. (2023). Domain adaptive object detection for autonomous driving under foggy weather. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 612-622).Google ScholarCross Ref
Xu, C. D., Zhao, X. R., **, X., & Wei, X. S. (2020). Exploring categorical regularization for domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11724-11733).Google ScholarCross Ref
Kim, S., Choi, J., Kim, T., & Kim, C. (2019). Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6092-6101).Google ScholarCross Ref
Maurya, J., Ranipa, K. R., Yamaguchi, O., Shibata, T., & Kobayashi, D. (2023, January). Domain Adaptation using Self-Training with Mixup for One-Stage Object Detection. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 4178-4187). IEEE.Google ScholarCross Ref
Li, Y. J., Dai, X., Ma, C. Y., Liu, Y. C., Chen, K., Wu, B., ... & Vajda, P. (2022). Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7581-7590).Google ScholarCross Ref
Inoue, N., Furuta, R., Yamasaki, T., & Aizawa, K. (2018). Cross-domain weakly-supervised object detection through progressive domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5001-5009).Google ScholarCross Ref
Zhan, F., Xue, C., & Lu, S. (2019). Ga-dan: Geometry-aware domain adaptation network for scene text detection and recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9105-9115).Google ScholarCross Ref
Wu, W., Lu, N., **e, E., Wang, Y., Yu, W., Yang, C., & Zhou, H. (2020). Synthetic-to-real unsupervised domain adaptation for scene text detection in the wild. In Proceedings of the Asian Conference on Computer Vision.Google Scholar
Deng, J., Luo, X., Zheng, J., Dang, W., & Li, W. (2022). Text Enhancement Network for Cross-Domain Scene Text Detection. IEEE Signal Processing Letters, 29, 2203-2207.Google Scholar
Chen, D., Lu, L., Lu, Y., Yu, R., Wang, S., Zhang, L., & Liu, T. (2019). Cross-domain scene text detection via pixel and image-level adaptation. In Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part V 26 (pp. 135-143). Springer International Publishing.Google ScholarCross Ref
Zheng, J. (2022, January). Multiple-level alignment for cross-domain scene text detection. In 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE) (pp. 671-675). IEEE.Google ScholarCross Ref
Mattolin, G., Zanella, L., Ricci, E., & Wang, Y. (2023). ConfMix: Unsupervised Domain Adaptation for Object Detection via Confidence-based Mixing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 423-433).Google ScholarCross Ref
Radosavovic, I., Dollár, P., Girshick, R., Gkioxari, G., & He, K. (2018). Data distillation: Towards omni-supervised learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4119-4128).Google ScholarCross Ref
Gupta, A., Vedaldi, A., & Zisserman, A. (2016). Synthetic data for text localisation in natural images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2315-2324).Google ScholarCross Ref
Zhan, F., Lu, S., & Xue, C. (2018). Verisimilar image synthesis for accurate detection and recognition of texts in scenes. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 249-266).Google ScholarDigital Library
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., ... & Valveny, E. (2015, August). ICDAR 2015 competition on robust reading. In 2015 13th international conference on document analysis and recognition (ICDAR) (pp. 1156-1160). IEEE.Google Scholar
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., i Bigorda, L. G., Mestre, S. R., ... & De Las Heras, L. P. (2013, August). ICDAR 2013 robust reading competition. In 2013 12th international conference on document analysis and recognition (pp. 1484-1493). IEEE.Google Scholar
Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14 (pp. 443-450). Springer International Publishing.Google Scholar

Index Terms

Orthogonal Feature Alignment Network for Cross-Domain Text Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Cross-domain mapping learning for transductive zero-shot learning
Abstract
Zero-shot learning (ZSL) aims to learn a projection function from a visual feature space to a semantic embedding space or reverse. The main challenge of ZSL is the domain shift problem where the unseen test data has a large gap with ...
Highlights
- Our general algorithm can extend inductive ZSL methods to transductive scenarios.
Read More
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. Unfortunately, it has received much less attention than supervised object detection. Models that try to address this task tend to suffer from a ...
Read More
Cross-Domain Semi-Supervised Learning Using Feature Formulation

Semi-Supervised Learning (SSL) traditionally makes use of unlabeled samples In this paper, sample and instance are interchangeable terms. by including them into the training set through an automated labeling process. Such a primitive Semi-Supervised ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing
January 2024
480 pages
ISBN:9798400716720
DOI:10.1145/3647649

Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Domain adaption
Scene text detection
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 2
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Orthogonal Feature Alignment Network for Cross-Domain Text Detection

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cross-domain mapping learning for transductive zero-shot learning

AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection

Cross-Domain Semi-Supervised Learning Using Feature Formulation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Orthogonal Feature Alignment Network for Cross-Domain Text Detection

ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cross-domain mapping learning for transductive zero-shot learning

AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection

Cross-Domain Semi-Supervised Learning Using Feature Formulation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media