skip to main content
10.1145/3633637.3633715acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

Printed Multilingual Document Image Retrieval Based On Improved SURF

Published: 28 February 2024 Publication History

Abstract

In the process of printed document image retrieval, the traditional algorithm, SURF algorithm combined with violent matching, has the problems of low retrieval accuracy and low retrieval efficiency. This paper proposes a FAST + PCA-SURF combined with KNN algorithm based on FLANN for multilingual document image retrieval. Based on FAST, the feature points are detected, and the feature descriptors after dimensionality reduction are extracted by PCA-SURF.Then, the KNN algorithm based on FLANN is used for feature matching, and finally the appropriate matching results are output. The experimental results show that the proposed algorithm is improved in terms of time complexity and retrieval accuracy compared with the traditional algorithm. The average time complexity and retrieval accuracy of the traditional algorithm are 0.1783 s and 71.8%, respectively, while the proposed algorithm is 0.0464 s and 77.8%, indicating that the proposed algorithm achieves better experimental results in multilingual document image retrieval.

References

[1]
Shuxia Bai and Yulai Bao. 2017. LDA word image representation of Mongolian ancient book image keyword retrieval method. Journal of Modern Information 37, 7 (2017), 5.
[2]
Hongyan Duan, Xiaoyu Zhang, and Wensi He. 2019. Optimization of SURF Algorithm for Image Matching of Parts. Transactions on Intelligent Welding Manufacturing13 (2019), 145–157.
[3]
Yunfeng gao, Chaowei Li, and Jianhui Li. 2012. Research on indoor mobile robot visual odometry. Transducer and Microsystem Technologies 31, 2 (2012), 4.
[4]
Kai Guo and Junmei Ai. 2022. An improved KNN medical classification algorithm based on FLANN. Computer and Modernization008 (2022), 000.
[5]
Shiguo Huang, Guobing Sun, and Minglun Li. 2021. FAST and FLANN for feature matching based on SURF. In 2021 33rd Chinese Control and Decision Conference (CCDC). 1584–1589. https://doi.org/10.1109/CCDC52312.2021.9601366
[6]
Mufeng Li. 2017. Research on document image retrieval and text detection. Ph. D. Dissertation. Harbin University of Technology.
[7]
Zhixu Lu, Zhihao Zhu, Yu Guo, and Zhi Gao. 2023. Image matching method based on improved SURF feature points. Software Guide 22, 3 (2023), 184–188.
[8]
Kuvatbak Mamuti. 2020. Research on Kazakh stem segmentation based on machine learning methods. Computer technology and development 30, 4 (2020), 7.
[9]
Heba Adnan Raheem and Tawfiq A. Al-Assadi. 2022. Video Important Shot Detection Based on ORB Algorithm and FLANN Technique. In 2022 8th International Engineering Conference on Sustainable Technology and Development (IEC). 113–117. https://doi.org/10.1109/IEC54822.2022.9807488
[10]
Zhiping Song, Yali Zhu, Xuebin Xu, Maimaiti Wuernisha, and Wubuli Kuerban. 2022. Secondary retrieval of Uyghur keyword images based on gray histogram and improved Hu invariant moment. Journal of Xinjiang University (Natural Science Edition)039-003 (2022).
[11]
Manyi Wu. 2018. Research on optimization of image fast feature point matching algorithm. EURASIP Journal on Image and Video Processing18 (2018), 0354–y.
[12]
Fang Xie. [n. d.]. Research on key technology of image processing for automobile appearance defect detection. Ph. D. Dissertation. Nanjing University of Aeronautics and Astronautics.
[13]
Xuebin Xu, Abdiriyimu AlimuJiang, Yali Zhu, Aisha AlimuJiang, and Wubuli Kuerban. 2021. Uyghur image keyword search based on spatial relationship. Computer Engineering and Design 042, 002 (2021), 497–503.
[14]
Gen Yu, Fei Yin, Youbin Chen, and Chenglin Liu. 2015. Index-based fast keyword search for handwritten Chinese documents. Pattern recognition and artificial intelligence 28, 11 (2015), 8.
[15]
Long Zhan and Can Zhen. 2022. Improved target recognition algorithm based on SURF and RANSAC. China Plant Engineering23 (2022), 140–143.

Index Terms

  1. Printed Multilingual Document Image Retrieval Based On Improved SURF

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCPR '23: Proceedings of the 2023 12th International Conference on Computing and Pattern Recognition
    October 2023
    589 pages
    ISBN:9798400707988
    DOI:10.1145/3633637
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 February 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FAST
    2. FLANN matching
    3. PCA-SURF
    4. Printed multilingual document images
    5. keyword search.

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Natural Science Foundation of Xinjiang Science and Technology Department

    Conference

    ICCPR 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 21
      Total Downloads
    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media