research-article

Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking

Authors:
Dunsheng Chen

Software School, Fudan university, Shanghai China

Software School, Fudan university, Shanghai China
View Profile

,
Yinsheng Li

Software School, Fudan university, Shanghai China

Software School, Fudan university, Shanghai China
View Profile

,
Xu Liang

Software School, Fudan university, Shanghai China

Software School, Fudan university, Shanghai China
View Profile

ICCSE'19: Proceedings of the 4th International Conference on Crowd Science and EngineeringOctober 2019Pages 58–62https://doi.org/10.1145/3371238.3371248

Published:18 October 2019Publication History

ICCSE'19: Proceedings of the 4th International Conference on Crowd Science and Engineering

Pages 58–62

ABSTRACT

The images with fixed layouts, such as images from ID cards, driving licenses, and invoices can be recognized from prior knowledge[1]-[7]. However, The non-immobilized images, such as product labels at ports, is very difficult to be extracted structured data information from tag images because the formats and contents of tags in different countries and different product vary widely[8]. The process is complex and the error rate is high.

This paper combines the characteristics of the Cross-Border Products label, overall format complex and simple local structure (top-to-down and left-to-right), and proposes a method for identifying and structuring port commodity label information. The method mainly establishes a template library of keyword and data unit information of commodity labels according to the port commodity classification and then separates the keyword and the data information from the multi-line text with accurate location information recognized by the OCR engine. Finally, the keyword and data are structured according to the local layout pattern between the keyword and the data, and the structured Cross-Border product information is obtained.

References

Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, and X. Bai, (2016)"Multi-oriented text detection with fully convolutional networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159--4167.Google Scholar
S. Tian, Y. Pan, C. Huang, S. Lu, K. Yu, and C. Lim Tan, (,2015) "Text flow: A unifed text detection system in natural scene images," in Proceedings of the IEEE international conference on computer vision, pp. 4651--4659.Google Scholar
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman, (2016)"Reading text in the wild with convolutional neural networks," International Journal of Computer Vision, vol. 116, no. 1, pp. 1--20.Google ScholarDigital Library
P. He, W. Huang, Y. Qiao, C. C. Loy, and X. Tang, (2016) "Reading scene text in deep convolutional sequences.," in AAAI, vol. 16, pp. 3501--3508.Google Scholar
T. He, W. Huang, Y. Qiao, and J. Yao, (2016)"Accurate text localization innatural image with cascaded convolutional text network," arXiv preprint arXiv:1603.09423.Google Scholar
A. Gupta, A. Vedaldi, and A. Zisserman, (2016) "Synthetic data for textlocalisation in natural images," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2315--2324.Google Scholar
W. Huang, Y. Qiao, and X. Tang, (2014)"Robust scene text detection with convolution neural network induced mser trees," in European Conference on Computer Vision, pp. 497--511, Springer.Google Scholar
J.-Y. Ramel, M. Crucianu, N. Vincent, C. Faure (2006). Detection, Extraction and Representation of Tables. Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR'03).Google Scholar
Y. Shinyama and S. Sekine, (2006)"Preemptive information extraction using unrestricted relation discovery," in Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 304--311, Association for Computational Linguistics.Google Scholar
H. Dejean, (2015). "Extracting structured data from unstructured document with incomplete resources". in Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 271--275, IEEE, 2015.Google ScholarDigital Library
D. A. Ferrucci, (2012)"Introduction to "this is watson"," IBM Journalof Research and Development, vol. 56, no. 3.4, pp. 1--1.Google Scholar
A. Arasu and H. Garcia-Molina, (2003) "Extracting structured data from web pages," in Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 337--348, ACM.Google Scholar
B. Liu, S. Zhang, Z. Hong and X. Ye, (2018) A Horizontal Tilt Correction Method for Ship License Numbers Recognition, Journal of Physics: Conference Series, IOP Publishing.Google Scholar
M. Busta, L. Neumann, and J. Matas, (2015) "Fastext: Efcient unconstrained scene text detector," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1206--1214.Google Scholar
Y. Ye, S. Zhu, J. Wang, Q. Du, Y. Yang, D. Tu, L. Wang and J. Luo (2018). A unifed scheme of text localization and structured data extraction for joint OCR and data mining. 2018 IEEE International Conference on Big Data (Big Data).Google ScholarCross Ref

Index Terms

Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Optical character recognition
  2. Electronic commerce
    1. E-commerce infrastructure
2. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques

Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, ...
Read More
Two template matching approaches to Arabic, Amharic and Latin isolated characters recognition

With the establishment of commercial OCR systems for Latin text, recent research efforts have been directed at the design of recognition systems for non-Latin scripts, such as Japanese, Cyrillic, Chinese, Hindi, Tibetan, and in particular Arabic. The ...
Read More
Character and numeral recognition for non-Indic and Indic scripts: a survey
Abstract
A collection of different scripts is employed in writing languages throughout the world. Character and numeral recognition of a particular script is a key area in the field of pattern recognition. In this paper, we have presented a comprehensive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCSE'19: Proceedings of the 4th International Conference on Crowd Science and Engineering
October 2019
246 pages
ISBN:9781450376402
DOI:10.1145/3371238

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 October 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
OCR
Structured
commodity label
location information
relationship pattern
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ICCSE'19 Paper Acceptance Rate35of92submissions,38%Overall Acceptance Rate92of247submissions,37%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 37
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking

ICCSE'19: Proceedings of the 4th International Conference on Crowd Science and Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques

Two template matching approaches to Arabic, Amharic and Latin isolated characters recognition

Character and numeral recognition for non-Indic and Indic scripts: a survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Tag Information Recognition Approaches and Algorithms for Cross-Border Products Checking

ICCSE'19: Proceedings of the 4th International Conference on Crowd Science and Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques

Two template matching approaches to Arabic, Amharic and Latin isolated characters recognition

Character and numeral recognition for non-Indic and Indic scripts: a survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media