An AI-based approach to auto-analyzing historical handwritten business documents:

Chen, Jinhui; Takiguchi, Tetsuya; Takatsuki, Yasuo; Itoh, Munehiko; Kamihigashi, Takashi

doi:10.1007/s42001-017-0009-2

An AI-based approach to auto-analyzing historical handwritten business documents:

As applied to the Kanebo database

Research Article
Published: 27 November 2017

Volume 1, pages 167–185, (2018)
Cite this article

Journal of Computational Social Science Aims and scope Submit manuscript

Jinhui Chen^1,2,
Tetsuya Takiguchi²,
Yasuo Takatsuki¹,
Munehiko Itoh¹ &
…
Takashi Kamihigashi¹

933 Accesses
Explore all metrics

Abstract

Matching salient points is a key step in visual tasks. However, many of the existing feature representation methods that are widely applied to these tasks, such as scale invariant feature transform (SIFT), suffer from a lack of representation invariance. This shortcoming limits the image representation stability and salient-point matching performance, particularly when images with a great deal of noise information are being processed (e.g., historical documents). We propose a general and effective transformation approach called RIFT (reversal-invariant feature transformation) for feature-robust representation. RIFT achieves gradient binning invariance for feature extraction by transforming the conventional gradient into a polar one. Experimental results on the Kanebo database and three fine-grained reference classification datasets demonstrated that RIFT can robustly improve the performance of local descriptors for image classification without sacrificing computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparison of CNN and Conventional Descriptors for Word Spotting Approach: Application to Handwritten Document Image Retrieval

A novel shape descriptor based on salient keypoints detection for binary image matching and retrieval

Article 10 May 2018

A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents

References

Angelova, A., Zhu, S. (2013) Efficient object detection and segmentation for fine-grained recognition. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 811–818. https://doi.org/10.1109/CVPR.2013.110.
Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-Up robust features (SURF). Comput Vis Image Underst (CVIU), 110(3), 346–359. https://doi.org/10.1016/j.cviu.2007.09.014.
Article Google Scholar
Bosch, A., Zisserman, A., Muoz, X. (2006) Scene classification via plsa. In: A. Leonardis, H. Bischof, A. Pinz (eds.) Proc. Eur. Conf. Comput. Vis. (ECCV), Lecture notes in computer science, vol. 3954, p 517–530. Springer, Heidelberg. https://doi.org/10.1007/11744085_40.
Chai, Y., Lempitsky, V., Zisserman, A. (2013) Symbiotic segmentation and part localization for fine-grained categorization. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), p 321–328. https://doi.org/10.1109/ICCV.2013.47
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C. (2004) Visual categorization with bags of keypoints. In: Proc. Eur. Conf. Comput. Vis. (ECCV) Workshop on statistical learning in computer vision, p 1–22. https://doi.org/10.1007/978-3-319-10599-4_8
Dalal, N., Triggs, B. (2005) Histograms of oriented gradients for human detection. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), vol. 1, p 886–893. https://doi.org/10.1109/CVPR.2005.177.
Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. J Mach Learn Res, 9, 1871–1874. https://doi.org/10.1145/1390681.1442794.
Google Scholar
Feng, J., Ni, B., Tian, Q., Yan, S. (2011) Geometric $l_{p}$-norm feature pooling for image classification. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 2609–2704. https://doi.org/10.1109/CVPR.2011.5995370.
Gavves, E., Fernando, B., Snoek, C., Smeulders, A., & Tuytelaars, T. (2015). Local alignments for fine-grained categorization. Int J Comput Vis (IJCV), 111(2), 191–212. https://doi.org/10.1007/s11263-014-0741-5.
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 580–587. https://doi.org/10.1109/CVPR.2014.81.
Gosselin, P.H., Murray, N., Jgou, H., Perronnin, F (2014) Revisiting the fisher vector for fine-grained classification. Pattern recognition letters 49, 92 – 98. https://doi.org/10.1016/j.patrec.2014.06.011.
Jegou, H., Douze, M., Schmid, C. (2009) On the burstiness of visual elements. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 1169–1176. https://doi.org/10.1109/CVPR.2009.5206609
Leordeanu, M., Hebert, M. (2005) A spectral technique for correspondence problems using pairwise constraints. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), vol. 2, p 1482–1489. https://doi.org/10.1109/ICCV.2005.20
Li, X., Larson, M., Hanjalic, A. (2015) Pairwise geometric matching for large-scale object retrieval. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 5153–5161. https://doi.org/10.1109/CVPR.2015.7299151.
Lin, H. T., Lin, C. J., & Weng, R. C. (2007). A note on platts probabilistic outputs for support vector machines. Mach Learn, 68(3), 267–276.
Article Google Scholar
Lowe, D. (1999) Object recognition from local scale-invariant features. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), p 1150–1157. https://doi.org/10.1109/ICCV.1999.790410
Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A. (2013) Fine-grained visual classification of aircraft. Tech Rep
Murray, N., Perronnin, F. (2014) Generalized max pooling. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 2473–2480. https://doi.org/10.1109/CVPR.2014.317
Parkhi, O., Vedaldi, A., Zisserman, A., Jawahar, C. (2012) Cats and dogs. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 3498–3505. https://doi.org/10.1109/CVPR.2012.6248092
Perronnin, F., Sánchez, J., Mensink, T. (2010) Improving the fisher kernel for large-scale image classification. In: Proc. Eur. Conf. Comput. Vis. (ECCV), p 143–156. http://dl.acm.org/citation.cfm?id=1888089.1888101
Platt, J. (2000) Probabilities for SV machines. In: Advances in large margin classifiers
Sivic, J., Zisserman, A. (2003) Video google: a text retrieval approach to object matching in videos. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), vol. 2, p 1470–1477. https://doi.org/10.1109/ICCV.2003.1238663
Snchez, J., Perronnin, F., Mensink, T., & Verbeek, J. (2013). Image classification with the fisher vector: Theory and practice. Int J Comput Vis (IJCV), 105(3), 222–245. https://doi.org/10.1007/s11263-013-0636-x.
Article Google Scholar
Sun, Z., Ampornpunt, N., Varma, M., Vishwanathan, S. Multiple kernel learning and the SMO algorithm. In: Advances in neural information processing systems 23
Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., & Girod, B. (2013). Fast computation of rotation-invariant image features by an approximate radial gradient transform. IEEE Trans Image Proc (TIP), 22(8), 2970–2982. https://doi.org/10.1109/TIP.2012.2230011.
Article Google Scholar
Tuytelaars, T. (2010) Dense interest points. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 2281–2288. https://doi.org/10.1109/CVPR.2010.5539911
Vedaldi, A., Fulkerson, B. (2010) Vlfeat: An open and portable library of computer vision algorithms. In: Proc. ACM the International Conference on Multimedia (ACM MM), ACM, New York, p 1469–1472. https://doi.org/10.1145/1873951.1874249.
Vedaldi, A., Mahendran, S., Tsogkas, S., Maji, S., Girshick, R., Kannala, J., Rahtu, E., Kokkinos, I., Blaschko, M., Weiss, D., Taskar, B., Simonyan, K., Saphra, N., Mohamed, S. (2014) Understanding objects in detail with fine-grained attributes. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), p 3622–3629. https://doi.org/10.1109/CVPR.2014.463
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S. (2011) The caltech-UCSD Birds-200-2011 dataset. Tech Rep
Wang, Z., Feng, J., & Yan, S. (2015). Collaborative linear coding for robust image classification. Int J Comput Vis, 114(2–3), 322–333. https://doi.org/10.1007/s11263-014-0739-z.
Article Google Scholar
Xie, L., Tian, Q., Wang, J., Zhang, B. (2015) Image classification with max-sift descriptors. In: Proc. Int. Conf. Acoustics, Speech and Signal Proc. (ICASSP)
Xie, L., Tian, Q., Wang, M., & Zhang, B. (2014). Spatial pooling of heterogeneous features for image classification. IEEE Trans Image Proc (TIP), 23, 1994–2008.
Article Google Scholar
Xie, L., Tian, Q., Zhang, B. (2014) Max-sift: Flipping invariant descriptors for web logo search. In: Proc. IEEE Int. Conf. Image Proc. (ICIP), pp. 5716–5720. https://doi.org/10.1109/ICIP.2014.7026156.
Xie, L., Wang, J., Lin, W., Zhang, B., Tian, Q. (2015) Ride: reversal invariant descriptor enhancement. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), p 100–108
Zhang, N., Farrell, R., Iandola, F., Darrell, T. (2013) Deformable part descriptors for fine-grained recognition and attribute prediction. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), p 729–736. https://doi.org/10.1109/ICCV.2013.96
Zhao, W. L., & Ngo, C. W. (2013). Flip-invariant sift for copy and object detection. IEEE Trans Image Proc (TIP), 22(3), 980–991. https://doi.org/10.1109/TIP.2012.2226043.
Article Google Scholar

Download references

Acknowledgements

This project was supported in part by the project funder of OAIR at Kobe University (NO. JINSYA3), PRESTO, JST (Grant No. JPMJPR15D2) and JSPS KAKENHI (Grant No. JP17H01995 and JP16H02032).

Author information

Authors and Affiliations

Research Institute for Economics and Business Administration, Kobe University, 2-1 Rokkodai, Nada, Kobe, 657-8501, Japan
Jinhui Chen, Yasuo Takatsuki, Munehiko Itoh & Takashi Kamihigashi
Graduate School of System Informatics, Kobe University, 1-1 Rokkodai, Nada, Kobe, 657-8501, Japan
Jinhui Chen & Tetsuya Takiguchi

Authors

Jinhui Chen
View author publications
You can also search for this author inPubMed Google Scholar
Tetsuya Takiguchi
View author publications
You can also search for this author inPubMed Google Scholar
Yasuo Takatsuki
View author publications
You can also search for this author inPubMed Google Scholar
Munehiko Itoh
View author publications
You can also search for this author inPubMed Google Scholar
Takashi Kamihigashi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jinhui Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., Takiguchi, T., Takatsuki, Y. et al. An AI-based approach to auto-analyzing historical handwritten business documents:. J Comput Soc Sc 1, 167–185 (2018). https://doi.org/10.1007/s42001-017-0009-2

Download citation

Received: 27 September 2017
Accepted: 16 November 2017
Published: 27 November 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s42001-017-0009-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An AI-based approach to auto-analyzing historical handwritten business documents:

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Comparison of CNN and Conventional Descriptors for Word Spotting Approach: Application to Handwritten Document Image Retrieval

A novel shape descriptor based on salient keypoints detection for binary image matching and retrieval

A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now