Skip to main content

Bengali Place Name Recognition - Comparative Analysis Using Different CNN Architectures

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1377))

Included in the following conference series:

  • 1367 Accesses

Abstract

Optical Character Recognition (OCR) has been deployed in the past in different application areas such as automatic transcription and indexing of document images, reading aid for the visually impaired persons, postal automation etc. However, the performance in many cases has not been impressive due to the fact that character segmentation is itself an error-prone and difficult operation, which leads to the poor performance of the system due to erroneous segmentation of characters. Hence, for many applications (like document indexing, Postal automation) where full character-wise transcription is not required, word recognition is the preferred method these days. This article investigates recognition of Bengali place names as word images using 5 different traditional architectures. Experiments on word images (of Bengali place names) from 608 classes were conducted. Encouraging results were obtained in all instances.

The data-set will be available on request. Please contact the first author via email.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A sign which used in consonant alphabet of the script

References

  1. Poznanski, A., Wolf, L.: CNN-N-gram for handwriting word recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2305–2314 (2016)

    Google Scholar 

  2. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556

  3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR, abs/1512.03385 (2015). http://arxiv.org/abs/1512.03385

  4. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. CoRR, abs/1512.00567 (2015). http://arxiv.org/abs/1512.00567

  5. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. CoRR, abs/1610.02357 (2016). http://arxiv.org/abs/1610.02357

  6. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenet v2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://arxiv.org/pdf/1801.04381.pdf

  7. Sharma, A., Pramod Sankar, K.: Adapting off-the-shelf CNNs for word spotting & recognition. In: 13th International Conference on Document Analysis and Recognition, ICDAR 2015, Nancy, France, August 23–26, 2015, pp. 986–990 (2015)

    Google Scholar 

  8. Sebastian Sudholt and Gernot A. Fink. Phocnet: A deep convolutional neural network for word spotting in handwritten documents. In 15th International Conference on Frontiers in Handwriting Recognition, ICFHR 2016, Shenzhen, China, October 23–26, 2016, pages 277–282, 2016

    Google Scholar 

  9. Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. In Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 545–552 (2009)

    Google Scholar 

  10. Bluche, T., et al.: Preparatory KWS experiments for large-scale indexing of a vast medieval manuscript collection in the HIMANIS project. In: 14th IAPR International Conference on Document Analysis and Recognition, ICDAR 2017, Kyoto, Japan, 9–15 November, 2017, pp. 311–316 (2017)

    Google Scholar 

  11. Chanda, S., Okafor, E., Hamel, S., Stutzmann, D., Schomaker, L.: Deep learning for classification and as tapped-feature generator in medieval word-image recognition. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 217–222 (2018)

    Google Scholar 

  12. Chakrapani Gv, A., Chanda, S., Pal, U., Doermann, D.: One-shot learning-based handwritten word recognition. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W.Q. (eds.) ACPR 2019. LNCS, vol. 12047, pp. 210–223. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41299-9_17

    Chapter  Google Scholar 

  13. Chanda, S., Baas, J., Haitink, D., Hamel, S., Stutzmann, D., Schomaker, L.: Zero-shot learning based approach for medieval word recognition using deep-learned features. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 345–350. IEEE (2018)

    Google Scholar 

  14. Stutzmann, D., et al.: Handwritten text recognition, keyword indexing, and plain text search in medieval manuscripts (2018)

    Google Scholar 

  15. Russakovsky, O., Deng, J., Hao, S., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  16. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  17. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

    Google Scholar 

  18. Liu, S., Long, Yu., Zhang, D.: An efficient method for high-speed railway dropper fault detection based on depthwise separable convolution. IEEE Access 7, 135678–135688 (2019)

    Article  Google Scholar 

  19. Yoo, B., Choi, Y., Choi, H.: Fast depthwise separable convolution for embedded systems. In: Cheng, L., Leung, A.C.S., Ozawa, S. (eds.) ICONIP 2018. LNCS, vol. 11307, pp. 656–665. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04239-4_59

    Chapter  Google Scholar 

  20. Pal, U., Roy, R.K., Kimura, F.: Multi-lingual city name recognition for Indian postal automation. In: 2012 International Conference on Frontiers in Handwriting Recognition, ICFHR 2012, Bari, Italy, 18–20 September, 2012, pp. 169–173. IEEE Computer Society (2012). https://doi.org/10.1109/ICFHR.2012.238

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prashant Kumar Prasad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Prasad, P.K., Banerjee, P., Chanda, S., Pal, U. (2021). Bengali Place Name Recognition - Comparative Analysis Using Different CNN Architectures. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1092-9_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1091-2

  • Online ISBN: 978-981-16-1092-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics