DeepSpacy-NER: an efficient deep learning model for named entity recognition for Punjabi language

Singh, Navdeep; Kumar, Munish; Singh, Bavalpreet; Singh, Jaskaran

doi:10.1007/s12530-022-09453-1

DeepSpacy-NER: an efficient deep learning model for named entity recognition for Punjabi language

Original Paper
Published: 03 August 2022

Volume 14, pages 673–683, (2023)
Cite this article

Evolving Systems Aims and scope Submit manuscript

Navdeep Singh¹,
Munish Kumar²,
Bavalpreet Singh³ &
…
Jaskaran Singh³

510 Accesses
5 Citations
2 Altmetric
Explore all metrics

Abstract

Named entity recognition is a technique for extracting named entities from text and classifying them into various entity types. There has been a lot of research done on the Punjabi language’s Shahmukhi script, with less emphasis on the Gurmukhi script. This paper proposes a novel technique for extracting named entities from sentences written in the Punjabi language’s Gurmukhi script, which categorizes the entities into six different entity types. 15 k sentences from the Indic data corpus’ Punjabi data and various newspapers were used for this work, and they were annotated with Doccano, an open-source annotation tool. In addition, the researchers proposed and made public an annotated benchmark corpus for Gurmukhi script. The model was trained on the Spacy framework with only 12 k sentences selected at random from the Punjabi data corpus, and the results were validated with the remaining 3 k sentences in terms of F1-score, which was chosen as the evaluation metric. The experimental results have been analyzed, and the article contains useful information about the technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 4

Fig. 6

Named Entity Recognition in Russian Using Multi-Task LSTM-CRF

Article 22 June 2023

PUNER-Parsi ULMFiT for Named-Entity Recognition in Persian Texts

Turkish Named-Entity Recognition

References

Ahmad MT et al (2020) Named entity recognition and classification for Punjabi Shahmukhi. ACM Trans Asian Low Resour Lang Inf Process 19(4):1–13. https://doi.org/10.1145/3383306
Article Google Scholar
Ali W, Lu J, Xu Z (2020) SiNER: a large dataset for Sindhi named entity recognition. In Proceedings of the 12th language resources and evaluation conference. European Language Resources Association, pp 2953–2961. https://aclanthology.org/2020.lrec-1.361
Athavale V, Bharadwaj S, Pamecha M, Prabhu A, Shrivastava M (2016) Towards deep learning in Hindi NER: an approach to tackle the labelled data scarcity. arXiv:https://doi.org/10.48550/arXiv.1610.09756
Boden M (2001) A guide to recurrent neural networks and backpropagation. School of Information Science, Computer and Electrical Engineering, Halmstad University. https://axon.cs.byu.edu/~martinez/classes/678/Papers/RNN_Intro.pdf
Dadas S (2019) Combining neural and knowledge-based approaches to named entity recognition in Polish. In: Rutkowski L, Scherer R, Korytkowski M, Pedrycz W, Tadeusiewicz R, Zurada J (eds) Artificial intelligence and soft computing. ICAISC 2019, Lecture Notes in Computer Science, vol 11508. Springer, Cham, pp 39–50. https://doi.org/10.1007/978-3-030-20912-4_4
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies (Long and Short Papers), vol 1. Association for Computational Linguistics, pp 4171–4186. https://aclanthology.org/N19-1423
Ekbal A, Bandyopadhyay S (2011) Named entity recognition in Bengali and Hindi using support vector machine. Lingvisticae Investig 34(1):35–67. https://doi.org/10.1075/li.34.1.02ekb
Article Google Scholar
Ekbal A, Haque R, Bandyopadhyay S (2008) Named entity recognition in Bengali: a conditional random field approach. IJCNLP
Epelbaum T (2017) Deep learning: technical introduction. arXiv: https://doi.org/10.48550/arXiv.1709.01412
Gia Hoang P, Thanh Nguyen L, Nguyen K (2021) UIT-E10dot3 at SemEval-2021 Task 5: toxic spans detection with named entity recognition and question-answering approaches. In: Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021), Association for Computational Linguistics, pp 919–926. https://doi.org/10.18653/v1/2021.semeval-1.125
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. In: 30th International conference on machine learning, PMLR, pp 1319–1327. http://arxiv.org/abs/1302.4389
Goyal A, Gupta V, Kumar M (2019) Analysis of different supervised techniques for named entity recognition. In: Luhach A, Jat D, Hawari K, Gao XZ, Lingras P (eds) Advanced Informatics for Computing Research. ICAICR 2019. Communications in computer and information science, vol 1075. Springer, Singapore. https://doi.org/10.1007/978-981-15-0108-1_18
Goyal A, Gupta V, Kumar M (2021) A deep learning-based bilingual Hindi and Punjabi named entity recognition system using enhanced word embeddings. Knowl Based Syst 234:107601. https://doi.org/10.1016/j.knosys.2021.107601
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag. https://doi.org/10.1109/MSP.2012.2205597
Article Google Scholar
Kakwani D, Kunchukuttan A, Golla S, Gokul NC, Bhattacharyya A, Khapra M, Kumar P (2020) IndicNLPSuite: monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp 4948–4961. https://doi.org/10.18653/v1/2020.findings-emnlp.445
Khalid M, Baber J, Kasi M, Bakhtyar M, Devi V, Sheikh N (2020) Empirical evaluation of activation functions in deep convolution neural network for facial expression recognition. In: 2020 43rd International conference on telecommunications and signal processing (TSP), pp 204–207. https://doi.org/10.1109/TSP49548.2020.9163446
Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C (2016) Neural architectures for named entity recognition. arXiv: https://doi.org/10.48550/arXiv.1603.01360
Li X, Feng J, Meng Y, Han Q, Wu F, Li J (2020a) A unified MRC framework for named entity recognition. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5849–5859. https://doi.org/10.18653/v1/2020.acl-main.519
Li X, Sun X, Meng Y, Liang J, Wu F, Li J (2020b) Dice loss for data-imbalanced NLP tasks. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 465–476. https://doi.org/10.18653/v1/2020.acl-main.45
Li Y, Zhang M, Chen C (2022) A deep-learning intelligent system incorporating data augmentation for short-term voltage stability assessment of power systems. Appl Energy 308:118347. https://doi.org/10.1016/j.apenergy.2021.118347
Article Google Scholar
Malarkodi CS, Devi SL (2020) A deeper study on features for named entity recognition. In: Proceedings of the WILDRE5 5th workshop on Indian language data: resources and evaluation. European Language Resources Association (ELRA), Marseille, pp 66–72. https://aclanthology.org/2020.wildre-1.12
Mikolov T, Kombrink S, Burget L, Černocký J, Khudanpur S (2011) Extensions of recurrent neural network language model. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 5528–5531. https://doi.org/10.1109/ICASSP.2011.5947611
Nakayama H, Kubo T, Kamura J, Taniguchi Y, Liang X (2018) Doccano: text annotation tool for human. https://github.com/doccano/doccano. Accessed 9 Nov 2021
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American chapter of the association for computational linguistics: human language technologies, Association for Computational Linguistics (Long Papers), vol 1, pp 2227–2237. https://doi.org/10.18653/v1/n18-1202
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners
Rezaeinia SM, Rahmani R, Ghodsi A, Veisi H (2019) Sentiment analysis based on improved pre-trained word embeddings. Expert Syst Appl 117:139–147. https://doi.org/10.1016/j.eswa.2018.08.044
Article Google Scholar
Saha SK, Chatterji S, Dandapat S, Sarkar S, Mitra P (2008) A hybrid named entity recognition system for south and south east Asian languages. [Online]. https://aclanthology.org/I08-5004. Accessed 7 Mar 2022
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003
Article Google Scholar
Shah B, Kopparapu SK (2019) A deep learning approach for Hindi named entity recognition. arXiv: https://doi.org/10.48550/arXiv.1911.01421
Singh S, Kumar A, Darbari H, Singh L, Rastogi A, Jain S (2017) Machine translation using deep learning: an overview. In: 2017 International conference on computer, communications and electronics (Comptelix), pp 162–167. https://doi.org/10.1109/COMPTELIX.2017.8003957
Staudemeyer RC, Morris ER (2019) Understanding LSTM—a tutorial into long short-term memory recurrent neural networks. arXiv: https://doi.org/10.48550/arXiv.1909.09586
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,Kaiser L, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems, vol 30. https://doi.org/10.48550/arXiv.1706.03762
Xie J, Yang Z, Neubig G, Smith NA, Carbonell J (2018) Neural cross-lingual named entity recognition with minimal resources. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Association for Computational Linguistics, Brussels, pp 369–379. https://doi.org/10.18653/v1/D18-1034
Yadav V, Bethard S (2019) A survey on recent advances in named entity recognition from deep learning models. arXiv: https://doi.org/10.48550/arXiv.1910.11470

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Punjabi University, Patiala, Punjab, India
Navdeep Singh
Department of Computational Sciences, Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab, India
Munish Kumar
Tatras Data Services Pvt. Ltd., Mohali, Punjab, India
Bavalpreet Singh & Jaskaran Singh

Authors

Navdeep Singh
View author publications
You can also search for this author in PubMed Google Scholar
Munish Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Bavalpreet Singh
View author publications
You can also search for this author in PubMed Google Scholar
Jaskaran Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Navdeep Singh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, N., Kumar, M., Singh, B. et al. DeepSpacy-NER: an efficient deep learning model for named entity recognition for Punjabi language. Evolving Systems 14, 673–683 (2023). https://doi.org/10.1007/s12530-022-09453-1

Download citation

Received: 15 April 2022
Accepted: 07 July 2022
Published: 03 August 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s12530-022-09453-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DeepSpacy-NER: an efficient deep learning model for named entity recognition for Punjabi language

Abstract

Access this article

Similar content being viewed by others

Named Entity Recognition in Russian Using Multi-Task LSTM-CRF

PUNER-Parsi ULMFiT for Named-Entity Recognition in Persian Texts

Turkish Named-Entity Recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DeepSpacy-NER: an efficient deep learning model for named entity recognition for Punjabi language

Abstract

Access this article

Similar content being viewed by others

Named Entity Recognition in Russian Using Multi-Task LSTM-CRF

PUNER-Parsi ULMFiT for Named-Entity Recognition in Persian Texts

Turkish Named-Entity Recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation