Skip to main content
Log in

Digital document analytics using logistic regressive and deep transition-based dependency parsing

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The selection of text features is a fundamental task and plays an important role in digital document analysis. Conventional methods in text feature extraction necessitate indigenous features. Obtaining an efficient feature is an extensive process, but a new and real-time representation of features in text data is a challenging task. Deep learning is making inroads in digital document mining. A significant distinction between deep learning and traditional methods is that deep learning learns features in a digital document in an automatic manner. In this paper, logistic regression and deep dependency parsing (LR-DDP) methods are proposed. The logistic regression token generation model generates robust tokens by means of Napierian grammar. With the robust generated tokens, a deep transition-based dependency parsing using duplex long short-term memory is designed. Experimental results demonstrate that our dependency parser achieves comparable performance in terms of digital document parsing accuracy, parsing time and overhead when compared to existing methods. Hence, these methods are found to be computationally efficient and accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Han W, Jiang Y, Kewei Tu (2019) Lexicalized neural unsupervised dependency parsing, neurocomputing. Elsevier

    Google Scholar 

  2. Jaf S, Calder C (2019) Deep learning for natural language parsing. IEEE Access 7:131363–131373

    Article  Google Scholar 

  3. Fernández-Gonzáleza D (2019) CarlosGómez-Rodríguez, “Faster shift-reduce constituent parsing with a non-binary, bottom-up strategy”, artificial intelligence. Elsevier

    Google Scholar 

  4. Beyersmann E, Grainger J, Castles A (2019) Embedded stems as a bootstrapping mechanism for morphological parsing during reading development. J Exp Child Psychol 182:196–210

    Article  Google Scholar 

  5. Bui DDA, Del Fiol G, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports. J Biomed Inform 61:141–148

    Article  Google Scholar 

  6. deMedeirosa SQ, Alvez GA, Mscarenhasc F (2019) Automatic syntax error reporting and recovery inparsing expression grammars. Sci Comput Program 187:102373–102394

    Google Scholar 

  7. Jaf S, Calder C (2019) Deep learning for natural language parsing. IEEE Access 7:131363–131373

    Article  Google Scholar 

  8. Zhou J, Huang JX, Chen Q, Hu QV, Wang T, He L (2019) Deep learning for aspect-level sentiment classification: survey, vision, and challenges. IEEE Access 7:78454–78483

    Article  Google Scholar 

  9. Saxe AM, Mc Clelland JL, Ganguli S (2018) A mathematical theory of semantic development in deep neural networks. Appl Math 116:11537–11546

    MathSciNet  MATH  Google Scholar 

  10. Minaee S, Cambria E, Meysam T, Gao J (2020) Deep learning based text classification: a comprehensive review. https://arxiv.org/pdf/2004.03705.pdf

  11. Parwez MA, Abulaish M (2019) Multilabel classification of microblogging texts using convolution neural network. IEEE Access 7:68678–68691

    Article  Google Scholar 

  12. Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412

    Article  Google Scholar 

  13. Kilimci ZH, Akyokus S (2018) Deep learning- and word embedding-based heterogeneous classifier ensembles for text classification, complexity. Wiley

    Google Scholar 

  14. Konig M, Sander A, Demuth I, Diekmann D, Steinhagen-Thiessen E (2019) Knowledge-based best of breed approach for automated detection of clinical events basedon German free text digital hospital dischargeletters. PLoS ONE. https://doi.org/10.1371/journal.pone.0224916

    Article  Google Scholar 

  15. Affolter K, Stockinger K, Bernstein A (2019) A comparative survey of recent natural language interfaces for Databases. Int J Very Large Data Base 28(5):793–819

    Article  Google Scholar 

  16. Soliman H (2020) Deep learning based searching approach for RDF graphs. PLoS ONE. https://doi.org/10.1371/journal.pone.0230500

    Article  Google Scholar 

  17. Zeng Z, Shi H, Wu Y, Hong Z (2015) Survey of natural language processing techniques in bioinformatics. Comput Math Methods Med 2015:674296

    Article  Google Scholar 

  18. Tkaczyk D, Szostek P, Fedoryszak M, Dendek PJ, Bolikowski Ł (2015) CERMINE: automatic extraction of structured metadata from scientific literature. Int J Doc Anal Recognit 18(4):317–335

    Article  Google Scholar 

  19. Bakari W, Neji M (2020) A novel semantic and logical-based approach integrating RTE technique in the Arabic question–answering. Int J Speech Technol 1–17

  20. Zhang Y, Tiryaki F, Jiang M, Xu H (2019) Parsing clinical text using the state-of-theart deep learning based parsers: a systematic comparison. IEEE Int Conf Healthc Inform 19(3):51–58

    Google Scholar 

Download references

Acknowledgements

We thank to Springer Nature Services for Linguistic Editing .

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Rekha.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rekha, D., Sangeetha, J. & Ramaswamy, V. Digital document analytics using logistic regressive and deep transition-based dependency parsing. J Supercomput 78, 2580–2596 (2022). https://doi.org/10.1007/s11227-021-03973-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-03973-4

Keywords

Navigation