Abstract
The selection of text features is a fundamental task and plays an important role in digital document analysis. Conventional methods in text feature extraction necessitate indigenous features. Obtaining an efficient feature is an extensive process, but a new and real-time representation of features in text data is a challenging task. Deep learning is making inroads in digital document mining. A significant distinction between deep learning and traditional methods is that deep learning learns features in a digital document in an automatic manner. In this paper, logistic regression and deep dependency parsing (LR-DDP) methods are proposed. The logistic regression token generation model generates robust tokens by means of Napierian grammar. With the robust generated tokens, a deep transition-based dependency parsing using duplex long short-term memory is designed. Experimental results demonstrate that our dependency parser achieves comparable performance in terms of digital document parsing accuracy, parsing time and overhead when compared to existing methods. Hence, these methods are found to be computationally efficient and accurate.






Similar content being viewed by others
References
Han W, Jiang Y, Kewei Tu (2019) Lexicalized neural unsupervised dependency parsing, neurocomputing. Elsevier
Jaf S, Calder C (2019) Deep learning for natural language parsing. IEEE Access 7:131363–131373
Fernández-Gonzáleza D (2019) CarlosGómez-Rodríguez, “Faster shift-reduce constituent parsing with a non-binary, bottom-up strategy”, artificial intelligence. Elsevier
Beyersmann E, Grainger J, Castles A (2019) Embedded stems as a bootstrapping mechanism for morphological parsing during reading development. J Exp Child Psychol 182:196–210
Bui DDA, Del Fiol G, Jonnalagadda S (2016) PDF text classification to leverage information extraction from publication reports. J Biomed Inform 61:141–148
deMedeirosa SQ, Alvez GA, Mscarenhasc F (2019) Automatic syntax error reporting and recovery inparsing expression grammars. Sci Comput Program 187:102373–102394
Jaf S, Calder C (2019) Deep learning for natural language parsing. IEEE Access 7:131363–131373
Zhou J, Huang JX, Chen Q, Hu QV, Wang T, He L (2019) Deep learning for aspect-level sentiment classification: survey, vision, and challenges. IEEE Access 7:78454–78483
Saxe AM, Mc Clelland JL, Ganguli S (2018) A mathematical theory of semantic development in deep neural networks. Appl Math 116:11537–11546
Minaee S, Cambria E, Meysam T, Gao J (2020) Deep learning based text classification: a comprehensive review. https://arxiv.org/pdf/2004.03705.pdf
Parwez MA, Abulaish M (2019) Multilabel classification of microblogging texts using convolution neural network. IEEE Access 7:68678–68691
Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412
Kilimci ZH, Akyokus S (2018) Deep learning- and word embedding-based heterogeneous classifier ensembles for text classification, complexity. Wiley
Konig M, Sander A, Demuth I, Diekmann D, Steinhagen-Thiessen E (2019) Knowledge-based best of breed approach for automated detection of clinical events basedon German free text digital hospital dischargeletters. PLoS ONE. https://doi.org/10.1371/journal.pone.0224916
Affolter K, Stockinger K, Bernstein A (2019) A comparative survey of recent natural language interfaces for Databases. Int J Very Large Data Base 28(5):793–819
Soliman H (2020) Deep learning based searching approach for RDF graphs. PLoS ONE. https://doi.org/10.1371/journal.pone.0230500
Zeng Z, Shi H, Wu Y, Hong Z (2015) Survey of natural language processing techniques in bioinformatics. Comput Math Methods Med 2015:674296
Tkaczyk D, Szostek P, Fedoryszak M, Dendek PJ, Bolikowski Ł (2015) CERMINE: automatic extraction of structured metadata from scientific literature. Int J Doc Anal Recognit 18(4):317–335
Bakari W, Neji M (2020) A novel semantic and logical-based approach integrating RTE technique in the Arabic question–answering. Int J Speech Technol 1–17
Zhang Y, Tiryaki F, Jiang M, Xu H (2019) Parsing clinical text using the state-of-theart deep learning based parsers: a systematic comparison. IEEE Int Conf Healthc Inform 19(3):51–58
Acknowledgements
We thank to Springer Nature Services for Linguistic Editing .
Funding
The authors received no specific funding for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rekha, D., Sangeetha, J. & Ramaswamy, V. Digital document analytics using logistic regressive and deep transition-based dependency parsing. J Supercomput 78, 2580–2596 (2022). https://doi.org/10.1007/s11227-021-03973-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03973-4