DeepInsight: a CNN-based approach for machine reading comprehension in query answering systems and its applications

Shukla, Anurag; Chourasia, Kavyansh; Jain, Gazal; U., Venkanna

doi:10.1007/s11042-023-17732-5

DeepInsight: a CNN-based approach for machine reading comprehension in query answering systems and its applications

1203: Applications of Advanced Artificial Intelligence in Multimedia and Information Security
Published: 02 December 2023

Volume 83, pages 3313–3333, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Anurag Shukla¹,
Kavyansh Chourasia¹,
Gazal Jain¹ &
…
Venkanna U. ORCID: orcid.org/0000-0002-3596-6679²

172 Accesses
Explore all metrics

Abstract

Understanding and reading an unstructured text to answer queries on it, also called Machine Reading Comprehension (MRC), has been a hot topic amongst researchers worldwide in the past few years. There are many ways in which this problem has been tackled; each one has its perks and pitfalls. MRC is being thought about meticulously because of its various applications which are a pressing need due to ever-increasing data in this modern world. However, no work has been done focusing on the applications of MRC. We propose DeepInsight, an efficient CNN-based machine reading comprehension model inspired by QANet and an assemblage of three case studies. Each case study is about a specific application of DeepInsight, alongside the details of its implementation, results from the analysis, and shortcomings. These case studies can be referred to by developers around the world while building similar applications.The first case study is a fully implemented end-to-end android-based application that can upload documents to a server and receive different queries. Second, a mobile application that can help people with visual impairment to comprehend documents. Third, a video query application that can answer questions posed on video data, using the deep captioning model as a core for this application. DeepInsight has an EM/F1 score of 77.0 / 86.3 and performs better than the present state-of-the-art models while keeping the inference time of 0.535 seconds, justifiable for real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Reading Comprehension (LSTM) Review (State of Art)

A Comprehensive Survey on Machine Reading Comprehension: Models, Benchmarked Datasets, Evaluation Metrics, and Trends

Classification-Regression for Chart Comprehension

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Zhang S, He F (2020) DRCDN: learning deep residual convolutional dehazing networks. Vis Comput 36(9):1797–1808
Article Google Scholar
Pan Y, He F, Yu H (2020) Learning social representations with deep autoencoder for recommender system. World Wide Web 23(4):2259–2279
Article Google Scholar
Quan Q, He F, Li H (2021) A multi-phase blending method with incremental intensity for training detection networks. Vis Comput 37(2):245–259
Article Google Scholar
Li H, He F, Chen Y, Pan Y (2021) MLFS-CCDE: multi-objective large-scale feature selection by cooperative coevolutionary differential evolution. Memet Comput 13(1):1–18
Article Google Scholar
Riloff E, Thelen M (2000) A rule-based question answering system for reading comprehension tests. Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding Systems, ANLP-NAACL Workshop
Book Google Scholar
Wang S, Jiang J (2016) Machine comprehension using match-lstm and answer pointer. arXiv:1608.07905
Seo M, Kembhavi A, Farhadi A, Hajishirzi H (2016) Bidirectional attention flow for machine comprehension. arXiv:1611.01603
Yu AW, Dohan D, Luong MT, Zhao R, Chen K, Norouzi M, Le QV (2018) Qanet: Combining local convolution with global self-attention for reading comprehension. arXiv:1804.09541
Liu S, Zhang X, Zhang S, Wang H, Zhang W (2019) Neural machine reading comprehension: Methods and trends. Appl Sci 9(18):3698
Article Google Scholar
Venugopalan S, Rohrbach M, Donahue J, Mooney R, Darrell T, Saenko K (2015) Sequence to sequence-video to text. In Proceedings of the IEEE international conference on computer vision, pp 4534–4542
Wang J, Jiang W, Ma L, Liu W, Xu Y (2018) Bidirectional attentive fusion with context gating for dense video captioning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7190–7198
Sutskever I, Vinyals O, Le QV (2014) Sequence to Sequence learning with neural networks. In Advances in neural information processing systems, pp 3104–3112
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. International conference on machine learning, pp 1243–1252
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, ... Polosukhin I (2017) Attention is all you need. In Advances in neural information processing systems, pp 5998–6008
Li W, Li W, Wu Y (2018) A unified model for document-based question answering based on human-like reading strategy. In Proceedings of the AAAI conference on artificial intelligence
Xiao H, Wang F, Yan J, Zheng J (2018) Dual ask-answer network for machine reading comprehension. arXiv:1809.01997
Abobeah R, Shoukry A, Katto J (2020) Video Alignment Using Bi-Directional Attention Flow in a Multi-Stage Learning Model. IEEE Access 8:18097–18109
Article Google Scholar
Guadarrama S, Krishnamoorthy N, Malkarnenkar G, Venugopalan S, Mooney R, Darrell T, Saenko K (2013) Youtube2text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. In Proceedings of the IEEE international conference on computer vision, pp 2712–2719
Rohrbach M, Qiu W, Titov I, Thater S, Pinkal M, Schiele B (2013) Translating video content to natural language descriptions. In Proceedings of the IEEE international conference on computer vision, pp 433–440
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning, pp 2048–2057
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Article Google Scholar
Yu H, Wang J, Huang Z, Yang Y, Xu W (2016) Video paragraph captioning using hierarchical recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4584–4593
Krishna R, Hata K, Ren F, Fei-Fei L, Carlos Niebles J (2017) Dense-captioning events in videos. In Proceedings of the IEEE international conference on computer vision, pp 706–715
Escorcia V, Heilbron FC, Niebles JC, Ghanem B (2016) Daps: Deep action proposals for action understanding. In European conference on computer vision, pp 768–784
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387
Wang W, Yang N, Wei F, Chang B, Zhou M (2017) Gated self-matching networks for reading comprehension and question answering. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 189–198
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2016) TensorFlow: large-scale machine learning on heterogeneous distributed Systems. arXiv:1603.04467
Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, del Río JF, Wiebe M, Peterson P, Gérard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE (2020) Array programming with NumPy. Nature 585(7825):357–362

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, IIIT Naya Raipur, Naya Raipur, India
Anurag Shukla, Kavyansh Chourasia & Gazal Jain
National Institute of Technology, Warangal, India
Venkanna U.

Authors

Anurag Shukla
View author publications
You can also search for this author inPubMed Google Scholar
Kavyansh Chourasia
View author publications
You can also search for this author inPubMed Google Scholar
Gazal Jain
View author publications
You can also search for this author inPubMed Google Scholar
Venkanna U.
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Venkanna U..

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest and there was no human or animal testing or participation involved in this research.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shukla, A., Chourasia, K., Jain, G. et al. DeepInsight: a CNN-based approach for machine reading comprehension in query answering systems and its applications. Multimed Tools Appl 83, 3313–3333 (2024). https://doi.org/10.1007/s11042-023-17732-5

Download citation

Received: 21 September 2020
Revised: 22 January 2023
Accepted: 23 November 2023
Published: 02 December 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-17732-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DeepInsight: a CNN-based approach for machine reading comprehension in query answering systems and its applications

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Machine Reading Comprehension (LSTM) Review (State of Art)

A Comprehensive Survey on Machine Reading Comprehension: Models, Benchmarked Datasets, Evaluation Metrics, and Trends

Classification-Regression for Chart Comprehension

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now