Skip to main content
Log in

A Information Retrieval Based on Question and Answering and NER for Unstructured Information Without Using SQL

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In today’s world, the availability of information in the form of unstructured data is in abundance. The unstructured information received is more often than not in the form of natural language text. For any defense establishment, the spy data or any sensitive information received may be best utilized when the information can be extracted efficiently and easily. The proposed model is applicable wherever the influx of text-heavy (unstructured data) is high like the information from the world wide web, documents related to a particular domain, or any other source where the information is in the form of natural language. The proposed Natural Language Information Interpretation and Representation System (NLIIRS) accepts the information in the form of natural language text, processes the information and allows the user to retrieve information by rendering questions in natural language. The questions thus asked by the user are responded by NLIIRS in the form of factoid or phrase based answers. In comparison to the conventional question and answering systems the proposed NLIIRS uses the advantages of both named entity recognition as well as sequential pattern matching based answer search technique. The proposed technique helps us to avoid the use of structured query language (SQL) at the back-end for information processing, storage and extraction. The conversion of user query to SQL statements and also storing the unstructured text in the form of relation tables can be avoided by using NLIIRS. By using this approach in our novel text processing algorithm, after every execution step, the pattern matching and extraction process of the answers to the queries becomes concise and faster. The whole system has been designed on natural language tool kit of Stanford University which helped us to generate parts of speech tag, tokenize the data, and forming tree structure. The novel text processing algorithm utilizes the lemmatizer, stemmer and ne_chunker to prepare the text for information retrieval via Q&A. The advantage of this system is that it does not need training. This system will enable the user to retrieve any information of his/her choice from the available unstructured information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Green, B. F., Chomsky, C., & Laughery, K. (1961) Baseball: An automatic question answerer. In Proceedings of the western joint computer conference, New York: Institute of Radio Engineers (pp. 219–224). https://doi.org/10.1145/1460690.1460714.

  2. Woods, W.A., & Bolt, B. (1973). Progress in natural language under-standing—An application to lunar geology. In Proceedings of the American Federation of Information Processing So-cieties (AFIPS) (Vol. 42, pp. 441–450). https://doi.org/10.1145/1499586.1499695.

  3. Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1994). MASQUE/SQL—An efficient and portable natural language query interface for relational databases. In Proceedings of the 6th international conference on industrial and engineering applications of artificial intelligence and expert systems (pp. 327–330). Edinburgh: Gordon and Breach Publisher Inc.

  4. Androutsopoulos, I., Ritchie, G. D., & Thanisch, P. (1995). Natural language interfaces to databases—An introduction. In Cambridge University Press 1995. J. Lang. Eng. 1 (1), 29–81, September 1.

  5. Yan-hong, F. Hong, Y., Geng, S., & Xun-ran, Y. (2018). Domain named entity recognition method based on skip-gram model. In First international conference on electronics instrumentation & information systems (EIIS), China, 22 February 2018. https://doi.org/10.1109/eiis.2017.8298655.

  6. Ranjan, P., & Balabantaray, R. C. (2016). Question answering system for factoid based question. In 2nd international conference on contemporary computing and informatics (IC3I), 2016. https://doi.org/10.1109/ic3i.2016.7917964.

  7. Ma, R., Zhang, J., Li, M., Chen, L., & Gao, J. (2018). Hybrid answer selection model for non-factoid question answering. In 2017 international conference on asian language processing (IALP), 22 February 2018. https://doi.org/10.1109/ialp.2017.8300620.

  8. Yulianti, E., Chen, R.-C., Scholer, F., Croft, W. B., & Sanderson, M. (2018) Document summarization for answering non-factoid queries. In IEEE transactions on knowledge and data engineering (Vol. 30, no. 1). https://doi.org/10.1109/tkde.2017.2754373.

  9. Iyyer, M., Boyd-Graber, J., Claudino, L., Socher, R., Daumé, H. III. (2014). A Neural network for factoid question answering over paragraphs. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.3115/v1/d14-1070.

  10. Jain, A., & Wasim, F. (2017). Answering SQuAD. In Department of Computer Science, Stanford University, Stanford, CA 94305-9020, USA, in 2017.

  11. Bian, J., Liu, Y., Agichtein, E., & Zha, H. (2008). Finding the right facts in the crowd: Factoid question answering over social media. In Proceedings of the 17th international conference on World Wide Web Beijing, China, April 21–25, 2008 (pp 467–476). https://doi.org/10.1145/1367497.1367561.

  12. Angeli, G., Nayak, N., & Manning, C. D. (2016). Combining natural logic and shallow reasoning for question answering. In Stanford University, Stanford, CA 94305, in 2016. https://doi.org/10.18653/v1/p16-1042.

  13. Chen, D., Bolton, J., & Manning, C. D. (2016). A thorough examination of the CNN/daily mail reading comprehension task. In Department of Computer Science Stanford University, Stanford, CA 94305-9020, USA, 2016. https://doi.org/10.18653/v1/p16-1223.

  14. Dodiya, T., & Jain, S. Question classification for medical domain question answering system. In IEEE international WIE conference on electrical and computer engineering (WIECON-ECE) 19–21 December 2016, AISSMS, Pune, India. https://doi.org/10.1109/wiecon-ece.2016.8009118.

  15. Cucerzan, S., & Agichtein, E. Factoid question answering over unstructured and structured web content. In Microsoft research, one microsoft way, 2005.

Download references

Acknowledgements

This publication is an outcome of the R&D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics & Information Technology, MeitY, Government of India, being implemented by Digital India Corporation. This research work has been done at Research Project Lab of National Institute of Technology (NIT), Durgapur, India. Financial support was received from Visvesvaraya PhD Scheme, Deity, Govt. of India (Order Number: PHD-MLA/4 (29)/2014_2015 Dated- 27/4/2015) to carry out this research work. The authors would like to thank the Department of Computer Science and Engineering, NIT, Durgapur, for academically supporting this research work. The authors would also like to thank the Department of Computer Science and Engineering, Jaypee University of Engineering & Technology, Guna MP.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Partha Sarathy Banerjee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Banerjee, P.S., Chakraborty, B., Tripathi, D. et al. A Information Retrieval Based on Question and Answering and NER for Unstructured Information Without Using SQL. Wireless Pers Commun 108, 1909–1931 (2019). https://doi.org/10.1007/s11277-019-06501-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-019-06501-z

Keywords

Navigation