Skip to main content

Enhancing Interaction with Social Networking Sites for Visually Impaired People by Using Textual and Visual Question Answering

  • Conference paper
  • First Online:
Computational Intelligence, Communications, and Business Analytics (CICBA 2018)

Abstract

Question Answering (QA) is an all-around inquired issue in Natural Language Processing (NLP). This paper expands the boundaries of Question Answering by including textual and visual aspects, then further combining it with the SNS for enhancing its interaction with visually impaired people. In our proposed work, a text supported with an image is fed into the hybrid model which is a combination of CNN and LSTM, producing the most accurate result with the highest probability. Both questions and answers are open ended visual and textual queries in specifically targeted diverse regions of a picture including the subtle elements of a text. Subsequently, we created a framework that required a point by point comprehension of the picture which is more complex than the framework delivering just pictorial inscriptions. The model achieved better results than other models. By using this model, we enhanced interaction with the SNS with greater efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mishra, A., Jain, S.K.: A survey on question answering systems with classification. J. King Saud Univ.-Comput. Inf. Sci. 28(3), 345–361 (2016)

    Google Scholar 

  2. Nadkarni, P.M., Ohno-Machado, L., Chapman, W.W.: Natural language processing: an introduction. J. Am. Med. Inform. Assoc. 18(5), 544–551 (2011)

    Article  Google Scholar 

  3. Yang, Z., He, X., Gao, J., Deng, L., Smola, A.: Stacked attention networks for image question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 21–29 (2016)

    Google Scholar 

  4. Ma, L., Lu, Z., Li, H.: Learning to answer questions from image using convolutional neural network. In: AAAI, vol. 3, no. 7, p. 16, February 2016

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Mao, J., Xu, W., Yang, Y., Wang, J., Huang, Z., Yuille, A.: Deep captioning with multimodal recurrent neural networks (M-RNN). arXiv preprint arXiv:1412.6632 (2014)

  7. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  8. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  9. Kumar, A., et al.: Ask me anything: dynamic memory networks for natural language processing. In: International Conference on Machine Learning, pp. 1378–1387, June 2016

    Google Scholar 

  10. Agrawal, A., et al.: VQA: visual question answering. Int. J. Comput. Vis. 123(1), 4–31 (2017)

    Article  MathSciNet  Google Scholar 

  11. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  12. Ren, M., Kiros, R., Zemel, R.: Exploring models and data for image question answering. In: Advances in Neural Information Processing Systems, pp. 2953–2961 (2015)

    Google Scholar 

  13. Xiong, C., Merity, S., Socher, R.: Dynamic memory networks for visual and textual question answering. In: International Conference on Machine Learning, pp. 2397–2406, June 2016

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  15. MSCOCO Dataset. http://cocodataset.org

  16. The bAbI Dataset. https://research.fb.com/downloads/babi

  17. Malinowski, M., Rohrbach, M., Fritz, M.: Ask your neurons: a neural-based approach to answering questions about images. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1–9. IEEE Computer Society, December 2015

    Google Scholar 

  18. Gao, H., Mao, J., Zhou, J., Huang, Z., Wang, L., Xu, W.: Are you talking to a mach? Dataset and methods for multilingual image question. In: Advances in Neural Information Processing Systems, pp. 2296–2304 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akshit Pradhan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pradhan, A., Shukla, P., Patra, P., Pathak, R., Jena, A.K. (2019). Enhancing Interaction with Social Networking Sites for Visually Impaired People by Using Textual and Visual Question Answering. In: Mandal, J., Mukhopadhyay, S., Dutta, P., Dasgupta, K. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2018. Communications in Computer and Information Science, vol 1031. Springer, Singapore. https://doi.org/10.1007/978-981-13-8581-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-8581-0_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-8580-3

  • Online ISBN: 978-981-13-8581-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics