Artificial intelligence snapchat: Visual conversation agent

Arsovski, Sasa; Cheok, Adrian David; Govindarajoo, Kirthana; Salehuddin, Nurizzaty; Vedadi, Somaiyeh

doi:10.1007/s10489-019-01621-2

Artificial intelligence snapchat: Visual conversation agent

Published: 26 February 2020

Volume 50, pages 2040–2049, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Sasa Arsovski ORCID: orcid.org/0000-0001-5981-9473¹,
Adrian David Cheok¹,
Kirthana Govindarajoo¹,
Nurizzaty Salehuddin¹ &
…
Somaiyeh Vedadi¹

954 Accesses
2 Citations
Explore all metrics

Abstract

Visual conversation is a dialog in which parties exchange visual information. The key novelty presented in this paper is an artificial intelligence-driven visual conversation automation method. We will present a state of the art Artificial Intelligence Snapchat Visual Conversation Agent (AISVCA). AISVCA uses our proposed artificial intelligence-driven visual conversation automation method to create received image caption and generate an appropriate reasonable visual response. These functionalities are achieved by using a combination of Convolutional Neural Network (CNN), Long Short-Term Memory Neural Network (LSTM) and, Latent Semantic Indexing method (LSI). CNN and LSTM are used to create image captions and, LSI is used to assess the semantic similarity between captions generated from personalized image dataset, and captions that are extracted from the received image content. We will show that AISVCA, using the proposed method can generate a visual response that is basically indistinguishable from a human visual response. To evaluate the proposed approach, we measured the accuracy of the proposed system and, conducted a user study to test communication quality. In the user study, we analyzed source credibility and interpersonal attraction of the AISVCA. The user study results showed that there are no significant differences in communication quality between a visual conversation with AISVCA and visual conversation with the human agent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Nantheera Anantrasirichai & David Bull

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Article 15 September 2023

Peng Gao, Shijie Geng, … Yu Qiao

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

Article 10 June 2021

Khaled Bayoudh, Raja Knani, … Abdellatif Mtibaa

References

Agrawal A, Lu J, Antol S, Mitchell M, Zitnick C L, Parikh D, Batra D (2017) Vqa: Visual question answering. Int J Comput Vis 123(1):4–31
Article MathSciNet Google Scholar
Chattopadhyay P, Yadav D, Prabhu V, Chandrasekaran A, Das A, Lee S, Batra D, Parikh D (2017) Evaluating visual conversational agents via cooperative human-ai games. arXiv:170805122
Chen J, Dong W, Li M (2016) Image caption generator based on deep neural networks
Das A, Kottur S, Gupta K, Singh A, Yadav D, Moura JM, Parikh D, Batra D (2017) Visual dialog. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2
Edwards C, Edwards A, Spence P, Shelton A (2014) Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter 33:372–376
Edwards C, Edwards A, Spence P R, Shelton A K (2014) Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter. Comput Hum Behav 33:372–376
Article Google Scholar
Fang H, Gupta S, Iandola F, Srivastava RK, Deng L, Dollár P, Gao J, He X, Mitchell M, Platt JC et al (2015) From captions to visual concepts and back. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1473–1482
Haas C, Wearden S T (2003) E-credibility: Building common ground in web environments. L1-Educational Studies in Language and Literature 3(1-2):169–184
Article Google Scholar
Hofmann T (2017) Probabilistic latent semantic indexing. In: ACM SIGIR forum, ACM, vol 51, pp 211–218
Hosseini M H, Nahad R F (2012) Investigating antecedents and consequences of open university brand image. Int J Acad Res 4(4):953–960
Google Scholar
Klassen A C, Creswell J, Clark V L P, Smith K C, Meissner H I (2012) Best practices in mixed methods for quality of life research. Qual Life Res 21(3):377–380
Article Google Scholar
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755
Manning C D, Raghavan P, Schütze H (2008) Matrix decompositions and latent semantic indexing. Introduction to Information Retrieval pp 403–417
McCroskey J C, McCain T A (1974) The measurement of interpersonal attraction. Speech Monographs 41 (3):261–266. https://doi.org/10.1080/03637757409375845
Article Google Scholar
McCroskey J C, Teven J J (1999) Goodwill: A reexamination of the construct and its measurement. Communications Monographs 66(1):90–103
Article Google Scholar
Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: 11th annual conference of the international speech communication association
Mostafazadeh N, Misra I, Devlin J, Mitchell M, He X, Vanderwende L (2016) Generating natural questions about an image. arXiv:160306059
Ohanian R (1991) The impact of celebrity spokespersons’ perceived image on consumers’ intention to purchase. Journal of advertising Research
Sharma S, Suhubdy D, Michalski V, Kahou SE, Bengio Y (2018) Chatpainter: Improving text to image generation using dialogue. arXiv:180208216
Soh M (2016) Learning cnn-lstm architectures for image caption generation
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3156–3164
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: Lessons learned from the 2015 mscoco image captioning challenge. IEEE transactions on pattern analysis and machine intelligence 39(4):652–663
Article Google Scholar
Wagner K (2017) Snapchat is still bigger than instagram for younger u.s. millennials. https://www.recode.net/2017/8/24/16198632/snapchat-instagram-teens-comscore-study-growth-users
Wagner K (2017) Snapchat is still the network of choice for u.s. teens - and instagram is facebook best shot at catching up. https://www.recode.net/2017/12/16/16783570/snapchat-instagram-teenagers-rbc-survey-favorite-app
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Zhang H, Xu T, Li H, Zhang S, Huang X, Wang X, Metaxas D (2017) Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: IEEE Int. Conf. Comput. Vision (ICCV), pp 5907–5915
Zhang Y, Jin R, Zhou Z H (2010) Understanding bag-of-words model: A statistical framework. Int J Mach Learn Cybern 1(1-4):43–52
Article Google Scholar

Download references

Author information

Authors and Affiliations

Imagineering Institute, Iskandar Puteri, Malaysia
Sasa Arsovski, Adrian David Cheok, Kirthana Govindarajoo, Nurizzaty Salehuddin & Somaiyeh Vedadi

Authors

Sasa Arsovski
View author publications
You can also search for this author in PubMed Google Scholar
Adrian David Cheok
View author publications
You can also search for this author in PubMed Google Scholar
Kirthana Govindarajoo
View author publications
You can also search for this author in PubMed Google Scholar
Nurizzaty Salehuddin
View author publications
You can also search for this author in PubMed Google Scholar
Somaiyeh Vedadi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sasa Arsovski.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arsovski, S., Cheok, A.D., Govindarajoo, K. et al. Artificial intelligence snapchat: Visual conversation agent. Appl Intell 50, 2040–2049 (2020). https://doi.org/10.1007/s10489-019-01621-2

Download citation

Published: 26 February 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s10489-019-01621-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence snapchat: Visual conversation agent

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Artificial intelligence snapchat: Visual conversation agent

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation