Human Centered Computing in Digital Persona Generation

Ramachandra, Nisha; Ahuja, Manish; Rao, Raghotham M.; Dubash, Neville

doi:10.1007/978-3-030-82565-2_32

Nisha Ramachandra¹⁹,
Manish Ahuja¹⁹,
Raghotham M. Rao¹⁹ &
…
Neville Dubash¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 388))

Included in the following conference series:

International Conference on Multimedia Technology and Enhanced Learning

548 Accesses
3 Citations

Abstract

Deepfake (or as we call it Digital Persona) has been very popularly used to create synthetic media in which a person in an existing image or video is replaced with someone else who is not present in that media. It refers to manipulated videos, or other digital representations produced by sophisticated artificial intelligence (AI), that yield fabricated images and sounds that appear to be real.

Deepfakes generally have been used for the purpose of defaming someone, where the user experience is not much of a concern. However, our work demonstrates using this technique for a good purpose. We created a digital persona of a renowned deceased artist with the aim to bring an enriching human experience through conversing with the persona projected on a 3d holographic stage in a museum. The digital persona responds in the voice of deceased artist to any questions asked by visitors related to his art journey and artwork. To ensure that the end results would have the audience immersed or awed with the outcome a.k.a. the digital persona, we adopted the human centered computing methodology which aims at radically changing the standard computing techniques of software development. In this work, the key elements of human centered computing include: a. Technology b. Cognitive Psychology and Ergonomics c. Social and Organizational Psychology d. Design and Arts e. Interaction f. Analysis for design of systems with a human focus from beginning to the end. We present the usage, details and outcomes of the mentioned focus areas in our design of developing deepfakes for good. We also present results of a social experiment conducted with children during their interaction with digital persona.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cognitive ergonomics and user interface design (2008). http://members.upc.nl/g.haan24/articles/chapter1.html. Accessed 7 Feb 2020
An AI service that analyses faces in images (2020). https://azure.microsoft.com/en-in/services/cognitive-services/face/. Accessed 17 May 2020
Create your own deepfakes online (2020). https://deepfakesweb.com/. Accessed 5 Feb 2020
Language understanding (LUIS) (2020). https://www.luis.ai/home. Accessed 15 June 2020
Real-time multimodal emotion recognition (2020). https://github.com/maelfabien/Multimodal-Emotion-Recognition. Accessed 29 July 2020
The Salvador Dali museum (2020). https://thedali.org/. Accessed 17 July 2020
Text analytics API documentation (2020). https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/. Accessed 22 July 2020
Arik, S.Ö., Chen, J., Peng, K., Ping, W., Zhou, Y.: Neural voice cloning with a few samples. CoRR abs/1802.06006 (2018). http://arxiv.org/abs/1802.06006
kan bayashi: Unofficial parallel wavegan (+ MelGAN) implementation with Pytorch (2020). https://github.com/kan-bayashi/ParallelWaveGAN. Accessed 24 Feb 2020
CISE - IIS - About (2020). http://www.nsf.gov/cise/iis/about.jsp. Accessed 12 Jan 2020
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
DeepFaceLab is the leading software for creating deepfakes (2020). https://github.com/iperov/DeepFaceLab. Accessed 11 Jan 2020
Pytorch implementation of convolutional neural networks-based text-to-speech synthesis models (2019). https://github.com/r9y9/deepvoice3_pytorch. Accessed 21 Feb 2020
Egger, M., Ley, M., Hanke, S.: Emotion recognition from physiological signal analysis: a review. Electron. Notes Theor. Comput. Sci. 343, 35–55 (2019). https://doi.org/10.1016/j.entcs.2019.04.009. http://www.sciencedirect.com/science/article/pii/S157106611930009X. The Proceedings of AmI, The 2018 European Conference on Ambient Intelligence (2018)
Faceswap: Deepfakes software for all (2020). https://github.com/deepfakes/faceswap. Accessed 29 Jan 2020
A denoising autoencoder, adversarial losses and attention mechanisms for face swapping (2019). https://github.com/shaoanlu/faceswap-GAN. Accessed 19 Jan 2020
Fried, O., et al.: Text-based editing of talking-head video. ACM Trans. Graph. 38(4), July 2019. https://doi.org/10.1145/3306346.3323028
You can now speak using someone else’s voice with deep learning, July 2019. https://towardsdatascience.com/you-can-now-speak-using-someone-elses-voice-with-deep-learning-8be24368fa2b. Accessed 16 Feb 2020
Holography (2020). https://en.wikipedia.org/wiki/Holography. Accessed 4 Feb 2020
Jaimes, A., Sebe, N., Gatica-Perez, D.: Human-centered computing: a multimedia perspective. In: Proceedings of the 14th ACM International Conference on Multimedia, MM 2006, New York, NY, USA, pp. 855–864. Association for Computing Machinery (2006). https://doi.org/10.1145/1180639.1180829
These five platforms will make your bots language-intelligent (2016). https://chatbotsmagazine.com/these-five-platforms-will-makeyour-bots-language-intelligent-634556750abd. Accessed 5 Jan 2020
Jia, Y., et al.: Transfer learning from speaker verification to multispeaker text-to-speech synthesis. CoRR abs/1806.04558 (2018). http://arxiv.org/abs/1806.04558
Kazeminia, S., et al.: GANs for medical image analysis. Artif. Intell. Med. 109 (2020). https://doi.org/10.1016/j.artmed.2020.101938. http://www.sciencedirect.com/science/article/pii/S0933365719311510
VGGFace implementation with Keras framework (2020). https://github.com/rcmalli/keras-vggface. Accessed 4 Jan 2020
Kietzmann, J., Lee, L.W., McCarthy, I.P., Kietzmann, T.C.: DeepFakes: trick or treat? Bus. Horizons 63(2), 135–146 (2020). https://doi.org/10.1016/j.bushor.2019.11.006. http://www.sciencedirect.com/science/article/pii/S0007681319301600
Kobayashi, K., Toda, T.: sprocket: Open-source voice conversion software, pp. 203–210, June 2018. https://doi.org/10.29007/s4t1
Emotion analytics (2018). https://searchcustomerexperience.techtarget.com/definition/emotions-analytics-EA. Accessed 14 Jan 2020
Nguyen, T., Nguyen, C., Nguyen, T., Nguyen, D., Nahavandi, S.: Deep learning for deepfakes creation and detection, September 2019
Google Scholar
10x your employee engagement with immersive learning experiences (2019). https://www.jolt.io/blog/10x-your-employee-engagement-with-great-learning-experiences. Accessed 27 June 2020
van den Oord, A., et al.: WaveNet: a generative model for raw audio. arXiv (2016). https://arxiv.org/abs/1609.03499
Ping, W., et al.: Deep voice 3: scaling text-to-speech with convolutional sequence learning (2017)
Google Scholar
Pold, S.: Interface realisms: the interface as aesthetic form. Postmod. Cult. 15, January 2005. https://doi.org/10.1353/pmc.2005.0013
Sanders, N., Wood, J.: The Humachine: Humankind, Machines, and the Future of Enterprise. Taylor & Francis, Abingdon (2019). https://books.google.co.in/books?id=OVauDwAAQBAJ
Shen, J., et al.: Natural TTS synthesis by conditioning WaveNet on MEL spectrogram predictions, pp. 4779–4783, April 2018. https://doi.org/10.1109/ICASSP.2018.8461368
Tomáis, E.: How the arts can help tangible interaction design: a critical re-orientation. Informatics 4, 31 (2017). https://doi.org/10.3390/informatics4030031
Article Google Scholar
Wang, Y., et al.: Tacotron: a fully end-to-end text-to-speech synthesis model, March 2017
Google Scholar
Liang, W.: The 3D holographic projection technology based on three-dimensional computer graphics, pp. 403–406, July 2012. https://doi.org/10.1109/ICALIP.2012.6376651

Download references

Author information

Authors and Affiliations

Accenture Labs, Accenture, Bengalaru, India
Nisha Ramachandra, Manish Ahuja, Raghotham M. Rao & Neville Dubash

Authors

Nisha Ramachandra
View author publications
You can also search for this author in PubMed Google Scholar
Manish Ahuja
View author publications
You can also search for this author in PubMed Google Scholar
Raghotham M. Rao
View author publications
You can also search for this author in PubMed Google Scholar
Neville Dubash
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nisha Ramachandra .

Editor information

Editors and Affiliations

Hunan Normal University, Changsha, China
Weina Fu
University of Jinan, Jinan, China
Yuan Xu
University of Leicester, Leicester, UK
Shui-Hua Wang
University of Leicester, Leicester, UK
Yudong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramachandra, N., Ahuja, M., Rao, R.M., Dubash, N. (2021). Human Centered Computing in Digital Persona Generation. In: Fu, W., Xu, Y., Wang, SH., Zhang, Y. (eds) Multimedia Technology and Enhanced Learning. ICMTEL 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 388. Springer, Cham. https://doi.org/10.1007/978-3-030-82565-2_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-82565-2_32
Published: 21 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82564-5
Online ISBN: 978-3-030-82565-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics