Abstract
Deepfake (or as we call it Digital Persona) has been very popularly used to create synthetic media in which a person in an existing image or video is replaced with someone else who is not present in that media. It refers to manipulated videos, or other digital representations produced by sophisticated artificial intelligence (AI), that yield fabricated images and sounds that appear to be real.
Deepfakes generally have been used for the purpose of defaming someone, where the user experience is not much of a concern. However, our work demonstrates using this technique for a good purpose. We created a digital persona of a renowned deceased artist with the aim to bring an enriching human experience through conversing with the persona projected on a 3d holographic stage in a museum. The digital persona responds in the voice of deceased artist to any questions asked by visitors related to his art journey and artwork. To ensure that the end results would have the audience immersed or awed with the outcome a.k.a. the digital persona, we adopted the human centered computing methodology which aims at radically changing the standard computing techniques of software development. In this work, the key elements of human centered computing include: a. Technology b. Cognitive Psychology and Ergonomics c. Social and Organizational Psychology d. Design and Arts e. Interaction f. Analysis for design of systems with a human focus from beginning to the end. We present the usage, details and outcomes of the mentioned focus areas in our design of developing deepfakes for good. We also present results of a social experiment conducted with children during their interaction with digital persona.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cognitive ergonomics and user interface design (2008). http://members.upc.nl/g.haan24/articles/chapter1.html. Accessed 7 Feb 2020
An AI service that analyses faces in images (2020). https://azure.microsoft.com/en-in/services/cognitive-services/face/. Accessed 17 May 2020
Create your own deepfakes online (2020). https://deepfakesweb.com/. Accessed 5 Feb 2020
Language understanding (LUIS) (2020). https://www.luis.ai/home. Accessed 15 June 2020
Real-time multimodal emotion recognition (2020). https://github.com/maelfabien/Multimodal-Emotion-Recognition. Accessed 29 July 2020
The Salvador Dali museum (2020). https://thedali.org/. Accessed 17 July 2020
Text analytics API documentation (2020). https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/. Accessed 22 July 2020
Arik, S.Ö., Chen, J., Peng, K., Ping, W., Zhou, Y.: Neural voice cloning with a few samples. CoRR abs/1802.06006 (2018). http://arxiv.org/abs/1802.06006
kan bayashi: Unofficial parallel wavegan (+ MelGAN) implementation with Pytorch (2020). https://github.com/kan-bayashi/ParallelWaveGAN. Accessed 24 Feb 2020
CISE - IIS - About (2020). http://www.nsf.gov/cise/iis/about.jsp. Accessed 12 Jan 2020
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
DeepFaceLab is the leading software for creating deepfakes (2020). https://github.com/iperov/DeepFaceLab. Accessed 11 Jan 2020
Pytorch implementation of convolutional neural networks-based text-to-speech synthesis models (2019). https://github.com/r9y9/deepvoice3_pytorch. Accessed 21 Feb 2020
Egger, M., Ley, M., Hanke, S.: Emotion recognition from physiological signal analysis: a review. Electron. Notes Theor. Comput. Sci. 343, 35–55 (2019). https://doi.org/10.1016/j.entcs.2019.04.009. http://www.sciencedirect.com/science/article/pii/S157106611930009X. The Proceedings of AmI, The 2018 European Conference on Ambient Intelligence (2018)
Faceswap: Deepfakes software for all (2020). https://github.com/deepfakes/faceswap. Accessed 29 Jan 2020
A denoising autoencoder, adversarial losses and attention mechanisms for face swapping (2019). https://github.com/shaoanlu/faceswap-GAN. Accessed 19 Jan 2020
Fried, O., et al.: Text-based editing of talking-head video. ACM Trans. Graph. 38(4), July 2019. https://doi.org/10.1145/3306346.3323028
You can now speak using someone else’s voice with deep learning, July 2019. https://towardsdatascience.com/you-can-now-speak-using-someone-elses-voice-with-deep-learning-8be24368fa2b. Accessed 16 Feb 2020
Holography (2020). https://en.wikipedia.org/wiki/Holography. Accessed 4 Feb 2020
Jaimes, A., Sebe, N., Gatica-Perez, D.: Human-centered computing: a multimedia perspective. In: Proceedings of the 14th ACM International Conference on Multimedia, MM 2006, New York, NY, USA, pp. 855–864. Association for Computing Machinery (2006). https://doi.org/10.1145/1180639.1180829
These five platforms will make your bots language-intelligent (2016). https://chatbotsmagazine.com/these-five-platforms-will-makeyour-bots-language-intelligent-634556750abd. Accessed 5 Jan 2020
Jia, Y., et al.: Transfer learning from speaker verification to multispeaker text-to-speech synthesis. CoRR abs/1806.04558 (2018). http://arxiv.org/abs/1806.04558
Kazeminia, S., et al.: GANs for medical image analysis. Artif. Intell. Med. 109 (2020). https://doi.org/10.1016/j.artmed.2020.101938. http://www.sciencedirect.com/science/article/pii/S0933365719311510
VGGFace implementation with Keras framework (2020). https://github.com/rcmalli/keras-vggface. Accessed 4 Jan 2020
Kietzmann, J., Lee, L.W., McCarthy, I.P., Kietzmann, T.C.: DeepFakes: trick or treat? Bus. Horizons 63(2), 135–146 (2020). https://doi.org/10.1016/j.bushor.2019.11.006. http://www.sciencedirect.com/science/article/pii/S0007681319301600
Kobayashi, K., Toda, T.: sprocket: Open-source voice conversion software, pp. 203–210, June 2018. https://doi.org/10.29007/s4t1
Emotion analytics (2018). https://searchcustomerexperience.techtarget.com/definition/emotions-analytics-EA. Accessed 14 Jan 2020
Nguyen, T., Nguyen, C., Nguyen, T., Nguyen, D., Nahavandi, S.: Deep learning for deepfakes creation and detection, September 2019
10x your employee engagement with immersive learning experiences (2019). https://www.jolt.io/blog/10x-your-employee-engagement-with-great-learning-experiences. Accessed 27 June 2020
van den Oord, A., et al.: WaveNet: a generative model for raw audio. arXiv (2016). https://arxiv.org/abs/1609.03499
Ping, W., et al.: Deep voice 3: scaling text-to-speech with convolutional sequence learning (2017)
Pold, S.: Interface realisms: the interface as aesthetic form. Postmod. Cult. 15, January 2005. https://doi.org/10.1353/pmc.2005.0013
Sanders, N., Wood, J.: The Humachine: Humankind, Machines, and the Future of Enterprise. Taylor & Francis, Abingdon (2019). https://books.google.co.in/books?id=OVauDwAAQBAJ
Shen, J., et al.: Natural TTS synthesis by conditioning WaveNet on MEL spectrogram predictions, pp. 4779–4783, April 2018. https://doi.org/10.1109/ICASSP.2018.8461368
Tomáis, E.: How the arts can help tangible interaction design: a critical re-orientation. Informatics 4, 31 (2017). https://doi.org/10.3390/informatics4030031
Wang, Y., et al.: Tacotron: a fully end-to-end text-to-speech synthesis model, March 2017
Liang, W.: The 3D holographic projection technology based on three-dimensional computer graphics, pp. 403–406, July 2012. https://doi.org/10.1109/ICALIP.2012.6376651
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ramachandra, N., Ahuja, M., Rao, R.M., Dubash, N. (2021). Human Centered Computing in Digital Persona Generation. In: Fu, W., Xu, Y., Wang, SH., Zhang, Y. (eds) Multimedia Technology and Enhanced Learning. ICMTEL 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 388. Springer, Cham. https://doi.org/10.1007/978-3-030-82565-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-82565-2_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82564-5
Online ISBN: 978-3-030-82565-2
eBook Packages: Computer ScienceComputer Science (R0)