Skip to main content

Human Centered Computing in Digital Persona Generation

  • Conference paper
  • First Online:
Multimedia Technology and Enhanced Learning (ICMTEL 2021)

Abstract

Deepfake (or as we call it Digital Persona) has been very popularly used to create synthetic media in which a person in an existing image or video is replaced with someone else who is not present in that media. It refers to manipulated videos, or other digital representations produced by sophisticated artificial intelligence (AI), that yield fabricated images and sounds that appear to be real.

Deepfakes generally have been used for the purpose of defaming someone, where the user experience is not much of a concern. However, our work demonstrates using this technique for a good purpose. We created a digital persona of a renowned deceased artist with the aim to bring an enriching human experience through conversing with the persona projected on a 3d holographic stage in a museum. The digital persona responds in the voice of deceased artist to any questions asked by visitors related to his art journey and artwork. To ensure that the end results would have the audience immersed or awed with the outcome a.k.a. the digital persona, we adopted the human centered computing methodology which aims at radically changing the standard computing techniques of software development. In this work, the key elements of human centered computing include: a. Technology b. Cognitive Psychology and Ergonomics c. Social and Organizational Psychology d. Design and Arts e. Interaction f. Analysis for design of systems with a human focus from beginning to the end. We present the usage, details and outcomes of the mentioned focus areas in our design of developing deepfakes for good. We also present results of a social experiment conducted with children during their interaction with digital persona.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cognitive ergonomics and user interface design (2008). http://members.upc.nl/g.haan24/articles/chapter1.html. Accessed 7 Feb 2020

  2. An AI service that analyses faces in images (2020). https://azure.microsoft.com/en-in/services/cognitive-services/face/. Accessed 17 May 2020

  3. Create your own deepfakes online (2020). https://deepfakesweb.com/. Accessed 5 Feb 2020

  4. Language understanding (LUIS) (2020). https://www.luis.ai/home. Accessed 15 June 2020

  5. Real-time multimodal emotion recognition (2020). https://github.com/maelfabien/Multimodal-Emotion-Recognition. Accessed 29 July 2020

  6. The Salvador Dali museum (2020). https://thedali.org/. Accessed 17 July 2020

  7. Text analytics API documentation (2020). https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/. Accessed 22 July 2020

  8. Arik, S.Ö., Chen, J., Peng, K., Ping, W., Zhou, Y.: Neural voice cloning with a few samples. CoRR abs/1802.06006 (2018). http://arxiv.org/abs/1802.06006

  9. kan bayashi: Unofficial parallel wavegan (+ MelGAN) implementation with Pytorch (2020). https://github.com/kan-bayashi/ParallelWaveGAN. Accessed 24 Feb 2020

  10. CISE - IIS - About (2020). http://www.nsf.gov/cise/iis/about.jsp. Accessed 12 Jan 2020

  11. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  12. DeepFaceLab is the leading software for creating deepfakes (2020). https://github.com/iperov/DeepFaceLab. Accessed 11 Jan 2020

  13. Pytorch implementation of convolutional neural networks-based text-to-speech synthesis models (2019). https://github.com/r9y9/deepvoice3_pytorch. Accessed 21 Feb 2020

  14. Egger, M., Ley, M., Hanke, S.: Emotion recognition from physiological signal analysis: a review. Electron. Notes Theor. Comput. Sci. 343, 35–55 (2019). https://doi.org/10.1016/j.entcs.2019.04.009. http://www.sciencedirect.com/science/article/pii/S157106611930009X. The Proceedings of AmI, The 2018 European Conference on Ambient Intelligence (2018)

  15. Faceswap: Deepfakes software for all (2020). https://github.com/deepfakes/faceswap. Accessed 29 Jan 2020

  16. A denoising autoencoder, adversarial losses and attention mechanisms for face swapping (2019). https://github.com/shaoanlu/faceswap-GAN. Accessed 19 Jan 2020

  17. Fried, O., et al.: Text-based editing of talking-head video. ACM Trans. Graph. 38(4), July 2019. https://doi.org/10.1145/3306346.3323028

  18. You can now speak using someone else’s voice with deep learning, July 2019. https://towardsdatascience.com/you-can-now-speak-using-someone-elses-voice-with-deep-learning-8be24368fa2b. Accessed 16 Feb 2020

  19. Holography (2020). https://en.wikipedia.org/wiki/Holography. Accessed 4 Feb 2020

  20. Jaimes, A., Sebe, N., Gatica-Perez, D.: Human-centered computing: a multimedia perspective. In: Proceedings of the 14th ACM International Conference on Multimedia, MM 2006, New York, NY, USA, pp. 855–864. Association for Computing Machinery (2006). https://doi.org/10.1145/1180639.1180829

  21. These five platforms will make your bots language-intelligent (2016). https://chatbotsmagazine.com/these-five-platforms-will-makeyour-bots-language-intelligent-634556750abd. Accessed 5 Jan 2020

  22. Jia, Y., et al.: Transfer learning from speaker verification to multispeaker text-to-speech synthesis. CoRR abs/1806.04558 (2018). http://arxiv.org/abs/1806.04558

  23. Kazeminia, S., et al.: GANs for medical image analysis. Artif. Intell. Med. 109 (2020). https://doi.org/10.1016/j.artmed.2020.101938. http://www.sciencedirect.com/science/article/pii/S0933365719311510

  24. VGGFace implementation with Keras framework (2020). https://github.com/rcmalli/keras-vggface. Accessed 4 Jan 2020

  25. Kietzmann, J., Lee, L.W., McCarthy, I.P., Kietzmann, T.C.: DeepFakes: trick or treat? Bus. Horizons 63(2), 135–146 (2020). https://doi.org/10.1016/j.bushor.2019.11.006. http://www.sciencedirect.com/science/article/pii/S0007681319301600

  26. Kobayashi, K., Toda, T.: sprocket: Open-source voice conversion software, pp. 203–210, June 2018. https://doi.org/10.29007/s4t1

  27. Emotion analytics (2018). https://searchcustomerexperience.techtarget.com/definition/emotions-analytics-EA. Accessed 14 Jan 2020

  28. Nguyen, T., Nguyen, C., Nguyen, T., Nguyen, D., Nahavandi, S.: Deep learning for deepfakes creation and detection, September 2019

    Google Scholar 

  29. 10x your employee engagement with immersive learning experiences (2019). https://www.jolt.io/blog/10x-your-employee-engagement-with-great-learning-experiences. Accessed 27 June 2020

  30. van den Oord, A., et al.: WaveNet: a generative model for raw audio. arXiv (2016). https://arxiv.org/abs/1609.03499

  31. Ping, W., et al.: Deep voice 3: scaling text-to-speech with convolutional sequence learning (2017)

    Google Scholar 

  32. Pold, S.: Interface realisms: the interface as aesthetic form. Postmod. Cult. 15, January 2005. https://doi.org/10.1353/pmc.2005.0013

  33. Sanders, N., Wood, J.: The Humachine: Humankind, Machines, and the Future of Enterprise. Taylor & Francis, Abingdon (2019). https://books.google.co.in/books?id=OVauDwAAQBAJ

  34. Shen, J., et al.: Natural TTS synthesis by conditioning WaveNet on MEL spectrogram predictions, pp. 4779–4783, April 2018. https://doi.org/10.1109/ICASSP.2018.8461368

  35. Tomáis, E.: How the arts can help tangible interaction design: a critical re-orientation. Informatics 4, 31 (2017). https://doi.org/10.3390/informatics4030031

    Article  Google Scholar 

  36. Wang, Y., et al.: Tacotron: a fully end-to-end text-to-speech synthesis model, March 2017

    Google Scholar 

  37. Liang, W.: The 3D holographic projection technology based on three-dimensional computer graphics, pp. 403–406, July 2012. https://doi.org/10.1109/ICALIP.2012.6376651

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nisha Ramachandra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramachandra, N., Ahuja, M., Rao, R.M., Dubash, N. (2021). Human Centered Computing in Digital Persona Generation. In: Fu, W., Xu, Y., Wang, SH., Zhang, Y. (eds) Multimedia Technology and Enhanced Learning. ICMTEL 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 388. Springer, Cham. https://doi.org/10.1007/978-3-030-82565-2_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-82565-2_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-82564-5

  • Online ISBN: 978-3-030-82565-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics