skip to main content
10.1145/3463944.3469269acmconferencesArticle/Chapter ViewAbstractPublication PagesicdarConference Proceedingsconference-collections
research-article

Multimodal Virtual Avatars for Investigative Interviews with Children

Published: 21 August 2021 Publication History

Abstract

In this article, we present our ongoing work in the field of training police officers who conduct interviews with abused children. The objectives in this context are to protect vulnerable children from abuse, facilitate prosecution of offenders, and ensure that innocent adults are not accused of criminal acts. There is therefore a need for more data that can be used for improved interviewer training to equip police with the skills to conduct high-quality interviews. To support this important task, we propose to research a training program that utilizes different system components and multimodal data from the field of artificial intelligence such as chatbots, generation of visual content, text-to-speech, and speech-to-text. This program will be able to generate an almost unlimited amount of interview and also training data. The goal of combining all these different technologies and datatypes is to create an immersive and interactive child avatar that responds in a realistic way, to help to support the training of police interviewers, but can also produce synthetic data of interview situations that can be used to solve different problems in the same domain.

References

[1]
2019. Home of the leading free deepfakes faceswapping software. https://www.faceswap.dev/
[2]
Martin Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).
[3]
Joyce A. Adams, Karen Farst, and Nancy D. Kellogg. 2018. Interpretation of Medical Findings in Suspected Child Sexual Abuse: An Update for 2018. Journal of Pediatric and Adolescent Gynecology, Vol. 31 (2018), 225--231.
[4]
American Professional Society on the Abuse of Children. 2002. Guidelines for psychosocial evaluation of suspected sexual abuse in young children (Rev. ed.). American Professional Society on the Abuse of Children, Chicago, IL: APSAC.
[5]
Gunn-Astrid Baugerud, Miriam S Johnson, Helle BG Hansen, Svein Magnussen, and Michael E Lamb. 2020. Forensic interviews with preschool children: An analysis of extended interviews in Norway (2015--2017). Applied Cognitive Psychology, Vol. 34, 3 (2020), 654--663.
[6]
Gunn Astrid Baugerud, Svein Magnussen, and Annika Melinder. 2014. High accuracy but low consistency in children's long-term recall of a real-life stressful event. Journal of experimental child psychology, Vol. 126 (2014), 357--368.
[7]
Mairi Benson and Martine Powell. 2015. Organizational challenges to delivering child investigative interviewer training via e-learning. International Journal of Police Science and Management, Vol. 17 (2015), 63--73.
[8]
Deirdre Brown and Michael Lamb. 2019. Forks in the road, routes chosen, and journeys that beckon: A selective review of scholarship on childrenss testimony. Applied cognitive psychology (2019), Special issue article 2018.
[9]
Sonja Brubacher, Martine Powell, Helene Skouteris, and Belinda Guadagno. 2015. The effects of e-simulation interview training on teachers' use of open-ended questions. CAN, Vol. 43 (2015), 95--103.
[10]
Ann-Christin Cederborg, Charlotte Alm, Djaildes Lima da Silva Nises, and Michael E Lamb. 2013. Investigative interviewing of alleged child abuse victims: An evaluation of a new training programme for investigative interviewers. Police Practice and Research, Vol. 14, 3 (2013), 242--254.
[11]
Ann-Christin Cederborg, Yael Orbach, Kathleen Sternberg, and Michael Lamb. 2000 a. Investigative interviews of child witnesses in Sweden. CAN, Vol. 24 (2000), 1355--1361.
[12]
Ann-Christin Cederborg, Yael Orbach, Kathleen J Sternberg, and Michael E Lamb. 2000 b. Investigative interviews of child witnesses in Sweden. Child Abuse & Neglect, Vol. 24, 10 (2000), 1355--1361.
[13]
Francois Chollet et al. 2015. Keras: Deep learning library for theano and tensorflow. URL: https://keras. io/k (2015).
[14]
Mireille Cyr and Michael Lamb. 2009. Assessing the Effectiveness of the NICHD Investigative Interview Protocol when Interviewing French-speaking Alleged Victims of Child Sexual Abuse in Quebec. CAN, Vol. 33 (2009), 257--268.
[15]
deepfakes. 2019. Faceswap. https://github.com/deepfakes/faceswap
[16]
Louise Dixon, Daniel F Perkins, Catherine Hamilton-Giachritsis, and Leam A Craig. 2017. The Wiley Handbook of what Works in Child Maltreatment: An Evidence-based Approach to Assessment and Intervention in Child Protection .John Wiley & Sons.
[17]
Katarina Finnil:a, Nina Mahlberg, Pekka Santtila, Kenneth Sandnabba, and Pekka Niemi. 2003. Validity of a Test of Children's Suggestibility for Predicting Responses to Two Interview Situations Differing in Their Degree of Suggestiveness. Journal of Experimental Child Psychology, Vol. 85 (2003), 23--49.
[18]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[19]
Gail Goodman, Deborah Goldfarb, Jia Chong, and Lauren Goodman-Shaver. 2014. Children's Eyewitness Memory: The Influence of Cognitive and Socio-Emotional Factors. Roger Williams University Law Review, Vol. 19, 2 (2014), Article 7.
[20]
Gail S. Goodman, Jodi A. Quas, and Christin M. Ogle. 2010. Child Maltreatment and Memory. Annual Review of Psychology, Vol. 61, 1 (2010), 325--351. https://doi.org/10.1146/annurev.psych.093008.100403
[21]
Home Office. 2007. Achieving best evidence in criminal proceedings: Guidance on interviewing victims and witnesses, and using special measures.
[22]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In Proc. of CVPR.
[23]
Amir Jamaludin, Joon Son Chung, and Andrew Zisserman. 2019. You Said That?: Synthesising Talking Faces from Audio. International Journal of Computer Vision (13 Feb 2019). https://doi.org/10.1007/s11263-019-01150-y
[24]
Miriam Johnson, Svein Magnussen, Christian Thorenson, Kyrre Lonnum, Lisa Victoria Burrell, and Annika Melinder. 2015. Best practice recommendations still fail to result in action: A national 10-year follow-up study of investigative inter-views in CSA cases. Applied Cognitive Psychology, Vol. 29 (2015), 661--668.
[25]
Justis- og beredskapsdepartementet. 2015. Forskrift om avhør av barn og andre saerlig sarbare fornærmede og vitner (tilrettelagte avhør) - Regulations of investigative interviews of children and other particularly vulnerable victims and witnesess. Justis- og beredskapsdepartementet, lovdata.no.
[26]
Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, and Koray Kavukcuoglu. 2018. Efficient Neural Audio Synthesis. arXiv preprint arXiv:1802.08435 (2018).
[27]
Julia Korkman, Pekka Santtila, and N Kenneth Sandnabba. 2006. Dynamics of verbal interaction between interviewer and child in interviews with alleged victims of child sexual abuse. Scandinavian journal of psychology, Vol. 47, 2 (2006), 109--119.
[28]
Niels Krause, Francesco Pompedda, Jan Antfolk, Angelo Zappala, and Pekka Santtila. 2017. The Effects of Feedback and Reflection on the Questioning Style of Untrained Interviewers in Simulated Child Sexual Abuse Interviews. Applied Cognitive Psychology, Vol. 31, 2 (2017), 187--198.
[29]
Rithesh Kumar, Jose Sotelo, Kundan Kumar, Alexandre de Bré bisson, and Yoshua Bengio. 2018. ObamaNet: Photo-realistic lip-sync from text. CoRR, Vol. abs/1801.01442 (2018). arxiv: 1801.01442 http://arxiv.org/abs/1801.01442
[30]
David La Rooy, Michael Lamb, and Amina Memon. 2011. Forensic interviews with children in Scotland: A survey of interview practices among police. Journal of Police and Criminal Psychology, Vol. 26 (2011), 26--34.
[31]
Michael Lamb. 2016. Difficulties translating research on forensic interview practices to practitioners: Finding water, leading horses, but can we get them to drink? American Psychologist, Vol. 71 (2016), 710--718.
[32]
Michael Lamb, Deirdre Brown, Irit Hershkowitz, Yale Orbach, and Phillip Esplin. 2018. Tell Me What Happened : Questioning Children about Abuse .J.&W.&Sons.
[33]
Michael E Lamb and Deirdre A Brown. 2006. Conversational apprentices: Helping children become competent informants about their own experiences. British Journal of Developmental Psychology, Vol. 24, 1 (2006), 215--234.
[34]
Michael E Lamb, David J La Rooy, Lindsay C Malloy, and Carmit Katz. 2011. Children's testimony: A handbook of psychological research and forensic practice. Vol. 53. John Wiley & Sons.
[35]
Michael E Lamb, Yael Orbach, Kathleen J Sternberg, JAN Aldridge, Sally Pearson, Heather L Stewart, Phillip W Esplin, and Lynn Bowler. 2009. Use of a structured investigative protocol enhances the quality of investigative interviews with alleged victims of child sexual abuse in Britain. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, Vol. 23, 4 (2009), 449--467.
[36]
Jiwei Li, Michel Galley, Chris Brockett, Georgios P Spithourakis, Jianfeng Gao, and Bill Dolan. 2016. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155 (2016).
[37]
World Health Organization et al. 2009. Global health risks: mortality and burden of disease attributable to selected major risks .Geneva: World Health Organization.
[38]
Francesco Pompedda, Angelo Zappala, and Pekka Santtila. 2015. Simulations of child sexual abuse interviews using avatars paired with feedback improves interview quality. Psychology, Crime and Law, Vol. 21 (2015), 28--52.
[39]
Martine Powell, Belinda Guadagno, and Mairi Benson. 2016. Improving child investigative interviewer performance through computer-based learning activities. Policing and Society, Vol. 26 (2016), 365--374.
[40]
Heather L Price and Kim P Roberts. 2011. The effects of an intensive training and feedback program on police and social workers' investigative interviews of children. Canadian Journal of Behavioural Science/Revue canadienne des sciences du comportement, Vol. 43, 3 (2011), 235.
[41]
Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, and Lawrence Carin. 2016. Variational autoencoder for deep learning of images, labels and captions. In Proc. of NIPS. 2352--2360.
[42]
A. Pumarola, A. Agudo, A.M. Martinez, A. Sanfeliu, and F. Moreno-Noguer. 2018. GANimation: Anatomically-aware Facial Animation from a Single Image. In Proc. of ECCV.
[43]
Kathleen J Sternberg, Michael E Lamb, and Irit Hershkowitz. 1996. Child sexual abuse investigations in Israel. Criminal Justice and Behavior, Vol. 23, 2 (1996), 322--337.
[44]
Supasorn Suwajanakorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. 2017. Synthesizing Obama: Learning Lip Sync from Audio. ACM Trans. Graph., Vol. 36, 4, Article 95 (July 2017), 13 pages. https://doi.org/10.1145/3072959.3073640
[45]
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Niessner. 2016. Face2face: Real-time face capture and reenactment of rgb videos. In Proc. of CVPR.
[46]
Soumya Tripathy, Juho Kannala, and Esa Rahtu. 2020. Icface: Interpretable and controllable face reenactment using gans. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3385--3394.
[47]
A"aron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. CoRR abs/1609.03499 (2016).
[48]
Oriol Vinyals and Quoc Le. 2015. A neural conversational model. arXiv preprint arXiv:1506.05869 (2015).
[49]
Cathy Spatz Widom. 2014. Longterm consequences of child maltreatment. In Handbook of child maltreatment. Springer, 225--247.
[50]
Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. 2018. Recent trends in deep learning based natural language processing. CIM, Vol. 13, 3 (2018), 55--75.
[51]
Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. 2018. Emotional chatting machine: Emotional conversation generation with internal and external memory. In Thirty-Second AAAI Conference on Artificial Intelligence.
[52]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proc. of IEEE ICCV.

Cited By

View all
  • (2025)Providing feedback in simulated investigative interviews with adult witness avatars increases the use of free recall and open questionsInternational Journal of Police Science & Management10.1177/14613557241310014Online publication date: 24-Jan-2025
  • (2024)Editorial: Technological solutions helping to train specialists' interviewing skills of possible victims and witnessesFrontiers in Psychology10.3389/fpsyg.2024.140686715Online publication date: 2-May-2024
  • (2024)Using an AI-based avatar for interviewer training at Children’s Advocacy Centers: Proof of ConceptChild Maltreatment10.1177/10775595241263017Online publication date: 18-Jun-2024
  • Show More Cited By

Index Terms

  1. Multimodal Virtual Avatars for Investigative Interviews with Children

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ICDAR '21: Proceedings of the 2021 ACM Workshop on Intelligent Cross-Data Analysis and Retrieval
        August 2021
        72 pages
        ISBN:9781450385299
        DOI:10.1145/3463944
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 21 August 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. artificial intelligence
        2. chatbots
        3. generated video
        4. interview training
        5. speech-to-text-to-speech
        6. virtual avatar

        Qualifiers

        • Research-article

        Funding Sources

        • Norwegian Research Council

        Conference

        ICMR '21
        Sponsor:

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)79
        • Downloads (Last 6 weeks)7
        Reflects downloads up to 20 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Providing feedback in simulated investigative interviews with adult witness avatars increases the use of free recall and open questionsInternational Journal of Police Science & Management10.1177/14613557241310014Online publication date: 24-Jan-2025
        • (2024)Editorial: Technological solutions helping to train specialists' interviewing skills of possible victims and witnessesFrontiers in Psychology10.3389/fpsyg.2024.140686715Online publication date: 2-May-2024
        • (2024)Using an AI-based avatar for interviewer training at Children’s Advocacy Centers: Proof of ConceptChild Maltreatment10.1177/10775595241263017Online publication date: 18-Jun-2024
        • (2024)A Serious Game with Avatar Suspects Can Be Used to Train Naive Participants in the Strategic Use of EvidenceJournal of Forensic Psychology Research and Practice10.1080/24732850.2023.2299492(1-26)Online publication date: 4-Jan-2024
        • (2023)Enhancing investigative interview training using a child avatar system: a comparative study of interactive environmentsScientific Reports10.1038/s41598-023-47368-213:1Online publication date: 21-Nov-2023
        • (2023)A field assessment of child abuse investigators' engagement with a child-avatar to develop interviewing skillsChild Abuse & Neglect10.1016/j.chiabu.2023.106324143(106324)Online publication date: Sep-2023
        • (2023)The Use and Productivity of Visual Aids as Retrieval Support in Police Interviews of Preschool-Aged Victims of AbuseJournal of Police and Criminal Psychology10.1007/s11896-023-09627-w39:2(289-302)Online publication date: 15-Dec-2023
        • (2022)Breakout Rooms Serve as a Suitable Tool for Interprofessional Pre-Service Online Training among Students within Health, Social, and Education Study ProgramsEducation Sciences10.3390/educsci1212087112:12(871)Online publication date: 28-Nov-2022
        • (2022)Synthesizing a Talking Child Avatar to Train Interviewers Working with Maltreated ChildrenBig Data and Cognitive Computing10.3390/bdcc60200626:2(62)Online publication date: 1-Jun-2022
        • (2022)Human vs. GPT-3: The challenges of extracting emotions from child responses2022 14th International Conference on Quality of Multimedia Experience (QoMEX)10.1109/QoMEX55416.2022.9900885(1-4)Online publication date: 5-Sep-2022
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media