Skip to main content

Advertisement

Log in

VOX system: a semantic embodied conversational agent exploiting linked data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the last few years, the use of ontologies has spread thanks to the irruption of the Semantic Web. They have become a crucial tool in information systems as they explicitly state the meaning of information, making it possible to share it and to achieve higher levels of interoperability. However, being knowledge representation models as they are, other fields can take advantage of their characteristics to extend their capabilities. In particular, in the context of Embodied Conversational Agents, they can be used to provide them with semantic knowledge and, therefore, enhance their intellectual skills. In this paper, we propose an approach to explore the synergies between these technologies. Thus, we have developed a multimodal ECA that exploits the knowledge provided by the Linked Data initiative to help users in their search information tasks. Based on a semantic-guided keyword search, our approach is flexible enough to: 1) deal with different Linked Data repositories and 2) handle different search/knowledge domains in a multilingual way. To illustrate the potential of our approach, we have focused on the case of DBpedia, as it mirrors the information stored in the Wikipedia, providing a semantic entry to it.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://schema.org

  2. http://www.wikipedia.org

  3. OWL Web Ontology Language, http://www.w3.org/TR/owl-primer/

  4. In fact, they might have several associated URLs, each of them corresponding to the article in different languages.

  5. RDF Resource Description Framework, http://www.w3.org/TR/rdf-primer/

  6. SKOS Simple Knowledge Organization System, http://www.w3.org/TR/skos-primer/

  7. The DBpedia Ontology, http://wiki.dbpedia.org/Ontology

  8. YAGO Ontology, http://www.mpi-inf.mpg.de/yago-naga/yago/

  9. Speech Recognition Grammar Specification, http://www.w3.org/TR/speech-grammar/

  10. We use an OWL version of SKOS to enable the reasoning on the SKOS taxonomy in a seamless way.

  11. We assume that the results can be cached as the data stored in the DBpedia is in fact quite stable in time: there was a lapse of seven months from the release of version 3.6 of DBpedia to the 3.7 one, and a year between 3.7 and the latest one, 3.8.

  12. X-SAMPA Extended Speech Assessment Methods Phonetic Alphabet, http://en.wikipedia.org/wiki/X-SAMPA

  13. http://www.web3d.org/x3d/

  14. Loquendo ASR, http://www.loquendo.com/en/products/speech-recognition/

  15. Speech Recognition Grammar Specification, http://www.w3.org/TR/speech-grammar/

  16. Loquendo TTS, http://www.loquendo.com/en/products/text-to-speech/

  17. JFlex Fast Scanner Generator for Java, http://jflex.de

  18. CUP Parser Generator for Java, http://www2.cs.tum.edu/projects/cup/

  19. OWL API, http://owlapi.sourceforge.net/

  20. Jena API, http://jena.apache.org/

  21. Protegé, http://protege.stanford.edu/

  22. Virtuoso repository, http://virtuoso.openlinksw.com/

  23. SPARQL Query Language, http://www.w3.org/TR/rdf-sparql-query/ superseded by http://www.w3.org/TR/sparql11-overview/

  24. They are in Spanish, as the prototype is developed to work in this language.

References

  1. Androutsopoulos I, Ritchie GD, Thanisch P (1995) Natural language interfaces to databases - An introduction. Nat Lang Eng 1(1):29–81

    Article  Google Scholar 

  2. Baader F, Calvanese D, McGuinness D, Nardi D, Patel-Scheneider P (2003) The Description Logic handbook. Theory, implementation and applications. Cambridge University Press

  3. Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley

  4. Baldassarri S, Cerezo E, Serón FJ (2008) Maxine: A platform for embodied animated agents. Comput Graph 32(3):430–437

    Article  Google Scholar 

  5. Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43

    Article  Google Scholar 

  6. Berry DC, Butler L.T, de Rosis F (2005) Evaluating a realistic agent in an advice-giving task. Int J Hum Comput Stud 63(3):304–327

    Article  Google Scholar 

  7. Beun R.J, de Vos E, Witteman C (2003) Embodied Conversational Agents: Effects on memory performance and anthropomorphisation. In: Proceedings of the 4th International Workshop on Intelligent Agents (IVA’03), Kloster Irsee (Germany). Springer, pp 315–319

  8. Bizer C, Heath T, Berners-Lee T (2009) Linked data - The story so far. Int J Semant Web Inf Syst 5(3):1–22

    Article  Google Scholar 

  9. Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia - A crystallization point for the web of data. Web Semant Sci Serv Agents World Wide Web 7(3): 154–165

    Article  Google Scholar 

  10. Bobed C, Esteban G, Mena E (2013) Enabling keyword search on Linked Data repositories: An ontology-based approach. International Journal of Knowledge-based and Intelligent Engineering Systems 17(1):67–77

    Google Scholar 

  11. Breuing A (2010) Improving human-agent conversations by accessing contextual knowledge from Wikipedia. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’10), Toronto (Canada). IEEE Computer Society Press, pp 428–431

  12. Cassell J (2001) Embodied conversational agents: representation and intelligence in user interfaces. AI Mag 22 (4):67–84

    Google Scholar 

  13. Cassell J, Sullivan J, Prevost S, Churchill EF (2000) Embodied conversational agents. MIT Press

  14. Cerezo E, Baldasarri S, Hupont I, Serón FJ (2008) Affective computing. I-Tech Education and Publishing

  15. Cimiano P, Kopp S (2010) Accessing the Web of Data through embodied virtual characters. Semantic Web Journal 1(1,2):83–88

    Google Scholar 

  16. Cochran WG, Cox GM (1957) Experimental designs, 2nd edn. Wiley

  17. Duckhorn F, Hoffmann R (2012) Using context-free grammars for embedded speech recognition with weighted finite-state transducers. In: Proceedings of the 13th Annual Conference of the International Communication Association (INTERSPEECH’12), Portland (Oregon, USA), pp. 1003–1006. ISCA

  18. D’Ulizia A, Ferri F, Grifoni P (2010) Generating multimodal grammars for multimodal dialogue processing. IEEE Trans Syst Man Cybern Syst Hum 40(6):1130–1145

    Article  Google Scholar 

  19. García A, Lamsfus C (2005) An e-learning platform to support vocational training centers on digital security training with virtual tutors and graphical spatial metaphores. In: Proceedings of the International Conference on Education (IADAT-e2005), Biarritz (France), IADAT, pp. 117–121.

  20. Graesser A, Chipman P, Haynes B, Olney A (2005) AutoTutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618

    Article  Google Scholar 

  21. Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220

    Article  Google Scholar 

  22. Gruber TR (1995) Toward principles for the design of ontologies used for knowledge sharing. Int J Hum Comput Stud 43(5-6):907–928

    Article  Google Scholar 

  23. Kalwick DJ (2006) Animating facial features & expressions, 2nd edn. Thompson

  24. Kim H, Park J, Oh Y, Kim S, Kim B (2012) Voice command recognition for fighter pilots using grammar tree. In: Computer Applications for Database, Education, and Ubiquitous Computing. International Conferences (EL, DTA and UNESST’12), Kangwondo (Korea). Springer, Berlin Heidelberg, pp 116–119

    Google Scholar 

  25. Kimura M, Kitamura Y (2006) Embodied Conversational Agent based on Semantic Web. In: Proceedings of the 9th Pacific Rim International Conference on Agent Computing and Multi-Agent Systems (PRIMA’06), Guilin (China). Springer, pp 734–741

  26. Kipp M, Kipp KH, Ndiaye A, Gebhard P (2006) Evaluating the tangible interface and virtual characters in the interactive COHIBIT exhibit. In: Proceedings of the 6th International Conference on Intelligent Virtual Agents (IVA’06), Marina Del Rey (California, USA). Springer, pp 434–444

  27. Lester J, Towns S, Fitzgerald P (1999) Achieving affective impact: visual emotive communication in lifelike pedagogical agents. Int J Artif Intell Educ 10(3):278–291

    Google Scholar 

  28. Li H, Zhang T, Qiu R, Ma L (2012) Grammar-based semi-supervised incremental learning in automatic speech recognition and labeling. Energy Procedia 17, Part B(0):1843–1849

    Article  Google Scholar 

  29. Lopez V, Uren VS, Sabou M, Motta E (2011) Is question answering fit for the Semantic Web?: A survey. Semantic Web journal 2(2):125–155

    Google Scholar 

  30. Marsella SC, Johnson WL, LaBore C (2000) Interactive pedagogical drama. In: Proceedings of the 4th International Conference on Autonomous Agents (AGENTS’00), Barcelona (Spain), ACM, pp. 301–308.

  31. Marsi E, van Rooden F (2007) Expressing uncertainty with a talking head in a multimodal question-answering system. In: Proceedings of the Workshop on Multimodal Output Generation (MOG’07), Aberdeen (UK), CTIT, pp. 105–116.

  32. Mignonneau L, Sommerer C (2005) Designing emotional, metaphoric, natural and intuitive interfaces for interactive art, edutainment and mobile communications. Comput Graph 29(6):837–851

    Article  Google Scholar 

  33. Mori M, MacDorman KF, Kageki N (2012) The uncanny valley [from the field]. IEEE Robot Autom Mag 19(2):98–100

    Article  Google Scholar 

  34. Motik B, Shearer R, Horrocks I (2009) Hypertableau reasoning for description logics. J Artif Intell Res 36(1):165–228

    MathSciNet  MATH  Google Scholar 

  35. Mulken SV, Andr E (1998) The persona effect: How substantial is it? In: People and Computers XIII: Proceedings of HCI’98, Sheffield (UK). Springer, pp 3–66

  36. Nass C, Steuer J, Tauber ER (1994) Computers are social actors. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’94), Boston (Massachusetts, USA), ACM, pp. 72–78.

  37. Ortiz A, Aizpurua I, Posada J (2003) Some techniques for avatar support of digital storytelling systems. In: Proceeding of 1st International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE’03), Darmstadt (Germany). Fraunhofer IRB Verlag, pp 322–327

  38. Reeves B (2000) The benefits of interactive online characters. Center for the study of language and information. Stanford University

  39. Rieger T (2003) Avatar gestures. Journal of WSCG 11(2):379–386

    MathSciNet  Google Scholar 

  40. Serenko A (2008) A model of user adoption of interface agents for email notification. Interact Comput 20(4–5):461–472

    Article  Google Scholar 

  41. Serenko A, Bontis N, Detlor B (2007) End-user adoption of animated interface agents in everyday work applications. Behav Inform Technol 26(2):119–132

    Article  Google Scholar 

  42. Shadbolt N, Hall W, Berners-Lee T (2006) The semantic web revisited. IEEE Intell Syst 21(3):96–101

    Article  Google Scholar 

  43. Sirin E, Parsia B, Grau BC, Kalyanpur A, Katz Y (2007) Pellet: A practical OWL-DL reasoner. Web Semant Sci Serv Agents World Wide Web 5(2):51–53

    Article  Google Scholar 

  44. Snedecor GW, Cochran WG (1989) Statistical methods, 8th edn. Wiley

  45. Völkel M, Krötzsch M, Vrandecic D, Haller H, Studer R (2006) Semantic Wikipedia. In: Proceedings of the 15th International Conference on World Wide Web (WWW’06), Edinburgh (Scotland), ACM, pp. 585–594.

  46. Waltinger U, Breuing A, Wachsmuth I (2011) Interfacing virtual agents with collaborative knowledge: Open domain question answering using Wikipedia-based topic models. Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI’11), Barcelona (Spain). AAAI Press, pp 1896–1902

  47. Wilcock G (2012)WikiTalk: A spokenWikipedia-based open-domain knowledge access system. In: Proceedings of the Workshop on Question Answering for Complex Domains (QACD’12), Mumbai (India). The COLING 2012 Organizing Committee, pp. 57–70

  48. Yuan X, Chee YS (2005) Design and evaluation of Elva: an embodied tour guide in an interactive virtual art gallery. Comput Animat Virt W 16(2):109–119

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partly financied by:

  • The Spanish “Dirección General de Investigación, Ministerio de Economía y Competitividad”, contract number: TIN2011-24660/REPLIKANTS.

  • The Spanish “Ministerio de Economía y Competitividad”, CICYT projects TIN2010-21387-C02-02 and TIN2013-46238-C4-4-R.

  • The Spanish “Ministerio de Industria, Energía y Turismo”, contract number: AVANZA TSI-020606-2012-4/CONTSEM.

  • European Commission: ALFA_GAVIOTA DCI-ALA/19.09.01/10/21526/245-654/ALFAIII (2010) 149.

  • European Commission: 519332-LLP-1-2011-1-PT-KA3-KA3NW/SEGAN.

  • The DGA (Aragonese Gobern), projects INNOVA-A1-064/13 and DGA-FSE.

We thank to Guillermo Esteban, Daniel Martínez and Javier Marco Rubio for their collaboration as contracted, in the development of this project. We also want to thank Eduardo Mena for his contributions during the design and development of the SENED module.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos Bobed.

Additional information

Reasons for the project’s name

In the video http://www.youtube.com/watch?v=4eouFz770I4, it is shown a Vox, an entity possessing a ”compendium of all human knowledge”. This video clip is taken from The Time Machine (2002), directed by Simon Wells. The film was a co-production of DreamWorks and Warner Bros. in association with Arnold Leibovit Entertainment who obtained the rights to the George Pal original Time Machine (1960).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Serón, F.J., Bobed, C. VOX system: a semantic embodied conversational agent exploiting linked data. Multimed Tools Appl 75, 381–404 (2016). https://doi.org/10.1007/s11042-014-2295-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2295-5

Keywords

Navigation