Abstract
SmartWeb aims to provide intuitive multimodal access to a rich selection of Web-based information services. We report on the current prototype with a smartphone client interface to the Semantic Web. An advanced ontology-based representation of facts and media structures serves as the central description for rich media content. Underlying content is accessed through conventional web service middleware to connect the ontological knowledge base and an intelligent web service composition module for external web services, which is able to translate between ordinary XML-based data structures and explicit semantic representations for user queries and system responses. The presentation module renders the media content and the results generated from the services and provides a detailed description of the content and its layout to the fusion module. The user is then able to employ multiple modalities, like speech and gestures, to interact with the presented multimedia material in a multimodal way.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fensel, D., et al. (eds.): Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, Cambridge (2003)
Wahlster, W.: SmartWeb: Mobile Applications of the Semantic Web. In: Biundo, S., Frühwirth, T., Palm, G. (eds.) KI 2004. LNCS (LNAI), vol. 3238, pp. 50–51. Springer, Heidelberg (2004)
Pantic, M., et al.: Human computing and machine understanding of human behavior: a survey. In: ICMI ’06: Proceedings of the 8th international conference on Multimodal interfaces, Banff, Alberta, Canada, pp. 239–248. ACM Press, New York (2006), doi:10.1145/1180995.1181044
Wahlster, W.: VERBMOBIL: Foundations of Speech-to-Speech Translation. Springer, Heidelberg (2000)
Wahlster, W.: SmartKom: Symmetric Multimodality in an Adaptive and Reusable Dialogue Shell. In: Krahl, R., Günther, D. (eds.) Proc. of the Human Computer Interaction Status Conference 2003, Berlin, Germany, pp. 47–62. DLR (2003)
Reithinger, N., et al.: MIAMM - A Multimodal Dialogue System Using Haptics. In: van Kuppevelt, J., Dybkjaer, L., Bernsen, N.O. (eds.) Advances in Natural Multimodal Dialogue Systems. Text, Speech and Language Technology, vol. 30, Springer, Heidelberg (2005)
Wahlster, W.: SmartKom: Foundations of Multimodal Dialogue Systems. Cognitive Technologies. Springer, New York (2006)
Oviatt, S.: Ten myths of multimodal interaction. Communications of the ACM 42(11), 74–81 (1999), citeseer.nj.nec.com/oviatt99ten.html
Reithinger, N., et al.: A Look Under the Hood Design and Development of the First SmartWeb System Demonstrator. In: Proceedings of 7th International Conference on Multimodal Interfaces (ICMI 2005), Trento, Italy, October 4-6 (2005)
Allen, J., et al.: An Architecture for a Generic Dialogue Shell. Natural Language Engineering 6(3), 1–16 (2000)
Poppe, R.W., Rienks, R.J.: Evaluating the future of hci: Challenges for the evaluation of upcoming applications. In: Proceedings of the International Workshop on Artificial Intelligence for Human Computing at the International Joint Conference on Artificial Intelligence IJCAI’07, Hyderabad, India, pp. 89–96 (2007)
Oviatt, S.: Multimodal Interfaces. In: The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, pp. 286–304. Lawrence Erlbaum, Mahwah (2003)
Wasinger, R., Wahlster, W.: The Anthropomorphized Product Shelf: Symmetric Multimodal Interaction with Instrumented Environments. In: Aarts, E., Encarnação, J.L. (eds.) True Visions: The Emergence of Ambient Intelligence, Springer, Heidelberg (2006)
Wahlster, W.: Towards symmetric multimodality: Fusion and fission of speech, gesture, and facial expression. In: KI, pp. 1–18 (2003)
Krotzsch, M., et al.: How to reason with OWL in a logic programming system. In: Proceedings of RuleML’06 (2006)
Cheyer, A.J., Martin, D.L.: The Open Agent Architecture. Autonomous Agents and Multi-Agent Systems 4(1–2), 143–148 (2001)
Seneff, S., Lau, R., Polifroni, J.: Organization, Communication, and Control in the Galaxy-II Conversational System. In: Proc. of Eurospeech’99, Budapest, Hungary, pp. 1271–1274 (1999)
Thorisson, K.R., et al.: Artificial intelligence in computer graphics: A constructionist approach. Computer Graphics, 26–30 (February 2004)
Herzog, G., et al.: Large-scale Software Integration for Spoken Language and Multimodal Dialog Systems. Natural Language Engineering (Special issue on Software Architecture for Language Engineering) 10 (2004)
Bontcheva, K., et al.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering (Special issue on Software Architecture for Language Engineering) 10 (2004)
Reithinger, N., Sonntag, D.: An integration framework for a mobile multimodal dialogue system accessing the semantic web. In: Proc. of Interspeech’05, Lisbon, Portugal (2005)
Oberle, D., et al.: Dolce ergo sumo: On foundational and domain models in swinto (smartweb integrated ontology). Technical report, AIFB, Karlsruhe (July 2006)
Gangemi, A., et al.: Sweetening Ontologies with DOLCE. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, p. 166. Springer, Heidelberg (2002)
Niles, I., Pease, A.: Towards a Standard Upper Ontology. In: Welty, C., Smith, B. (eds.) Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001), Ogunquit, Maine, October 17–19 (2001)
Cimiano, P., et al.: The smartweb foundational ontology. Technical report (AIFB), University of Karlsruhe, Karlsruhe, Germany, SmartWeb Project (2004)
Oberle, D.: Semantic Management of Middleware, vol. I of The Semantic Web and Beyond. Springer, Heidelberg (2006)
Gangemi, A., Mika, P.: Understanding the semantic web through descriptions and situations. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, Springer, Heidelberg (2003)
Sonntag, D.: Towards interaction ontologies for mobile devices accessing the semantic web - pattern languages for open domain information providing multimodal dialogue systems. In: Proceedings of the workshop on Artificial Intelligence in Mobile Systems (AIMS). 2005 at MobileHCI, Salzburg (2005)
Hovy, E., et al.: Towards semantic-based answer pinpointing. In: Proceedings of Human Language Technologies Conference, San Diego CA, March 2001, pp. 339–345 (2001), citeseer.ist.psu.edu/hovy01toward.html
Sonntag, D., Romanelli, M.: A multimodal result ontology for integrated semantic web dialogue applications. In: Proceedings of the 5th Conference on Language Resources and Evaluation (LREC 2006), Genova, Italy, May 24–26 (2006)
Hunter, J.: Adding Multimedia to the Semantic Web - Building an MPEG-7 Ontology. In: Proceedings of the International Semantic Web Working Symposium (SWWS) (2001)
Benitez, A.B., et al.: Semantics of Multimedia in MPEG-7. In: IEEE International Conference on Image Processing (ICIP), IEEE Computer Society Press, Los Alamitos (2002)
Romanelli, M., Sonntag, D., Reithinger, N.: Connecting foundational ontologies with mpeg-7 ontologies for multimodal qa. In: Proceedings of the 1st International Conference on Semantics and digital Media Technology (SAMT), Athens, Greece, December 6-8 (2006)
Ghallab, M., Nau, D., Traverso, P.: Automated planning. Elsevier, Amsterdam (2004)
Ankolekar, A., et al.: Integrating semantic web services for mobile access. In: Proceedings of 3rd European Semantic Web Conference (ESWC 2006) (2006)
Engel, R.: Robust and efficient semantic parsing of free word order languages in spoken dialogue systems. In: Proceedings of 9th Conference on Speech Communication and technology, Lisboa (2005)
Gavaldà, M.: SOUP: A parser for real-world spontaneous speech. In: Proc. of 6th IWPT, Trento, Italy (February 2000)
Potamianos, A., Ammicht, E., Kuo, H.-K.J.: Dialogue managment in the bell labs communicator system. In: Proc. of 6th ICSLP, Beijing, China (2000)
Ward, W.: Understanding spontaneous speech: the Phoenix system. In: Proc. of ICASSP-91 (1991)
Kaiser, E.C., Johnston, M., Heeman, P.A.: PROFER: Predictive, robust finite-state parsing for spoken language. In: Proc. of ICASSP-99, Phoenix, Arizona, vol. 2, pp. 629–632 (1999)
Lavie, A.: GLR*: A robust parser for spontaneously spoken language. In: Proc. of ESSLLI-96 Workshop on Robust Parsing (1996)
Huynh, D.T.: Communicative grammars: The complexity of uniform word problems. Information and Control 57(1), 21–39 (1983)
Becker, T.: Natural language generation with fully specified templates. In: Wahlster, W. (ed.) SmartKom: Foundations of Multi-modal Dialogue Systems. Cognitive Technologies, pp. 401–410. Springer, Heidelberg (2006)
Engel, R.: Spin: A semantic parser for spoken dialog systems. In: Proceedings of the 5th Slovenian First International Language Technology Conference (IS-LTC 2006) (2006)
Pfleger, N.: Fade - an integrated approach to multimodal fusion and discourse processing. In: Proceedings of the Dotoral Spotlight at ICMI 2005, Trento, Italy (2005)
Pfleger, N., Alexandersson, J.: Towards Resolving Referring Expressions by Implicitly Activated Referents in Practical Dialogue Systems. In: Proceedings of the 10th Workshop on the Semantics and Pragmatics of Dialogue (Brandial), Postdam, Germany, September 11-13, 2006, pp. 2–9 (2006)
Porzel, R., et al.: Towards a Separation of Pragmatic Knowledge and Contextual Information. In: Proceedings of ECAI 06 Workshop on Contexts and Ontologies, Riva del Garda, Italy (2006)
Hacker, C., Batliner, A., Nöth, E.: Are You Looking at Me, are You Talking with Me – Multimodal Classification of the Focus of Attention. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 581–588. Springer, Heidelberg (2006)
Matheson, C., Poesio, M., Traum, D.: Modelling grounding and discourse obligations using update rules. In: Proceedings of NAACL 2000 (May 2000), citeseer.ist.psu.edu/article/matheson00modelling.html
Sonntag, D.: Towards combining finite-state, ontologies, and data driven approaches to dialogue management for multimodal question answering. In: Proceedings of the 5th Slovenian First International Language Technology Conference (IS-LTC 2006) (2006)
Carpenter, B.: The logic of typed feature structures (1992)
Larsson, S., Traum, D.: Information state and dialogue management in the TRINDI dialogue move engine toolkit. In: Natural Language Engineering, Cambridge University Press, Cambridge (2000)
Matheson, C., Poesio, M., Traum, D.: Modelling grounding and discourse obligations using update rules. In: Proceedings of NAACL 2000 (May 2000), citeseer.ist.psu.edu/article/matheson00modelling.html
Walker, M., Fromer, J., Narayanan, S.: Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email (1998)
Singh, S., et al.: Optimizing dialogue management with reinforcement learning: Experiments with the njfun system. Journal of Artificial Intelligence Research (JAIR) 16, 105–133 (2002)
Rieser, V., Kruijff-Korbayova, K., Lemon, O.: A framework for learning multimodal clarification strategies. In: Proceedings of the International Conference on Multimodal Interfaces (ICMI) (2005)
Raghavan, H., Madani, O., Jones, R.: When will a human in the loop accelerate learning. In: Proceedings of the International Workshop on Artificial Intelligence for Human Computing at the International Joint Conference on Artificial Intelligence IJCAI’07, Hyderabad, India, pp. 97–105 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Sonntag, D. et al. (2007). SmartWeb Handheld — Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds) Artifical Intelligence for Human Computing. Lecture Notes in Computer Science(), vol 4451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72348-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-72348-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72346-2
Online ISBN: 978-3-540-72348-6
eBook Packages: Computer ScienceComputer Science (R0)