Abstract
In this paper we describe a fully implemented system for speech and natural language control of 3D animation and computer games. The experimental framework has features that have been emulated from the popular DOOM™ computer game. It implements an integrated parser based on a linguistic formalism tailored to the processing of the specific natural language instructions required to control a player character. This parser outputs structured message formats to the animation layer, which further interprets these messages to generate behaviours for the scene objects. We have found that interactive control significantly impacts on the behavioural interpretation of natural language semantics. Besides bringing stringent requirements in terms of response times for the natural language processing step, it determines the level of autonomy that the animated character should possess, which in turn influences the generation of behavioral scripts from natural language instructions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Clay S.R., Wilhelms J. Put: language-based interactive manipulation of objects. IEEE Computer Graphics and Applications 1995; 16(2):31–39
Wauchoppe K, Everett S, Perzanovski D, Marsh E. Natural language in four spatial interfaces. Proceedings of the Fifth Conference on Applied Natural Language Processing, 1997; 8–11
Cavazza M, Bonne J-B, Pernel D, Pouteau X, Prunet C. Virtual environments for control and command applications. Proceedings of the FIVE'95 Conference, London, 1995
Karlgren J, Bretan I, Frost N, Jonsson L. Interaction models, reference, and interactivity for speech interfaces to virtual environments. Proceedings of 2nd Eurographics Workshop on Virtual Environments, Realism and Real Time, Monte Carlo, 1995; 149–159
Kurlander D, Ling D. Planning-based control of interface animation. Proceedings of the CHI'95 Conference, 1995; Denver, CO: ACM Press; 472–479
Perlin K, Goldberg A. Improv: a system for scripting interactive actors in virtual worlds. Proceedings of SIGGRAPH'95, New Orleans, LA, 1995; 205-216
Chapman D. Vision, instruction and action. Cambridge, MA: MIT Press, 1991
Webber B, Badler N, Di Eugenio B, Geib C, Levison L, Moore M. Instructions, intentions and expectations. Technical Report, IRCS-94-01, Institute for Research In Cognitive Science, University of Pennsylvania, 1994
Badler N, Bindiganavale R, Bourne J, Allbeck J, Shi J, Palmer M. Real time virtual humans. Proceedings of Digital Media Futures, Bradford, UK, 1999
Geib C, Levison L, Moore MB. Sodajack: an architecture for agents that search for and manipulate objects. Technical Report, MS-CIS-94-16, Dept. of Computer and Information Science, University of Pennsylvania, 1994
Bolter JD, Grusin R. Remediation: Understanding New Media. Cambridge, MA: MIT Press, 1999
Cohen PR, Oviatt SL. The role of voice in human-machine communication, In: Voice communication between humans and machines. Roe D, Wilpon J, eds. Washington, DC: National Academy of Sciences Press, 1994; 3:4–75
Palmer IJ, Grimsdale RL. REALISM: reusable elements for animation using local integrated simulation models. Proceedings of Computer Animation '94, IEEE Computer Society Press. 1994; 132–140
Salisbury MR. Command and Control Simulation Interface Language (CCSIL): Status update. Proceedings of the Twelfth Workshop on Standards for the Interoperability of Defense Simulations, Orlando, Florida 1995, 639–649
Sager N. Sublanguage: linguistic phenomenon, computational tool. In: Analyzing language in restricted domains. Grishman, R, Kittredge, R, eds. Hillsdale, NJ: Lawrence Erlbaum Associates, 1984; 1–7
Ogden WC, Bernick P. Using natural language interfaces. In: Handbook of human-computer interaction. Helander, M, ed. Amsterdam: Elsevier Science Publishers (North-Holland), 1996; 137–163
Zoltan-Ford, E. How to get people to say and type what computers can understand. The International Journal of Man-Machine Studies, 1991, 34: 527–547
Microsoft. Guidelines for designing character interaction. Microsoft Corporation, 1998. Available on-line at http://www.microsoft.com./workshop/imedia/agent/guidelines.asp
Cavazza M. An integrated TFG parser with explicit tree typing. Proceedings of the Fourth TAG + Workshop, Technical Report IRCS-98-12, Institute for Research in Cognitive Science, University of Pennsylvania, PA, 1998; 34–37
Joshi A, Levy L, Takahashi M. Tree adjunct grammars. Journal of the Computer and System Sciences 1995; 10(1):136–163
Abeillé A. Une grammaire lexicalisée d'arbres adjoints pour le francais: application a l'analyse automatique. These de Doctorat de l'Université Paris 7 (in French), 1991
De Smedt K, Kempen G. Segment grammars: a formalism for incremental sentence generation. In: Natural language generation and computational linguistics. Paris, C, ed. Dordrecht: Kluwer, 1990; 329–349
Cavazza M, Palmer I, Parnell S. Real-time requirements for the implementation of speech-controlled artificial actors. In: Modeling and motion capture for virtual environments, Lecture Notes in Artificial Intelligence 1537, Magnenat-Thalmann, N, Thalmann, D. eds. New York: Springer, 1988; 187–198
Takeda H, Iwata K, Takaai M, Sawada A, Nishida T. An ontology-based cooperative environment for real-world agents. Proceedings of Second International Conference on Multiagent Systems, Kyoto, Japan, 1996; 353–360
Bandi S, Thalmann D. Space discretization for efficient human navigation. Computer Graphics Forum 1998, 17(3): C195-C206
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Cavazza, M., Palmer, I. Natural language control of interactive 3D animation and computer games. Virtual Reality 4, 85–102 (1999). https://doi.org/10.1007/BF01408588
Issue Date:
DOI: https://doi.org/10.1007/BF01408588