Abstract
Sign language (SL) recognition modules in human-computer interaction systems need to be both fast and reliable. In cases where multiple sets of features are extracted from the SL data, the recognition system can speed up processing by taking only a subset of extracted features as its input. However, this should not be realised at the expense of a drop in recognition accuracy. By training different recognizers for different subsets of features, we can formulate the problem as the task of planning the sequence of recognizer actions to apply to SL data, while accounting for the trade-off between recognition speed and accuracy. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for such planning problems. A POMDP explicitly models the probabilities of observing various outputs from the individual recognizers and thus maintains a probability distribution (or belief) over the set of possible SL input sentences. It then computes a policy that maps every belief to an action. This allows the system to select actions in real-time during online policy execution, adapting its behaviour according to the observations encountered. We illustrate the POMDP approach with a simple sentence recognition problem and show in experiments the advantages of this approach over “fixed action” systems that do not adapt their behaviour in real-time.
Chapter PDF
Similar content being viewed by others
References
von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.-F.: Recent Developments in Visual Sign Language Recognition. Univ. Access Inf. Soc. 6, 323–362 (2008)
Kaelbling, L., Littman, M., Cassandra, A.: Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101(1-2), 99–134 (1998)
Williams, J., Young, S.: Scaling POMDPs for Spoken Dialogue Management. IEEE Trans. on Audio, Speech & Language Processing 17(7) (2007)
Sridharan, M., Wyatt, J., Dearden, R.: HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot. In: Intl. Conf. on Automated Planning & Scheduling (2008)
Kong, W.W., Ranganath, S.: Automatic Hand Trajectory Segmentation and Phoneme Transcription for Sign Language. In: IEEE Conf. Automatic Face & Gesture Recog. (2008)
Kurniawati, H., Hsu, D., Lee, W.: SARSOP: Efficient Point-based POMDP Planning by Approximating Optimally Reachable Belief Spaces. In: Proc. RSS (2008)
Pineau, J., Thrun, S.: High-level Robot Behaviour Control Using POMDPs. In: AAAI (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ong, S.C.W., Hsu, D., Lee, W.S., Kurniawati, H. (2009). Partially Observable Markov Decision Process (POMDP) Technologies for Sign Language Based Human-Computer Interaction. In: Stephanidis, C. (eds) Universal Access in Human-Computer Interaction. Applications and Services. UAHCI 2009. Lecture Notes in Computer Science, vol 5616. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02713-0_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-02713-0_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02712-3
Online ISBN: 978-3-642-02713-0
eBook Packages: Computer ScienceComputer Science (R0)