Abstract
We address noise-robust “auditory scene understanding” for a robot defined by extracting 6W (What, When, Where, Who, Why, hoW) information on the surrounding environment. Although such a robot has been studied in the field of robot audition, only the first four Ws except for “why” and “how” were in scope. Thus, this paper mainly focuses on extracting “how” information, in particular, on cooking scenes to realize a cooking support robot. In this case, “how” information is regarded as a cooking procedure, we construct sound-based cooking procedure recognition based on two models. One is a conventional statistical model, Gaussian Mixture Model (GMM), which is used for an acoustic model to recognize a cooking sound event such as stirring, cutting and so on. The other is a Hierarchical Hidden Markov Model (HHMM), which is used for a recipe model to recognize a sequence of cooking events, i.e., a cooking procedure. We constructed a prototype system for cooking recipe and procedure recognition. Preliminary results showed that the proposed GMM-HHMM based system outperformed a conventional GMM-HMM based system in terms of noise-robustness in cooking recipe recognition and our system was able to correct misrecognition of cooking sound events using recipe model in cooking procedure recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res. (JAIR) 24, 305–339 (2005)
Uslar, M., Specht, M., Rohjans, S., Trefke, J., Gonzalez, J.M.V.: Introduction. In: Uslar, M., Specht, M., Rohjans, S., Trefke, J., Vasquez Gonzalez, J.M. (eds.) The Common Information Model CIM. POWSYS, vol. 2, pp. 3–48. Springer, Heidelberg (2012)
Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden markov model: Analysis and applications. Machine Learning 32(1), 41–62 (1998)
Hashimoto, A., Mori, N., et al.: Smart kitchen: A user centric cooking support system. Proc. of IPMU 8, 848–854 (2008)
Inoue, Y., Minato, S.: An Efficient Method for Indexing All Topological Orders of a Directed Graph. In: Ahn, H.-K., Shin, C.-S. (eds.) ISAAC 2014. LNCS, vol. 8889, pp. 103–114. Springer, Heidelberg (2014)
Khoo, C.S., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: Proc. of the 38th Annual Meeting on Association for Computational Linguistics, pp. 336–343. ACL (2000)
Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of the 6th Conf. on Natural Language Learning, vol. 20, pp. 1–7 (2002)
Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to japanese morphological analysis. EMNLP 4, 230–237 (2004)
Liao, L., Fox, D., Kautz, H.: Hierarchical conditional random fields for gps-based activity recognition. In: Robotics Research, pp. 487–506. Springer (2007)
Mori, S., Maeta, H., et al.: Flow graph corpus from recipe texts. In: Proc. of the 9th International Conf. on Language Resources and Evaluation (2014)
Nakadai, K., Lourens, T., et al.: Active audition for humanoid. In: AAAI/IAAI, pp. 832–839 (2000)
Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. of Artificial Intelligence Research 15(1), 391–454 (2001)
Spriggs, E.H., De La Torre, F., Hebert, M.: Temporal segmentation and activity classification from first-person sensing. In: CVPR Workshops 2009. IEEE Computer Society Conference, pp. 17–24 (2009)
Truyen, T.T., Phung, D., et al.: Hierarchical semi-markov conditional random fields for recursive sequential data. In: Advances in Neural Information Processing Systems, pp. 1657–1664 (2009)
Yamakata, Y., Imahori, S., Sugiyama, Y., Mori, S., Tanaka, K.: Feature Extraction and Summarization of Recipes Using Flow Graph. In: Jatowt, A., Lim, E.-P., Ding, Y., Miura, A., Tezuka, T., Dias, G., Tanaka, K., Flanagin, A., Dai, B.T. (eds.) SocInfo 2013. LNCS, vol. 8238, pp. 241–254. Springer, Heidelberg (2013)
Yamakata, Y., Tsuchimoto, Y., et al.: Cooking ingredient recognition based on the load on a chopping board during cutting. In: 2011 IEEE International Symposium on Multimedia. pp. 381–386 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kojima, R., Sugiyama, O., Nakadai, K. (2015). Scene Understanding Based on Sound and Text Information for a Cooking Support Robot. In: Ali, M., Kwon, Y., Lee, CH., Kim, J., Kim, Y. (eds) Current Approaches in Applied Artificial Intelligence. IEA/AIE 2015. Lecture Notes in Computer Science(), vol 9101. Springer, Cham. https://doi.org/10.1007/978-3-319-19066-2_64
Download citation
DOI: https://doi.org/10.1007/978-3-319-19066-2_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19065-5
Online ISBN: 978-3-319-19066-2
eBook Packages: Computer ScienceComputer Science (R0)