Abstract
Smart homes are becoming a growing need to prepare for a comfortable life style for the elderly and make things easy for the caretakers of the future. One important component of these systems is to identify the human activities and scenarios. As the wireless technologies are becoming advanced, they are being used to provide a low-cost, non-intrusive and privacy-conscious solution to activity recognition. However, in more complicated environments, we need to identify scenarios with subtle cues e.g. eye gaze. These situations call for a complementary vision-based solution, and we present a robust scenario recognition system by following the objects seen in eye gaze trajectories. In this paper, we present a probabilistic hierarchical model for scenario recognition using the environmental elements like objects in the scene. We utilize the fact that any scenario can be divided into constituent tasks and activities recursively to the level of atomic actions and objects. Considering bottom-up, the scenario recognition problem can be hierarchically solved by identifying the objects seen and combining them together to form coarse-grained higher level activities. This is a novel contribution to be able to recognize complete scenarios only on the basis of objects seen. We performed experiments on standard datasets of Georgia Tech Egocentric Activities (GTEA-Gaze) and unconstrained videos collected “in the Wild”; and trained an Artificial Neural Network to get a precision of 73.84% and accuracy of 92.27%.






Similar content being viewed by others
Notes
References
Wang, Y., Jiang, X., Cao, R., & Wang, X. (2015). Robust indoor human activity recognition using wireless signals. Sensor, 15(7), 17195–17208.
Al-qaness, M. A. A., & Li, F. (2016). WiGer: WiFi-based gesture recognition system. ISPRS International Journal of Geo-Information, 5(6), 92.
Kim, K., Kim, J., Choi, J., Kim, J., & Lee, S. (2015). Depth camera-based 3D hand gesture controls with immersive tactile feedback for natural Mid-Air gesture interactions. Sensors, 15(1), 1022–1046.
Gupta, S., Morris, D., Patel, S. N., & Tan, D. (2012). SoundWave: Using the Doppler effect to sense gestures. In Session: Sensory interaction modalities (pp. 1911–1914).
Santos, A. (2012). Pioneers latest Raku Navi GPS units take commands from hand gestures. https://www.engadget.com/2012/10/07/pioneer-raku-navi-gps-hand-gesture-controlled/. Accessed 25 Apr 2018.
Palacios, J. M., Sags, C., Montijano, E., & Llorente, S. (2013). Human-computer interaction based on hand gestures using RDB-D sensors. Sensors, 13(9), 11842–11860.
Wu, J., Pan, G., Zhang, D., Qi, G., & Li, S. (2009). Gesture recognition with a 3-D accelerometer (pp. 25–38). Berlin: Springer.
Wang, S., & Zhou, G. (2015). A review on radio based activity recognition. Digital Communications and Networks, 1(1), 20–29.
Qi, X., Zhou, G., Li, Y., & Peng, G. (2012). RadioSense: Exploiting wireless communication patterns for body sensor network activity recognition. In Proceedings of the 2012 IEEE 33rd Real-Time Systems Symposium (pp. 95–104).
Qifan, P., Gupta, S., Gollakota, S., & Patel, S. (2013), Whole-home gesture recognition using wireless signals. In Proceedings of the 19th annual international conference on mobile computing and networking, ACM, New York, NY, USA, MobiCom 2013 (pp. 27–38).
Wang, J., Vasisht, D., & Katabi, D. (2014). RF-IDraw: Virtual touch screen in the air using RF signals. SIGCOMM Computer Communication Review, 44(4), 235–246.
Haque, S. A., Rahman, M., & Aziz, S. M. (2015). Sensor anomaly detection in wireless sensor networks for healthcare. Sensors, 15(4), 8764–8786.
Lu, C. H., & Fu, L. C. (2009). Robust location-aware activity recognition using wireless sensor network in an attentive home. IEEE Transactions on Automation Science and Engineering, 6(4), 598–609.
Xi, W., Huang, D., Zhao, K., Yan, Y., Cai, Y., Ma, R., & Chen, D. (2015). Device-free human activity recognition using CSI. In Proceedings of the 1st workshop on context sensing and activity recognition, ACM, New York, NY, USA, CSAR ’15 (pp. 31–36).
Wu, C. X., Su, H., & Yu, K. (2016). A wireless signal denoising model for human activity recognition. In 2016 international conference on artificial intelligence and computer science (AICS 2016).
Sun, Q., Shen, J., Qiao, H., Huang, X., Chen, C., & Hu, F. (2017). Static human detection and scenario recognition via wearable thermal sensing system. Computers, 6(1), 3.
Bourobou, S. T. M., & Yoo, Y. (2015). User activity recognition in smart homes using pattern clustering applied to temporal ANN algorithm. Sensors, 15, 11953–11971.
Yao, M. (2017). Understanding the limits of deep learning. https://venturebeat.com/2017/04/02/understanding-the-limits-of-deep-learning/. Accessed 25 Apr 2018.
Collobert, R., & Bengio, S. (2004). Links between Perceptrons, MLPs and SVMs. In Proceedings of the twenty-first international conference on machine learning, ACM, New York, NY, USA, ICML (pp. 23–30). https://doi.org/10.1145/1015330.1015415.
Bengio, Y., & LeCun, Y. (2007). Scaling learning algorithms towards AI. In L. Bottou, O. Chapelle, D. DeCoste, & J. Weston (Eds.), Large-scale kernel machines. Cambridge: MIT Press.
Kadous, W. (2017). What are the advantages and disadvantages of deep learning? Can you compare it with the statistical learning theory? Promoted by NYC Data Science Academy. https://www.quora.com/. Accessed on July 2017.
Henderson, J. M., & Hollingworth, A. (1999). High-level scene perception. Annual Review of Psychology, 50, 243–271.
Rayner, K., Smith, T. J., Malcolm, G. L., & Henderson, J. M. (2009). Eye movements and visual encoding during scene perception. Psychological Science, 20(1), 6–10.
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 27, pp. 487–495). Red Hook: Curran Associates Inc.
Ekure, I. N., Wang, S., & Zhou, G. (2014). A theoretical analysis of path loss based activity recognition. In Proceedings of the 2014 IEEE 11th international conference on mobile ad hoc and sensor systems (pp. 277–281).
Nechayev, Y. I., Hall, P. S., Constantinou, C. C., Hao, Y., Alomainy, A., Dubrovka, R., & Parini, C. G. (2005). On-body path gain variations with changing body posture and antenna position. In Proceedings of the 2005 IEEE antennas and propagation society international symposium (Vol. 1B, pp. 731–734).
Kanade, T., & Hebert, M. (2012). First-person vision. Proceedings of the IEEE, 100(8), 2442–2453.
Lowe, D. G. (2004). Distinctive image features from scale-invariant Keypoints. International Journal of Computer Vision, 60, 91–110.
Yang, J., & Leskovec, J. (2012). Community-affiliation graph model for overlapping network community detection. In Proceedings of the 2012 IEEE 12th international conference on data mining (pp. 1170–1175). https://doi.org/10.1109/ICDM.2012.139.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, International Weekly Journal of Science, 323, 533–536. https://doi.org/10.1038/323533a0.
Tenorth, M., Kunze, L., Jain, D. & Beet, M. (2010). KNOWROB-MAP—Knowledge-linked semantic object maps. In Proceedings of the 2010 10th IEEE-RAS international conference on humanoid robots, Humanoids 2010. https://doi.org/10.1109/ICHR.2010.5686350.
Gupta, R., & Kochenderfer, M. J. (2004). Common sense data acquisition for indoor mobile robots. In Proceedings of the nineteenth national conference on artificial intelligence (AAAI-04) (pp. 605–610).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Noor, S., Uddin, V. & Jabeen, De. Probabilistic Hierarchical Model Using First Person Vision for Scenario Recognition. Wireless Pers Commun 106, 2179–2193 (2019). https://doi.org/10.1007/s11277-018-5933-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-018-5933-9