Abstract
A new technique for action clustering-based human action representation on the basis of optical flow analysis and random sample consensus (RANSAC) method is proposed in this paper. The apparent motion of the human subject with respect to the background is detected and localized by using optical flow analysis. The next task is to characterize the action through the frequent movement of the optical flow points or interest points at different regions of the moving subject. The RANSAC algorithm is used to filter out any unwanted interested points all around the scene and keep only those that are related to that particular subject’s motion. From the remaining salient key interest points, the area of the human body within the frame is estimated. The rectangular area surrounding the human body is then segmented both horizontally and vertically. Now, the percentage of change of interest points in each horizontal and vertical segments from frame to frame is estimated. Similar results are obtained for different persons performing the same action and the corresponding values are averaged for respective segments. The matrix constructed by this strategy is used as a feature vector for that particular action. Similar data are calculated for each block created at the intersections of the horizontal and vertical segments. In addition to these, the change in the position of the person along X- and Y-axes is accumulated for an action and included in the feature vectors. Afterward, for the purpose of recognition using the extracted feature vectors, a distance-based similarity measure and a support vector machine-based classifiers have been exploited. Several combination of the feature vectors is examined. From extensive experimentation upon benchmark motion databases, it is found that the proposed method offers not only a very high degree of accuracy but also computational savings.




Similar content being viewed by others
References
Ahad, M.A.R.: Computer Vision and Action Recognition: A Guide for Image Processing and Computer Vision Community for Action Understanding. Atlantis Press, Paris (2011)
Ahad, M.A.R.: Motion History Images for Action Recognition and Understanding. Springer, Berlin (2012)
Ahad, M.A.R.: Smart approaches for human action recognition. Pattern Recognit. Lett. (2013 in press)
Ahad, M.A.R., Tan, J., Kim, H., Ishikawa, S.: Human activity recognition: various paradigms. In: International Conference in Control, Automation and Systems, pp. 1896–1901 (2008)
Ahad, M.A.R., Tan, J., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255–281 (2010)
Ahmad, M., Lee, S.: Human action recognition using multi-view image sequences features. In: FGR ‘06 Proceedings of the 7th International Conference on Automatic Fare and Gesture Recognition, pp. 523–528 (2006)
Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. In: IEEE PAMI, pp. 288–303 (2010)
Awad, M., Jiang, X., Motai, Y.: Incremental support vector machine framework for visual sensor networks. EURASIP J. Appl. Signal Process. 2007, 222–222 (2007)
Beauchemin, S., Barron, J.: The computation of optical flow. ACM Comput. Surv. 27(3), 443–467 (1995)
Salem Ben, Y., Nasri, S.: Automatic recognition of woven fabrics based on texture and using svm. Signal Image Video Process. 4(4), 429–434 (2010). doi:10.1007/s11760-009-0132-5
Bimbo, A., Nesi, P.: Real-time optical flow estimation. In: International Conference on Systems Engineering in the Service of Humans, Systems, Man and Cybernetics, vol. 3, pp. 13–19 (1993)
Bobick, A., Davis, J.: The recognition of human movement using temporal templates. IEEE PAMI 23, 257–267 (2001)
Bourennane, S., Fossati, C.: Comparison of shape descriptors for hand posture recognition in video. Signal Image Video Process. 6(1), 147–157 (2012). doi:10.1007/s11760-010-0176-6
Bradski, G., Davis, J.: Motion segmentation and pose recognition with motion history gradients. Mach Vis. Appl. 13(3), 174–184 (2002)
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: European Conference on Computer Vision (2004)
Bruhn, A., Weickert, J., Schnorr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. Int. J. Comput. Vis. 61(3), 211–231 (2005)
Buciu, I., Kotropoulos, C., Pitas, I.: Comparison of ICA approaches for facial expression recognition. Signal Image Video Process. 3(4), 345–361 (2009). doi:10.1007/s11760-008-0074-3
Danafar, S., Gheissari, N.: Action recognition for surveillance applications using optic flow and svm. In: Proceedings of the 8th Asian conference on Computer Vision, vol. 2 (2007)
Davis, J.: Hierarchical motion history images for recognizing human motion. In: IEEE Workshop on Detection and Recognition of Events in Video, pp. 39–46 (2001)
Davis, J., Bradski, G.: Real-time motion template gradients using Intel CVLib. In: International Workshop on Frame-Rate Vision with International Conference on Computer Vision, CA, pp. 1–20 (1999)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: IEEE International Workshop on VS-PETS, pp. 65–72 (2005)
Duraisamy, P., Belkhouche, Y., Jackson, S., Namuduri, K., Buckles, B.: Choosing appropriate homography transformation for building panoramic images. Int. J. Comput. Vis. Signal Process. 2(1), 29–37 (2012). http://cennser.org/IJCVSP/paper.html
Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: International Conference on Computer Vision, pp. 726–733 (2003)
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE CVPR, pp. 524–531 (2005)
Fihl, P., Moeslund, T.: Invariant gait continuum based on the duty-factor. Signal Image Video Process. 3(4), 391–402 (2009). doi:10.1007/s11760-008-0089-9
Fischler, M., Bolles, R.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. In: Communications of the ACM, pp. 381–395 (1981)
Gafurov, D., Bours, P., Snekkenes, E.: User authentication based on foot motion. Signal Image Video Process. 5(4), 457–467 (2011). doi:10.1007/s11760-011-0249-1
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE PAMI 29(12), 2247–2253 (2007)
Guo, K., Ishwar, P., Konrad, J.: Action recognition using sparse representation on covariance manifolds of optical flow. In: 7th IEEE International Conference on Advanced Video and Signal-Based Surveillance (2010)
Horn, B., Schunck, B.: Determining optical flow. Artif. Intell. 17, 185–203 (1981)
Junejo, I., Bhutta, A., Foroosh, H.: Single-class svm for dynamic scene modeling. Signal Image Video Process. 7(1), 45–52 (2013). doi:10.1007/s11760-011-0230-z
Junejo, I., Dexter, E., Laptev, I., Perez, P.: View-independent action recognition from temporal self-similarities. IEEE PAMI 33(1), 172–185 (2011)
Keshri, S., Omkar, S., Singh, A., Jeengar, V., Yadav, M.: A real-time scheme of video stabilization for mobile surveillance robot. Int. J. Comput. Vis. Signal Process. 2(1), 8–16 (2012). http://cennser.org/IJCVSP/paper.html
Laptev, I., Caputo, B.: KTH Action Database (2004). http://www.nada.kth.se/cvap/actions/
Laptev, I., Lindeberg, T.: Space-time interest points. In: International Conference on Computer Vision, vol. 1 (2003)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE CVPR (2008)
Liu, J., Zhang, N.: Gait history image: a novel temporal template for gait recognition. In: Proceedings of the IEEE International Conference on Multimedia and Expo, pp. 663–666 (2007)
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference in Artificial Intelligence, pp. 674–679 (1981)
Lucas, B.D.: Generalized Image Matching by the Method of Differences. Ph.D. thesis, Robotics Institute, Carnegie Mellon University (1984)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the Workshop in Imaging Understanding, pp. 121–130 (1981)
Lucena, M., Blanca, N., Fuertes, J.: Human action recognition based on aggregated local motion estimates. Mach. Vis. Appl. 23(1), 135–150 (2010)
Lucena, M., Blanca, N., Fuertes, J., Marin-Jimenez, M.: Human action recognition using optical flow accumulated local histograms. In: IbPRIA, pp. 32–39 (2009)
Mahbub, U., Imtiaz, H., Roy, T., Rahman, M.S., Ahad, M.A.R.: A template matching approach of one-shot-learning gesture recognition. Pattern Recognit. Lett. (2012). doi:10.1016/j.patrec.2012.09.014
Wikipedia, the Free Encyclopedia: Optical Flow. http://en.wikipedia.org/wiki/Optical_flow
McCane, B., Novins, K., Crannitch, D., Galvin, B.: On benchmarking optical flow. Comput. Vis. Image Underst. 84, 126–143 (2001)
Nguyen, Q., Novakowski, S., Boyd, J., Jacob, C., Hushlak, G.: Motion swarms: video interaction for art in complex environments. In: Proceedings of the ACM International Conference on Multimedia, CA, pp. 461–469 (2006)
Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008)
Papenberg, N., Bruhn, A., Brox, T., Didas, S., Weickert, J.: Highly accurate optic flow computation with theoretically justified warping. Int. J. Comput. Vis. 67(2), 141–158 (2006)
Seo, H.J., Milanfar, P.: Action recognition from one example. IEEE PAMI 33(5), 867–882 (2011)
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: IEEE International Conference on Computer Vision, pp. 370–377 (2005)
Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9, 293–300 (1999)
Talukder, A., Goldberg, S., Matthies, L., Ansar, A.: Real-time detection of moving objects in a dynamic scene from moving robotic vehicles. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1308–1313 (2003)
Wan, Y., Duraisamy, P., Alam, M., Buckles, B.: Wireless capsule endoscopy segmentation using global-constrained hidden markov model and image registration. Int. J. Comput. Vis. Signal Process. 2(1), 17–28 (2012). http://cennser.org/IJCVSP/paper.html
Wang, S., Huang, K., Tan, T.: A compact optical flow based motion representation for realtime action recognition in surveillance scenes. In: Proceedings of the International Conference on Image Processing (ICIP), Cairo, Egypt (2009)
Wei, J., Harle, N.: Use of temporal redundancy of motion vectors for the increase of optical flow calculation speed as a contribution to real-time robot vision. In: IEEE TENCON, pp. 677–680 (1997)
Wixson, L.: Detecting salient motion by accumulating directionally-consistent flow. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 774–780 (2006)
Wong, S.F., Cipolla, R.: Extracting spatio-temporal interest points using global information (2007)
Zhou, Z., Du, E., Thomas, N., Delp, E.: A comprehensive multimodal eye recognition. Signal Image Video Process. 7(4), 619–631 (2013). doi:10.1007/s11760-013-0468-8
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mahbub, U., Imtiaz, H. & Ahad, M.A.R. Action recognition based on statistical analysis from clustered flow vectors. SIViP 8, 243–253 (2014). https://doi.org/10.1007/s11760-013-0533-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-013-0533-3