Skip to main content
Log in

An improved open-view human action recognition with unsupervised domain adaptation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

One of the primary concerns with open-view human action recognition (HAR) is the large differences between data distributions of the target and source views. Subsequently, such differences cause the data shift problem to occur, and hence, decreasing the performance of the system. This problem comes from the fact that real-world situation deals with unconstrained rather than constrained situations such as differences in camera resolutions, field of views, and non-uniform illumination which are not found in constrained datasets. The primary goal of this paper is to improve this open-view HAR by proposing the unsupervised domain adaptation approach. In particular, we demonstrated that the balanced weighted unified discriminant and distribution alignment (BW-UDDA) managed to handle the dataset with significant differences across views such as those found in the MCAD dataset. We showed that by using the MCAD dataset on two types of cross-view evaluations, our proposed technique outperformed other unsupervised domain adaptation methods with average accuracies of 13.38% and 61.45%. Additionally, we applied our method to a constrained multi-view IXMAS dataset and achieved an average accuracy of 90.91%. The results confirmed the superiority of the proposed technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43:1–43. https://doi.org/10.1145/1922649.1922653

    Article  Google Scholar 

  2. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(11)

  3. Li B, Camps OI, Sznaier M (2012) Cross-view activity recognition using Hankelets. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, In, pp 1362–1369. https://doi.org/10.1109/CVPR.2012.6247822

  4. Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE 2066–2073. doi: https://doi.org/10.1109/CVPR.2012.6247911

  5. Cai J, Huang X (2018) Modified sparse linear-discriminant analysis via nonconvex penalties. IEEE Trans Neural Networks Learn Syst 29:4957–4966. https://doi.org/10.1109/TNNLS.2017.2785324

    Article  MathSciNet  Google Scholar 

  6. Ciptadi A, Goodwin MS, Rehg JM (2014) Movement pattern histogram for action recognition and retrieval. Eur Conf Comput Vision:695–710. https://doi.org/10.1007/978-3-319-10605-2_45

  7. Farhadi A, Tabrizi MK (2008) Learning to recognize activities from the wrong view point. In: European conference on computer vision. Springer, Berlin, Heidelberg. 154–166. https://doi.org/10.1007/978-3-540-88682-2_13

  8. Fernando B, Habrard A, Sebban M, Tuytelaars T (2013) Unsupervised visual domain adaptation using subspace alignment. In: 2013 IEEE international conference on computer vision. IEEE, pp 2960–2967. https://doi.org/10.1109/ICCV.2013.368

  9. Ghifary M, Balduzzi D, Kleijn WB, Zhang M (2017) Scatter component analysis: a unified framework for domain adaptation and domain generalization. IEEE Trans Pattern Anal Mach Intell 39:1414–1430. https://doi.org/10.1109/TPAMI.2016.2599532

    Article  Google Scholar 

  10. Gorelick L, Blank M, Shechtman E, Member S, Irani M, Basri R (2007) Action as space time shapes. IEEE Trans Pattern Anal Mach Intell 29:2247–2253. https://doi.org/10.1109/TPAMI.2007.70711

    Article  Google Scholar 

  11. Junejo IN, Dexter E, Laptev I, Pérez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern Anal Mach Intell 33:172–185. https://doi.org/10.1109/TPAMI.2010.68

    Article  Google Scholar 

  12. Junejo IN, Dexter E, Laptev I, Púrez P (2008) Cross-view action recognition from temporal self-similarities. In: European conference on computer vision. Springer, Berlin, Heidelberg, pp. 293–306. https://doi.org/10.1109/TPAMI.2010.68, 33

  13. Kase N, Babaee M, Rigoll G (2017) Multi-view human activity recognition using motion frequency. In: IEEE international conference on image processing (ICIP). IEEE, pp 3963–3967. https://doi.org/10.1109/TIP.2017.2696786

  14. Kong Y, Ding Z, Li J, Fu Y (2017) Deeply learned view-invariant features for cross-view action recognition. IEEE Trans Image Process 26:3028–3037. https://doi.org/10.1109/TIP.2017.2696786

    Article  MathSciNet  MATH  Google Scholar 

  15. Kulathumani V, Kavi R, Ramagiri S (2011) WVU multi-view action recognition dataset. Available on: http://csee.WVUEdu/~vkkulathumani/WVU-action.Html# download2.

  16. Laptev L (2003) Space-time interest points. IEEE International Conference on Computer Vision. IEEE, In, pp 432–439. https://doi.org/10.1109/ICCV.2003.1238378

  17. Li R, Zickler T (2012) Discriminative virtual views for cross-view action recognition. In: IEEE computer society conference on computer vision and pattern recognition. 2855–2862. Pp 187–196. https://doi.org/10.1109/WACV.2017.28

  18. Li W, Wong Y, Liu AA, Li Y, Su YT, Kankanhalli M (2017) Multi-camera action dataset for cross-camera action recognition benchmarking. IEEE Winter Conf Appl Comput Vision, WACV 2017:187–196. https://doi.org/10.1109/ICME.2019.00124

    Article  Google Scholar 

  19. Li Y, Cheng L, Peng Y, Wen Z, Ying S (2019) Manifold alignment and distribution adaptation for unsupervised domain adaptation. IEEE International Conference on Multimedia and Expo, In, pp 688–693. https://doi.org/10.1109/CVPR.2011.5995729

  20. Liu J, Shah M, Kuipers B, Savarese S (2011) Cross-view action recognition via view knowledge transfer. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, In, pp 3209–3216. https://doi.org/10.1109/TCSVT.2018.2868123

  21. Liu Y, Lu Z, Li J, Yang T (2019) Hierarchically learned view-invariant representations for cross-view action recognition. IEEE Transn Circ Syst Video Technol 29:2416–2430. https://doi.org/10.1109/TCSVT.2018.2868123

    Article  Google Scholar 

  22. Liu Y, Lu Z, Li J, Yao C, Deng Y (2018) Transferable feature representation for visible-to-infrared cross-dataset human action recognition. Complexity 2018:1–20. https://doi.org/10.1155/2018/5345241

    Article  Google Scholar 

  23. Liu Z, Liu G, Pu J, Wang X, Wang H (2018) Orthogonal sparse linear discriminant analysis. Int J Syst Sci 49:847–857. https://doi.org/10.1080/00207721.2018.1424964

    Article  MathSciNet  MATH  Google Scholar 

  24. Long M, Wang J, Ding G, Sun J, Yu PS (2013) Transfer feature learning with joint distribution adaptation. IEEE International Conference on Computer Vision, In, pp 2200–2207. https://doi.org/10.1109/CVPR.2014.183

  25. Long M, Wang J, Ding G, Sun J, Yu PS (2014) Transfer joint matching for unsupervised domain adaptation. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, In, pp 1410–1417. https://doi.org/10.1049/iet-cvi.2015.0416

  26. Murtaza F, Yousaf MH, Velastin SA (2016) Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description. IET Comput Vis 10:758–767. https://doi.org/10.1049/iet-cvi.2015.0416

    Article  Google Scholar 

  27. Nie W, Liu A, Yu J, Su Y, Chaisorn L, Wang Y, Kankanhalli MS (2014) Multi-view action recognition by cross-domain learning. In: international workshop on multimedia signal processing (MMSP). IEEE, pp 1–6. https://doi.org/10.1109/TNN.2010.2091281

  28. Pan SJ, Tsang IW, Kwok JT, Yang Q (2011) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22:199–210. https://doi.org/10.1109/TNN.2010.2091281

    Article  Google Scholar 

  29. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359. https://doi.org/10.1007/978-981-15-5971-6_83

    Article  Google Scholar 

  30. Peng X, Zou C, Qiao Y, Peng Q (2014) Action recognition with stacked fisher vectors. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Pp 581–595. https://doi.org/10.1109/ICIP.2017.8297026

  31. Shao L, Member S, Zhu F, Member S, Li X (2015) Transfer learning for visual categorization : a survey. IEEE Trans Neural Networks Learn Syst 26:1019–1034. https://doi.org/10.1109/TNNLS.2014.2330900

    Article  MathSciNet  Google Scholar 

  32. Singh S, Velastin SA, Ragheb H (2010) MuHAVi: a multicamera human action video dataset for the evaluation of action recognition methods. IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, In, pp 48–55. https://doi.org/10.1109/AVSS.2010.63

  33. Su Y, Li Y, Liu A (2019) Open-view human action recognition based on linear discriminant analysis. Multimed Tools Appl 78:767–782. https://doi.org/10.1007/s11042-018-5657-6

    Article  Google Scholar 

  34. Sun B, Saenko K (2015) Subspace distribution alignment for unsupervised domain adaptation. In: Procedings of the British machine vision conference 2015. British Mach Vision Assoc 24:1–24.10. https://doi.org/10.5244/c.29.24

    Article  Google Scholar 

  35. Wang H, Kläser A, Schmid C, Liu CL (2011) Action recognition by dense trajectories. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, In, pp 3169–3176. https://doi.org/10.1109/CVPR.2011.5995407

  36. Wang H, Schmid C (2013) Action recognition with improved trajectories. IEEE International Conference on Computer Vision. IEEE, In, pp 3551–3558. https://doi.org/10.1109/ICCV.2013.441

  37. Wang J, Chen Y, Feng W, Han YU, Huang M, Yang Q (2020) Transfer learning with dynamic distribution adaptation. ACM transactions on intelligent systems and technology (TIST), pp 1–25. https://doi.org/10.1145/3360309

  38. Wang J, Chen Y, Hao S, Feng W, Shen Z (2017) Balanced distribution adaptation for transfer learning. In: IEEE international conference on data mining (ICDM). IEEE, pp 1129–1134. https://doi.org/10.1109/ICDM.2017.150

  39. Wang J, Feng W, Chen Y, Yu H, Huang M, Yu PS (2018) Visual domain adaptation with manifold embedded distribution alignment. In: proceedings of the 26th ACM international conference on multimedia. Pp 402–410. https://doi.org/10.1145/3240508.3240512

  40. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, In, pp 3360–3367. https://doi.org/10.1109/CVPR.2010.5540018

  41. Weinland D, Boyer E, Ronfard R (2007) Action recognition from arbitrary views using 3D exemplars. In: IEEE 11th international conference on computer vision. IEEE, pp 1–7.

  42. Weinland D, Weinland D, Weinland D, Ronfard R (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image Underst 104(2–3):249–257

    Article  Google Scholar 

  43. Wen J, Fang X, Cui J, Fei L, Yan K, Chen Y, Xu Y (2019) Robust sparse linear discriminant analysis. IEEE Trans Circ Syst Video Technol 29:390–403. https://doi.org/10.1109/TCSVT.2018.2799214

    Article  Google Scholar 

  44. Wu X, Wang H, Liu C, Jia Y (2015) Cross-view action recognition over heterogeneous feature spaces. IEEE Trans Image Process 24:4096–4108. https://doi.org/10.1109/TIP.2015.2445293

    Article  MathSciNet  MATH  Google Scholar 

  45. Yan Y, Ricci E, Subramanian R, Liu G, Sebe N (2014) Multitask linear discriminant analysis for view invariant action recognition. IEEE Trans Image Process 23:5599–5611. https://doi.org/10.1109/TIP.2014.2365699

    Article  MathSciNet  MATH  Google Scholar 

  46. Yang Y, Hospedales T (2015) Zero-shot domain adaptation via kernel regression on the Grassmannian. In: Proceedings the 1st international workshop on differential geometry in computer vision for analysis of shapes. BMVA Press, Images and Trajectories, pp 1.1–1.12

    Google Scholar 

  47. Zhang J, Li W, Ogunbona P (2017) Joint geometrical and statistical alignment for visual domain adaptation. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, In, pp 5150–5158

  48. Zhang W, Wu D (2020) Discriminative joint probability maximum mean discrepancy (DJP-MMD) for domain adaptation. Proceedings of the International Joint Conference on Neural Networks, In, pp 1–8. https://doi.org/10.1109/CVPR.2017.547

  49. Zhang Z, Wang C, Xiao B, Zhou W, Liu S (2014) Cross-view action recognition using contextual maximum margin clustering. IEEE Trans Circ Syst Video Technol 24:1663–1668. https://doi.org/10.1109/TCSVT.2014.2305552

    Article  Google Scholar 

  50. Zhang Z, Wang C, Xiao B, Zhou W, Liu S, Shi C (2013) Cross-view action recognition via a continuous virtual path. In proceedings of the IEEE conference on computer vision and pattern recognition 2690–2697. https://doi.org/10.1109/CVPR.2013.347

  51. Zheng J, Jiang Z, Chellappa R (2016) Cross-view action recognition via transferable dictionary learning. IEEE Trans Image Process 25:2542–2556. https://doi.org/10.1109/TIP.2016.2548242

    Article  MathSciNet  MATH  Google Scholar 

  52. Zheng J, Jiang Z, Phillips J, Chellappa R (2012) Cross-view action recognition via a transferable dictionary pair. British Machine Vision Conference, In, pp 125.1–125.11

  53. Zhu F, Shao L (2014) Weakly-supervised cross-domain dictionary learning for visual recognition. Int J Comput Vis 109:42–59. https://doi.org/10.1007/s11263-014-0703-y

    Article  MATH  Google Scholar 

Download references

Funding

The authors thank the Ministry of Education Malaysia and University Technology Malaysia (UTM) for their support under the Fundamental Research Scheme, grant number R.J130000.7851.5F179.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. S. Rizal Samsudin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Samsudin, M.S.R., Abu-Bakar, S.A.R. & Mokji, M.M. An improved open-view human action recognition with unsupervised domain adaptation. Multimed Tools Appl 81, 28479–28507 (2022). https://doi.org/10.1007/s11042-022-12822-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12822-2

Keywords

Navigation