Skip to main content
Log in

Customer behavior classification using surveillance camera for marketing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The analysis of customer behavior from surveillance camera is one of the most important open topics for marketing. Traditionally, retailers use the records of cash registers or credit cards to analyze the buying behaviors of customers. However, this information cannot reveal the behaviors of customer when he or she shows interest on the front of the merchandise shelf but does not buy. Those behaviors can be recorded and analyzed by the surveillance camera. We propose a system to classify different customer behaviors on the front of shelf: no interest, viewing, turning body to shelf, touching, picking and returning to shelf and picking and putting into basket, which show customer’s increasing interest to products. In the proposed system, head orientation, body orientation, and arm action, the multiple cues are integrated for the customer behavior recognition. The proposed system discretizes the head and body orientation of customer into 8 directions to estimate whether the customer is looking or turning to the merchandise shelf. Semi-Supervised Learning method is applied to optimize the training dataset and to generate the accurate classifier. In addition, the temporal constraint and the human physical model constraint are considered in joint body and head orientation estimation. As for the arm action recognition, a novel Combined Hand Feature (CHF), which includes hand trajectory, tracking status and the relative position between hand and shopping basket, is proposed to classify different arm actions. The hand tracking is done by an improved particle filter. The CHF is classified by Dynamic Bayesian Network (DBN) to output different types of arm actions. A series of experiments demonstrate effectiveness of the proposed technologies and the performance to the developed system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Abe S, Morimoto M, Fujii K (2010) Estimating face direction from wideview surveillance camera. In World Automation Congress (WAC), 2010 (pp. 1–6). IEEE

  2. Benmokhtar R (2014) Robust human action recognition scheme based on high-level feature fusion. Multimedia Tools Appl 69(2):253–275

    Article  Google Scholar 

  3. Chen C, Heili A, Odobez JM (2011). Combined estimation of location and body pose in surveillance video. In Advanced Video and Signal-Based Surveillance (AVSS), 2011 8th IEEE International Conference on (pp. 5–10). IEEE

  4. Chen F, Wang W (2010) Activity recognition through multi-scale dynamic bayesian network. In Virtual Systems and Multimedia (VSMM), 2010 16th International Conference on (pp. 34–41). IEEE

  5. Choi W, Savarese S (2012) A unified framework for multi-target tracking and collective activity recognition. In computer vision–ECCV. Springer, Berlin Heidelberg, pp 215–230

    Google Scholar 

  6. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886–893). IEEE

  7. Elmezain M, Al-Hamadi A, Michaelis B (2009) Hand trajectory-based gesture spotting and recognition using HMM. In Image Processing (ICIP), 2009 16th IEEE International Conference on (pp. 3577–3580). IEEE

  8. Gandhi T, Trivedi MM (2008). Image based estimation of pedestrian orientation for improving path prediction. In Intelligent Vehicles Symposium, 2008 I.E. (pp. 506–511). IEEE

  9. Goffredo M, Bouchrika I, Carter JN, Nixon MS (2010) Performance analysis for automated gait extraction and recognition in multi-camera surveillance. Multimedia Tools Appl 50(1):75–94

    Article  Google Scholar 

  10. Gu Y, Kamijo S (2014) Recognition and pose estimation of urban road users from on-board camera for collision avoidance. In Intelligent Transportation Systems (ITSC), 2014 I.E. 17th International Conference on (pp. 1266–1273). IEEE

  11. Haritaoglu I, Beymer D, Flickner M (2002) Ghost 3d: detecting body posture and parts using stereo. In Motion and Video Computing, 2002. Proceedings. Workshop on (pp. 175–180). IEEE

  12. Haritaoglu I, Flickner M (2001) Detection and tracking of shopping groups in stores. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 I.E. Computer Society Conference on (Vol. 1, pp. I-431). IEEE

  13. Haritaoglu I, Flickner M (2002) Attentive billboards: towards to video based customer behavior understanding. In Applications of Computer Vision, 2002.(WACV 2002). Proceedings. Sixth IEEE Workshop on (pp. 127–131). IEEE

  14. Hu Y, Cao L, Lv F, Yan S, Gong Y, Huang TS (2009) Action detection in complex scenes with spatial and temporal ambiguities. In Computer Vision, 2009 I.E. 12th International Conference on (pp. 128–135). IEEE

  15. Lao W, Han J, De With PH (2009) Automatic video-based human motion analyzer for consumer surveillance system. Consumer Electronics, IEEE Trans 55(2):591–598

    Article  Google Scholar 

  16. Lee KD, Nam MY, Chung KY, Lee YH, Kang UG (2013) Context and profile based cascade classifier for efficient people detection and safety care system. Multimedia Tools Appl 63(1):27–44

    Article  Google Scholar 

  17. Leykin A, Tuceryan M (2007) Detecting shopper groups in video sequences. In Advanced Video and Signal Based Surveillance, 2007. AVSS 2007. IEEE Conference on (pp. 417–422). IEEE

  18. Liu J, Shah M (2008) Learning human actions via information maximization. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on (pp. 1–8). IEEE

  19. Migniot C, Ababsa F (2013) 3D human tracking from depth cue in a buying behavior analysis context. In Computer Analysis of Images and Patterns (pp. 482–489). Springer Berlin Heidelberg

  20. Murphy KP (2002) Dynamic bayesian networks: representation, inference and learning. Diss. University of California, Berkeley

    Google Scholar 

  21. Niebles JC, Fei-Fei L (2007) A hierarchical model of shape and appearance for human action classification. In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on (pp. 1–8). IEEE

  22. Popa M, Rothkrantz L, Yang Z, Wiggers P, Braspenning R, Shan C (2010) Analysis of shopping behavior based on surveillance system. In Systems Man and Cybernetics (SMC), 2010 I.E. International Conference on (pp. 2512–2519). IEEE

  23. Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In Computer vision, 2009 ieee 12th international conference on (pp. 1593–1600). IEEE

  24. Ryoo MS, Aggarwal JK (2009) Semantic representation and recognition of continued and recursive human activities. Int J Comput Vis 82(1):1–24

    Article  Google Scholar 

  25. Sae-ueng S, Ogino A, Kato T (2007) Modeling personal preference using shopping behaviors in ubiquitous information environment. DEWS2007, Mar

  26. Schüldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on (Vol. 3, pp. 32–36). IEEE

  27. Schulz A, Damer N, Fischer M, Stiefelhagen R (2011) Combined head localization and head pose estimation for video–based advanced driver assistance systems. In pattern recognition. Springer, Berlin Heidelberg, pp 51–60

    Google Scholar 

  28. Schulz A, Stiefelhagen R (2012) Video-based pedestrian head pose estimation for risk assessment. In Intelligent Transportation Systems (ITSC), 2012 15th International IEEE Conference on (pp. 1771–1776). IEEE

  29. Senior AW, Brown L, Hampapur A, Shu C-F, Zhai Y, Feris RS, Tian Y-L, Borger S, Carlson C (2007) Video analytics for retail. In Advanced Video and Signal Based Surveillance, 2007. AVSS 2007. IEEE Conference on (pp. 423–428)

  30. Shao L, Ji L, Liu Y, Zhang J (2012) Human action segmentation and recognition via motion and shape analysis. Pattern Recogn Lett 33(4):438–445

    Article  Google Scholar 

  31. Shechtman E, Irani M (2005) Space-time behavior based correlation. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 405–412). IEEE

  32. Stan CE, Dumitrescu D, Caras V, Tiliute DE, Pop E, Anghel LE (2008) Intelligent store-an innovative technological solution for retail activities with mobile access. In Computing in the Global Information Technology, 2008. ICCGI’08. The Third International Multi-Conference on (pp. 7–11). IEEE

  33. Trinh H, Fan Q, Pan J, Gabbur P, Miyazawa S, Pankanti S (2011) Detecting human activities in retail surveillance using hierarchical finite state machine. In Acoustics, Speech and Signal Processing (ICASSP), 2011 I.E. International Conference on (pp. 1337–1340). IEEE

  34. Watanabe T, Ito S, Yokoi K (2010) Co-occurrence histograms of oriented gradients for human detection. Information Media Technol 5(2):659–667

    Google Scholar 

  35. Weinland D, Özuysal M, Fua P (2010) Making action recognition robust to occlusions and viewpoint changes. In computer vision–ECCV. Springer, Berlin Heidelberg, pp 635–648

    Google Scholar 

  36. Yano S, Gu Y, Kamijo S (2014) Estimation of pedestrian pose and orientation using on-board camera with histograms of oriented gradients features. International Journal of Intelligent Transportation Systems Research, 1–10

  37. Yao J, Odobez JM (2007) Multi-layer background subtraction based on color and texture. In Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on (pp. 1–8). IEEE

  38. Zelnik-Manor L, Irani M (2001) Event-based analysis of video. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 I.E. Computer Society Conference on (Vol. 2, pp. II-123). IEEE

  39. Zhang TY, Suen CY (1984) A fast parallel algorithm for thinning digital patterns. Commun ACM 27(3):236–239

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank Envirosell Japan, Inc. for providing the test video. The faces of customers are blurred to protect the privacy. This research is permitted by the Compliance Committee of The University of Tokyo.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingwen Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Gu, Y. & Kamijo, S. Customer behavior classification using surveillance camera for marketing. Multimed Tools Appl 76, 6595–6622 (2017). https://doi.org/10.1007/s11042-016-3342-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3342-1

Keywords

Navigation