Abstract
During live events the organizers often would want to deploy audience engagement systems to analyze people behaviour, perform user profiling or modify the show according to the participant feedback. Such systems usually need computer vision algorithms whose performance are severely affected by constraints such as illumination and cameras position. In this paper we present a fully automatic audience engagement system, optimized for live music events with rapidly changing illumination conditions. The system uses a multi-modal approach which combines wireless-based person detection together with computer vision algorithms for pose and face analysis. We show that such hybrid approach, while running in real-time, performs better than standard approaches that only employ computer vision techniques. The system has been tested both in a laboratory environment as well as in a concert hall and it will be deployed in distributed live events.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Indoor person localization hybrid system in live events https://bit.ly/3cYmvz2.
References
Altini, M., Brunelli, D., Farella, E., Benini, L.: Bluetooth indoor localization with multiple neural networks. In: ISWPC 2010 - IEEE 5th International Symposium on Wireless Pervasive Computing, 1 June 2010, pp. 295–300 (2010). https://doi.org/10.1109/ISWPC.2010.5483748
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014). https://doi.org/10.1109/CVPR.2014.471
Asvadi, A., Garrote, L., Premebida, C., Peixoto, P., Nunes, U.J.: Multimodal vehicle detection: fusing 3D-LIDAR and color camera data. Pattern Recogn. Lett. 115, 20–29 (2018). https://doi.org/10.1016/j.patrec.2017.09.038
Bodd, B.: Means, not an end (of the world) the customization of news personalization by European news media. SSRN Electron. J. (2018). https://doi.org/10.2139/ssrn.3141810
Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3D human pose annotations. In: 2009 IEEE 12th International Conference on Computer Vision (2010). https://doi.org/10.1109/iccv.2009.5459303
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. In: arXiv preprint arXiv:1812.08008 (2018)
Čehovin, L., Leonardis, A., Kristan, M.: Visual object tracking performance measures revisited. IEEE Trans. Image Process. 25(3), 1261–1274 (2016)
Chang, J.Y., Moon, G., Lee, K.M.: V2V-PoseNet: voxel-to-voxel prediction network for accurate 3D hand and human pose estimation from a single depth map. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/CVPR.2018.00533
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005 (2005). https://doi.org/10.1109/CVPR.2005.177
Dari, Y.E., Suyoto, S.S., Pranowo, P.P.: CAPTURE: a mobile based indoor positioning system using wireless indoor positioning system. Int. J. Interact. Mob. Technol. (JIM) 12(1), 61 (2018). https://doi.org/10.3991/ijim.v12i1.7632
Deloitte: 2018 Media and Entertainment Industry Trends — Deloitte US (2018)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014). https://doi.org/10.1109/CVPR.2014.81
Glisser (2019). https://www.glisser.com/features/. Accessed 24 Sept 2019
Granados, N.: Digital Video and Social Media Will Drive Entertainment Industry Growth in 2019 (2018)
Gupta, P., Kar, S.P.: MUSIC and improved MUSIC algorithm to estimate direction of arrival. In: 2015 International Conference on Communication and Signal Processing, ICCSP 2015 (2015). https://doi.org/10.1109/ICCSP.2015.7322593
Hassib, M., Schneegass, S., Eiglsperger, P., Henze, N., Schmidt, A., Alt, F.: EngageMeter: a system for implicit audience engagement sensing using electroencephalography. In: Conference on Human Factors in Computing Systems - Proceedings (2017). https://doi.org/10.1145/3025453.3025669
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
IEEE OUI MAC registries (2019). https://regauth.standards.ieee.org/standards-ra-web/pub/view.html#registries. Accessed 24 Sept 2019
Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: Proceedings of the British Machine Vision Conference (2010) https://doi.org/10.5244/C.24.12
Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: 2010 20th International Conference on Pattern Recognition, pp. 2756–2759. IEEE (2010)
Khalil, M., Ebner, M.: Clustering patterns of engagement in Massive Open Online Courses (MOOCs): the use of learning analytics to reveal student categories. J. Comput. High. Educ. 29(1), 114–132 (2016). https://doi.org/10.1007/s12528-016-9126-9
Nanani, G.K., Prasad Kantipudi, M.V.V.: A study of WI-FI based system for moving object detection through the wall. Int. J. Comput. Appl. 79(7), 15–18 (2013). https://doi.org/10.5120/13753-1589
Lanzisera, S., Zats, D., Pister, K.S.: Radio frequency time-of-flight distance measurement for low-cost wireless sensor localization. IEEE Sens. J. (2011). https://doi.org/10.1109/JSEN.2010.2072496
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings, International Conference on Image Processing (2003). https://doi.org/10.1109/icip.2002.1038171
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMs for object detection and beyond. In: Proceedings of the IEEE International Conference on Computer Vision (2011). https://doi.org/10.1109/ICCV.2011.6126229
Martínez-Villaseñor, L., Ponce, H., Brieva, J., Moya-Albor, E., Núñez-Martínez, J., Peñafort-Asturiano, C.: UP-fall detection dataset: a multimodal approach. Sensors (2019). https://doi.org/10.3390/s19091988
Meyer, K.A.: Student engagement in online learning: what works and why. ASHE High. Educ. Rep. (2014). https://doi.org/10.1002/aehe.20018
Mitchell, J.: Hollywood’s Latest Blockbuster: Big Data and The Innovator’s Curse (2014)
Oguejiofor, O.S., Okorogu, V.N., Adewale, A., Osuesu, B.O.: Outdoor localization system using RSSI measurement of wireless sensor network. Int. J. Innov. Technol. Explor. Eng. 2, 1–6 (2013)
Papandreou, G., Zhu, T., Chen, L.-C., Gidaris, S., Tompson, J., Murphy, K.: PersonLab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XIV. LNCS, vol. 11218, pp. 282–299. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_17
Peng, R., Sichitiu, M.L.: Angle of arrival localization for wireless sensor networks. In: 2006 3rd Annual IEEE Communications Society on Sensor and Adhoc Communications and Networks, Secon 2006 (2007). https://doi.org/10.1109/SAHCN.2006.288442
Pizer, S.M., Johnston, R.E., Ericksen, J.P., Yankaskas, B.C., Muller, K.E.: Contrast-limited adaptive histogram equalization: speed and effectiveness. In: Proceedings of the First Conference on Visualization in Biomedical Computing (1990)
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2014). https://doi.org/10.1109/CVPRW.2014.131
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016). https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. CoRR abs/1804.02767 (2018). http://arxiv.org/abs/1804.02767
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Richter, R., Kellenberger, T., Kaufmann, H.: Comparison of topographic correction methods. Remote Sens. (2009). https://doi.org/10.3390/rs1030184
Sanz Narrillos, M., Masneri., S., Zorrilla, M.: Combining video and wireless signals for enhanced audience analysis. In: Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, pp. 151–161. INSTICC, SciTePress (2020). https://doi.org/10.5220/0008963101510161
Schivinski, B., Christodoulides, G., Dabrowski, D.: Measuring consumers’ engagement with brand-related social-media content: development and validation of a scale that identifies levels of social-media engagement with brands. J. Advert. Res. (2016). https://doi.org/10.2501/JAR-2016-004
Schollz, Z.: High-precision indoor positioning framework, version 3 (2019). https://github.com/schollz/find3. Accessed 10 Apr 2019
Spinello, L., Triebel, R., Siegwart, R.: Multimodal people detection and tracking in crowded scenes. In: Proceedings of the National Conference on Artificial Intelligence (2008)
Su, Z., Ye, M., Zhang, G., Dai, L., Sheng, J.: Cascade feature aggregation for human pose estimation. In: CVPR (2019)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep High-Resolution Representation Learning for Human Pose Estimation (2019). http://arxiv.org/abs/1902.09212
Wightman, R.: posenet-pytorch (2018). https://github.com/rwightman/posenet-pytorch
Yiu, S., Dashti, M., Claussen, H., Perez-Cruz, F.: Wireless RSSI fingerprinting localization (2017). https://doi.org/10.1016/j.sigpro.2016.07.005
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sanz-Narrillos, M., Masneri, S., Zorrilla, M. (2021). A Multi-modal Audience Engagement Measurement System. In: Rocha, A.P., Steels, L., van den Herik, J. (eds) Agents and Artificial Intelligence. ICAART 2020. Lecture Notes in Computer Science(), vol 12613. Springer, Cham. https://doi.org/10.1007/978-3-030-71158-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-71158-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71157-3
Online ISBN: 978-3-030-71158-0
eBook Packages: Computer ScienceComputer Science (R0)