Abstract
In the domain of affective computing, researchers have sought to enhance the performance of models and algorithms by leveraging the complementarity of multimodal information. However, the rapid emergence of new modalities has outpaced the development of suitable datasets, posing a challenge in keeping up with the advancements in modal sensing technology. The collection and analysis of multimodal data present intricate and substantial tasks. To address the partial missing data challenge within the research community, we have curated a novel homogeneous multimodal gesture emotion recognition dataset, augmenting existing datasets through meticulous analysis. This dataset not only fills the gaps in homogeneous multimodal data but also opens up new avenues for emotion recognition research. Additionally, we propose a pseudo dual-flow network based on this dataset, establishing its potential application in the affective computing community. Experimental findings indicate the feasibility of utilizing traditional visual information and spiking visual information derived from homogeneous multimodal data for visual emotion recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amir, A., et al.: A low power, fully event-based gesture recognition system. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7243–7252 (2017)
Brandli, C., Berner, R., Yang, M., Liu, S.C., Delbruck, T.: A 240\(\times \) 180 130 db 3 \(\mu \)s latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circ. 49(10), 2333–2341 (2014)
Chen, G., Chen, J., Lienen, M., Conradt, J., Röhrbein, F., Knoll, A.C.: FLGR: fixed length GISTS representation learning for RNN-hmm hybrid-based neuromorphic continuous gesture recognition. Front. Neurosci. 13, 73 (2019)
Guo, M., Huang, J., Chen, S.: Live demonstration: a 768\(\times \) 640 pixels 200meps dynamic vision sensor. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–1. IEEE (2017)
Jiang, J., Fares, A., Zhong, S.H.: A context-supported deep learning framework for multimodal brain imaging classification. IEEE Trans. Hum. Mach. Syst. 49(6), 611–622 (2019)
Li, S., et al.: Unsupervised RGB-T object tracking with attentional multi-modal feature fusion. Multimedia Tools Appl. 82(15), 1–19 (2023)
Li, X., et al.: 4DME: a spontaneous 4D micro-expression dataset with multimodalities. IEEE Trans. Affect. Comput. 14(4), 3031–3047 (2022)
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128\(\times \) 128 120 db 15 \(\mu \)s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circ. 43(2), 566–576 (2008)
Lu, X., Wang, B., Zheng, X.: Sound active attention framework for remote sensing image captioning. IEEE Trans. Geosci. Remote Sens. 58(3), 1985–2000 (2019)
Lungu, I.A., Corradi, F., Delbrück, T.: Live demonstration: convolutional neural network driven by dynamic vision sensor playing RoShamBo. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–1. IEEE (2017)
Maro, J.M., Ieng, S.H., Benosman, R.: Event-based gesture recognition with dynamic background suppression using smartphone computational capabilities. Front. Neurosci. 14, 275 (2020)
Mueggler, E., Rebecq, H., Gallego, G., Delbruck, T., Scaramuzza, D.: The event-camera dataset and simulator: event-based data for pose estimation, visual odometry, and slam. Int. J. Robot. Res. 36(2), 142–149 (2017)
Ning, H., Zheng, X., Lu, X., Yuan, Y.: Disentangled representation learning for cross-modal biometric matching. IEEE Trans. Multimedia 24, 1763–1774 (2021)
Simon Chane, C., Ieng, S.H., Posch, C., Benosman, R.B.: Event-based tone mapping for asynchronous time-based image sensor. Front. Neurosci. 10, 391 (2016)
Vasudevan, A., Negri, P., Linares-Barranco, B., Serrano-Gotarredona, T.: Introduction and analysis of an event-based sign language dataset. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 675–682. IEEE (2020)
Wang, B., Dong, G., Zhao, Y., Li, R., Cao, Q., Chao, Y.: Non-uniform attention network for multi-modal sentiment analysis. In: Þór Jónsson, B., et al. (eds.) MMM 2022. LNCS, vol. 13141, pp. 612–623. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98358-1_48
Wang, B., et al.: Spiking emotions: Dynamic vision emotion recognition using spiking neural networks. vol. 3331, pp. 50–58. Virtual, Online, China (2022)
Wang, H., Chen, H., Wang, B., Jin, Y., Li, G., Kan, Y.: High-efficiency low-power microdefect detection in photovoltaic cells via a field programmable gate array-accelerated dual-flow network. Appl. Energy 318, 119203 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, B., Liang, X. (2024). Incorporating Spiking Neural Network for Dynamic Vision Emotion Analysis. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14437. Springer, Singapore. https://doi.org/10.1007/978-981-99-8558-6_29
Download citation
DOI: https://doi.org/10.1007/978-981-99-8558-6_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8557-9
Online ISBN: 978-981-99-8558-6
eBook Packages: Computer ScienceComputer Science (R0)