Abstract
Human understand the external world through a variety of perceptual processes such as sight, sound, touch and smell. Simulating such biological multi-sensory fusion decisions using a computational model is important for both computer and neuroscience research. Spiking Neural Networks (SNNs) mimic the neural dynamics of the brain, which are expected to reveal the biological multimodal perception mechanism. However, existing works of multimodal SNNs are still limited, and most of them only focus on audiovisual fusion and lack systematic comparison of the performance and robustness of the models. In this paper, we propose a novel fusion module called Cross-modality Current Integration (CMCI) for multimodal SNNs and systematically compare it with other fusion methods on visual, auditory and olfactory fusion recognition tasks. Besides, a regularization technique called Modality-wise Dropout (ModDrop) is introduced to further improve the robustness of multimodal SNNs in missing modalities. Experimental results show that our method exhibits superiority in both modality-complete and missing conditions without any additional networks or parameters.
Supported by the National Key Research and Development Program of China under Grant 2020AAA0105900 and the National Natural Science Foundation of China under Grant 62236007.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The sixth location is excluded due to the missing data.
References
Tan, H., Zhou, Y., Tao, Q., Rosen, J., van Dijken, S.: Bioinspired multisensory neural network with crossmodal integration and recognition. Nat. Commun. 12(1), 1120 (2021)
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Roy, K., Jaiswal, A., Panda, P.: Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784), 607–617 (2019)
Chen, C., Xue, Y., Xiong, Y., Liu, M., Zhuang, L., Wang, P.: An auditory and olfactory data fusion algorithm based on spiking neural network for mobile robot. In: 2022 IEEE International Symposium on Olfaction and Electronic Nose (ISOEN), pp. 1–4. IEEE (2022)
Zhang, M., et al.: An efficient threshold-driven aggregate-label learning algorithm for multimodal information processing. IEEE J. Sel. Top. Signal Process. 14(3), 592–602 (2020)
Rathi, N., Roy, K.: STDP based unsupervised multimodal learning with cross-modal processing in spiking neural networks. IEEE Trans. Emerg. Top. Comput. Intell. 5(1), 143–153 (2018)
Liu, Q., Xing, D., Feng, L., Tang, H., Pan, G.: Event-based multimodal spiking neural network with attention mechanism. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8922–8926. IEEE (2022)
Chavarriaga, R., et al.: The opportunity challenge: a benchmark database for on-body sensor-based activity recognition. Pattern Recognit. Lett. 34(15), 2033–2042 (2013)
Gu, P., Xiao, R., Pan, G., Tang, H.: STCA: spatio-temporal credit assignment with delayed feedback in deep spiking neural networks. In: Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI 2019, pp. 1366–1372 (2019)
Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 12, 331 (2018)
Wu, Y., Deng, L., Li, G., Zhu, J., Xie, Y., Shi, L.: Direct training for spiking neural networks: faster, larger, better. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1311–1318 (2019)
Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: rethinking gradient-descent for training spiking neural networks. Adv. Neural. Inf. Process. Syst. 34, 23426–23439 (2021)
Guo, Y., et al.: IM-loss: information maximization loss for spiking neural networks. Adv. Neural. Inf. Process. Syst. 35, 156–166 (2022)
Ma, G., Yan, R., Tang, H.: Exploiting noise as a resource for computation and learning in spiking neural networks. arXiv preprint arXiv:2305.16044 (2023)
Neverova, N., Wolf, C., Taylor, G., Nebout, F.: Moddrop: adaptive multi-modal gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1692–1706 (2015)
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Warden, P.: Speech commands: a dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018)
Vergara, A., Fonollosa, J., Mahiques, J., Trincavelli, M., Rulkov, N., Huerta, R.: On the performance of gas sensor arrays in open sampling systems using inhibitory support vector machines. Sens. Actuators B Chem. 185, 462–477 (2013)
Rathi, N., Roy, K.: DIET-SNN: a low-latency spiking neural network with direct input encoding and leakage and threshold optimization. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Choi, J.H., Lee, J.S.: Embracenet: a robust deep learning architecture for multimodal classification. Inf. Fusion 51, 259–270 (2019)
Wang, S.H., Chou, T.I., Chiu, S.W., Tang, K.T.: Using a hybrid deep neural network for gas classification. IEEE Sens. J. 21(5), 6401–6407 (2020)
Acknowledgments
This work was supported by the National Key Research and Development Program of China under Grant 2020AAA0105900 and the National Natural Science Foundation of China under Grant 62236007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, R., Han, J., Xue, Y., Wang, P., Tang, H. (2024). CMCI: A Robust Multimodal Fusion Method for Spiking Neural Networks. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14449. Springer, Singapore. https://doi.org/10.1007/978-981-99-8067-3_12
Download citation
DOI: https://doi.org/10.1007/978-981-99-8067-3_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8066-6
Online ISBN: 978-981-99-8067-3
eBook Packages: Computer ScienceComputer Science (R0)