CMCI: A Robust Multimodal Fusion Method for Spiking Neural Networks

Jiang, Runhao; Han, Jianing; Xue, Yingying; Wang, Ping; Tang, Huajin

doi:10.1007/978-981-99-8067-3_12

Runhao Jiang¹²,
Jianing Han¹²,
Yingying Xue¹³,
Ping Wang¹³ &
…
Huajin Tang^12,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14449))

Included in the following conference series:

International Conference on Neural Information Processing

1291 Accesses
2 Citations

Abstract

Human understand the external world through a variety of perceptual processes such as sight, sound, touch and smell. Simulating such biological multi-sensory fusion decisions using a computational model is important for both computer and neuroscience research. Spiking Neural Networks (SNNs) mimic the neural dynamics of the brain, which are expected to reveal the biological multimodal perception mechanism. However, existing works of multimodal SNNs are still limited, and most of them only focus on audiovisual fusion and lack systematic comparison of the performance and robustness of the models. In this paper, we propose a novel fusion module called Cross-modality Current Integration (CMCI) for multimodal SNNs and systematically compare it with other fusion methods on visual, auditory and olfactory fusion recognition tasks. Besides, a regularization technique called Modality-wise Dropout (ModDrop) is introduced to further improve the robustness of multimodal SNNs in missing modalities. Experimental results show that our method exhibits superiority in both modality-complete and missing conditions without any additional networks or parameters.

Supported by the National Key Research and Development Program of China under Grant 2020AAA0105900 and the National Natural Science Foundation of China under Grant 62236007.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multimodal Sensory Computing

Brain-inspired multisensory integration neural network for cross-modal recognition through spatiotemporal dynamics and deep learning

Article 02 February 2023

Bioinspired multisensory neural network with crossmodal integration and recognition

Article Open access 18 February 2021

Notes

1.
The sixth location is excluded due to the missing data.

References

Tan, H., Zhou, Y., Tao, Q., Rosen, J., van Dijken, S.: Bioinspired multisensory neural network with crossmodal integration and recognition. Nat. Commun. 12(1), 1120 (2021)
Article Google Scholar
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Article Google Scholar
Roy, K., Jaiswal, A., Panda, P.: Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784), 607–617 (2019)
Article Google Scholar
Chen, C., Xue, Y., Xiong, Y., Liu, M., Zhuang, L., Wang, P.: An auditory and olfactory data fusion algorithm based on spiking neural network for mobile robot. In: 2022 IEEE International Symposium on Olfaction and Electronic Nose (ISOEN), pp. 1–4. IEEE (2022)
Google Scholar
Zhang, M., et al.: An efficient threshold-driven aggregate-label learning algorithm for multimodal information processing. IEEE J. Sel. Top. Signal Process. 14(3), 592–602 (2020)
Article Google Scholar
Rathi, N., Roy, K.: STDP based unsupervised multimodal learning with cross-modal processing in spiking neural networks. IEEE Trans. Emerg. Top. Comput. Intell. 5(1), 143–153 (2018)
Article Google Scholar
Liu, Q., Xing, D., Feng, L., Tang, H., Pan, G.: Event-based multimodal spiking neural network with attention mechanism. In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8922–8926. IEEE (2022)
Google Scholar
Chavarriaga, R., et al.: The opportunity challenge: a benchmark database for on-body sensor-based activity recognition. Pattern Recognit. Lett. 34(15), 2033–2042 (2013)
Article Google Scholar
Gu, P., Xiao, R., Pan, G., Tang, H.: STCA: spatio-temporal credit assignment with delayed feedback in deep spiking neural networks. In: Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI 2019, pp. 1366–1372 (2019)
Google Scholar
Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 12, 331 (2018)
Article Google Scholar
Wu, Y., Deng, L., Li, G., Zhu, J., Xie, Y., Shi, L.: Direct training for spiking neural networks: faster, larger, better. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1311–1318 (2019)
Google Scholar
Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: rethinking gradient-descent for training spiking neural networks. Adv. Neural. Inf. Process. Syst. 34, 23426–23439 (2021)
Google Scholar
Guo, Y., et al.: IM-loss: information maximization loss for spiking neural networks. Adv. Neural. Inf. Process. Syst. 35, 156–166 (2022)
Google Scholar
Ma, G., Yan, R., Tang, H.: Exploiting noise as a resource for computation and learning in spiking neural networks. arXiv preprint arXiv:2305.16044 (2023)
Neverova, N., Wolf, C., Taylor, G., Nebout, F.: Moddrop: adaptive multi-modal gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1692–1706 (2015)
Article Google Scholar
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
Warden, P.: Speech commands: a dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018)
Vergara, A., Fonollosa, J., Mahiques, J., Trincavelli, M., Rulkov, N., Huerta, R.: On the performance of gas sensor arrays in open sampling systems using inhibitory support vector machines. Sens. Actuators B Chem. 185, 462–477 (2013)
Article Google Scholar
Rathi, N., Roy, K.: DIET-SNN: a low-latency spiking neural network with direct input encoding and leakage and threshold optimization. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Google Scholar
Choi, J.H., Lee, J.S.: Embracenet: a robust deep learning architecture for multimodal classification. Inf. Fusion 51, 259–270 (2019)
Article Google Scholar
Wang, S.H., Chou, T.I., Chiu, S.W., Tang, K.T.: Using a hybrid deep neural network for gas classification. IEEE Sens. J. 21(5), 6401–6407 (2020)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Key Research and Development Program of China under Grant 2020AAA0105900 and the National Natural Science Foundation of China under Grant 62236007.

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Runhao Jiang, Jianing Han & Huajin Tang
Biosensor National Special Laboratory Key Laboratory for Biomedical Engineering of Education Ministry, Department of Biomedical Engineering, Zhejiang University, Hangzhou, China
Yingying Xue & Ping Wang
Zhejiang Lab, Hangzhou, China
Huajin Tang

Authors

Runhao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jianing Han
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Xue
View author publications
You can also search for this author in PubMed Google Scholar
Ping Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huajin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huajin Tang .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Beijing, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, R., Han, J., Xue, Y., Wang, P., Tang, H. (2024). CMCI: A Robust Multimodal Fusion Method for Spiking Neural Networks. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14449. Springer, Singapore. https://doi.org/10.1007/978-981-99-8067-3_12

Download citation

DOI: https://doi.org/10.1007/978-981-99-8067-3_12
Published: 16 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8066-6
Online ISBN: 978-981-99-8067-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CMCI: A Robust Multimodal Fusion Method for Spiking Neural Networks