Abstract
In this paper, we propose a novel adaptive multimodal fusion network MIMF that is driven by the mutual information between the input data and the target recognition pattern. Due to the variant weather and road conditions, the real scenes can be far more complicated than those in the training dataset. That constructs a non-ignorable challenge for multimodal fusion models that obey fixed fusion modes, especially for autonomous driving. To address the problem, we leverage mutual information for adaptive modal selection in fusion, which measures the relation between the input and target output. We therefore design a weight-fusion module based on MI, and integrate it into our feature fusion lane line segmentation network. We evaluate it with the KITTI and A2D2 datasets, in which we simulate the extreme malfunction of sensors like modality loss problem. The result demonstrates the benefit of our method in practical application, and informs the future research into development of multimodal fusion as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Feng, D., et al.: Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges. IEEE Trans. Intell. Transport. Syst (2019)
Caltagirone, L., Bellone, M., Svensson, L., Wahde, M.: LIDAR-camera fusion for road detection using fully convolutional neural networks. Robot. Autonom. Syst. 111, 125–131 (2019)
Mario, B., et al.: Seeing through fog without seeing fog: deep multimodal sensor fusion in unseen adverse weather. In: CVPR (2020)
Carballo, A. et al.: LIBRE: The multiple 3D LiDAR dataset. ArXiv, abs/2003.06129
Vora, S., Lang, A., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: CVPR (2020)
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.: Frustum PointNets for 3D object detection from RGB-D data. In: CVPR (2018)
Su, Y., Gao, Y., Zhang, Y., Álvarez, J.M., Yang, J., Kong, H.: An illumination-invariant nonparametric model for urban road detection. IEEE Trans. Intell. Vehicles 4, 14–23 (2019)
Yang, B., Liang, M., Urtasun, R.: HDNET: exploiting HD maps for 3D object detection. In: CoRL (2018)
Kim, J., Koh, J., Kim, Y., Choi, J., Hwang, Y., Choi, J.W.: Robust deep multi-modal learning based on gated information fusion network. In: ACCV (2018)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Gabrié, M., et al.: Entropy and mutual information in models of deep neural networks. In: NeurIPS (2018)
Belghazi, M.I., et al.: Mutual information neural estimation. In: ICML (2018)
Hjelm, R.D., Fedorov, A., Lavoie-Marchildon, S., Grewal, K., Trischler, A., Bengio, Y.: Learning deep representations by mutual information estimation and maximization (2019)
Bramon, R., et al.: Multimodal data fusion based on mutual information. IEEE Trans. Visual. Comput. Graph. 18, 1574–1587 (2012)
Yousef, A., Iftekharuddin, K.: Shoreline extraction from the fusion of LiDAR DEM data and aerial images using mutual information and genetic algorithms. In: IJCNN (2014)
Pan, X., Shi, J., Luo, P., Wang, X., Tang, X.: Spatial as deep: spatial CNN for traffic scene understanding. In: AAAI (2018)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Geyer, J., et al.: A2D2: Audi autonomous driving dataset. ArXiv, abs/2004.06320
Acknowledgements
This work was supported by the National High Technology Research and Development Program of China under Grant No. 2018YFE0204300, the Beijing Science and Technology Plan Project No. Z191100007419008, the Guoqiang Research Institute Project No. 2019GQG1010, and the National Natural Science Foundation of China under Grant No. U1964203.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zou, Z., Zhao, L., Zhang, X., Li, Z., Jin, D., Luo, T. (2021). MIMF: Mutual Information-Driven Multimodal Fusion. In: Sun, F., Liu, H., Fang, B. (eds) Cognitive Systems and Signal Processing. ICCSIP 2020. Communications in Computer and Information Science, vol 1397. Springer, Singapore. https://doi.org/10.1007/978-981-16-2336-3_13
Download citation
DOI: https://doi.org/10.1007/978-981-16-2336-3_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2335-6
Online ISBN: 978-981-16-2336-3
eBook Packages: Computer ScienceComputer Science (R0)