Fusing multimodal information in multimedia data usually improves the retrieval performance. One of the major issues in multimodal fusion is how to determine the best modalities. To combine the modalities more effectively, we propose a RELIEF-based modality weighting approach, named as RELIEF-MM. The original RELIEF algorithm is extended for weaknesses in several major issues: class-specific feature selection, complexities with multi-labeled data and noise, handling unbalanced datasets, and using the algorithm with classifier predictions. RELIEF-MM employs an improved weight estimation function, which exploits the representation and reliability capabilities of modalities, as well as the discrimination capability, without any increase in the computational complexity. The comprehensive experiments conducted on TRECVID 2007, TRECVID 2008 and CCV datasets validate RELIEF-MM as an efficient, accurate and robust way of modality weighting for multimedia data.

This paper is a revised and extended version of [54].
The final goal of this study is to select the effective modalities by weighting the available modalities and each modality is a multi-dimensional feature. Thus, from now on, the phrases ‘modality selection’, ‘modality weighting’ and ‘multimodal feature selection’ are used interchangeably.
This two-step process is applied for the TRECVID 2007 and 2008 datasets, where the number of modalities lead to inefficient situations. For the CCV dataset, an exhaustive weight search process is performed with 0.01 precision.
The measurements are taken on a machine with “Intel(R) Xeon(R) CPU E5530 @2.40GHz”. The values on the graph and table are obtained without a parallel programming approach.
Yilmaz, T., Yazici, A. & Kitsuregawa, M. RELIEF-MM: effective modality weighting for multimedia information retrieval. Multimedia Systems 20, 389–413 (2014). https://doi.org/10.1007/s00530-014-0360-6
