Improving acoustic event detection using generalizable visual features and multi-modality modeling | IEEE Conference Publication | IEEE Xplore