Loading [a11y]/accessibility-menu.js
Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation | IEEE Conference Publication | IEEE Xplore

Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation


Abstract:

Human cognition is supported by the combination of multimodal information from different sources of perception. The two most important modalities are visual and audio. Cr...Show More

Abstract:

Human cognition is supported by the combination of multimodal information from different sources of perception. The two most important modalities are visual and audio. Cross-modal visual-audio generation enables the synthesis of data from one modality following the acquisition of data from another. This brings about the full experience that can only be achieved through the combination of the two. In this paper, the Self-Attention mechanism is applied to cross-modal visual-audio generation for the first time. This technique is implemented to assist in the analysis of the structural characteristics of the spectrogram. A series of experiments are conducted to discover the best performing configuration. The post-experimental comparison shows that the Self-Attention module greatly improves the generation and classification of audio data. Furthermore, the presented method achieves results that are superior to existing cross-modal visual-audio generative models.
Date of Conference: 04-08 May 2020
Date Added to IEEE Xplore: 09 April 2020
ISBN Information:

ISSN Information:

Conference Location: Barcelona, Spain

Contact IEEE to Subscribe

References

References is not available for this document.