ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Improving Spoken Language Understanding with Cross-Modal Contrastive Learning

Jingjing Dong, Jiayi Fu, Peng Zhou, Hao Li, Xiaorui Wang

Spoken language understanding(SLU) is conventionally based on pipeline architecture with error propagation issues. To mitigate this problem, end-to-end(E2E) models are proposed to directly map speech input to desired semantic outputs. Meanwhile, others try to leverage linguistic information in addition to acoustic information by adopting a multi-modal architecture. In this work, we propose a novel multi-modal SLU method, named CMCL, which utilizes cross-modal contrastive learning to learn better multi-modal representation. In particular, a two-stream multi-modal framework is designed, and a contrastive learning task is performed across speech and text representations. Moreover, CMCL employs a multi-modal shared classification task combined with a contrastive learning task to guide the learned representation to improve the performance on the intent classification task. We also investigate the efficacy of employing crossmodal contrastive learning during pretraining. CMCL achieves 99.69% and 92.50% accuracy on FSC and Smartlights datasets, respectively, outperforming state-of-the-art comparative methods. Also, performances only decrease by 0.32% and 2.8%, respectively, when trained on 10% and 1% of the FSC dataset, indicating its advancement under few-shot seniors.


doi: 10.21437/Interspeech.2022-658

Cite as: Dong, J., Fu, J., Zhou, P., Li, H., Wang, X. (2022) Improving Spoken Language Understanding with Cross-Modal Contrastive Learning. Proc. Interspeech 2022, 2693-2697, doi: 10.21437/Interspeech.2022-658

@inproceedings{dong22_interspeech,
  author={Jingjing Dong and Jiayi Fu and Peng Zhou and Hao Li and Xiaorui Wang},
  title={{Improving Spoken Language Understanding with Cross-Modal Contrastive Learning}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={2693--2697},
  doi={10.21437/Interspeech.2022-658}
}