Unsupervised Multimodal Machine Translation for Low-resource Distant Language Pairs

Published: 15 April 2024


Unsupervised machine translation (UMT) has recently attracted more attention from researchers, enabling models to translate when languages lack parallel corpora. However, the current works mainly consider close language pairs (e.g., English-German and English-French), and the effectiveness of visual content for distant language pairs has yet to be investigated. This article proposes an unsupervised multimodal machine translation model for low-resource distant language pairs. Specifically, we first employ adequate measures such as transliteration and re-ordering to bring distant language pairs closer together. We then use visual content to extend masked language modeling and generate visual masked language modeling for UMT. Finally, empirical experiments are conducted on our distant language pair dataset and the public Multi30k dataset. Experimental results demonstrate the superior performance of our model, with BLEU score improvements of 2.5 and 2.6 on translation for distant language pairs English-Uyghur and Chinese-Uyghur. Moreover, our model also brings remarkable results for close language pairs, improving 2.3 BLEU compared with the existing models in English-German.


Index Terms

  1. Unsupervised Multimodal Machine Translation for Low-resource Distant Language Pairs



    Information & Contributors


    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 4
    April 2024
    221 pages
    Issue’s Table of Contents


    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 April 2024
    Online AM: 09 March 2024
    Accepted: 05 March 2024
    Revised: 26 February 2024
    Received: 07 November 2023
    Published in TALLIP Volume 23, Issue 4

    Author Tags

    1. Visual masked language modeling
    2. unsupervised machine translation
    3. distant language pair
    4. image feature


    • Research-article

    Funding Sources

    NSFC, China


    Cited By

    View all
    • (2025)Strategic Decision Support System With Probabilistic Linguistic Term Sets: Extended CRADIS Approach for Supply Chain Risk Management in Sports IndustryIEEE Access10.1109/ACCESS.2024.341639113(32853-32862)Online publication date: 2025
    • (2025)Decision-making model for selecting products through online product reviews utilizing natural language processing techniquesNeurocomputing10.1016/j.neucom.2024.128593611:COnline publication date: 1-Jan-2025
    • (2025)Joint pairwise learning and masked language models for neural machine translation of EnglishArtificial Life and Robotics10.1007/s10015-025-01008-2Online publication date: 10-Feb-2025
    • (2024)Advancements in intrusion detection: A lightweight hybrid RNN-RF modelPLOS ONE10.1371/journal.pone.029966619:6(e0299666)Online publication date: 21-Jun-2024
    • (2024)A dataset of Tibetan-Chinese speech translationChina Scientific Data10.11922/11-6035.csd.2024.0023.zh9:4(1-9)Online publication date: 20-Dec-2024
    • (2024)Design of Multimodal Retrieval Model for Translation Domain Based on BERTProceedings of the 2024 International Conference on Machine Intelligence and Digital Applications10.1145/3662739.3672185(168-172)Online publication date: 30-May-2024
    • (2024)Swarm Learning Empowered Federated Deep Learning for Seamless Smartphone-Based Activity RecognitionIEEE Transactions on Consumer Electronics10.1109/TCE.2024.347907870:4(6919-6935)Online publication date: 1-Nov-2024
    • (2024)English Translation Assistance System Integrating Machine Learning Algorithms2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems (IACIS)10.1109/IACIS61494.2024.10721728(1-4)Online publication date: 23-Aug-2024
    • (2024)ClusterE-ZSL: A Novel Cluster-Based Embedding for Enhanced Zero-Shot Learning in Contrastive Pre-Training Cross-Modal RetrievalIEEE Access10.1109/ACCESS.2024.347608212(162622-162637)Online publication date: 2024
    • (2024)Enhanced Sentiment Analysis and Topic Modeling During the Pandemic Using Automated Latent Dirichlet AllocationIEEE Access10.1109/ACCESS.2024.341171712(81206-81220)Online publication date: 2024
    • Show More Cited By

