Special Section on Graphics Interface 2021Visualization-based improvement of neural machine translation
Graphical abstract
Introduction
Machine learning and especially deep learning are popular and rapidly growing fields in many research areas. The results created with machine learning models are often impressive but sometimes still problematic. Currently, much research is performed to better understand, explain, and interact with these models. In this context, visualization and visual analytics methods are suitable and more and more often used to explore different aspects of these models. Available techniques for visual analytics in deep learning were examined by Hohman et al. [1]. While there is a large amount of work available for explainability in computer vision, less work exists for machine translation.
As it becomes increasingly important to communicate in different languages, and since information should be available for a huge range of people from different countries, many texts have to be translated. Doing this manually takes much effort. Nowadays, online translation systems like Google Translate [2] or DeepL [3] support humans in translating texts. However, the translations generated that way are often not as expected or like someone familiar with both languages might translate them. It may also not express someone’s translation style or use the correct terminology of a specific domain or for some occasion. Often, more background knowledge about the text is required to translate documents appropriately.
With the introduction of deep learning methods, the translation quality of machine translation models has improved considerably in the last years. However, there are still difficulties that need to be addressed. Common problems of neural machine translation (NMT) models are, for instance, over- and under-translation [4] when words are translated repeatedly or not at all. Handling rare words [5], which might be available in specific documents, and long sentences, are also issues. Domain adaption [5] is another challenge. Especially documents from specific domains such as medicine, law, or science require high-quality translations [6]. As many NMT models are trained on general data sets, their translation performance is worse for domain-specific texts.
If high-quality translations for large texts are required, it is insufficient to use machine translation models alone. These models are computationally efficient and able to translate large documents with low time effort, but they may create erroneous or inappropriate translations. Humans are very slow compared to these models, but they can detect and correct mistranslations when familiar with the languages and the domain terminology. In a visual analytics system, both of these capabilities can be combined. Such a system should provide the translations from an NMT model and possibilities for users to visually explore translation results to find mistranslated sentences, correct them, and steer the machine learning model.
We have developed a visual analytics approach to reach the goals outlined above. First, our system performs automatic translation of a whole, possibly large, document and shows the result in the Document View (Fig. 1). Users can then explore and modify the document on different views [7] (Fig. 2) to improve translations and use these corrections to fine-tune the NMT model. We support different NMT architectures and use both an LSTM-based and a Transformer architecture.
So far, visual analytics systems for deep learning were mostly available for computer vision, some text-related areas, focusing on smaller parts of machine translation [8], [9] or intended for domain experts to gain insight into the models or to debug them [10], [11]. This work contributes to visualization research by introducing the application domain of NMT using a user-oriented visual analytics approach. In our system, we employ different visualization techniques adapted for usage with NMT. Our parallel coordinates plot (Fig. 1(B)) supports the visualization of different metrics related to text quality. The interaction techniques in our graph- and matrix-based visualizations for attention (Fig. 2(B) and (C)) and tree-based visualization for beam search (Fig. 2(D)) are specifically designed for text exploration and modification. They have a strong coupling to the underlying model. Furthermore, our system has a fast feedback loop and allows interaction in real-time. We demonstrate our system’s features in a video and provide the source code1 [12] for our system. The trained models [13] we used in our case study and evaluation are also publicly available.
This paper is an extended version of our previous work [14]. We improved our visual analytics approach by adding a second interactive attention-based visualization for sentences in the form of a matrix (Fig. 2(C)) that supports subword units. In this context, we discuss the differences between our attention visualizations and the circumstances in which one variant might be preferable. Additionally, we allow users to specify parameters in the user interface for a better analysis of translation results. This is especially useful for more experienced users who want to explore more details about the Transformer architecture or about the attention weights in both architectures we implemented. We provide more information about our implemented machine translation models and explore a different document in our case study. Finally, we include a quantitative computer-based evaluation, demonstrating both the utility of our metrics for detecting mistranslated sentences, as well as how fine-tuning LSTM-based models on domain-specific documents increases the in-domain translation quality.
Section snippets
Related work
This section first discusses visualization, visual analytics, and interaction approaches for language translation in general and then visual analytics of deep learning for text. Afterward, we provide an overview of work that combines both areas in the context of NMT.
Many visualization techniques and visual analytics systems exist for text; see Kucher and Kerren [15] for an overview. However, there is little work on exploring and modifying translation results. An interactive system to explore
Visual analytics approach
Our visual analytics approach allows the automatic translation, exploration, and correction of documents. Its components can be split into multiple parts. First, a document is automatically translated from one language into another one. Next, mistranslated sentences in the document are identified by users. Then, the users can explore and correct individual sentences. Finally, the model can be fine-tuned and the document retranslated. This workflow is also shown in Fig. 3.
Our approach has a
Case study
As a typical use case, we take the German Wikipedia article for artificial intelligence (Künstliche Intelligenz) [56] as a document for translation into English. For translation, we used a total of 358 sentences and headings from the article. In the following, we show how to use our system to improve the translation quality of the document. Please see our accompanying video for a demonstration with the Transformer model. The examples in the following were created with both an LSTM and
Evaluation
We conducted a preliminary user study during the development of our approach to evaluate our concept, using a prototype with an LSTM translation model. Our visual analytics system was rated positively in terms of effectiveness, ease of understanding and intuitiveness of visualizations, and ease of interaction. The participants mastered the translation process well using our selected visualizations. Especially our choice of parallel coordinate plots to visualize multiple metrics, and the
Discussion and future work
To conclude, we present a visual analytics approach for exploring, understanding, and correcting translations created by NMT. Our approach supports users in translating large domain-specific documents with interactive visualizations in different views, and it allows sentence correction in real-time and model adaption.
Our qualitative user study results showed that our visual analytics system was rated positively regarding effectiveness, interpretability of visualizations, and ease of interaction.
CRediT authorship contribution statement
Tanja Munz: Conceptualization, Methodology, Software, Formal analysis, Writing – original draft, Visualization, Project administration. Dirk Väth: Conceptualization, Methodology, Software, Writing – original draft. Paul Kuznecov: Conceptualization, Methodology, Software, Formal analysis. Ngoc Thang Vu: Conceptualization, Writing – review & editing, Supervision, Funding acquisition. Daniel Weiskopf: Conceptualization, Writing – review & editing, Supervision, Funding acquisition.
Acknowledgments
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2075 – 390740016.
References (66)
- et al.
Towards better analysis of machine learning models: A visual analytics perspective
Vis Informa
(2017) - et al.
A task-and-technique centered survey on visual analytics for deep learning model engineering
Comput Graph
(2018) - et al.
Visual analytics in deep learning: An interrogative survey for the next frontiers
IEEE Trans Vis Comput Graphics
(2018) Google translate
(2021)Deepl translator
(2021)- Tu Z, Liu Y, Shang L, Liu X, Li H. Neural machine translation with reconstruction. In: Thirty-first AAAI conference on...
- et al.
Six challenges for neural machine translation
- et al.
A survey of domain adaptation for neural machine translation
- Roberts JC. State of the art: Coordinated multiple views in exploratory visualization. In: Fifth international...
- et al.
Interactive visualization and manipulation of attention-based neural machine translation
Visualizing neural machine translation attention and confidence
Prague Bull Math Linguist
LSTMVis: A tool for visual analysis of hidden state dynamics in recurrent neural networks
IEEE Trans Vis Comput Graphics
Seq2Seq-Vis: A visual debugging tool for sequence-to-sequence models
IEEE Trans Vis Comput Graphics
NMTVis - extended neural machine translation visualization system
NMTVis - trained models for our visual analytics system
Visual-interactive neural machine translation
The Chinese room: Visualization and interaction to understand and correct ambiguous machine translation
Comput Graph Forum
Natural language translation at the intersection of AI and HCI
Commun ACM
Visual analytics for explainable deep learning
IEEE Comput Graph Appl
A survey of visual analytics techniques for machine learning
Comput Vis Media
Visualizing and understanding recurrent networks
Teaching machines to read and comprehend
Neural machine translation by jointly learning to align and translate
A survey of deep learning techniques for neural machine translation
Attention is all you need
A multiscale visualization of attention in the transformer model
Visualizing attention in transformerbased language models
Cited by (7)
Editorial Note
2022, Computers and Graphics (Pergamon)Visual Analysis of Scene-Graph-Based Visual Question Answering
2023, ACM International Conference Proceeding SeriesScientometric analysis of ICT-assisted intelligent control systems response to COVID-19 pandemic
2023, Neural Computing and ApplicationsAngler: Helping Machine Translation Practitioners Prioritize Model Improvements
2023, Conference on Human Factors in Computing Systems - Proceedings