ABSTRACT
Machine translation software has become heavily integrated into our daily lives due to the recent improvement in the performance of deep neural networks. However, machine translation software has been shown to regularly return erroneous translations, which can lead to harmful consequences such as economic loss and political conflicts. Additionally, due to the complexity of the underlying neural models, testing machine translation systems presents new challenges. To address this problem, we introduce a novel methodology called PatInv. The main intuition behind PatInv is that sentences with different meanings should not have the same translation. Under this general idea, we provide two realizations of PatInv that given an arbitrary sentence, generate syntactically similar but semantically different sentences by: (1) replacing one word in the sentence using a masked language model or (2) removing one word or phrase from the sentence based on its constituency structure. We then test whether the returned translations are the same for the original and modified sentences. We have applied PatInv to test Google Translate and Bing Microsoft Translator using 200 English sentences. Two language settings are considered: English-Hindi (En-Hi) and English-Chinese (En-Zh). The results show that PatInv can accurately find 308 erroneous translations in Google Translate and 223 erroneous translations in Bing Microsoft Translator, most of which cannot be found by the state-of-the-art approaches.
Supplemental Material
- [n.d.]. fairseq: A Fast, Extensible Toolkit for Sequence Modeling. https://github. com//pytorch/fairseqGoogle Scholar
- [n.d.]. Google Translate. https://translate.google.comGoogle Scholar
- [n.d.]. Thesaurus. https://www.thesaurus.com/Google Scholar
- [n.d.]. WordsAPI. https://www.wordsapi.com/Google Scholar
- 2018. 15 ,000 Eggs Delivered to Norwegian Olympic Team After Google Translate Error. https://www.nbcwashington.com/news/national-international/ googletranslate-fail-norway-olympic-team-gets-15k-eggs-delivered/2034392/Google Scholar
- 2018. Greedy, Britle, Opaque, and Shallow: The Downsides to Deep Learning. https://www.wired.com/story/greedy-brittle-opaque-and-shallow-thedownsides-to-deep-learning/Google Scholar
- Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. 2018. Generating Natural Language Adversarial Examples. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
- Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In Proceedings of the 35th International Conference on Machine Learning (ICML).Google Scholar
- Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and Natural Noise Both Break Neural Machine Translation. In Proceedings of the 6th International Conference on Learning Representations (ICLR).Google Scholar
- Edward Loper Bird, Steven and Ewan Klein. 2009. Natural Language Processing with Python. O'Reilly Media Inc.Google Scholar
- Ond rej Bojar, Christian Federmann, Mark Fishel, Yvete Graham, Barry Haddow, Mathias Huck, Philipp Koehn, and Christof Monz. 2018. Findings of the 2018 Conference on Machine Translation (WMT18). In Proceedings of the Third Conference on Machine Translation, Volume 2 : Shared Task Papers. Association for Computational Linguistics, Belgium, Brussels, 272-307. http://www.aclweb. org/anthology/W18-6401Google Scholar
- Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden Voice Commands. In Proceedings of the 25th USENIX Security Symposium (USENIX Security).Google ScholarDigital Library
- Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. arXiv preprint arXiv: 1803. 11175 ( 2018 ).Google Scholar
- Akshay Chaturvedi, Abijith KP, and Utpal Garain. 2019. Exploring the Robustness of NMT Systems to Nonsensical Inputs. arXiv preprint arXiv: 1908. 01165 ( 2019 ).Google Scholar
- Tsong Y. Chen, Shing C. Cheung, and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
- Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges and Opportunities. ACM Computing Surveys (CSUR) 51 ( 2018 ). Issue 1.Google ScholarDigital Library
- N. Chomsky. 1957. Syntactic Structures. Mouton, The Hague.Google Scholar
- Chenhui Chu, Raj Dabre, and Sadao Kurohashi. 2017. An empirical comparison of simple domain adaptation methods for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
- Gareth Davies. 2017. Palestinian man is arrested by police after posting 'Good morning' in Arabic on Facebook which was wrongly translated as 'attack them'. https://www.dailymail.co.uk/news/article-5005489/ Good-morningFacebook-post-leads-arrest-Palestinian.htmlGoogle Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv: 1810. 04805 ( 2018 ).Google Scholar
- Yinpeng Dong, Qi-An Fu, Xiao Yang, Tianyu Pang, Hang Su, Zihao Xiao, and Jun Zhu. 2019. Benchmarking Adversarial Robustness. arXiv preprint arXiv: 1912. 11852 ( 2019 ).Google Scholar
- Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. 2019. SirenAtack: Generating Adversarial Audio for End-to-End Acoustic Systems. arXiv preprint arXiv: 1901. 07846 ( 2019 ).Google Scholar
- Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. 2019. DeepStellar: Model-based quantitative analysis of stateful deep learning systems. In ESEC/FSE 2019-Proceedings of the 2019 27th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering.Google ScholarDigital Library
- Javid Ebrahimi, Daniel Lowd, and Dejing Dou. 2018. On Adversarial Examples for Character-Level Neural Machine Translation. In Proceedings of the 27th International Conference on Computational Linguistics (COLING).Google Scholar
- Hugging Face. [n.d.]. Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch. https://github.com/huggingface/ transformersGoogle Scholar
- Alessio Gambi, Marc Mueller, and Gordon Fraser. 2019. Automatically testing self-driving cars with search-based procedural content generation. In Proc. of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA).Google ScholarDigital Library
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. Proceedings of the 3rd International Conference on Learning Representations (ICLR).Google Scholar
- Stanford NLP Group. [n.d.]. Stanford CoreNLP-Natural language software. https://stanfordnlp.github.io/CoreNLP/Google Scholar
- Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, et al. 2018. Achieving Human Parity on Automatic Chinese to English News Translation. arXiv preprint arXiv: 1803. 05567 ( 2018 ).Google Scholar
- Pinjia He, Clara Meister, and Zhendong Su. 2020. Structure-Invariant Testing for Machine Translation. In Proc. of the 42nd International Conference on Software Engineering (ICSE).Google ScholarDigital Library
- J. Henriksson, C. Berger, M. Borg, L. Tornberg, C. Englund, S. R. Sathyamoorthy, and S. Ursing. 2019. Towards Structured Evaluation of Deep Neural Network Supervisors. In 2019 IEEE International Conference On Artificial Intelligence Testing (AITest).Google Scholar
- Mohit Iyyer, John Wieting, Kevin Gimpel, and Luke Zetlemoyer. 2018. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 ( Long Papers).Google ScholarCross Ref
- Mohit Iyyer, John Wieting, Kevin Gimpel, and Luke Zetlemoyer. 2018. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).Google ScholarCross Ref
- Robin Jia and Percy Liang. 2017. Adversarial Examples for Evaluating Reading Comprehension Systems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
- Harini Kannan, Alexey Kurakin, and Ian Goodfellow. 2018. Adversarial Logit Pairing. arXiv preprint arXiv: 1803. 06373 ( 2018 ).Google Scholar
- Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding Deep Learning System Testing using Surprise Adequacy. In Proceedings of the 41st International Conference on Software Engineering (ICSE).Google ScholarDigital Library
- Fred. Lambert. 2016. Understanding the fatal Tesla accident on Autopilot and the NHTSA probe. https://electrek.co/ 2016 /07/01/understanding-fatal-teslaaccident-autopilot-nhtsa-probe/Google Scholar
- Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence Modulo Inputs. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).Google ScholarDigital Library
- Sam Levin. 2018. Tesla fatal crash: 'autopilot' mode sped up car before driver killed, report finds. https://www.theguardian.com/technology/2018/jun/ 07/tesla-fatal-crash-silicon-valley-autopilot-mode-reportGoogle Scholar
- Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, and Ting Wang. 2019. TextBugger: Generating Adversarial Text Against Real-world Applications. In Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
- Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-Core Compiler Fuzzing. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).Google Scholar
- Ji Lin, Chuang Gan, and Song Han. 2019. Defensive Quantization: When Eficiency Meets Robustness. In Proceedings of the 7th International Conference on Learning Representations (ICLR).Google Scholar
- Mikael Lindvall, Dharmalingam Ganesan, Ragnar Árdal, and Robert E. Wiegand. 2015. Metamorphic Model-based Testing Applied on NASA DAT-an experience report. In Proceedings of the 37th International Conference on Software Engineering (ICSE).Google Scholar
- Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, et al. 2018. Deepgauge: Multi-Granularity Testing Criteria for Deep Learning Systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE).Google ScholarDigital Library
- Shiqing Ma, Yingqi Liu, Wen-Chuan Lee, Xiangyu Zhang, and Ananth Grama. 2018. MODE: Automated Neural Network Model Debugging via State Diferential Analysis and Input Selection. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE).Google ScholarDigital Library
- Shiqing Ma, Yingqi Liu, Guanhong Tao, Wen-Chuan Lee, and Xiangyu Zhang. 2019. NIC: Detecting Adversarial Samples with Neural Network Invariant Checking. In Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
- Fiona Macdonald. 2015. The Greatest Mistranslations Ever. http://www.bbc. com/culture/story/20150202-the-greatest-mistranslations-everGoogle Scholar
- Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Atacks. In Proceedings of the 6th International Conference on Learning Representations (ICLR).Google Scholar
- Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Association for Computational Linguistics (ACL) System Demonstrations.Google Scholar
- Paul Michel, Xian Li, Graham Neubig, and Juan Miguel Pino. 2019. On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models. In Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACLHLT).Google ScholarCross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jefrey Dean. 2013. Eficient Estimation of Word Representations in Vector Space. arXiv e-prints ( 2013 ).Google Scholar
- Pramod K. Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere. 2018. Did the Model Understand the Question?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
- Christian Murphy, Gail E. Kaiser, Lifeng Hu, and Leon Wu. 2008. Properties of Machine Learning Applications for Use in Metamorphic Testing. In Proceedings of the 20th International Conference on Software Engineering and Knowledge Engineering (SEKE).Google Scholar
- Arika Okrent. 2016. 9 Litle Translation Mistakes That Caused Big Problems. http://mentalfloss.com/article/48795/9-little-translation-mistakescaused-big-problemsGoogle Scholar
- Thuy Ong. 2017. Facebook apologizes after wrong translation sees Palestinian man arrested for posting 'good morning'. https://www.theverge.com/usworld/2017/10/24/16533496/facebook-apology-wrong-translation-palestinianarrested-post-good-morningGoogle Scholar
- Myle Ot, Michael Auli, David Grangier, and Marc'Aurelio Ranzato. 2018. Analyzing Uncertainty in Neural Machine Translation. arXiv:cs.CL/ 1803.00047Google Scholar
- Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks. In IEEE Symposium on Security and Privacy.Google ScholarCross Ref
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL).Google Scholar
- Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated Whitebox Testing of Deep Learning Systems. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP).Google ScholarDigital Library
- Jefrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
- Mathew E. Peters, Mark Neumann, Mohit Iyyer, Mat Gardner, Christopher Clark, Kenton Lee, and Luke Zetlemoyer. 2018. Deep contextualized word representations. arXiv e-prints ( 2018 ).Google Scholar
- Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE.Google ScholarDigital Library
- Danish Pruthi, Bhuwan Dhingra, and Zachary C. Lipton. 2019. Combating Adversarial Misspellings with Robust Word Recognition. In Proc. of the 57th Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
- Alec Radford. 2018. Improving Language Understanding by Generative PreTraining.Google Scholar
- RobustNLP. 2020. A toolkit for testing machine translation. https://github.com/ RobustNLP/TestTranslationGoogle Scholar
- Benny Royston. 2018. Israel Eurovision winner Neta called 'a real cow'by Prime Minister in auto-translate fail. https://metro.co.uk/ 2018 /05/13/israeleurovision-winner-netta-called-a-real-cow-by-prime-minister-in-autotranslate-fail-7541925/Google Scholar
- Sergio Segura, Gordon Fraser, Ana B. Sanchez, and Antonio Ruiz-Cortés. 2016. A Survey on Metamorphic Testing. IEEE Transactions on Software Engineering (TSE) 42 ( 2016 ). Issue 9.Google ScholarCross Ref
- Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
- Zeyu Sun, Jie M Zhang, Mark Harman, Mike Papadakis, and Lu Zhang. 2020. Automatic Testing and Improvement of Machine Translation. In Proc. of the 42nd International Conference on Software Engineering (ICSE).Google ScholarDigital Library
- Guanhong Tao, Shiqing Ma, Yingqi Liu, and Xiangyu Zhang. 2018. Atacks Meet Interpretability: Atribute-steered Detection of Adversarial Samples. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS).Google Scholar
- Wilson L. Taylor. 1953. ”Cloze Procedure” : A New Tool for Measuring Readability. Journalism Bulletin 30, 4 ( 1953 ), 415-433.Google Scholar
- Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars. In Proceedings of the 40th International Conference on Software Engineering (ICSE).Google ScholarDigital Library
- Barak Turovsky. 2016. Ten years of Google Translate. https://blog.google/ products/translate/ten-years-of-google-translate/Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, and Illia Kaiser, Lukasz abd Polosukhin. 2017. Atention is All you Need. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS).Google Scholar
- Jingyi Wang, Guoliang Dong, Jun Sun, Xinyu Wang, and Peixin Zhang. 2019. Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing. In Proceedings of the 41st International Conference on Software Engineering (ICSE).Google ScholarDigital Library
- Wenyu Wang, Wujie Zheng, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, and Tao Xie. 2019. Detecting Failures of Neural Machine Translation in the Absence of Reference Translations. In Proc. of the 49th IEEE/IFIP International Conference on Dependable Systems and Networks (industry track).Google ScholarCross Ref
- Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation. arXiv preprint arXiv:1609.08144 ( 2016 ).Google Scholar
- Xiaoyuan Xie, Joshua WK Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2011. Testing and Validating Machine Learning Classifiers by Metamorphic Testing. Journal of Systems and Software (JSS) 84 ( 2011 ). Issue 4.Google Scholar
- Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. Deephunter: A coverageguided fuzz testing framework for deep neural networks. In ISSTA 2019-Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis.Google ScholarDigital Library
- Chong Xiong, Charles R. Qi, and Bo Li. 2019. Generating 3D Adversarial Point Clouds. In Proceedings of the 2019 IEEE Conference on Computer Vision and Patern Recognition (CVPR).Google ScholarCross Ref
- Weilin Xu, David Evans, and Yanjun Qi. 2018. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. In Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
- Dawei Yang, Chaowei Xiao, Bo Li, Jia Deng, and Mingyan Liu. 2019. Realistic Adversarial Examples in 3D Meshes. In Proceedings of the 2019 IEEE Conference on Computer Vision and Patern Recognition (CVPR).Google Scholar
- Fuyuan Zhang, Sankalan Pal Chowdhury, and Maria Christakis. 2019. DeepSearch: Simple and Efective Blackbox Fuzzing of Deep Neural Networks. arXiv preprint arXiv: 1910. 06296 ( 2019 ).Google Scholar
- Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-Based Inference of Polynomial Metamorphic Relations. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (ASE).Google ScholarDigital Library
- Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2019. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering ( 2019 ).Google Scholar
- Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. Deeproad: Gan-Based Metamorphic Autonomous Driving System Testing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE).Google ScholarDigital Library
- Zhengli Zhao, Dheeru Dua, and Sameer Singh. 2018. Generating natural adversarial examples. In Proceedings of the 6th International Conference on Learning Representations (ICLR).Google Scholar
- Wujie Zheng, Wenyu Wang, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, and Tao Xie. 2018. Testing Untestable Neural Machine Translation: An Industrial Case. arXiv preprint arXiv:1807. 02340 ( 2018 ).Google Scholar
- Wujie Zheng, Wenyu Wang, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, and Tao Xie. 2019. Testing untestable neural machine translation: an industrial case. In Proc. of the 41st International Conference on Software Engineering: Companion Proceedings.Google ScholarDigital Library
- Zhi Quan Zhou, Shaowen Xiang, and Tsong Yueh Chen. 2016. Metamorphic Testing for Software Quality Assessment: A Study of Search Engines. IEEE Transactions on Software Engineering (TSE) 42 ( 2016 ). Issue 3.Google Scholar
- Muhua Zhu, Yue Zhang, Wenliang Chen, Min Zhang, and Jingbo Zhu. 2013. Fast and Accurate Shift-Reduce Constituent Parsing. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers). Association for Computational Linguistics.Google Scholar
- Chris. Ziegler. 2016. A Google self-driving car caused a crash for the first time. https://www.theverge.com/ 2016 /2/29/11134344/google-self-driving-carcrash-reportGoogle Scholar
Index Terms
- Machine translation testing via pathological invariance
Recommendations
Structure-invariant testing for machine translation
ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software EngineeringIn recent years, machine translation software has increasingly been integrated into our daily lives. People routinely use machine translation for various applications, such as describing symptoms to a foreign doctor and reading political news in a ...
Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation
The poor grammatical output of Machine Translation (MT) systems appeals syntax-based approaches within language modeling. However, previous studies showed that syntax-based language modeling using (Context-Free) Treebank Grammars was not very helpful in ...
Large aligned treebanks for syntax-based machine translation
We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- ...
Comments