Skip to main content
Log in

Machine translation using deep learning for universal networking language based on their structure

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper presents a deep learning-based machine translation (MT) system that translates a sentence of subject-object-verb (SOV) structured language into subject-verb-object (SVO) structured language. This system uses recurrent neural networks (RNNs) and Encodings. Encode embedded RNNs generate a set of numbers from the input sentence, where the second RNNs generate the output from these sets of numbers. Three popular datasets of SOV structured language i.e., EMILLE corpus, Prothom-Alo corpus and Punjabi Monolingual Text Corpus ILCI-II are used as two different case-study to validate. In our experimental case-study 1, for the EMILLE corpus and Prothom-Alo corpus dataset, we have achieved 0.742, 4.11 and 0.18, respectively as Bilingual Evaluation Understudy (BLEU), NIST (metric) and tertiary entrance rank scores. Another case-study for Punjabi Monolingual Text Corpus ILCI-II dataset achieved a BLEU score of 0.75. Our results can be compared with the state-of-the-art results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W et al (2016) Google's neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144

  2. Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Education Limited, Malaysia

    MATH  Google Scholar 

  3. Karaa WBA, Ashour AS, Sassi DB, Roy P, Kausar N, Dey N (2016) Medline text mining: an enhancement genetic algorithm based approach for document clustering. In: Applications of intelligent optimization in biology and medicine. Springer, Cham, pp 267–287

  4. Santosh KC, Nattee C (2009) A comprehensive survey on on-line handwriting recognition technology and its real application to the Nepalese natural handwriting. Kathmandu University J Sci Eng Technol 5(I):31–55

  5. https://en.wikipedia.org/wiki/Google_Translate. Accessed 27 Jul 2019

  6. Eom YH, Aragón P, Laniado D, Kaltenbrunner A, Vigna S, Shepelyansky DL (2015) Interactions of cultures and top people of Wikipedia from ranking of 24 language editions. PLoS ONE 10(3):e0114825

    Article  Google Scholar 

  7. Mridha MF, Saha AK, Das JK (2014) New approach of solving semantic ambiguity problem of Bangla root words using universal networking language (UNL). In: 2014 International Conference on Informatics, Electronics & Vision (ICIEV). IEEE, pp 1–6

  8. Tripathi S, Sarkhel JK (2010) Approaches to machine translation

  9. Vickrey D, Biewald L, Teyssier M, Koller D (2005) Word-sense disambiguation for machine translation. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing, pp 771–778

  10. Forcada ML, Ginestí-Rosell M, Nordfalk J, O’Regan J, Ortiz-Rojas S, Pérez-Ortiz JA et al (2011) Apertium: a free/open-source platform for rule-based machine translation. Mach Transl 25(2):127–144

    Article  Google Scholar 

  11. Karaa WBA, Dey N (2017) Mining multimedia documents. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  12. Santosh KC, Nattee C (2006) Stroke number and order free handwriting recognition for Nepali. In: Pacific Rim International Conference on Artificial Intelligence. Springer, Berlin, Heidelberg, pp 990–994

  13. Koehn P (2009) Statistical machine translation. Cambridge University Press, Cambridge

    Book  Google Scholar 

  14. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: MT summit. vol 5, pp 79–86

  15. Singh A, Dey N, Ashour AS (2017) Scope of automation in semantics-driven multimedia information retrieval from web. In: Web semantics for textual and visual information retrieval. IGI Global, pp 1–16

  16. Santosh KC, Nattee C (2006) Structural approach on writer independent nepalese natural handwriting recognition. In: 2006 IEEE conference on cybernetics and intelligent systems. IEEE, pp. 1–6

  17. Aiken M, Balan S (2011) An analysis of Google translate accuracy. Transl J 16(2):1–3

    Google Scholar 

  18. Maji P, Chatterjee S, Chakraborty S, Kausar N, Samanta S, Dey N (2015) Effect of Euler number as a feature in gender recognition system from offline handwritten signature using neural networks. In: 2015 2nd International conference on computing for sustainable global development (INDIACom). IEEE, pp 1869–1873

  19. Farrús M, Costa-Jussa MR, Mariño JB, Poch M, Hernández A, Henríquez C, Fonollosa JA (2011) Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan-Spanish language pair. Lang Resour Eval 45(2):181–208

    Article  Google Scholar 

  20. Pardeshi R, Chaudhuri BB, Hangarge M, Santosh KC (2014) Automatic handwritten Indian scripts identification. In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 375–380

  21. Chaki J, Dey N, Shi F, Sherratt RS (2019) Pattern mining approaches used in sensor-based biometric recognition: a review. IEEE Sens J 19(10):3569–3580

    Article  Google Scholar 

  22. Mahata SK, Das D, Bandyopadhyay S (2019) Mtil 2017: Machine translation using recurrent neural network on statistical machine translation. J Intell Syst 28(3):447–453

    Article  Google Scholar 

  23. Mukta AP, Mamun AA, Basak C, Nahar S, Arif MFH (2019) A phrase-based machine translation from English to Bangla using rule-based approach. In: 2019 International conference on electrical, computer and communication engineering (ECCE). IEEE, pp 1–5

  24. Ali M, Yousuf N, Rahman M, Sorwar G (2019) Bangla DeConverter for extraction of BanglaText from Universal Networking Language. Information 10(10):324

    Article  Google Scholar 

  25. Islam MZ, Tiedemann J, Eisele A (2010) English to Bangla phrase-based machine translation. In: proceedings of the 14th annual conference of the European association for machine translation

  26. Ashrafi SS, Kabir MH, Anwar MM, Noman AKM (2013) English to Bangla machine translation system using Context-Free Grammars. Int J Comput Sci Issues 10(3):144

    Google Scholar 

  27. Ali MNY, Al-Mamun SA, Das JK, Nurannabi AM (2008) Morphological analysis of Bangla words for universal networking language. In: 2008 Third International Conference on Digital Information Management. IEEE, pp 532–537

  28. Ali M, Ali MM (2002) Development of machine translation Dictionaries for Bangla language. In: 5th ICCIT, pp 272–276

  29. Saha GK (2005) The E2B machine translation: a new approach to HLT. Ubiquity 2005(August):1–1

    Article  Google Scholar 

  30. Uddin MG, Ashraf H, Kamal AHM, Ali MM (2004) New parameters for Bangla to English statistical machine translation. In: International Conference on Electrical and Computer Engineering. ICECE, pp 545–548

  31. Luong MT, Manning CD (2016) Achieving open vocabulary neural machine translation with hybrid word-character models. arXiv:1604.00788

  32. Simard M, Ueing N, Isabelle P, Kuhn R (2007) Rule-based translation with statistical phrase-based post-editing. In: Proceedings of the second workshop on statistical machine translation. Association for Computational Linguistics, pp 203–206

  33. Subramanian CM, Cherukuri AK, Chelliah C (2018) Role based access control design using three-way formal concept analysis. Int J Mach Learn Cybern 9(11):1807–1837

    Article  Google Scholar 

  34. Dey N, Ashour AS, Nguyen GN (2020) Recent advancement in multimedia content using deep learning

  35. Pinker S (1991) Rules of language. Science 253(5019):530–535

    Article  Google Scholar 

  36. Habash N (2007) Syntactic preprocessing for statistical machine translation. In: MT Summit XI. pp 215–222

  37. Mikolov T, Kombrink S, Burget L, Černocký J, Khudanpur S (2011) Extensions of recurrent neural network language model. In: 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5528–5531

  38. Gao M, Shi G, Li S (2018) Online prediction of ship behavior with automatic identification system sensor data using bidirectional long short-term memory recurrent neural network. Sensors 18(12):4211

    Article  Google Scholar 

  39. Auli M, Galley M, Quirk C, Zweig G (2013) Joint language and translation modeling with recurrent neural networks

  40. Ogata T, Murase M, Tani J, Komatani K, Okuno HG (2007) Two-way translation of compound sentences and arm motions by recurrent neural networks. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp 1858–1863

  41. Mikolov T, Karaiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. In: Eleventh annual conference of the international speech communication association

  42. Wang J, Liu F, Qin S (2019) Global exponential stability of uncertain memristor-based recurrent neural networks with mixed time delays. Int J Mach Learn Cybern 10(4):743–755

    Article  Google Scholar 

  43. Zhang C, Ma Y (eds) (2012) Ensemble machine learning: methods and applications. Springer, New York

    MATH  Google Scholar 

  44. https://www.lancaster.ac.uk/fass/projects/corpus/emille/. Accessed 21 Jun 2019

  45. Majumder KM, Arafat Y (2006) Analysis of and observations from a Bangla News Corpus

  46. Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 311–318

  47. Doddington G (2002) Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc, pp 138–145

  48. Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of association for machine translation in the Americas, vol 200, no 6

  49. https://tdil-dc.in/index.php?option=com_download&task=showresourceDetails&toolid=1890 (Punjabi Monolingual Text Corpus ILCI-II). Accessed 16 Apr 2020

  50. Kumar P, Sharma RK (2013) Punjabi Deconverter for generating Punjabi from universal networking language. J Zhejiang Univ Sci C 14(3):179–196

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. C. Santosh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ali, M.N.Y., Rahman, M.L., Chaki, J. et al. Machine translation using deep learning for universal networking language based on their structure. Int. J. Mach. Learn. & Cyber. 12, 2365–2376 (2021). https://doi.org/10.1007/s13042-021-01317-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01317-5

Keywords

Navigation