Skip to main content
Log in

The effects of the challenges in the transliteration of Persian names into English on the recall of retrieved results in the web of science

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The objective of this study was to examine the effects of the challenges in the transliteration of Persian names into English on the recall of retrieved results in the Web of Science. The statistical population of this study included the names of all Iranian researchers in the Web of Science database who had published an English article in the period 2010–2017. The initial number of these names was 3,110,873. After refining the data, the number of names was reduced to 11,242, of which 3959 were unique names with different spellings. Bibexcel and Excel were used to analyze the data. The challenges identified were divided into four groups: “consonants”, “vowels”, “omitted or repeated letters”, and “pronunciation”. The effect of each of the above-mentioned challenges on the recall of retrieved results was examined, and the spelling form that had the highest retrieval percentage and frequency among the examples retrieved for each challenge was determined. The results showed that the non-uniform transliteration of Persian names into English and different name spellings resulted in a decrease in recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. The Soundex algorithm is an algorithm that uses phonemes instead of spelling to index words. This method focuses on consonant words and puts them in a group despite a slight difference in spelling.

References

  • Abdollahi, M.S., (2007). Exploring Persian language morphology issues in retrieving information from web browsers. M.A. thesis. Shiraz University.

  • Aliaga, F. M., & Correa, A. D. (2011). Normalization trends of authors’ names in scientific publications. RELIEVE, 17(1), 1–10. https://doi.org/10.7203/relieve.17.1.4124

    Article  Google Scholar 

  • Al-Onaizan, Y., & Knight, K. (2002). Machine transliteration of names in Arabic texts. In: Proceedings of the ACL-02 workshop on Computational approaches to semitic languages. https://doi.org/10.3115/1118637.1118642.

  • Anand, R., Mahajan, R., Verma, N., & Singh, P. (2020). Soundex Algorithm for Hindi Language Names. In: Advances in Data Sciences, Security and Applications, 285–293. Springer, Singapor. https://doi.org/10.1007/978-981-15-0372-6_22.

  • Balabantaray, R. C., Mohanty, S., & Das, R. K. (2009). A Hybrid approach for transliteration of name entities. In Proceedings of the First International Conference on Intelligent Human Computer Interaction. https://doi.org/10.1007/978-81-8489-203-1_22

    Article  Google Scholar 

  • Chen, H. H., Hueng, S. J., Ding, Y. W., & Tsai, S. C. (1998). Proper name translation in cross-language information retrieval. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1, pp. 232–236. https://doi.org/10.3115/980845.980883.

  • Chen, K. H., & Hsieh, C. N. (2011). Ambiguity resolution for author names of bibliographic data. Journal of Educational Media & Library Sciences, 49(2).

  • Deep, K., & Goyal, V. (2011). Development of a Punjabi to English transliteration system. International Journal of Computer Science and Communication, 2(2), 521–526.

    Google Scholar 

  • Demetrescu, C., Ribichini, A., & Schaerf, M. (2018). Accuracy of author names in bibliographic data sources: An Italian case study. Scientometrics, 117(3), 1777–1791. https://doi.org/10.1007/s11192-018-2945-x

    Article  Google Scholar 

  • Falih, H. H. (2009). Some problems in the" translation" of English proper names into Arabic. Adab Al-Basrah, 50, 42–54.

    Google Scholar 

  • Falahati Qadimi Fumani, M. R. (2013). Glossary of transliteration of foreign authors' names and surnames of (written in English letters) into Persian using event-based analysis. Takht Jamshid.

  • Fumani, M. R. F. Q., Goltaji, M., & Parto, P. (2013). Inconsistent transliteration of Iranian university names: A hazard to Iran’s ranking in ISI Web of Science. Scientometrics, 95(1), 371–384. https://doi.org/10.1007/s11192-012-0818-2

    Article  Google Scholar 

  • Gasparyan, A. Y., Yessirkepov, M., Gerasimov, A. N., Kostyukova, E. I., & Kitas, G. D. (2016). Scientific author names: Errors, corrections, and identity profiles. Biochemia Medica, 26(2), 169–173.

    Article  Google Scholar 

  • Gautam V., Pipal A., Arora M. (2019) SoundEx Algorithm Revisited for Indian Language. In: Bhattacharyya S., Hassanien A., Gupta D., Khanna A., Pan I. International Conference on Innovative Computing and Communications. Lecture Notes in Networks and Systems, 56. Springer, Singapore. https://doi.org/10.1007/978-981-13-2354-6_6.

  • Giles, C. L., Zha, H., & Han, H. (2005). Name disambiguation in author citations using a k-way spectral clustering method. In: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (JCDL'05), pp. 334–343. https://doi.org/10.1145/1065385.1065462

  • Goltaji, M., & Abbaspour, J. (2017). A Survey on the inconsistencies of writing affiliations of the organizations and research centers and its influence on the recall of retrievals in Thomson Reuters. Library and Information Sciences, 20(1), 88–112.

    Google Scholar 

  • Goltaji, M., & Bazregar, S. (2010). Investigating the morphological problems of Persian language in three databases RICEST, IRANDOC & SID. Library and Information Science, 13(2), 191–214.

    Google Scholar 

  • Hermans, T. (1988). On translating proper names, with reference to De Witte and Max Havelaar. In M. J. Wintle (Ed.), Modern Dutch studies. Essays in honour of Professor Peter King on the occasion of his retirement. London/Atlantic Highlands: The Athlone Press.

  • Jacob, F., Javed, F., Zhao, M., & Mcnair, M. (2014). sCooL: A system for academic institution name normalization. In: 2014 international conference on collaboration technologies and systems (CTS), 86–93. IEEE. https://doi.org/10.1109/CTS.2014.6867547

  • Jahanshahi, A. (2006). Investigating the problems of transliteration of contemporary Iranian writers’ names. Informatics, 2(3), 195–214.

    Google Scholar 

  • Jahanshiri, A. (2019). Writing Persian with latin alphabet. In The blog single posts: Posts about language. http://www.jahanshiri.ir/fa/fa/negaresh-be-latin.

  • Jamalzadeh Jahromi, M. (2012). Proper nouns in dubbed animations. M.A. thesis, Faculty of Persian Literature and Foreign Languages, Allameh Tabataba'I University.

  • Jiang, L., Zhou, M., Chien, L. F., & Niu, C. (2007). Named Entity Translation with Web Mining and Transliteration. In: IJCAI7, pp. 1629–1634.

  • Josan, G. S., & Lehal, G. S. (2010). A Punjabi to Hindi machine transliteration system. In: International Journal of Computational Linguistics & Chinese Language Processing, 15(2).

  • Karimi, S., Scholer, F., & Turpin, A. (2007). Collapsed consonant and vowel models: New approaches for English-Persian transliteration and back-transliteration. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 648–655.

  • Karimi, S. (2008). Machine transliteration of proper names between English and Persian, Doctoral dissertation, RMIT University, Melbourne.

  • Karoonboonyanan, T., Sornlertlamvanich, V., & Meknavin, S. (1997). A Thai soundex system for spelling correction. In: Proceeding of the National Language Processing Pacific Rim Symposium, pp. 633–636.

  • Kaveh, M., Mirzabeigi, M., Sotudeh, H., & Moloodi, A. (2020). The challenges of transliteration of Persian Names to English, and their impact on the recall of web of science’s retrieved results. Iranian Journal of Information Processing and Management, 35(4), 1065–1094.

    Google Scholar 

  • Kawashima, H., & Tomizawa, H. (2015). Accuracy evaluation of Scopus author ID based on the largest funding database in Japan. Scientometrics, 103(3), 1061–1071. https://doi.org/10.1007/s11192-015-1580-z

    Article  Google Scholar 

  • Kay, B. N., & Rineer, B. C. (2012). Approaches to Arabic Name Transliteration and Matching in the DataFlux Quality Knowledge Base. In: the Fourth Workshop on Computational Approaches to Arabic Script-based Languages, 32.

  • Kim, J. (2018). Evaluating author name disambiguation for digital libraries: A case of DBLP. Scientometrics, 116(3), 1867–1886. https://doi.org/10.1007/s11192-018-2824-5

    Article  Google Scholar 

  • Kim, J., & Kim, J. (2020). Effect of forename string on author name disambiguation. Journal of the Association for Information Science and Technology, 71(7), 839–855. https://doi.org/10.1002/asi.24298

    Article  Google Scholar 

  • Klinberg, G. (1986). Children’s fiction in the hands of the translators. Lund: CWK Gleeerup.

  • Kurien, B. T. (2008). Name variations can hit citation rankings. Nature453(7194), 450–450. https://doi.org/10.1038/453450a.

  • Lerchenmueller, M. J., & Sorenson, O. (2016). Author disambiguation in PubMed: Evidence on the precision and recall of author-ity among NIH-funded scientists. PLoS One, 11(7), 1–13.

  • Levine-Clark, M., & Kraus, J. (2007). Finding chemistry information using Google Scholar: A comparison with chemical abstracts service. Science & Technology Libraries, 27(4), 3–17.

  • Li, Z., Chng, E. S., & Li, H. (2017). Named entity transliteration with sequence-to-sequence neural network. In: 2017 International Conference on Asian Language Processing (IALP), pp. 374–378. IEEE. doi: https://doi.org/10.1109/IALP.2017.8300621.

  • Marušić, A. (2016). What’s in a name? The problem of authors’ names in research articles. Biochemia Medica, 26(2), 174–175.

    Article  Google Scholar 

  • Mingers, J., & Meyer, M. (2017). Normalizing Google Scholar data for use in research evaluation. Scientometrics, 112(2), 1111–1121. https://doi.org/10.1007/s11192-017-2415-x.Mohagheghzadeh,M.,&Zareian,K.(2004).ProvideasolutionforsomePersianwritingautomationproblems.Informatics,42(3)

    Article  Google Scholar 

  • Mohagheghzadeh, M., & Zareian, K. (2005). Providing solutions to some problems of Persian writing automation. Information, 19(3–4), 1–10.

  • Mohammadzadeh sarab, A., Kazerani, M., and Shekofteh, M. (2018). Examining the Problems and Inconsistencies in the interpolation of English Transliterated names of Persian Language Researchers in Citation Databases. Library Philosophy and Practice (e-journal). p. 1740.

  • Raveenthiran, V. (2016). insensitivity regarding cultural variations of author surnames. Biochem Med (zagreb)., 26, 164–168.

    Google Scholar 

  • Riahinia, N., & Niknia, M. (2016). The barriers to persian name authority data sharing in the virtual international authority file (viaf), National Library and Information Studies, 27(3).

  • Sadeghi Gouraji, S., Pourahman, A. A., Hajizeinolabedini, M., & Zeiaei, S. (2015). Evaluation of the effectiveness of google scholar in authors’ information retrieval. Library and Information Science Research, 5(1), 205–220.

    Google Scholar 

  • Särkkä, H. (2007). Translation of proper names in non-fiction texts. Translation Journal, 11(1). Retrieved February 6, 2013 from http://accurapid.com/journal/39proper.htm.

  • Schulz, J. (2016). Using Monte Carlo simulations to assess the impact of author name disambiguation quality on different bibliometric analyses. Scientometrics, 107, 1283–1298. https://doi.org/10.1007/s11192-016-1892-7

    Article  Google Scholar 

  • Selivanova, I. V., Kosyakov, D. V., & Guskov, A. E. (2019). The impact of errors in the Scopus database on the research assessment. Scientific and Technical Information Processing, 46(3), 204–212. https://doi.org/10.3103/S0147688219030109

    Article  Google Scholar 

  • Shirinzadeh, S. A., & Mahadi, T. S. T. (2014). Transliteration proper nouns: A case study on english translation of Hafez’s lyrics. English Language Teaching, 7(7), 8.

    Google Scholar 

  • Sotudeh, H., & Honarjouyan, Z. (2014). Investigating the diversity of Persian writing patterns and its impact on information retrieval recall: A case study: Hamshahri corpus. Library and Information Science, 17(2), 31–49.

    Google Scholar 

  • Thaiprayoon, S., Kongthon, A. and Haruechaiyasak, C. (2018), ThaiQCor 2.0: Thai Query Correction via Soundex and Word Approximation, 2018 5th International Conference on Advanced Informatics: Concept Theory and Applications (ICAICTA), pp. 113–117, https://doi.org/10.1109/ICAICTA.2018.8541321.

  • Van Coillie, J. (2006). Character names in translation. A functional approach. https://lirias.kuleuven.be/1682096?limo=0.

  • Vermes, A. P. (2003). Proper names in translation: an explanatory attempt. Across Languages and Cultures, 4(1), 89–108.

  • Virga, P., & Khudanpur, S. (2003). Transliteration of proper names in cross-language applications. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 365–366. https://doi.org/10.1145/860435.860503.

  • Yahia, M. E., Elhafez, R. E. A., Elsherif, T. B., & Osman, O. N. (2004). Intelligent soundex function for Arabic names. In: Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, pp. 407–408. https://doi.org/10.1109/ICTTA.2004.1307804.

  • Yousef Mesr, L.,(2012). Translation proper names in roman. M.A. thesis. Faculty of Persian literature and foreign languages, Tehran Tarbiat Moalem University.

  • Zahid, M. A., Rao, N. I., & Siddiqui, A. M. (2010). English to Urdu transliteration: An application of Soundex algorithm. In: 2010 International Conference on Information and Emerging Technologies, pp. 1–5. https://doi.org/10.1109/ICIET.2010.5625681.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahdieh Mirzabeigi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaveh, M., Mirzabeigi, M., Sotudeh, H. et al. The effects of the challenges in the transliteration of Persian names into English on the recall of retrieved results in the web of science. Scientometrics 127, 1099–1128 (2022). https://doi.org/10.1007/s11192-021-04234-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-04234-0

Keywords

Navigation