Skip to main content
Log in

Parsisanj: an automatic component-based approach toward search engine evaluation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Web search engines play a significant role in answering users’ information needs based on the huge amount of data available on the internet. Although evaluating the performance of these systems is very important for their improvement, there is no comprehensive, unbiased, low-cost, and reusable method for this purpose. Previous works used a small and limited set of queries for their evaluation process that restricts the assessment domain. Moreover, these methods mainly rely on human evaluators for manual assessment of search engines which makes the results of the evaluation subjective to the opinion of human evaluators and also prone to error. In addition, repeating the evaluation would be a problem, as it requires the same level of human effort as of the first evaluation. Another drawback of the existing evaluations is that they score a search result based on its position in the retrieved list of relevant pages. This implies that these methods are only evaluating the ranker component of a web search engine, leaving all other components unevaluated. In this research, we propose an automatic approach for web search engine evaluation that can run with a query set that is multiple times bigger than the query sets used in the manual evaluations. The automatic nature of our proposed method makes repetition of the evaluation to be low cost in terms of the required human effort. Moreover, we designed this approach to be component based, meaning that we have different evaluation tasks for assessing different components of web search engines. For each component, queries are designed differently and are meant to assess the functionality of that component only. Similarly, the way that the retrieved results are scored is different for each component. For example, for assessing the spell correction component, the input query would contain a typo, and in the results, only instances of that word with the correct form will be scored positively. Experimental results of applying thousands of queries on two Persian and two language-independent web search engines show that none of the selected search engines dominates the other three across all components; instead, each search engine has its own points of strength and weakness that are highlighted through this evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Information Retrieval.

  2. Text REtrieval Conference.

  3. In this system, the universal set contains a set of related webpages to a query. They are used to elicit a value interval for query features by which the relevancy score of a webpage can be calculated in terms of each of its features.

  4. The U set.

  5. https://trec.nist.gov/trec_eval/.

  6. P_5 means Precision after 5 docs retrieved.

  7. If an irrelevant page is ranked higher than the relevant results of a query the search engine will receive − 1 score for that query.

References

  1. Mahmoudi M, Badie R, Zahedi MS, Azimzadeh M (2014) Evaluating the retrieval effectiveness of search engines using Persian navigational queries. In: 7’th International Symposium on Telecommunications (IST’2014), pp 563–568

  2. Sánchez D, Martínez-Sanahuja L, Batet M (2018) Survey and evaluation of web search engine hit counts as research tools in computational linguistics. Inf Syst 73:50–60

    Article  Google Scholar 

  3. Wu S, Zhang Z, Xu C (2019) Evaluating the effectiveness of web search engines on results diversification. Inf Res Int Electron J 24(1):n1

    Google Scholar 

  4. Azimzadeh M, Badie R, Esnaashari MM (2016) A review on web search engines’ automatic evaluation methods and how to select the evaluation method. In: Second international conference on web research (ICWR) pp 78–83

  5. Shoeleh F, Azimzadeh M, Mirzaei A, Farhoodi M (2016) Similarity based automatic web search engine evaluation. In: 2016 8th International Symposium on Telecommunications (IST), pp 643–648

  6. Nowkarizi M, Zeinali M (2017) The overlap and coverage of 4 local search engines: Parsijoo, Yooz, Parseek and Rismoun. Hum Inf Interact 4(3):48–59

    Google Scholar 

  7. Zhang J, Cai X, Le T, Fei W, Ma F (2019) A study on effective measurement of search results from search engines. J Glob Inf Manage (JGIM) 27(1):196–221

    Article  Google Scholar 

  8. Rahim I, Mushtaq H, Ahmad S (2019) Evaluation of Search Engines using Advanced Search: Comparative analysis of Yahoo and Bing. Library Philosophy and Practice, pp 1–12

  9. Gul S, Ali S, Hussain A (2020) Retrieval performance of Google, Yahoo and Bing for navigational queries in the field of “life science and biomedicine.” Data Technologies and Applications

  10. Tazehkandi MZ, Nowkarizi M (2020) Evaluating the effectiveness of Google, Parsijoo, Rismoon, and Yooz to retrieve Persian documents. Library Hi Tech

  11. Can F, Nuray R, Sevdik AB (2004) Automatic performance evaluation of Web search engines. Inf Process Manage 40(3):495–514

    Article  Google Scholar 

  12. Jiang S et al (2016) Learning query and document relevance from a web-scale click graph. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, pp 185–194

  13. Zeng W, Xu J, Lan Y, Guo J, Cheng X (2018) Multi page search with reinforcement learning to rank. In: Proceedings of the 2018 ACM SIGIR international conference on theory of information retrieval, pp 175–178

  14. Bitirim Y, Tonta Y, Sever H (2002) Information retrieval effectiveness of Turkish search engines.’ In: International conference on advances in information systems, pp 93–103

  15. Garoufallou E (2012) Evaluating search engines: a comparative study between international and Greek SE by Greek librarians. Program

  16. Griesbaum J (2004) Evaluation of three German search engines: Altavista. de, Google. de and Lycos. de. Inf Res 9(4)

  17. Tawileh W et al (2010) Evaluation of five web search engines in Arabic language. In: LWA, pp 221–228

  18. Lazarinis F, Vilares J, Tait J, Efthimiadis EN (2009) Current research issues and trends in non-English Web searching. Inf Retrieval 12(3):230–250

    Article  Google Scholar 

  19. Givi H, Anvari H (2003) Dastoore zabane Farsi

  20. Callan JP, Croft WB, Harding SM (1992) The INQUERY retrieval system. Database and expert systems applications, pp 78–83

  21. Metzler D, Manmatha R (2004) An inference network approach to image retrieval. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) vol 3115, pp 42–50

Download references

Acknowledgements

We are all thankful for the support and cooperation of all the group members. Additionally, this project was supported by Iran Telecommunication Research Center with project id 901952720.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amin Heydari Alashti.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by Iran Telecommunication Research Center.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alashti, A.H., Rezaei, A.A., Elahi, A. et al. Parsisanj: an automatic component-based approach toward search engine evaluation. J Supercomput 78, 10690–10711 (2022). https://doi.org/10.1007/s11227-022-04306-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04306-9

Keywords

Navigation