Parsisanj: an automatic component-based approach toward search engine evaluation

Alashti, Amin Heydari; Rezaei, Ahmad Asgharian; Elahi, Alireza; Sayyaran, Sobhan; Ghodsi, Mohammad

doi:10.1007/s11227-022-04306-9

Parsisanj: an automatic component-based approach toward search engine evaluation

Published: 25 January 2022

Volume 78, pages 10690–10711, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

138 Accesses
Explore all metrics

Abstract

Web search engines play a significant role in answering users’ information needs based on the huge amount of data available on the internet. Although evaluating the performance of these systems is very important for their improvement, there is no comprehensive, unbiased, low-cost, and reusable method for this purpose. Previous works used a small and limited set of queries for their evaluation process that restricts the assessment domain. Moreover, these methods mainly rely on human evaluators for manual assessment of search engines which makes the results of the evaluation subjective to the opinion of human evaluators and also prone to error. In addition, repeating the evaluation would be a problem, as it requires the same level of human effort as of the first evaluation. Another drawback of the existing evaluations is that they score a search result based on its position in the retrieved list of relevant pages. This implies that these methods are only evaluating the ranker component of a web search engine, leaving all other components unevaluated. In this research, we propose an automatic approach for web search engine evaluation that can run with a query set that is multiple times bigger than the query sets used in the manual evaluations. The automatic nature of our proposed method makes repetition of the evaluation to be low cost in terms of the required human effort. Moreover, we designed this approach to be component based, meaning that we have different evaluation tasks for assessing different components of web search engines. For each component, queries are designed differently and are meant to assess the functionality of that component only. Similarly, the way that the retrieved results are scored is different for each component. For example, for assessing the spell correction component, the input query would contain a typo, and in the results, only instances of that word with the correct form will be scored positively. Experimental results of applying thousands of queries on two Persian and two language-independent web search engines show that none of the selected search engines dominates the other three across all components; instead, each search engine has its own points of strength and weakness that are highlighted through this evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Meta-Search Engine Ranking Based on Webpage Information Quality Evaluation

Army ANT: A Workbench for Innovation in Entity-Oriented Search

Domain-Based Search Engine Evaluation

Notes

Information Retrieval.
Text REtrieval Conference.
In this system, the universal set contains a set of related webpages to a query. They are used to elicit a value interval for query features by which the relevancy score of a webpage can be calculated in terms of each of its features.
The U set.
https://trec.nist.gov/trec_eval/.
P_5 means Precision after 5 docs retrieved.
If an irrelevant page is ranked higher than the relevant results of a query the search engine will receive − 1 score for that query.

References

Mahmoudi M, Badie R, Zahedi MS, Azimzadeh M (2014) Evaluating the retrieval effectiveness of search engines using Persian navigational queries. In: 7’th International Symposium on Telecommunications (IST’2014), pp 563–568
Sánchez D, Martínez-Sanahuja L, Batet M (2018) Survey and evaluation of web search engine hit counts as research tools in computational linguistics. Inf Syst 73:50–60
Article Google Scholar
Wu S, Zhang Z, Xu C (2019) Evaluating the effectiveness of web search engines on results diversification. Inf Res Int Electron J 24(1):n1
Google Scholar
Azimzadeh M, Badie R, Esnaashari MM (2016) A review on web search engines’ automatic evaluation methods and how to select the evaluation method. In: Second international conference on web research (ICWR) pp 78–83
Shoeleh F, Azimzadeh M, Mirzaei A, Farhoodi M (2016) Similarity based automatic web search engine evaluation. In: 2016 8th International Symposium on Telecommunications (IST), pp 643–648
Nowkarizi M, Zeinali M (2017) The overlap and coverage of 4 local search engines: Parsijoo, Yooz, Parseek and Rismoun. Hum Inf Interact 4(3):48–59
Google Scholar
Zhang J, Cai X, Le T, Fei W, Ma F (2019) A study on effective measurement of search results from search engines. J Glob Inf Manage (JGIM) 27(1):196–221
Article Google Scholar
Rahim I, Mushtaq H, Ahmad S (2019) Evaluation of Search Engines using Advanced Search: Comparative analysis of Yahoo and Bing. Library Philosophy and Practice, pp 1–12
Gul S, Ali S, Hussain A (2020) Retrieval performance of Google, Yahoo and Bing for navigational queries in the field of “life science and biomedicine.” Data Technologies and Applications
Tazehkandi MZ, Nowkarizi M (2020) Evaluating the effectiveness of Google, Parsijoo, Rismoon, and Yooz to retrieve Persian documents. Library Hi Tech
Can F, Nuray R, Sevdik AB (2004) Automatic performance evaluation of Web search engines. Inf Process Manage 40(3):495–514
Article Google Scholar
Jiang S et al (2016) Learning query and document relevance from a web-scale click graph. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, pp 185–194
Zeng W, Xu J, Lan Y, Guo J, Cheng X (2018) Multi page search with reinforcement learning to rank. In: Proceedings of the 2018 ACM SIGIR international conference on theory of information retrieval, pp 175–178
Bitirim Y, Tonta Y, Sever H (2002) Information retrieval effectiveness of Turkish search engines.’ In: International conference on advances in information systems, pp 93–103
Garoufallou E (2012) Evaluating search engines: a comparative study between international and Greek SE by Greek librarians. Program
Griesbaum J (2004) Evaluation of three German search engines: Altavista. de, Google. de and Lycos. de. Inf Res 9(4)
Tawileh W et al (2010) Evaluation of five web search engines in Arabic language. In: LWA, pp 221–228
Lazarinis F, Vilares J, Tait J, Efthimiadis EN (2009) Current research issues and trends in non-English Web searching. Inf Retrieval 12(3):230–250
Article Google Scholar
Givi H, Anvari H (2003) Dastoore zabane Farsi
Callan JP, Croft WB, Harding SM (1992) The INQUERY retrieval system. Database and expert systems applications, pp 78–83
Metzler D, Manmatha R (2004) An inference network approach to image retrieval. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) vol 3115, pp 42–50

Download references

Acknowledgements

We are all thankful for the support and cooperation of all the group members. Additionally, this project was supported by Iran Telecommunication Research Center with project id 901952720.

Author information

Authors and Affiliations

Snapp, Tehran, Iran
Amin Heydari Alashti
RMIT University, Melbourne, Australia
Ahmad Asgharian Rezaei
Shahid Beheshti University, Tehran, Iran
Alireza Elahi
Imam Sadegh University, Tehran, Iran
Sobhan Sayyaran
Computer Science Faculty, Sharif University of Technology, Tehran, Iran
Mohammad Ghodsi

Authors

Amin Heydari Alashti
View author publications
You can also search for this author inPubMed Google Scholar
Ahmad Asgharian Rezaei
View author publications
You can also search for this author inPubMed Google Scholar
Alireza Elahi
View author publications
You can also search for this author inPubMed Google Scholar
Sobhan Sayyaran
View author publications
You can also search for this author inPubMed Google Scholar
Mohammad Ghodsi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Amin Heydari Alashti.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by Iran Telecommunication Research Center.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alashti, A.H., Rezaei, A.A., Elahi, A. et al. Parsisanj: an automatic component-based approach toward search engine evaluation. J Supercomput 78, 10690–10711 (2022). https://doi.org/10.1007/s11227-022-04306-9

Download citation

Accepted: 19 December 2021
Published: 25 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11227-022-04306-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parsisanj: an automatic component-based approach toward search engine evaluation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Meta-Search Engine Ranking Based on Webpage Information Quality Evaluation

Army ANT: A Workbench for Innovation in Entity-Oriented Search

Domain-Based Search Engine Evaluation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now