skip to main content
10.1145/1277741.1277984acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Beyond classical measures: how to evaluate the effectiveness of interactive information retrieval system?

Published:23 July 2007Publication History

ABSTRACT

This research explores the relationship between Information Retrieval (IR) systems' effectiveness and users' performance (accuracy and speed) and their satisfaction with the retrieved results (precision of the results, completeness of the results and overall system success). Previous studies have concluded that improvements in IR systems based on increase in IR effectiveness measures do not reflect on improvement in users' performance. This work aims at exploiting factors that can possibly be considered as confounding variables in Interactive Information Retrieval (IIR) evaluation. In this research, we look at substantive approaches to evaluate IIR systems. We aim to build an interactive evaluation framework that brings together aspects of systems' effectiveness and users' performance and satisfaction. This research also involves developing methods for capturing users' satisfaction with the retrieved results of IR systems, as well as examination how users assess their own performance in task completion. Furthermore, we are also interested in identifying evaluation measures which are used in batch mode (non-interactive experiment), but correlate well in interactive IR system. Thus, by the end of this research, we hope to develop a valid and reliable metrics for IIR evaluation.

A first study was set up to explore the relationship between system effectiveness as quantified by traditional measures, such as precision and recall, and users' effectiveness and satisfaction of the results, though this study was limited to few users. The tasks involve finding images for recall-based tasks. It was concluded that no direct relationship between system effectiveness and users' performance. People learn to adapt to a system regardless to its effectiveness. This study recommends that a combination of measures (e.g. system effectiveness, user performance and satisfaction) to be used to evaluate IIR systems. Based on our observation from this study, we found that users' familiarity of the search topic has increased their performance.

Thus, we set up a second experiment to investigate how users' satisfaction correlate with some IR effectiveness measures such as precision and the suite of Cumulative Gain measures (CG, DCG, NDCG) in simple web searching tasks. Results from this study have shown that CG and Precision are better than NDCG at reflecting users' satisfaction with the results of an IR system. We have also concluded that users of web search engines, in the context of simple search task, are more concerned with precision than completeness of the search. This stemmed from the stronger correlation between users' satisfaction with the success of overall search and their satisfaction with the accuracy of the search results than with their satisfaction with the completeness of the search.

Many scholars such as [1], [2], [3], and [4] have recommended considering perceptions of the users as important as IR effectiveness measures, and both should be interpreted as measures of effectiveness. Therefore, the issue in IIR evaluation should not be focusing on maximizing the retrieval performance, by refining IR techniques alone, but also understanding users' satisfaction, behaviors and information needs. This raises the need for more investigation on measures that translate users' performance and satisfaction as the criterion of a system.

Future plans are to incorporate variables domain knowledge, motivation, task complexity and search behaviours on user performance and users evaluation of IR system performance when evaluating interactive IR systems; this is in an attempt to explore the suitability of different measures in IIR evaluation. Thus, the proposed approach adopts a systematic and multidimensional approach to evaluation including not only classical traditional evaluation measures, such as precision and recall, but also interactive non-traditional measures, such as users' characteristics and their satisfaction.

References

  1. Belkin, N. J., Muresan, G. & Zhang, X.-M. Using User's Context for IR Personalization. SIGIR-04. Sheffield, UK. 2004Google ScholarGoogle Scholar
  2. Järvelin, K. & Ingwersen, P. Information seeking research needs extension towards tasks and technology. Information Research, 10, 212. 2004Google ScholarGoogle Scholar
  3. Turpin, A. & Scholer, F. User Performance versus Precision Measures for Simple Search Tasks. SIGIR. Seatle, Washington, USA. 2006 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Su, L. T. (1992) Evaluation measures for interactive information retrieval. Information Processing & Management, 28, 503--516. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Beyond classical measures: how to evaluate the effectiveness of interactive information retrieval system?

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
      July 2007
      946 pages
      ISBN:9781595935977
      DOI:10.1145/1277741

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 July 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%
    • Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader