Implicit user behaviours to improve post-retrieval document relevancy
Introduction
With ever growing information over the Web, finding high quality relevant information within the large collection of texts is a challenging issue. The tremendous growth of both information and usage has introduced an information overload problem in which users are finding it increasingly difficult to locate the right information at the right time. This phenomenon resulted in the widespread research on information retrieval – the science of searching for information from a large set of documents collection. Information retrieval deals with the storage, representation, organisation and access to the information items (Baeza-Yates & Ribeiro-Neto, 1999). A typical example of information retrieval is the web search engine such as Google.
It is a challenge to build effective mechanisms to improve search performances, and studies have explored various techniques including using users’ feedback. Explicit feedback requires users to explicitly give feedback stating their preferences, for example, by specifying keywords, commenting, answering questions, rating or ranking, among others. This approach requires the users to engage in additional activities beyond their normal searching behaviours, and thus resulting in higher user cost (time and effort). Therefore, implicit feedback which estimates users’ feedback based on their interactions and behaviours, such as dwell time (i.e. the amount of time spent on a page), scrolling, printing and bookmarking are more appealing. User feedback have been shown to improve search results relevancy and many of these studies have combined multiple feedback approaches (i.e. more than one implicit or explicit feedback) such as Fox et al., 2005, Claypool et al., 2001, Morita and Shinoda, 1994, Joachims et al., 2007, Liu et al., 2011, White et al., 2010, Buscher et al., 2012, Guo and Agichtein, 2012, or combined both implicit and explicit approaches into a hybrid model (Liu et al., 2010, Núñez-Valdéz et al., 2012, Park, 2013, Zhu et al., 2012).
The focus of our study is to specifically explore how multiple implicit feedback can be integrated to improve post-retrieval document relevancy. Though many works have been done in combining implicit feedback, the current study differs in the sense that we propose a re-ranking algorithm which takes into account two common techniques (i.e. dwell time and click-through) and two other techniques, namely page review (i.e. returning to the same document) and text selection. Text selection and page review, in particular are deemed to be interesting additions as they measure a user’s post-click behaviour. Additionally, the study also intends to compare the performance of each of the feedback techniques separately with the integrated model. Therefore, the main research question of the current study is “Does integrating dwell time, click-through, page review and text selection improve the search performance for a query?” To evaluate the accuracy predictions of our proposed method, an experiment was conducted using a self-developed prototype search engine. Implicit feedback were gathered while users interact with the list of ranked results on the search engines result page (SERP). These data were then fed into the re-ranking algorithm so as to improve the document relevancy. The overall evaluation results were then analysed and compared with the baseline algorithm (TF-IDF). In addition, the results were also compared among the various implicit feedback models. All the comparisons were made based on the top 10, 15 and 25 documents.
The structure of the remaining paper is as follows – the description of implicit feedback and some of the notable works in this area are given in the following section. Then, the research methodology is presented with explanations on the re-ranking algorithm, evaluation metrics, experimental setup, etc. This is followed by the results and discussion. The paper is finally concluded by a Section 5.
Section snippets
Implicit feedback
Implicit feedback unobtrusively obtains information about users’ behaviour by watching their interactions with the systems. Common techniques used to gather implicit feedback include dwell time, saving, scrolling, bookmarking, printing and click-through, among others. Compared to explicit feedback, inferences drawn from implicit feedback are considered to be less reliable, however large quantities of data can be gathered implicitly without incurring any additional activities by the users (Jung,
Research methodology
Fig. 1 illustrates the proposed method used in improving the document relevance in this study. Assuming a user performs a new search via a query, a SERP is returned to the user based on the classical information retrieval algorithm, TF-IDF (further details in Salton & Buckley, 1988). As the user interacts with the results displayed, his/her behaviours are captured implicitly (i.e. dwell time, click-through, page review and text selection) and then fed into the re-ranking algorithm which sorts
Results and discussion
The models were compared three-ways: (i) TF-IDF_Integrated versus TF-IDF, (ii) each of the four implicit feedback models with TF-IDF, and (iii) all the five models with implicit feedback.
Conclusion
In this paper we proposed an integrated implicit feedback model to improve the post-retrieval document relevancy. The techniques combined were dwell time, click-through, page review and text selection. Our technique first ranks and presents a list of results for a query based on the classical TF-IDF algorithm. As the user interacts with the system, his/her pattern of interaction (i.e. implicit feedback) was captured. A re-ranking algorithm then incorporates the captured interactions to re-rank
Acknowledgements
The authors wish to thank University of Malaya (RG103-12ICT) for supporting this study. Gratitude also goes to all the students who participated in the user testing.
References (32)
- et al.
Post-retrieval search hit clustering to improve information retrieval effectiveness: Two digital forensics case studies
Decision Support Systems
(2011) - et al.
Click data as implicit relevance feedback in web search
Information Processing & Management
(2007) - et al.
How do users describe their information need: Query recommendation based on snippet click model
Expert Systems with Applications
(2011) - et al.
Implicit feedback techniques on recommender systems applied to electronic books
Computers in Human Behavior
(2012) An adaptive match-making system reflecting the explicit and implicit preferences of users
Expert Systems with Applications
(2013)- et al.
Term weighting approaches in automatic text retrieval
Information Processing and Management
(1988) - et al.
How people revisit web pages: Empirical findings and implications for the design of history systems
International Journal of Human-Computer Studies – Special issue: World Wide Web Usability
(1997) - et al.
User interest modeling and self-adaptive update using relevance feedback technology
Procedia Engineering
(2012) - Agichtein, E., Brill, E., & Dumais, S. (2006). Improving web search ranking by incorporating user behaviour...
- Ahn, J., Brusilovsky, P., He, D., Grady, J., & Li, Q. (2008). Personalized web exploration with task models. In...
Modern information retrieval
A3CRank: An adaptive ranking method based on connectivity, content and click-through data
Information Processing & Management
Inferring user interest
IEEE Internet Computing
Evaluating implicit measures to improve web search
ACM Transactions on Information Systems
Cited by (11)
Fuzzy rule based profiling approach for enterprise information seeking and retrieval
2017, Information SciencesCitation Excerpt :This was proven by comparing observational studies with explicit interest measures [37,50,59,51]. Current research shows that the combination of several relevance feedback parameters can produce better results [37,79,15,10,23]. It was found that reading time, along with other user behaviour can be a very reliable indicator of content relevancy.
Comparative analysis of relevance feedback methods based on two user studies
2016, Computers in Human BehaviorCitation Excerpt :Similar success was reported by Shapira, Taieb-Maimon and Moskowitz (Shapira, Taieb-Maimon, & Moskowitz, 2006). Balakrishnan and Zhang (Balakrishnan & Zhang, 2014) examined the effect of some implicit indicators on post-retrieval document relevancy. They found that a combination of text selection, dwell time, click-through and page review post-click behaviour can improve the precision of relevance feedback.
How does the first buggy file work well for iterative IR-based bug localization?
2022, Proceedings of the ACM Symposium on Applied ComputingGender differentials and implicit feedback on online video content: enhancing user interest evaluation
2019, Industrial Management and Data SystemsContext-aware adaptive m-learning: Implicit indicators of learning performance, perceived usefulness, and willingness to use
2019, Computers in Education JournalA novel hybrid knowledge retrieval approach for online customer service platforms
2018, 26th European Conference on Information Systems: Beyond Digitization - Facets of Socio-Technical Change, ECIS 2018