An information retrieval benchmarking model of satisficing and impatient users’ behavior in online search environments
Graphical abstract
Information retrieval behavior of the patient, satisficing and impatient decision makers (DMs) through the initial nodes of a binary decision tree.
Introduction
Users tend to evaluate the alternatives obtained from a search in the order provided by the online search engine (Epstein and Robertson, 2015, Gao and Shah, 2020, Luo et al., 2011). Eye-tracking technology has validated this feature as well as the biased focus of users towards the highest-ranked alternatives (Lewandowski and Kammerer, 2020, Lorigo et al., 2008). Indeed, the first two alternatives composing the ranking receive a disproportionate number of clicks compared to the remaining ones within the first page of search results (Chitika, 2013, Dean, 2019).
The formalization of the information retrieval behavior observed through standard utility approaches must deal with the cognitive limits of users (Gupta et al., 2018, Lieder and Griffiths, 2020), whose behavior cannot be based on the almost four million permutations that can be computed from the ten results composing the initial page delivered by the engine (Basu, 2018, Victorelli et al., 2020). We are therefore left with the order implicit in the ranking provided by the engine as the only guideline available to replicate the information retrieval behavior of users (European Commission, 2016).
The satisficing approach to information retrieval and user behavior has gained considerable attention in later years, particularly within experimental settings. The empirical literature comparing the maximization and satisficing approaches to information retrieval emphasizes the difficulties faced when eliciting the utility derived from the outcomes of the search process (Misuraca & Fasolo, 2018). These analyses focus particularly on students, who are also found to generally follow a satisficing approach when identifying optimal information sources (List & Alexander, 2017). Note that, though not explicitly, the intuition behind the satisficing approach implies a certain degree of impatience from the user, who may decide to conclude the retrieval process as soon as he finds a suitable alternative.
The decision-theoretical literature has managed to identify the main characteristics determining the behavior of impatient users (Ghafurian et al., 2020). The complexity of the interactions taking place between users and programs has been widely documented and ranges from reactions to different response times to connections within the emotional domain (Norman and Kirakowski, 2017, Victorelli et al., 2020). Initial studies concluded that the waiting time users were willing to tolerate the download of a Web page when retrieving information was approximately two seconds (Nah, 2004). The importance assigned by users to slow interactions and response times has evolved through time, with their demand for timely information adapting to the new technological paradigm (Lohr, 2012). Recently, impatient users have become the focus of analysis in queuing-related environments, providing fertile ground for the development of potential extensions of the algorithmic framework presented (Bolandifar et al., 2019, Li et al., 2018).
Credibility considerations have also gained considerable relevance, particularly in strategic and medical environments (Machackova & Smahel, 2018). The evaluation of information is a complex procedure where credibility is determined by the characteristics of users, contents, and information sources, encompassing also the task motivating the search and its relative difficulty (Lee & Pang, 2018). For instance, cognitive scientists have highlighted the reliance of users on information scent when evaluating the alignment of alternatives with their preferences (Karanam et al., 2016, Ong et al., 2017).
Regarding applicability, tourism research is one of the leading academic fields analyzing information retrieval processes conditioned by the output delivered by different search engines and recommender systems. The empirical findings from this research area describe decision-makers (DMs) overwhelmed by the amount of information that they must assimilate and process (Zillinger, 2020). Thus, DMs must rely on search engines without understanding the algorithmic mechanism delivering the results and even their own search strategies (Pirolli, 2018). As a result, this branch of the literature has emphasized the considerable confusion that exists regarding the actual search strategies of users (Lu & Gursoy, 2015).
The main contribution of the current paper is the design of a series of algorithmic benchmarks allowing researchers to extrapolate the behavior of DMs when facing different types of search frictions. The algorithms are built on basic behavioral assumptions so that further modifications to the incentives driving the retrieval process can be implemented. We aim at providing a reference framework of analysis for empirical studies when determining the consequences from modifications to the information retrieval incentives of DMs.
It is important to emphasize what is not being analyzed by the algorithms. The structural complexity of the algorithms contrasts with the basic characterization of the decision nodes. That is, each decision node is based on a simple command stating thatthen the alternative is evaluated. The value of the stochastic realization reflects the characteristics of the alternative being evaluated – as observed by the DM –, which are compared to the subjective preferences defined by the DM as determinants of the cutoff value. The algorithms do not consider
- •
how the realizations observed follow from the cognitive capacities of DMs;
- •
how the cutoff values are defined based on the subjective preferences of DMs.
That is, we do not study how these features are determined but benchmark the response of DMs using the behavioral data provided by search engines. The literature dealing with these characterizations extends through different research fields, ranging from empirical decision-making (Doniec et al., 2020, Jankowski et al., 2016) and psychology (Khamitov et al., 2019) to cognitive sciences (Dou et al., 2010). Each of these areas has provided ample evidence regarding the factors that determine the alignment of the preferences of DMs with the characteristics observed. These factors are quite varied and encompass product features (Lu and Altenbek, 2021, Zhu and Zhang, 2010), and the subjective characteristics of DMs (Lauraéus et al., 2015, Sadiq et al., 2021, Shafiq et al., 2015), ranging from gender differentials (Bae & Lee, 2011) to cognitive (Bartels and Johnson, 2015, Kimmel, 2012) and psychological frictions (Lerner et al., 2015).
We design a benchmark information retrieval algorithm mimicking the click-through rate (CTR) behavior of users when deciding on which alternatives to click from the first page of results displayed by a search engine. The CTR of a given alternative is defined as the number of users who click on the link to the alternative divided by the total number of users performing a search. The decision-tree structure of this benchmark algorithm accounts for the 1,023 binary decision nodes that users may have to consider as they retrieve information from the first page of results and the 1024 final nodes describing the potential evaluation vectors generated through the different retrieval paths that may be followed by DMs. The only assumption imposed on the information retrieval process of users is that they observe alternatives in the order provided by the engine. In this regard, the benchmark algorithm assumes that users consider clicking on the different alternatives with a decreasing probability as they proceed through the ranking delivered by the engine. We equate the probability of clicking on an alternative to the empirical value of the CTR described in Dean (2019) and simulate a total of 1,000,000 queries per configuration to evaluate the ability of the algorithm to replicate the behavior observed.
After illustrating the capacity of the benchmark algorithm to replicate the CTR behavior observed, we analyze the effects from an increase in the impatience of users as they retrieve information on the alternatives provided by the search engine. As emphasized in the previous section, impatience represents a consistent characteristic defining the behavior of online users, a feature increasingly exacerbated at the mobile search level (Google, 2016, Varnali et al., 2012). We define two versions of the initial algorithm to illustrate the importance of impatience as a determinant of CTRs conditioned by the ranking position of the alternatives.
- •
The first version assumes that users proceed sequentially through the ranking until they find an alternative satisficing their expectations. Once they find a satisficing alternative, DMs continue retrieving information until they observe an alternative that violates their expectations.
- •
The second version increases the impatience of users, who stop retrieving information as soon as an alternative does not satisfy their expectations – even if it is the top-ranked one.
In all cases, the algorithms must incorporate the whole set of potential branches that may be generated through the information retrieval process. This request is relatively simple when accounting for impatient evaluation structures dealing with a total of 21 nodes, but its complexity increases considerably when dealing with the 2,047 nodes required to model the sequential behavior of the benchmark, i.e., patient, users. The resulting scenarios, including the intermediate satisficing setting – introduced to accommodate the bounded rationality approaches implemented in the economics and managerial literature –, are developed through the next section.
It should be emphasized that our approach differs completely from the one generally implemented in the systems literature. We do not design an experimental framework whose results are compared to the actual behavior of users (Schneider et al., 2019, Speier-Pero, 2019), a methodology that also represents the standard procedure in the literature on electronic commerce (Sun et al., 2020, Yoo et al., 2016). We define a simulation benchmark determined by the search behavior of users online, whose modifications can be validated through experiments or empirical analyses (Bell and Mgbemena, 2018, Dunke and Nickel, 2020).
That is, the proposed algorithms constitute a malleable structure allowing to simulate a wide variety of behavioral phenomena, providing a benchmark evaluation framework for the systems literature (Hong et al., 2021, Mahony et al., 2016, Zhang et al., 2020). The algorithmic structures proposed also provide a benchmark counterpart to the big data analyses performed to elicit the behavior of users through the implementation of artificial intelligence techniques (Dell’Aversana & Bucciarelli, 2018).
The paper proceeds as follows. 2 Decision trees as evaluation techniques, 3 Designing the information retrieval algorithms describe the intuition required to formalize the different information retrieval algorithms. The behavior of users under different willingness to search and impatience scenarios is analyzed in Section 4. Section 5 incorporates search frictions to the analysis and presents the main managerial implications. Section 6 concludes and suggests potential extensions.
Section snippets
Decision trees as evaluation techniques
Decision trees provide a formal structure sufficiently flexible to analyze most sequential information retrieval processes common to the management and operations research literature (Pei and Hu, 2018, Sagi and Rokach, 2020). Despite this fact, the systems literature has not generally considered the application of decision trees to formalize information retrieval incentives in online evaluation environments.
The apparent simplicity of the corresponding information retrieval process may have led
Designing the information retrieval algorithms
The information retrieval algorithms analyzed are intuitively illustrated in Fig. 1, Fig. 2, Fig. 3. The sequential evaluation structures described through these figures incorporate each alternative in the order delivered by the search engine. That is, the first element composing each initial node within all figures, denoted by 1°, corresponds to the first element ranked by the search engine. The same intuition applies to the remaining elements composing the nodes until the tenth and final one,
Impatience and click through rates
The intuition on which the mimicking algorithm is built follows directly from the sequential behavior observed among online users. Dean (2019) analyzed a sample of five million queries and computed the CTRs of the organic results ranked within the first page of Google. The second column of Table 1 describes the CTRs reported by Dean (2019), which are then compared with those obtained from the different information retrieval algorithms determined by the relative impatience of users. The
Managerial implications
The decision trees described through the paper are sufficiently flexible to incorporate any subjective constraints inherent to the preferences of DMs. In addition, the trees can be modified at each node depending on the complexity of the decision process that must be analyzed. This latter feature highlights the main implications of the current set of algorithms from a managerial viewpoint.
Managers are provided with a benchmark structure illustrating the main consequences derived from any
Conclusion
We have designed a benchmark algorithm that mimics the information retrieval behavior of users when evaluating the initial page of alternatives ranked by a search engine. Several versions of the algorithm have been defined to account for different degrees of user impatience and their effects on the resulting CTRs. The main results obtained are conditioned by the structural differences existing among the impatient scenarios analyzed, which, together with the corresponding evaluation
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
Dr. Madjid Tavana is grateful for the partial financial support he received from the Czech Science Foundation (GACR 19-13946S).
References (64)
- et al.
Connecting cognition and consumer choice
Cognition
(2015) Information search in the internet markets: Experience versus search goods
Electronic Commerce Research and Applications
(2018)- et al.
Purchase intention-based agent for customer behaviours
Information Sciences
(2020) - et al.
Neural networks for the metamodeling of simulation models with online decision making
Simulation Modelling Practice and Theory
(2020) - et al.
Toward creating a fairer ranking in search engine results
Information Processing & Management
(2020) - et al.
Big data with cognitive computing: A review for the future
International Journal of Information Management
(2018) - et al.
CPIN: Comprehensive present-interest network for CTR prediction
Expert Systems with Applications
(2021) - et al.
Fuzzy multi-objective modeling of effectiveness and user experience in online advertising
Expert Systems with Applications
(2016) - et al.
Approaches to displaying information to assist decisions under uncertainty
Omega
(2012) - et al.
How do users describe their information need: Query recommendation based on snippet click model
Expert Systems with Applications
(2011)