An information retrieval benchmarking model of satisficing and impatient users’ behavior in online search environments

https://doi.org/10.1016/j.eswa.2021.116352Get rights and content

Highlights

  • We study the effect of user impatience and search rankings on click-through rates (CTR).

  • We design a stochastic information retrieval algorithm to mimic real-life CTR behavior.

  • Two versions of the algorithm are defined to incorporate the degree of user impatience.

  • The CTRs of the top three ranked alternatives remain stable as users grow impatient.

  • CTR differences widen as growingly impatient users proceed halfway through the ranking.

Abstract

This study analyzes the effects that the position of the alternatives ranked by a search engine and the relative impatience of users have on their information retrieval behavior. We design a stochastic information retrieval algorithm calibrated to mimic the click-through rates (CTRs) of users observed in real-life environments. We introduce two versions of the mimicking algorithm designed to demonstrate the importance of impatience as a determinant of CTRs conditioned by the alternatives’ ranking position. The first version assumes that users proceed sequentially through the ranking until they find an alternative satisficing their expectations. Once they find a satisficing alternative, they continue retrieving information until they observe an alternative that violates their expectations. The second version increases users’ impatience, who stop retrieving information as soon as an alternative does not satisfy their expectations – even if it is the top-ranked one. All three algorithmic structures are sufficiently malleable to incorporate any potential modification to users’ beliefs and preferences. We simulate sets of 1,000,000 queries to illustrate how the CTRs of the top three ranked alternatives remain stable as users grow impatient, with differences widening as growingly impatient users proceed halfway through the ranking.

Graphical abstract

Information retrieval behavior of the patient, satisficing and impatient decision makers (DMs) through the initial nodes of a binary decision tree.

  1. Download : Download high-res image (84KB)
  2. Download : Download full-size image

Introduction

Users tend to evaluate the alternatives obtained from a search in the order provided by the online search engine (Epstein and Robertson, 2015, Gao and Shah, 2020, Luo et al., 2011). Eye-tracking technology has validated this feature as well as the biased focus of users towards the highest-ranked alternatives (Lewandowski and Kammerer, 2020, Lorigo et al., 2008). Indeed, the first two alternatives composing the ranking receive a disproportionate number of clicks compared to the remaining ones within the first page of search results (Chitika, 2013, Dean, 2019).

The formalization of the information retrieval behavior observed through standard utility approaches must deal with the cognitive limits of users (Gupta et al., 2018, Lieder and Griffiths, 2020), whose behavior cannot be based on the almost four million permutations that can be computed from the ten results composing the initial page delivered by the engine (Basu, 2018, Victorelli et al., 2020). We are therefore left with the order implicit in the ranking provided by the engine as the only guideline available to replicate the information retrieval behavior of users (European Commission, 2016).

The satisficing approach to information retrieval and user behavior has gained considerable attention in later years, particularly within experimental settings. The empirical literature comparing the maximization and satisficing approaches to information retrieval emphasizes the difficulties faced when eliciting the utility derived from the outcomes of the search process (Misuraca & Fasolo, 2018). These analyses focus particularly on students, who are also found to generally follow a satisficing approach when identifying optimal information sources (List & Alexander, 2017). Note that, though not explicitly, the intuition behind the satisficing approach implies a certain degree of impatience from the user, who may decide to conclude the retrieval process as soon as he finds a suitable alternative.

The decision-theoretical literature has managed to identify the main characteristics determining the behavior of impatient users (Ghafurian et al., 2020). The complexity of the interactions taking place between users and programs has been widely documented and ranges from reactions to different response times to connections within the emotional domain (Norman and Kirakowski, 2017, Victorelli et al., 2020). Initial studies concluded that the waiting time users were willing to tolerate the download of a Web page when retrieving information was approximately two seconds (Nah, 2004). The importance assigned by users to slow interactions and response times has evolved through time, with their demand for timely information adapting to the new technological paradigm (Lohr, 2012). Recently, impatient users have become the focus of analysis in queuing-related environments, providing fertile ground for the development of potential extensions of the algorithmic framework presented (Bolandifar et al., 2019, Li et al., 2018).

Credibility considerations have also gained considerable relevance, particularly in strategic and medical environments (Machackova & Smahel, 2018). The evaluation of information is a complex procedure where credibility is determined by the characteristics of users, contents, and information sources, encompassing also the task motivating the search and its relative difficulty (Lee & Pang, 2018). For instance, cognitive scientists have highlighted the reliance of users on information scent when evaluating the alignment of alternatives with their preferences (Karanam et al., 2016, Ong et al., 2017).

Regarding applicability, tourism research is one of the leading academic fields analyzing information retrieval processes conditioned by the output delivered by different search engines and recommender systems. The empirical findings from this research area describe decision-makers (DMs) overwhelmed by the amount of information that they must assimilate and process (Zillinger, 2020). Thus, DMs must rely on search engines without understanding the algorithmic mechanism delivering the results and even their own search strategies (Pirolli, 2018). As a result, this branch of the literature has emphasized the considerable confusion that exists regarding the actual search strategies of users (Lu & Gursoy, 2015).

The main contribution of the current paper is the design of a series of algorithmic benchmarks allowing researchers to extrapolate the behavior of DMs when facing different types of search frictions. The algorithms are built on basic behavioral assumptions so that further modifications to the incentives driving the retrieval process can be implemented. We aim at providing a reference framework of analysis for empirical studies when determining the consequences from modifications to the information retrieval incentives of DMs.

It is important to emphasize what is not being analyzed by the algorithms. The structural complexity of the algorithms contrasts with the basic characterization of the decision nodes. That is, each decision node is based on a simple command stating thatifarandomuniformrealization>thecutoffvalueassignedtothealternativethen the alternative is evaluated. The value of the stochastic realization reflects the characteristics of the alternative being evaluated – as observed by the DM –, which are compared to the subjective preferences defined by the DM as determinants of the cutoff value. The algorithms do not consider

  • how the realizations observed follow from the cognitive capacities of DMs;

  • how the cutoff values are defined based on the subjective preferences of DMs.

That is, we do not study how these features are determined but benchmark the response of DMs using the behavioral data provided by search engines. The literature dealing with these characterizations extends through different research fields, ranging from empirical decision-making (Doniec et al., 2020, Jankowski et al., 2016) and psychology (Khamitov et al., 2019) to cognitive sciences (Dou et al., 2010). Each of these areas has provided ample evidence regarding the factors that determine the alignment of the preferences of DMs with the characteristics observed. These factors are quite varied and encompass product features (Lu and Altenbek, 2021, Zhu and Zhang, 2010), and the subjective characteristics of DMs (Lauraéus et al., 2015, Sadiq et al., 2021, Shafiq et al., 2015), ranging from gender differentials (Bae & Lee, 2011) to cognitive (Bartels and Johnson, 2015, Kimmel, 2012) and psychological frictions (Lerner et al., 2015).

We design a benchmark information retrieval algorithm mimicking the click-through rate (CTR) behavior of users when deciding on which alternatives to click from the first page of results displayed by a search engine. The CTR of a given alternative is defined as the number of users who click on the link to the alternative divided by the total number of users performing a search. The decision-tree structure of this benchmark algorithm accounts for the 1,023 binary decision nodes that users may have to consider as they retrieve information from the first page of results and the 1024 final nodes describing the potential evaluation vectors generated through the different retrieval paths that may be followed by DMs. The only assumption imposed on the information retrieval process of users is that they observe alternatives in the order provided by the engine. In this regard, the benchmark algorithm assumes that users consider clicking on the different alternatives with a decreasing probability as they proceed through the ranking delivered by the engine. We equate the probability of clicking on an alternative to the empirical value of the CTR described in Dean (2019) and simulate a total of 1,000,000 queries per configuration to evaluate the ability of the algorithm to replicate the behavior observed.

After illustrating the capacity of the benchmark algorithm to replicate the CTR behavior observed, we analyze the effects from an increase in the impatience of users as they retrieve information on the alternatives provided by the search engine. As emphasized in the previous section, impatience represents a consistent characteristic defining the behavior of online users, a feature increasingly exacerbated at the mobile search level (Google, 2016, Varnali et al., 2012). We define two versions of the initial algorithm to illustrate the importance of impatience as a determinant of CTRs conditioned by the ranking position of the alternatives.

  • The first version assumes that users proceed sequentially through the ranking until they find an alternative satisficing their expectations. Once they find a satisficing alternative, DMs continue retrieving information until they observe an alternative that violates their expectations.

  • The second version increases the impatience of users, who stop retrieving information as soon as an alternative does not satisfy their expectations – even if it is the top-ranked one.

In all cases, the algorithms must incorporate the whole set of potential branches that may be generated through the information retrieval process. This request is relatively simple when accounting for impatient evaluation structures dealing with a total of 21 nodes, but its complexity increases considerably when dealing with the 2,047 nodes required to model the sequential behavior of the benchmark, i.e., patient, users. The resulting scenarios, including the intermediate satisficing setting – introduced to accommodate the bounded rationality approaches implemented in the economics and managerial literature –, are developed through the next section.

It should be emphasized that our approach differs completely from the one generally implemented in the systems literature. We do not design an experimental framework whose results are compared to the actual behavior of users (Schneider et al., 2019, Speier-Pero, 2019), a methodology that also represents the standard procedure in the literature on electronic commerce (Sun et al., 2020, Yoo et al., 2016). We define a simulation benchmark determined by the search behavior of users online, whose modifications can be validated through experiments or empirical analyses (Bell and Mgbemena, 2018, Dunke and Nickel, 2020).

That is, the proposed algorithms constitute a malleable structure allowing to simulate a wide variety of behavioral phenomena, providing a benchmark evaluation framework for the systems literature (Hong et al., 2021, Mahony et al., 2016, Zhang et al., 2020). The algorithmic structures proposed also provide a benchmark counterpart to the big data analyses performed to elicit the behavior of users through the implementation of artificial intelligence techniques (Dell’Aversana & Bucciarelli, 2018).

The paper proceeds as follows. 2 Decision trees as evaluation techniques, 3 Designing the information retrieval algorithms describe the intuition required to formalize the different information retrieval algorithms. The behavior of users under different willingness to search and impatience scenarios is analyzed in Section 4. Section 5 incorporates search frictions to the analysis and presents the main managerial implications. Section 6 concludes and suggests potential extensions.

Section snippets

Decision trees as evaluation techniques

Decision trees provide a formal structure sufficiently flexible to analyze most sequential information retrieval processes common to the management and operations research literature (Pei and Hu, 2018, Sagi and Rokach, 2020). Despite this fact, the systems literature has not generally considered the application of decision trees to formalize information retrieval incentives in online evaluation environments.

The apparent simplicity of the corresponding information retrieval process may have led

Designing the information retrieval algorithms

The information retrieval algorithms analyzed are intuitively illustrated in Fig. 1, Fig. 2, Fig. 3. The sequential evaluation structures described through these figures incorporate each alternative in the order delivered by the search engine. That is, the first element composing each initial node within all figures, denoted by 1°, corresponds to the first element ranked by the search engine. The same intuition applies to the remaining elements composing the nodes until the tenth and final one,

Impatience and click through rates

The intuition on which the mimicking algorithm is built follows directly from the sequential behavior observed among online users. Dean (2019) analyzed a sample of five million queries and computed the CTRs of the organic results ranked within the first page of Google. The second column of Table 1 describes the CTRs reported by Dean (2019), which are then compared with those obtained from the different information retrieval algorithms determined by the relative impatience of users. The

Managerial implications

The decision trees described through the paper are sufficiently flexible to incorporate any subjective constraints inherent to the preferences of DMs. In addition, the trees can be modified at each node depending on the complexity of the decision process that must be analyzed. This latter feature highlights the main implications of the current set of algorithms from a managerial viewpoint.

Managers are provided with a benchmark structure illustrating the main consequences derived from any

Conclusion

We have designed a benchmark algorithm that mimics the information retrieval behavior of users when evaluating the initial page of alternatives ranked by a search engine. Several versions of the algorithm have been defined to account for different degrees of user impatience and their effects on the resulting CTRs. The main results obtained are conditioned by the structural differences existing among the impatient scenarios analyzed, which, together with the corresponding evaluation

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

Dr. Madjid Tavana is grateful for the partial financial support he received from the Czech Science Foundation (GACR 19-13946S).

References (64)

  • W. Lu et al.

    A recommendation algorithm based on fine-grained feature analysis

    Expert Systems with Applications

    (2021)
  • W. Luo et al.

    Search advertising placement strategy: Exploring the efficacy of the conventional wisdom

    Information & Management

    (2011)
  • R. Misuraca et al.

    Maximizing versus satisficing in the digital age: Disjoint scales and the case for “construct consensus”

    Personality and Individual Differences

    (2018)
  • H. Machackova et al.

    The perceived importance of credibility cues for the assessment of the trustworthiness of online information by visitors of health-related websites: The role of individual factors

    Telematics and informatics

    (2018)
  • S. Pei et al.

    Partially monotonic decision trees

    Information Sciences

    (2018)
  • H. Ren et al.

    Modeling customer bounded rationality in operations management: A review and research opportunities

    Computers & Operations Research

    (2018)
  • S. Sadiq et al.

    Discrepancy detection between actual user reviews and numeric ratings of Google App store using deep learning

    Expert Systems with Applications

    (2021)
  • O. Sagi et al.

    Explainable decision forest: Transforming a decision forest into an interpretable tree

    Information Fusion

    (2020)
  • O. Shafiq et al.

    On personalizing Web search using social network analysis

    Information Sciences

    (2015)
  • K. Varnali et al.

    Predictors of attitudinal and behavioral outcomes in mobile advertising: A field experiment

    Electronic Commerce Research and Applications

    (2012)
  • E.Z. Victorelli et al.

    Understanding human-data interaction: Literature review and recommendations for design

    International Journal of Human-Computer Studies

    (2020)
  • B. Yoo et al.

    An analysis of popularity information effects: Field experiments in an online marketplace

    Electronic Commerce Research and Applications

    (2016)
  • J.H. Ahn et al.

    Attention adjustment, renewal, and equilibrium seeking in online search: An eye-tracking approach

    Journal of Management Information Systems

    (2018)
  • S. Bae et al.

    Product type and consumers’ perception of online consumer reviews

    Electronic Markets

    (2011)
  • Baeza-Yates, R. Applications of web query mining. In: Losada, D.E., Fernández-Luna, J.M. (Eds.). Advances in...
  • D. Bell et al.

    Data-driven agent-based exploration of customer behavior

    Simulation

    (2018)
  • E. Bolandifar et al.

    An empirical study of the behavior of patients who leave the emergency department without being seen

    Journal of Operations Management

    (2019)
  • Chitika: The value of Google result positioning. Chitika Insights June 7, 2013. Chitika, Westborough (2013) Available...
  • M. Cristofaro

    Herbert Simon’s bounded rationality: Its historical evolution in management and cross-fertilizing contribution

    Journal of Management History

    (2017)
  • Dean, B. (2019). We analyzed 5 million Google search results. Here’s what we learned about organic click through rate....
  • R. Dell’Aversana et al.

    Towards a natural experiment leveraging big data to analyse and predict users’ behavioural patterns within an online consumption setting

  • A. Dimoka et al.

    On product uncertainty in online markets: Theory and evidence

    MIS Quarterly

    (2012)
  • Cited by (0)

    View full text