Determining the informational, navigational, and transactional intent of Web queries
Introduction
The World Wide Web (Web) has become an indispensable tool in the daily lives of many people, and search engines provide critical access to Web resources. With nearly 70% of Web searchers using a search engine as their point of entry, the major search engines receive millions of queries per day and present billions of results per week in response to these queries (Sullivan, 2006). Search engines are ‘the tool’ that many people use on a daily basis for accessing the information, Internet sites, services, and other resources on the Web. Although popular, how are people using Web search engines to accomplish their intended goal? How can we determine what it is that these people are actually seeking? What task, need, or goal are these people trying to address with their Web searching?
Belkin (1993) states that one can classify searching episodes in terms of (1) goal of the interaction, (2) method of interaction, (3) mode of retrieval and (4) type of resource interacted with during the search. Web searching certainly possesses these aspects, so Web searching has continuity with earlier searching interactions, such as library systems. However, Web searching differs in three respects (i.e., context, scale, and variety), making it a unique domain of study. The first difference is that the direct availability of content accessible on the Web is nearly ubiquitous. Web search engines provide access to textual and multimedia content in a wide variety of settings including both home and work, as well as in mobile situations. Second, there is the number of searchers attempting to access this content via Web search engines. The scale of topics submitted by these users is surely unparalleled in pre-Web end user searching. Third, the variety of content, users, and systems is certainly unique. This combined diversity on the Web in both content and users is extreme.
In response to this diversity, Web search engines service a variety of purposes for users. In addition to satisfying information problems, modern Web search engines are navigational tools to take users to specific uniform resource locators (URLs) or to aid in browsing. People use search engines as applications to conduct e-commerce transactions, such as with sponsored search or Google’s payment system. Search engines provide access to content collections of images, songs, and videos rather than directly addressing an information need with a specific object. Search engines provide access to transactional services such as maps, online auctions, driving directions, or even other search engines. Search engines perform social networking functions, as with Yahoo! Answers. Web search engines are spell checkers, thesauruses, and dictionaries. They are games, such as Google Whacking or vanity searching. Modern Web search engines are adding an increasing diverse range of features. Providers are placing more and highly varied content and services on the Web. In response, people are employing search engines in new, novel, and increasing diverse ways.
It is this cornucopia of alternatives where Web search engines differ most from classic information search and pre-Web retrieval systems. Referring back to facets outlined by Belkin, the method of interaction has remained the same (i.e., enter query, retrieve results, scan results, view results, refine query as needed). The mode of retrieval is similar, albeit within a hypermedia environment (Marchionini, 1995). In terms of goals and type of resources, however, the changes are dramatic. In fact, the facets of goals and range of resources are classic examples of the long tail effect of the Web. Namely, the Web has extended significantly both the range of search goals for people and the range of resources available (Anderson, 2006), and these resources need not be informational. We refer to the type of resource desired in the user’s expression to the system as user intent. Within this great diversity, Web search engines can better assist people in finding the resources they are looking for by more clearly identifying the intent behind the query.
In this research, we developed a methodology to classify user intent in Web searching. We categorized user searches based on intent in terms of the type of content specified by the query and other user expressions, and we operationalized these classifications with defining characteristics. We implemented these catagories in a program that automatically classified Web search engine queries. We discuss how one can use this approach to improve Web search engine performance by provide more results in line with searchers’ underlying intent.
The next section presents related research concerning modeling Web queries.
Section snippets
Related studies
Research aimed at discovering the intent of Web searchers is a growing field of Web focus. Determining the underlying intent of user searches has the potential to drastically improve system performance of Web search engine (Gisbergen, Most, & Aelen, 2007), with impact in the areas of information retrieval, data mining, and e-commerce. User intent research falls into three sub-areas, which are: (1) empirical studies and surveys of search engine use, (2) manual analysis of search engine
Research objectives
The research objectives are described below:
- 1.
Develop a comprehensive classification of Web searching user intent.
For research objective one, we analysed prior work in the area along with an analysis of numerous actual Web searching transaction logs in order to develop a detail categorization of Web searching based on user intent. Given the plethora of categories and classifications, it is difficult to compare results across studies and research experiments. Such a comparison is vitally needed in
Classification of Web searching
For research objective one, we performed a comprehensive review of prior work in the area of user intent in Web searching. We cross correlated reported results from these studies to align user intent classes that were similar but variously labeled. We also supplemented this literature review by using results from our own data analysis. From this review and analysis, we derived a comprehensive categorization of Web searching intent and correlated this categorization with prior published works.
Research objective one
For research objective one (Develop a comprehensive classification of Web searching user intent), we present in Table 2 a three-level hierarchical taxonomy, with the top most level being informational, navigational, and transactional. Each of these level one categories has multiple level two classifications. Some classifications also can involve a third level classification.
Below this developed taxonomy, Table 2 presents user intent studies and their best-fit classification across studies. The
Discussion and implications
In this study, we employed a three-level classification of Web searching that is useful in identifying the intent of the searcher. This model is based on our own analysis and on prior published work, most notably that of Broder, 2002, Rose and Levinson, 2004. However, Broder (2002) did not present a description of the process and metrics used to classify the queries. Similarly, Rose and Levinson (2004) also did not elaborate on the details of their classifications. In our work, we have
Conclusion and further research
In order for Web search engines to continue to improve, they must leverage an increased knowledge of user behavior in order to identify the underlying intent of searchers. In this research, we highlighted characteristics of Web queries based on user intent. These characteristics were derived from an examination of Web queries from multiple search engine transaction logs. We have also demonstrated an automated method that can successfully classify Web queries based on user intent. Web search
Acknowledgements
We would like to thank Excite, AlltheWeb.com, AltaVista, and especially Infospace.com for providing the data for this analysis, without which we could not have conducted this research. We encourage other search engine companies to engage members of academic community in Web searching research. The Air Force Office of Scientific Research (AFOSR) and the National Science Foundation (NSF) funded portions of this research.
References (53)
Seeking and implementing automated assistance during the search process
Information Processing & Management
(2005)- et al.
An analysis of Web searching by European Alltheweb.com users
Information Processing & Management
(2005) - et al.
Real life, real users, and real needs: A study and analysis of user queries on the Web
Information Processing & Management
(2000) - et al.
End user searching: A Web log analysis of NAVER, a Korean Web search engine
Library & Information Science Research
(2005) The long tail: Why the future of business is selling more of less
(2006)- Baeza-Yates, R., Caldeŕon-Benavides, L., & Gonźalez, C. (2006). In The intention behind Web queries (pp. 98–109). Paper...
- et al.
Automatic classification of Web queries using very large unlabeled query logs
ACM Transactions on Information Systems
(2007) Anomalous states of knowledge as a basis for information retrieval
Canadian Journal of Information Science
(1980)- Belkin, N. J. (1993). Interaction with texts: Information retrieval as information-seeking behavior. In Information...
- Belkin, N., Cool, C., Croft, W. B., & Callan, J. (1993). In The effect of multiple query representations on information...