Social summarization in collaborative web search

https://doi.org/10.1016/j.ipm.2009.10.011Get rights and content

Abstract

A critical challenge for Web search engines concerns how they present relevant results to searchers. The traditional approach is to produce a ranked list of results with title and summary (snippet) information, and these snippets are usually chosen based on the current query. Snippets play a vital sensemaking role, helping searchers to efficiently make sense of a collection of search results, as well as determine the likely relevance of individual results. Recently researchers have begun to explore how snippets might also be adapted based on searcher preferences as a way to better highlight relevant results to the searcher. In this paper we focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers. We go on to show how this so-called social summarization technique can generate summaries that are significantly better adapted to searcher preferences and describe a novel personalized search interface that combines result recommendation with social summarization.

Introduction

From a Web search standpoint, the success of a particular result-list depends on a number of factors. Obviously the pages that are retrieved as results are important; missing relevant result pages or including too many irrelevant result pages will significantly compromise the quality of the result-list. In addition, the ability to rank results according to their likely relevance to the query is also critically important and it is well known that the majority of user attention tends to be focused on the top ranking results. Finally, results should be presented in a way that highlights their likely relevance, not just to the query, but to the individual searcher. By convention, today’s search engines present results as a combination of page title, page URL, and result snippet. In this paper we are especially interested in result snippets—those short extracts of page content that serve to summarize a particular result—and the way that they are generated.

In the past researchers have attempted to improve Web search by concentrating on the selection and ranking of search results. For example, many researchers have called for a more personalized approach to Web search, one which takes advantage of the learned preferences of the individual searcher (Dou, Song, & Wen, 2007) or a community of searchers, so as to recommend a ranked list of results that better reflect these interests. This research shares many aspects in common with traditional recommender systems research as it involves the recommendation of items (search results) on the basis of some learned user (or community) preferences. Recently, recommender systems research has begun to look at how recommendations can be explained or justified to users, to help users better understand the reason behind a recommendation, and ultimately improve the perceived quality of the recommendations that are made (McSherry, 2005, Pu and Chen, 2007). In this regard our recent work (Boydell & Smyth, 2007) is relevant and provides the starting point for the research presented here.

One important emerging theme in modern information retrieval highlights the inherently collaborative nature of many information retrieval and Web search scenarios (see for e.g., Morris, 2008, Reddy and Dourish, 2002, Reddy et al., 2001, Reddy and Jansen, 2008, Reddy and Spence, 2008, Smyth, 2007, Smyth et al., 2004). In short, despite the fact that information search systems are often designed to support single-user interaction there are many situations where groups of searchers effectively come together (either explicitly or implicitly) when they search leading to a form of explicit or implicit collaboration. Recent work has sought to take advantage of this by building search interfaces that are designed to support a more collaboration style of search (Amershi and Morris, 2008, Morris and Horvitz, 2007a, Morris and Horvitz, 2007b, Paul and Morris, 2009, Smeaton et al., 2007, Smyth, 2007, Smyth et al., 2004).

In this paper we build on recent work in collaborative web search (Boydell and Smyth, 2006, Smyth, 2007, Smyth et al., 2004), where the search actions of communities of searchers are harnessed to provide a more community-focussed search experience by, for example, promoting results that have been liked by community members, for similar queries, ahead of organic search results. In this paper however, we describe a different approach to personalization. Instead of (or in addition to the) re-ranking of search results we focus on adapting the result snippets so that they better reflect a community’s interests. Result snippets are especially important in Web search because they help the searcher to better understand the result content and, thus, potential relevance, of a search result. They play an important role in sensemaking (Russel, Stefik, Pirolli, & Card, 1993), helping searchers to quickly find meaning in a collection of search results by extracting segments of text from a search result to present at search time. Conventional result snippets are usually query-focused; for example, the snippet text will usually be chosen because it contains a high density of query terms. These techniques do not always produce the most informative snippet texts, however, especially when a particular result might cover many different aspects of a topic some of which may be more or less interesting to the searcher. For example, consider a motoring enthusiast searching with the query ‘porsche engine parts’. At the time of writing, one of Google’s top results was for a parts specialist supplying, according to the snippet, “OEM Porsche engine parts, OEM Porsche brakes, air filters, fuel filters, engine parts, spark plugs, steering, exhaust, …” as shown in Fig. 1. Clearly this snippet has been generated with reference to the target query and it is surely likely to appeal to many searchers using this query. However, consider a searcher who is interested in finding a parts supplier to source rare parts for their classic Porsche 356 coupé. This searcher may very well use the same query but will not be interested in most parts suppliers, only those that deal or specialise in classic Porsche components. As it turns out the supplier above does deal in this niche market but of course the query-sensitive snippet does not reflect this and so the result may be passed over by our searcher.

In this paper we describe how to generate query-focused snippets that are chosen based on the implicit preferences of a community of like-minded searchers. We do this by mining the selection information (search queries and selected results) that are generated as communities of searchers collaborate while they search. For example, a community of classic-car enthusiasts might receive a more relevant snippet for our example above, such as “All Classic 356 911 912 Porsche Parts For Sale …OEM Porsche engine parts, OEM Porsche brakes, air filters, fuel filters, engine parts, …”, if during previous searches for this result other community members have tended to use queries that have generated snippets containing these more relevant terms (terms like “Classic 356”). In this work we extend the basic social summarization (Boydell & Smyth, 2007) technique so that it can be used to generate query-focused, community-based summaries as part of a collaborative web search engine such as (Smyth, 2007). We also evaluate the quality of the generated summaries across eight different search communities, ranging in size and topic, from just over 800 searchers to more than 11,000 searchers. The results indicate that social summaries have the potential to outperform more traditional summarization techniques using standardized ROUGE recall tests (Lin, 2004).

Section snippets

Background

The research in this paper touches on a number of areas of interest that fit broadly within the recommender systems remit. First and foremost our focus is on the use of recommendation technologies to personalize Web search results. Second, we are especially interested in the information that accompanies Web search results – the snippet text – by way of explanations, and how such snippets might also be adapted for the needs of users. Thirdly, we acknowledge the importance of the search interface

Social summarization in collaborative web search

The starting point for our work on community-based social summarization is an approach to Web search, called collaborative web search (CWS). CWS exploits query repetition and selection regularity among communities of like-minded searchers in order to recommend results from some underlying search engine that are likely to be especially relevant for a particular community. A detailed review of CWS is beyond the scope of this paper (see Boydell and Smyth, 2006, Smyth, 2007) but it is worth

Using social summaries in web search

So far we have described a page summarization technique that is based on the interactions of communities of like-minded searchers. As it stands this technique is very suitable for producing full page summaries, as has been previously shown by Boydell and Smyth (2007). However it is not yet appropriate for producing result snippets because its summaries are not query-focused. In this section we will explain how this technique can be expanded to produce query-focused summaries that can be used as

Evaluation

There are two basic considerations with respect to our social summarization technique, from an summary quality standpoint. In the first instance we might consider the quality of the summaries produced in a generic way: we did this in (Boydell & Smyth, 2007), where we compared pages summaries produced by our social summarization technique to those produced by more traditional summarization approaches. Alternatively we might consider the specific hypothesis forwarded in this work: that the

Discussion

The primary contribution of this paper has been to explore the generation of community-focused search result summaries, to aid in community-sensemaking, as part of a collaborative web search engine. In the main we have concentrated on presenting and evaluating the core summarization technique, with positive results obtained in comparison to a number of conventional summarization benchmarks. In this section we discuss some recent additional work which explores the role of social summaries as

Conclusions

In this work we have described an approach to Web search that is personalized for the needs of a community of like-minded searchers. The basic collaborative web search technique focuses on recommending search results, which come from a traditional search engine, as promotion candidates because they have been previously considered relevant by a community of searchers. The main focus of this paper has not been on the core recommendation technique (this has been previously presented elsewhere in

Acknowledgement

This material is based on works supported by Science Foundation Ireland under Grants 03/IN.3/I361 and 07/CE/I1147.

References (43)

  • Coyle, M., Smyth, B. (2007). On the community-based explanation of search results. In IUI ’07: Proceedings of the 12th...
  • Delort, J.-Y., Bouchon-Meunier, B., & Rifqi, M. (2003). Enhanced web document summarization using hyperlinks. In...
  • Dou, Z., Song, R., & Wen, J.-R. (2007). A large-scale evaluation and analysis of personalized search strategies. In WWW...
  • E. Glover et al.

    Web search – Your way

    Communications of the ACM

    (2001)
  • Joho, H., & Jose, J.M. (2006). Slicing and dicing the information space using local contexts. In IIiX: Proceedings of...
  • Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Proceedings of the workshop on text...
  • Lin, C.-Y., & Hovy, E.H. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In...
  • D. McSherry

    Explanation in recommender systems

    Artificial Intelligence Review

    (2005)
  • Mitra, M., Singhal, A., & Buckley, C. (1998). Improving automatic query expansion. In Proceedings of the 21st annual...
  • Morita, M., & Shinoda, Y. (1994). Information filtering based on user behaviour analysis and best match text retrieval....
  • Morris, M.R. (2008). A survey of collaborative web search practices. In CHI (pp....
  • Cited by (14)

    • Multiple documents summarization based on evolutionary optimization algorithm

      2013, Expert Systems with Applications
      Citation Excerpt :

      According to Mani and Maybury (1999), automatic text summarization takes a partially structured source text from multiple texts written about the same topic, extracts information content from it, and presents the most important content to the user in a manner sensitive to the user’s needs. Nowadays, without browsing the large volume of documents, search engines such as Google, Yahoo!, AltaVista, and others provide users with the clusters of documents they are interested in and present a summary of each document briefly which facilitates the task of finding the desired documents (Boydell & Smyth, 2010; Shen, Sun, Li, Yang, & Chen, 2007; Song, Choi, Park, & Ding, 2011; Yang & Wang, 2008). Boydell and Smyth (2010) focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behavior of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers.

    • CDDS: Constraint-driven document summarization models

      2013, Expert Systems with Applications
      Citation Excerpt :

      According to Mani and Maybury (1999), automatic text summarization takes a partially structured source text from multiple texts written about the same topic, extracts information content from it, and presents the most important content to the user in a manner sensitive to the user’s needs. Nowadays, without browsing the large volume of documents, search engines such as Google, Yahoo!, AltaVista, and others provide users with the clusters of documents they are interested in and present a summary of each document briefly which facilitates the task of finding the desired documents (Boydell & Smyth, 2010; Shen, Sun, Li, Yang, & Chen, 2007; Song, Choi, Park, & Ding, 2011; Yang & Wang, 2008). In Mani and Maybury (1999) it is underlined two basic approaches to automatic summarization: abstractive and extractive.

    • Introduction to the special issue

      2010, Information Processing and Management
    • Social search

      2018, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    View all citing articles on Scopus
    View full text