short-paper

A User Study on Snippet Generation: Text Reuse vs. Paraphrases

Authors:
Wei-Fan Chen

Bauhaus-Universität Weimar, Weimar, Germany

Bauhaus-Universität Weimar, Weimar, Germany
View Profile

,
Matthias Hagen

Martin Luther University Halle-Wittenberg, Halle, Germany

Martin Luther University Halle-Wittenberg, Halle, Germany
View Profile

,
Benno Stein

Bauhaus-Universität Weimar, Weimar, Germany

Bauhaus-Universität Weimar, Weimar, Germany
View Profile

,
Martin Potthast

Leipzig University, Leipzig, Germany

Leipzig University, Leipzig, Germany
View Profile

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalJune 2018Pages 1033–1036https://doi.org/10.1145/3209978.3210149

Published:27 June 2018Publication History

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 1033–1036

ABSTRACT

The snippets in the result list of a web search engine are built with sentences from the retrieved web pages that match the query. Reusing a web page's text for snippets has been considered fair use under the copyright laws of most jurisdictions. As of recent, notable exceptions from this arrangement include Germany and Spain, where news publishers are entitled to raise claims under a so-called ancillary copyright. A similar legislation is currently discussed at the European Commission. If this development gains momentum, the reuse of text for snippets will soon incur costs, which in turn will give rise to new solutions for generating truly original snippets. A key question in this regard is whether the users will accept any new approach for snippet generation, or whether they will prefer the current model of "reuse snippets." The paper in hand gives a first answer. A crowdsourcing experiment along with a statistical analysis reveals that our test users exert no significant preference for either kind of snippet. Notwithstanding the technological difficulty, this result opens the door to a new snippet synthesis paradigm.

References

L. L. Bando, F. Scholer, and A. Turpin . 2010. Constructing Query-biased Summaries: A Comparison of Human and System Generated Snippets Proc. of IIiX. 195--204. Google ScholarDigital Library
P. B. Baxendale . 1958. Machine-Made Index for Technical Literature - An Experiment. IBM Journal of Research and Development Vol. 2, 4 (1958), 354--361. Google ScholarDigital Library
S. Brin and L. Page . 1998. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks, Vol. 30, 1--7 (1998), 107--117. Google ScholarDigital Library
S. Chopra, M. Auli, and A. M. Rush . 2016. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks Proc. of NAACL/HLT.Google Scholar
E. Cutrell and Z. Guan . 2007. What are you Looking for?: An Eye-tracking Study of Information Usage in Web Search Proc. of CHI. 407--416. Google ScholarDigital Library
M. Gambhir and V. Gupta . 2017. Recent Automatic Text Summarization Techniques: A Survey. Artificial Intelligence Review Vol. 47, 1 (2017), 1--66. Google ScholarDigital Library
L. A. Granka, T. Joachims, and G. Gay . 2004. Eye-tracking Analysis of User Behavior in WWW Search Proc. of SIGIR. 478--479. Google ScholarDigital Library
Y. Huang, Z. Liu, and Y. Chen . 2008. Query biased Snippet Generation in XML Search. In Proc. of SIGMOD. 315--326. Google ScholarDigital Library
M. Kaisser, M. A. Hearst, and J. B. Lowe . 2008. Improving Search Results Quality by Customizing Summary Lengths Proc. of ACL. 701--709.Google Scholar
J. Kim, P. Thomas, R. Sankaranarayana, T. Gedeon, and H. Yoon . 2017. What Snippet Size is Needed in Mobile Web Search? Proc. of CHIIR 2017. 97--106. Google ScholarDigital Library
H. P. Luhn . 1958. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development Vol. 2, 2 (1958), 159--165. Google ScholarDigital Library
D. Maxwell, L. Azzopardi, and Y. Moshfeghi . 2017. A Study of Snippet Length and Informativeness: Behaviour, Performance and User Experience Proc. of SIGIR. 135--144. Google ScholarDigital Library
C. McMahon, I. Johnson, and B. Hecht . 2017. The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies Proc. of ICWSM.Google Scholar
R. Nallapati, B. Zhou, C. Nogueira dos Santos, cC. Gülccehre, and B. Xiang . 2016. Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond Proc. of CoNLL.Google Scholar
J. Pedersen, D. Cutting, and J. Tukey . 1991. Snippet Search: A single phrase approach to text access. In Proc. of the 1991 Joint Statistical Meetings.Google Scholar
T. Penin, H. Wang, T. Tran, and Y. Yu . 2008. Snippet Generation for Semantic Web Search Engines Proc. of ASWC. 493--507. Google ScholarDigital Library
M. Potthast, W. Chen, M. Hagen, and B. Stein . 2018. A Plan for Ancillary Copyright: Original Snippets Proc. of NewsIR, Vol. Vol. 2079. CEUR-WS.org, 3--5.Google Scholar
A. M. Rush, S. Chopra, and J. Weston . 2015. A Neural Attention Model for Abstractive Sentence Summarization Proc. of EMNLP.Google Scholar
D. Savenkov, P. Braslavski, and M. Lebedev . 2011. Search Snippet Evaluation at Yandex: Lessons Learned and Future Directions Proc. of CLEF 2011. 14--25. Google ScholarDigital Library
A. See, P. J. Liu, and C. D. Manning . 2017. Get To The Point: Summarization with Pointer-Generator Networks Proc. of ACL.Google Scholar
B. Suh, G. Convertino, E. H. Chi, and P. Pirolli . 2009. The singularity is not near: slowing growth of Wikipedia Proc. of WikiSym. Google ScholarDigital Library
S. Thomaidou, I. Lourentzou, P. Katsivelis-Perakis, and M. Vazirgiannis . 2013. Automated Snippet Generation for Online Advertising Proc. of CIKM. 1841--1844. Google ScholarDigital Library
A. Tombros and M. Sanderson . 1998. Advantages of Query Biased Summaries in Information Retrieval Proc. of SIGIR. 2--10. Google ScholarDigital Library
A. Turpin, Y. Tsegay, D. Hawking, and H. E. Williams . 2007. Fast Generation of Result Snippets in Web Search Proc. of SIGIR. 127--134. Google ScholarDigital Library
R. White, I. Ruthven, and J. M. Jose . 2002 a. Finding Relevant Documents Using Top Ranking Sentences: An Evaluation of Two Alternative Schemes. In Proc. of SIGIR. 57--64. Google ScholarDigital Library
R. White, I. Ruthven, and J. M. Jose . 2002 b. The Use of Implicit Evidence for Relevance Feedback in Web Retrieval Proc. of ECIR. 93--109. Google ScholarDigital Library

Index Terms

A User Study on Snippet Generation: Text Reuse vs. Paraphrases
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Summarization
    2. Users and interactive retrieval
      1. Search interfaces
  2. World Wide Web
    1. Web searching and information discovery
      1. Web search engines

Recommendations

Auditing the Partisanship of Google Search Snippets
WWW '19: The World Wide Web Conference

The text snippets presented in web search results provide users with a slice of page content that they can quickly scan to help inform their click decisions. However, little is known about how these snippets are generated or how they relate to a user's ...
Read More
Extractive Snippet Generation for Arguments
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Snippets are used in web search to help users assess the relevance of retrieved results to their query. Recently, specialized search engines have arisen that retrieve pro and con arguments on controversial issues. We argue that standard snippet ...
Read More
Abstractive Snippet Generation
WWW '20: Proceedings of The Web Conference 2020

An abstractive snippet is an originally created piece of text to summarize a web page on a search engine results page. Compared to the conventional extractive snippets, which are generated by extracting phrases and sentences verbatim from a web page, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
June 2018
1509 pages
ISBN:9781450356572
DOI:10.1145/3209978
General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ancillary copyright
crowdsourcing
snippet generation
Qualifiers
- short-paper
Conference

Acceptance Rates
SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 247
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A User Study on Snippet Generation: Text Reuse vs. Paraphrases

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Auditing the Partisanship of Google Search Snippets

Extractive Snippet Generation for Arguments

Abstractive Snippet Generation