research-article

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session

Authors:
Neeru Dubey

Indian Institute of Technology Ropar, Rupnagar, Punjab, India

Indian Institute of Technology Ropar, Rupnagar, Punjab, India
View Profile

,
Simran Setia

Indian Institute of Technology Ropar, Rupnagar, Punjab, India

Indian Institute of Technology Ropar, Rupnagar, Punjab, India
View Profile

,
Amit Arjun Verma

Indian Institute of Technology Ropar, Rupnagar, Punjab, India

Indian Institute of Technology Ropar, Rupnagar, Punjab, India
View Profile

,
S. R.S. Iyengar

Indian Institute of Technology Ropar, Rupnagar, Punjab, India

Indian Institute of Technology Ropar, Rupnagar, Punjab, India
View Profile

HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in HypertextDecember 2020Article No.: 4Pages 1–9https://doi.org/10.1145/3406853.3432662

Published:25 November 2020Publication History

HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext

Pages 1–9

ABSTRACT

Wikipedia is an open-content encyclopedia that receives billions of page views per month. It has been observed that in a single reading session, Wikipedia users visit multiple articles. To reduce the problems of overload and loss of information, there has been a growing interest in the research community to develop new approaches to present the only necessary information to the users. Automatically generation of personalized summaries is a proven remedy for the information overload problem. In this paper, we propose a technique to generate personalized summaries for Wikipedia articles by analyzing the reading patterns of users. To perform reading pattern analysis, we track eye gaze during the article reading session. Eye gaze analysis helps in identifying the attention distribution of a reader over an article. We extend the proposed approach to generate a summary for multiple articles visited during a user's Wikipedia reading session. We capture a dataset representing the reading pattern of Wikipedia users. We make this dataset publicly available for research community1.

References

[n.d.]. CVC Eye Tracker. https://github.com/tiendan/OpenGazer Accessed: 2016.Google Scholar
[n.d.]. NetGazer. http://sourceforge.net/projects/netgazer/ Accessed: 2016.Google Scholar
Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. Journal of Human-Robot Interaction 6, 1 (2017), 25--63.Google ScholarDigital Library
Diego Antognini and Boi Faltings. 2019. Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization. arXiv preprint arXiv:1909.12231 (2019).Google Scholar
Diego Antognini and Boi Faltings. 2020. GameWikiSum: a Novel Large Multi-Document Summarization Dataset. arXiv preprint arXiv:2002.06851 (2020).Google Scholar
Shlomo Berkovsky, Timothy Baldwin, and Ingrid Zukerman. 2008. Aspect-based personalized text summarization. In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Springer, 267--270.Google ScholarDigital Library
David Beymer and Daniel M Russell. 2005. WebGazeAnalyzer: a system for capturing and analyzing web reading behavior using eye gaze. In CHI'05 extended abstracts on Human factors in computing systems. ACM, 1913--1916.Google ScholarDigital Library
Georg Buscher and Andreas Dengel. 2009. Gaze-based filtering of relevant document segments. In International World Wide Web Conference (WWW). 2024.Google Scholar
Frans W Cornelissen, Enno M Peters, and John Palmer. 2002. The Eyelink Toolbox: eye tracking with MATLAB and the Psychophysics Toolbox. Behavior Research Methods, Instruments, & Computers 34, 4 (2002), 613--617.Google ScholarCross Ref
Alberto Díaz, Pablo Gervás, and Antonio García. 2005. Evaluation of a System for Personalized Summarization of Web Contents. In User Modeling 2005. Springer Berlin Heidelberg, 453--462.Google Scholar
Peter K Dunn, Margaret Marshman, and Robert McDougall. 2019. Evaluating Wikipedia as a self-learning resource for statistics: You know they'll use it. The American Statistician 73, 3 (2019), 224--231.Google ScholarCross Ref
Nathan J Emery. 2000. The eyes have it: the neuroethology, function and evolution of social gaze. Neuroscience & Biobehavioral Reviews 24, 6 (2000), 581--604.Google ScholarCross Ref
Gunes Erkan and Dragomir Radev. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 365--371.Google Scholar
Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. ArXiv abs/1109.2128 (2004).Google Scholar
Onur Ferhat, Fernando Vilarino, and Francisco Javier Sanchez. 2014. A cheap portable eye-tracker solution for common setups. (2014).Google Scholar
Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, and Georgiana Ifrim. 2020. A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal. arXiv preprint arXiv:2005.10070 (2020).Google Scholar
Jade Goldstein, Vibhu O Mittal, Jaime G Carbonell, and Mark Kantrowitz. 2000. Multi-document summarization by sentence extraction. In NAACL-ANLP 2000 Workshop: Automatic Summarization.Google ScholarDigital Library
Alison Head and Michael Eisenberg. 2010. How today's college students use Wikipedia for course-related research. First Monday 15, 3 (2010).Google Scholar
Denis Helic. 2012. Analyzing user click paths in a wikipedia navigation game. In 2012 Proceedings of the 35th International Convention MIPRO. IEEE, 374--379.Google Scholar
Dharmendra Hingu, Deep Shah, and Sandeep S Udmale. 2015. Automatic text summarization of Wikipedia articles. In 2015 International Conference on Communication, Information & Computing Technology (ICCICT). IEEE, 1--4.Google ScholarCross Ref
Heather Knight and Reid Simmons. 2013. Estimating human interest and attention via gaze analysis. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 4350--4355.Google ScholarCross Ref
Mahnaz Koupaee and William Yang Wang. 2018. Wikihow: A large scale text summarization dataset. arXiv preprint arXiv:1810.09305 (2018).Google Scholar
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.Google Scholar
Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198 (2018).Google Scholar
Yong Liu, Xiaolei Wang, Jin Zhang, and Hongbo Xu. 2008. Personalized PageRank based multi-document summarization. In IEEE International Workshop on Semantic Computing and Systems. IEEE, 169--173.Google ScholarDigital Library
Róbert Móro et al. 2012. Personalized text summarization based on important terms identification. In 2012 23rd International Workshop on Database and Expert Systems Applications. IEEE, 131--135.Google Scholar
EM Nel, DJC MacKay, P Zieliński, O Williams, and R Cipolla. 2012. Opengazer: open-source gaze tracker for ordinary webcams. (2012).Google Scholar
Ani Nenkova and Lucy Vanderwende. [n.d.]. The impact of frequency on summarization. ([n. d.]).Google Scholar
Ayano Okoso, Kai Kunze, and Koichi Kise. 2014. Implicit gaze based annotations to support second language learning. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. 143--146.Google ScholarDigital Library
Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology (2012).Google Scholar
M Whitney Olsen and Anne R Diekema. 2012. "I just Wikipedia it": Information behavior of first-year writing students. Proceedings of the American Society for Information Science and Technology 49, 1 (2012), 1--11.Google ScholarCross Ref
Alexandra Papoutsaki, James Laskey, and Jeff Huang. 2017. Searchgazer: Webcam eye tracking for remote studies of web search. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval. 17--26.Google ScholarDigital Library
Dragomir R Radev, Weiguo Fan, and Zhu Zhang. 2001. Webinessence: A personalized web-based multi-document summarization and recommendation system. In NAACL Workshop on Automatic Summarization. Citeseer.Google Scholar
Krishnan Ramanathan, Yogesh Sankarasubramaniam, Nidhi Mathur, and Ajay Gupta. 2009. Document summarization using Wikipedia. In Proceedings of the first international conference on intelligent human computer interaction. Springer, 254--260.Google ScholarCross Ref
Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. New Jersey, USA, 133--142.Google Scholar
Eyal M Reingold and Keith Rayner. 2006. Examining the word identification stages hypothesized by the EZ Reader model. Psychological Science 17, 9 (2006), 742--746.Google ScholarCross Ref
Gaetano Rossiello, Pierpaolo Basile, and Giovanni Semeraro. 2017. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. 12--21.Google ScholarCross Ref
Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017).Google Scholar
HS Sichel. 1974. On a distribution representing sentence-length in written prose. Journal of the Royal Statistical Society: Series A (General) 137, 1 (1974), 25--34.Google ScholarCross Ref
Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2012. Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. University of Massachusetts, Amherst, Tech. Rep. UM-CS-2012 15 (2012).Google Scholar
Taner Uçkan and Ali Karcı. 2020. Extractive multi-document text summarization based on graph independent sets. Egyptian Informatics Journal (2020).Google Scholar
Wikimedia Statistics. 2019. Wikistats 2 - Statistics For Wikimedia Projects. https://stats.wikimedia.org/v2/#/en.wikipedia.org [Online; accessed 05-October-2019].Google Scholar
Songhua Xu, Hao Jiang, and Francis Lau. 2009. User-oriented document summarization through vision-based eye-tracking. In Proceedings of the 14th international conference on Intelligent user interfaces. ACM, 7--16.Google ScholarDigital Library
Petro Zdebskyi, Victoria Vysotska, Roman Peleshchak, Ivan Peleshchak, Andriy Demchuk, and Maksym Krylyshyn. 2019. An Application Development for Recognizing of View in Order to Control the Mouse Pointer.. In MoMLeT. 55--74.Google Scholar
Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M Meyer, and Steffen Eger. 2019. Moverscore: Text generation evaluating with contextualized embeddings and earth mover distance. arXiv preprint arXiv:1909.02622 (2019).Google Scholar

Index Terms

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session
1. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing
  2. Human computer interaction (HCI)
    1. Interactive systems and tools

Recommendations

Datasets and gate evaluation framework for benchmarking Wikipedia-based NER systems
NLP-DBPEDIA'13: Proceedings of the 2013th International Conference on NLP & DBpedia - Volume 1064

We present a wikifier evaluation framework consisting of software support and two datasets (News and Tweets), which were derived from datasets previously published at WEKEX 2011 and MSM Challenge 2013. Entities recognized in the original datasets were ...
Read More
Learning to Map Wikidata Entities To Predefined Topics
WWW '19: Companion Proceedings of The 2019 World Wide Web Conference

Recently much progress has been made in entity disambiguation and linking systems (EDL). Given a piece of text, EDL links words and phrases to entities in a knowledge base, where each entity defines a specific concept. Although extracted entities are ...
Read More
DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

In this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext
December 2020
25 pages
ISBN:9781450380584
DOI:10.1145/3406853
Editors:
Claus Atzenbeck
Institute of Information Systems (iisys), Hof University, Germany
,
Jessica Rubart
Ostwestfalen-Lippe University of Applied Sciences and Arts, Germany
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 November 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Gaze detection
Multi-document Summarization
Natural Language Processing
Summary dataset
Wikipedia
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
HUMAN '20 Paper Acceptance Rate3of5submissions,60%Overall Acceptance Rate6of9submissions,67%
More
Upcoming Conference
HT '24

Sponsor:

sigweb

35th ACM Conference on Hypertext and Social Media

September 10 - 13, 2024

Poznan , Poland
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 154
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session

HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext

ABSTRACT

References

Cited By

Index Terms

Recommendations

Datasets and gate evaluation framework for benchmarking Wikipedia-based NER systems

Learning to Map Wikidata Entities To Predefined Topics

DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages