skip to main content
10.1145/3406853.3432662acmconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session

Authors Info & Claims
Published:25 November 2020Publication History

ABSTRACT

Wikipedia is an open-content encyclopedia that receives billions of page views per month. It has been observed that in a single reading session, Wikipedia users visit multiple articles. To reduce the problems of overload and loss of information, there has been a growing interest in the research community to develop new approaches to present the only necessary information to the users. Automatically generation of personalized summaries is a proven remedy for the information overload problem. In this paper, we propose a technique to generate personalized summaries for Wikipedia articles by analyzing the reading patterns of users. To perform reading pattern analysis, we track eye gaze during the article reading session. Eye gaze analysis helps in identifying the attention distribution of a reader over an article. We extend the proposed approach to generate a summary for multiple articles visited during a user's Wikipedia reading session. We capture a dataset representing the reading pattern of Wikipedia users. We make this dataset publicly available for research community1.

References

  1. [n.d.]. CVC Eye Tracker. https://github.com/tiendan/OpenGazer Accessed: 2016.Google ScholarGoogle Scholar
  2. [n.d.]. NetGazer. http://sourceforge.net/projects/netgazer/ Accessed: 2016.Google ScholarGoogle Scholar
  3. Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. Journal of Human-Robot Interaction 6, 1 (2017), 25--63.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Diego Antognini and Boi Faltings. 2019. Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization. arXiv preprint arXiv:1909.12231 (2019).Google ScholarGoogle Scholar
  5. Diego Antognini and Boi Faltings. 2020. GameWikiSum: a Novel Large Multi-Document Summarization Dataset. arXiv preprint arXiv:2002.06851 (2020).Google ScholarGoogle Scholar
  6. Shlomo Berkovsky, Timothy Baldwin, and Ingrid Zukerman. 2008. Aspect-based personalized text summarization. In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Springer, 267--270.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. David Beymer and Daniel M Russell. 2005. WebGazeAnalyzer: a system for capturing and analyzing web reading behavior using eye gaze. In CHI'05 extended abstracts on Human factors in computing systems. ACM, 1913--1916.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Georg Buscher and Andreas Dengel. 2009. Gaze-based filtering of relevant document segments. In International World Wide Web Conference (WWW). 2024.Google ScholarGoogle Scholar
  9. Frans W Cornelissen, Enno M Peters, and John Palmer. 2002. The Eyelink Toolbox: eye tracking with MATLAB and the Psychophysics Toolbox. Behavior Research Methods, Instruments, & Computers 34, 4 (2002), 613--617.Google ScholarGoogle ScholarCross RefCross Ref
  10. Alberto Díaz, Pablo Gervás, and Antonio García. 2005. Evaluation of a System for Personalized Summarization of Web Contents. In User Modeling 2005. Springer Berlin Heidelberg, 453--462.Google ScholarGoogle Scholar
  11. Peter K Dunn, Margaret Marshman, and Robert McDougall. 2019. Evaluating Wikipedia as a self-learning resource for statistics: You know they'll use it. The American Statistician 73, 3 (2019), 224--231.Google ScholarGoogle ScholarCross RefCross Ref
  12. Nathan J Emery. 2000. The eyes have it: the neuroethology, function and evolution of social gaze. Neuroscience & Biobehavioral Reviews 24, 6 (2000), 581--604.Google ScholarGoogle ScholarCross RefCross Ref
  13. Gunes Erkan and Dragomir Radev. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 365--371.Google ScholarGoogle Scholar
  14. Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. ArXiv abs/1109.2128 (2004).Google ScholarGoogle Scholar
  15. Onur Ferhat, Fernando Vilarino, and Francisco Javier Sanchez. 2014. A cheap portable eye-tracker solution for common setups. (2014).Google ScholarGoogle Scholar
  16. Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, and Georgiana Ifrim. 2020. A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal. arXiv preprint arXiv:2005.10070 (2020).Google ScholarGoogle Scholar
  17. Jade Goldstein, Vibhu O Mittal, Jaime G Carbonell, and Mark Kantrowitz. 2000. Multi-document summarization by sentence extraction. In NAACL-ANLP 2000 Workshop: Automatic Summarization.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Alison Head and Michael Eisenberg. 2010. How today's college students use Wikipedia for course-related research. First Monday 15, 3 (2010).Google ScholarGoogle Scholar
  19. Denis Helic. 2012. Analyzing user click paths in a wikipedia navigation game. In 2012 Proceedings of the 35th International Convention MIPRO. IEEE, 374--379.Google ScholarGoogle Scholar
  20. Dharmendra Hingu, Deep Shah, and Sandeep S Udmale. 2015. Automatic text summarization of Wikipedia articles. In 2015 International Conference on Communication, Information & Computing Technology (ICCICT). IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  21. Heather Knight and Reid Simmons. 2013. Estimating human interest and attention via gaze analysis. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 4350--4355.Google ScholarGoogle ScholarCross RefCross Ref
  22. Mahnaz Koupaee and William Yang Wang. 2018. Wikihow: A large scale text summarization dataset. arXiv preprint arXiv:1810.09305 (2018).Google ScholarGoogle Scholar
  23. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.Google ScholarGoogle Scholar
  24. Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198 (2018).Google ScholarGoogle Scholar
  25. Yong Liu, Xiaolei Wang, Jin Zhang, and Hongbo Xu. 2008. Personalized PageRank based multi-document summarization. In IEEE International Workshop on Semantic Computing and Systems. IEEE, 169--173.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Róbert Móro et al. 2012. Personalized text summarization based on important terms identification. In 2012 23rd International Workshop on Database and Expert Systems Applications. IEEE, 131--135.Google ScholarGoogle Scholar
  27. EM Nel, DJC MacKay, P Zieliński, O Williams, and R Cipolla. 2012. Opengazer: open-source gaze tracker for ordinary webcams. (2012).Google ScholarGoogle Scholar
  28. Ani Nenkova and Lucy Vanderwende. [n.d.]. The impact of frequency on summarization. ([n. d.]).Google ScholarGoogle Scholar
  29. Ayano Okoso, Kai Kunze, and Koichi Kise. 2014. Implicit gaze based annotations to support second language learning. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. 143--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology (2012).Google ScholarGoogle Scholar
  31. M Whitney Olsen and Anne R Diekema. 2012. "I just Wikipedia it": Information behavior of first-year writing students. Proceedings of the American Society for Information Science and Technology 49, 1 (2012), 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  32. Alexandra Papoutsaki, James Laskey, and Jeff Huang. 2017. Searchgazer: Webcam eye tracking for remote studies of web search. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval. 17--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Dragomir R Radev, Weiguo Fan, and Zhu Zhang. 2001. Webinessence: A personalized web-based multi-document summarization and recommendation system. In NAACL Workshop on Automatic Summarization. Citeseer.Google ScholarGoogle Scholar
  34. Krishnan Ramanathan, Yogesh Sankarasubramaniam, Nidhi Mathur, and Ajay Gupta. 2009. Document summarization using Wikipedia. In Proceedings of the first international conference on intelligent human computer interaction. Springer, 254--260.Google ScholarGoogle ScholarCross RefCross Ref
  35. Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. New Jersey, USA, 133--142.Google ScholarGoogle Scholar
  36. Eyal M Reingold and Keith Rayner. 2006. Examining the word identification stages hypothesized by the EZ Reader model. Psychological Science 17, 9 (2006), 742--746.Google ScholarGoogle ScholarCross RefCross Ref
  37. Gaetano Rossiello, Pierpaolo Basile, and Giovanni Semeraro. 2017. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. 12--21.Google ScholarGoogle ScholarCross RefCross Ref
  38. Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017).Google ScholarGoogle Scholar
  39. HS Sichel. 1974. On a distribution representing sentence-length in written prose. Journal of the Royal Statistical Society: Series A (General) 137, 1 (1974), 25--34.Google ScholarGoogle ScholarCross RefCross Ref
  40. Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2012. Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. University of Massachusetts, Amherst, Tech. Rep. UM-CS-2012 15 (2012).Google ScholarGoogle Scholar
  41. Taner Uçkan and Ali Karcı. 2020. Extractive multi-document text summarization based on graph independent sets. Egyptian Informatics Journal (2020).Google ScholarGoogle Scholar
  42. Wikimedia Statistics. 2019. Wikistats 2 - Statistics For Wikimedia Projects. https://stats.wikimedia.org/v2/#/en.wikipedia.org [Online; accessed 05-October-2019].Google ScholarGoogle Scholar
  43. Songhua Xu, Hao Jiang, and Francis Lau. 2009. User-oriented document summarization through vision-based eye-tracking. In Proceedings of the 14th international conference on Intelligent user interfaces. ACM, 7--16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Petro Zdebskyi, Victoria Vysotska, Roman Peleshchak, Ivan Peleshchak, Andriy Demchuk, and Maksym Krylyshyn. 2019. An Application Development for Recognizing of View in Order to Control the Mouse Pointer.. In MoMLeT. 55--74.Google ScholarGoogle Scholar
  45. Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M Meyer, and Steffen Eger. 2019. Moverscore: Text generation evaluating with contextualized embeddings and earth mover distance. arXiv preprint arXiv:1909.02622 (2019).Google ScholarGoogle Scholar

Index Terms

  1. WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext
        December 2020
        25 pages
        ISBN:9781450380584
        DOI:10.1145/3406853
        • Editors:
        • Claus Atzenbeck,
        • Jessica Rubart

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 November 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        HUMAN '20 Paper Acceptance Rate3of5submissions,60%Overall Acceptance Rate6of9submissions,67%

        Upcoming Conference

        HT '24
        35th ACM Conference on Hypertext and Social Media
        September 10 - 13, 2024
        Poznan , Poland

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader