Exploring and Summarizing Document Colletions with Multiple Coordinated Views
Abstract
Knowledge work such as summarizing related research in preparation for writing, typically requires the extraction of useful information from scientific literature. Nowadays the primary source of information for researchers comes from electronic documents available on the Web, accessible through general and academic search engines such as Google Scholar or IEEE Xplore. Yet, the vast amount of resources makes retrieving only the most relevant results a difficult task. As a consequence, researchers are often confronted with loads of low-quality or irrelevant content. To address this issue we introduce a novel system, which combines a rich, interactive Web-based user interface and different visualization approaches. This system enables researchers to identify key phrases matching current information needs and spot potentially relevant literature within hierarchical document collections. The chosen context was the collection and summarization of related work in preparation for scientific writing, thus the system supports features such as bibliography and citation management, document metadata extraction and a text editor. This paper introduces the design rationale and components of the PaperViz. Moreover, we report the insights gathered in a formative design study addressing usability.
References
[1]
Jae-wook Ahn and Peter Brusilovsky. 2013. Adaptive visualization for exploratory information retrieval. Inf. Process. Manag. 49, 5 (2013), 1139--1164. x0306--4573 03064573 http://dx.doi.org/10.1016/j.ipm.2013.01.007
[2]
Keith Andrews, Wolfgang Kienreich, Vedran Sabol, Jutta Becker, Georg Droschl, Frank Kappe, Michael Granitzer, Peter Auer, and Klaus Tochtermann. 2002. The InfoSky Visual Explorer: Exploiting Hierarchical Structure and Document Similarities. Inf Vis 1, 3/4 (Dec. 2002), 166--181. 1473--8716 http://dx.doi.org/10.1057/palgrave.ivs.9500023
[3]
Diane Blankenship. 2010. Applied research and evaluation methods in recreation. Human Kinetics.
[4]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993--1022.
[5]
John Brooke. 1996. SUS - A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4--7. x0748404600 1097-0193 http://dx.doi.org/10.1002/hbm.20701
[6]
Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. 2011. Apolo: Making Sense of Large Network Data by Combining Rich User Interaction and Machine Learning. In Proc. CHI '11. ACM, 167--176. x978--1--4503-0228--9 http://dx.doi.org/10.1145/1978942.1978967
[7]
Chaomei Chen. 2006. Information Visualization: Beyond the Horizon. Springer-Verlag New York, Inc., Secaucus, NJ, USA. x184628340X
[8]
Wenwen Dou, Xiaoyu Wang, Remco Chang, and William Ribarsky. 2011. Paralleltopics: A probabilistic approach to exploring document collections. In Proc. IEEE VAST '11. IEEE, 231--240.
[9]
Drahomira, Petr Herrmannova, and Knoth. 2012. Visual Search for Supporting Content Exploration in Large Document Collections. D-Lib Magazine 8, 7 (9 2012).
[10]
Jacob Eisenstein, Duen Horng Chau, Aniket Kittur, and Eric Xing. 2012. TopicViz: Interactive Topic Exploration in Document Collections (CHI EA '12). ACM, 2177--2182. x978--1--4503--1016--1 http://dx.doi.org/10.1145/2212776.2223772
[11]
Google. 2015. Word Trees. https://developers.google.com/chart/interactive/docs/gallery/wordtree#overview/. (12 2015).[Online; accessed 17-March-2016].
[12]
Brynjar Gretarsson, John Odonovan, Svetlin Bostandjiev, Tobias Höllerer, Arthur Asuncion, David Newman, and Padhraic Smyth. 2012. Topicnets: Visual analysis of large text corpora with topic modeling. ACM TIST 3, 2 (2012), 23.
[13]
Marti A. Hearst. 1995. TileBars: Visualization of Term Distribution Information in Full Text Information Access. In Proc. CHI '95. ACM Press/Addison-Wesley Publishing, 59--66. x0--201--84705--1 http://dx.doi.org/10.1145/223904.223912
[14]
James R. Lewis and Jeff Sauro. 2009. The factor structure of the system usability scale. Lecture Notes in Computer Science 5619 LNCS (2009), 94--103. x3642028055 03029743 http://dx.doi.org/10.1007/978--3--642-02806--9_12
[15]
Gary Marchionini. 2006. Exploratory Search: From Finding to Understanding. Commun. ACM 49, 4 (2006), 41--46. 0001-0782 http://dx.doi.org/10.1145/1121949.1121979
[16]
David Newman, Timothy Baldwin, Lawrence Cavedon, Eric Huang, Sarvnaz Karimi, David Martinez, Falk Scholer, and Justin Zobel. 2010. Visualizing search results and document collections using topic maps. Web Semant. 8, 2 (2010), 169--175.
[17]
Kai A. Olsen, Robert R. Korfhage, Kenneth M. Sochats, Michael B. Spring, and James G. Williams. 1993. Visualization of a document collection: The vibe system. Inf. Process. Manag. 29, 1 (1993), 69 -- 81. 0306--4573 http://dx.doi.org/10.1016/0306--4573(93)90024--8
[18]
Heydon Pickering. 2016. Inclusive Design Patterns. Smashing Magazine. 312 pages. x9783945749432
[19]
Guy Shani and Noam Tractinsky. 2013. Displaying Relevance Scores for Search Results. In Proc. ACM SIGIR '13. ACM, 901--904. x978--1--4503--2034--4 http://dx.doi.org/10.1145/2484028.2484112
[20]
Frank van Ham, Martin Wattenberg, and Fernanda B. Viegas. 2009. Mapping Text with Phrase Nets. IEEE TVCG 15, 6 (Nov. 2009), 1169--1176. 1077--2626 http://dx.doi.org/10.1109/TVCG.2009.165
Index Terms
- Exploring and Summarizing Document Colletions with Multiple Coordinated Views
Recommendations
Measuring Group Cohesion in Document Collections
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01Exploring document collections remains a focus of research. This task can be tackled using various techniques, typically ranking documents according to a relevance index or grouping documents based on various clustering algorithms. The task complexity ...
Comments
Information & Contributors
Information
Published In

March 2017
82 pages
ISBN:9781450349031
DOI:10.1145/3038462
- Conference Chairs:
- Dorota Glowacka,
- Evangelos Milios,
- Axel J. Soto,
- Fernando Paulovich
Copyright © 2017 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 13 March 2017
Check for updates
Author Tags
Qualifiers
- Research-article
Funding Sources
- Austrian COMET Program
- H2020 AFEL Project
Conference
Upcoming Conference
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 150Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in