Author:
Eduard Hoenkamp
Affiliation:
Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, Australia, Institute for Computing and Information Sciences, Radboud University, Nijmegen and The Netherlands
Keyword(s):
Storyline, Topic Models, Document Space, Foreground/Background Separation, Robust PCA, Sparse Recovery, Subspace Tracking, Geometric Optimization, Grassman Manifolds.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Clustering and Classification Methods
;
Concept Mining
;
Foundations of Knowledge Discovery in Databases
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Symbolic Systems
Abstract:
Many of us struggle to keep up with fast evolving news stories, viral tweets, or e-mails demanding our attention. Previous studies have tried to contain such overload by reducing the amount of information reaching us, make it easier to cope with the information that does reach us, or help us decide what to do with the information once delivered. Instead, the approach presented here is to mitigate the overload by uncovering and presenting only the information that is worth looking at. We posit that the latter is encapsulated as an underlying storyline that obeys several intuitive cognitive constraints. The paper assesses the efficacy of the two main paradigms of Information Retrieval, the document space model and language modeling, in how well each captures the intuitive idea of a storyline, seen as a stream of topics. The paper formally defines topics as high-dimensional but sparse elements of a (Grassmann) manifold, and storyline as a trajectory connecting these elements. We show ho
w geometric optimization can isolate the storyline from a stationary low dimensional story background. The approach is effective and efficient in producing a compact representation of the information stream, to be subsequently conveyed to the end-user.
(More)