Skip to content
Publicly Available Published by De Gruyter Oldenbourg January 30, 2015

A semi-supervised method for topic extraction from micro postings

  • Georg Fuchs

    Georg Fuchs is a senior research scientist and project manager at Fraunhofer IAIS working in the field of visual analytics with a strong emphasis on spatio-temporal data analysis. His research interests include information visualization in general and visualization of spatio-temporal data in particular, visual analytics methodologies, task-driven adaptation of visual representations and Smart Visual Interfaces, as well as computer graphics and rendering. Georg Fuchs has co-authored 38+ peer-reviewed research papers and journal articles, and received a best short paper award at Smart Graphics 2008.

    Fraunhofer IAIS, D-53757 Sankt Augustin

    EMAIL logo
    , Hendrik Stange

    Hendrik Stange is a research fellow at the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS and a project manager of many research and industrial big data projects with international consortia. He has a background in data mining and spatial business intelligence. Hendrik studied at the Otto-von-Guericke University Magdeburg and received his degree in Business Informatics in 2007. Since, he specialized in mobile analytics for outdoor advertising and telecommunications. Current research focuses on learning on streams of heterogeneous poly-structured data and visual analytics for big data applications.

    Fraunhofer IAIS, D-53757 Sankt Augustin

    , Ahmad Samiei

    Ahamd Samiei is PhD student at Fraunhofer IAIS. He recently obtained is MSc. in Computer Sciences on the topic of semi-supervised topic extraction from Twitter. His research interests include natural language processing, text mining, linked data and data mining in general.

    Fraunhofer IAIS, D-53757 Sankt Augustin

    , Gennady Andrienko

    Gennady Andrienko is a lead scientist responsible for the visual analytics research at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) and professor (part-time) at City University London. He co-authored the monographs “Exploratory Analysis of Spatial and Temporal Data” (Springer, 2006) and “Visual Analytics of Movement” (2013), 60+ peer-reviewed journal papers, 20+ book chapters and more than 100 conference papers. Since 2007, Gennady Andrienko is chairing the ICA Commission on GeoVisualization. He co-organized scientific events on visual analytics, geovisualization and visual data mining, and co-edited 10 special issues of journals.

    Fraunhofer IAIS, D-53757 Sankt Augustin

    and Natalia Andrienko

    Natalia Andrienko has been working at GMD, now Fraunhofer IAIS, since 1997. Since 2007, she is a lead scientist responsible for the visual analytics research. Since 2013 she is professor (part-time) at City University London. She co-authored the monographs “Exploratory Analysis of Spatial and Temporal Data” (Springer, 2006) and “Visual Analytics of Movement” (2013), over 60 peer-reviewed journal papers, over 20 book chapters and more than 100 conference papers. She received best paper awards at AGILE 2006 and IEEE VAST 2011 and 2012 conferences, best poster awards at AGILE 2007 and ACM GIS 2011, and VAST challenge award 2008.

    Fraunhofer IAIS, D-53757 Sankt Augustin

Abstract

Social networking services have become a major channel for the digital society to share content, opinions, experiences on activities or events, as well as on products, services and brands. Evaluating digital feedback on the latter can be a valuable asset for companies seeking product and consumer insights. However, the analysis of short, noisy, fragmented, and often subjective textual data still remains a challenge. Typically, the human analyst needs to be actively involved during extraction and modeling to resolve ambiguities that will inevitable arise in such data and to put the model into context. This paper proposes a visual analytics approach that enables a first intuition and exploration of topics appearing in the text corpus, and facilitates the interactive-iterative refinement of the overall topic model describing the stream of tweets. A second contribution is the discussion of efficient graph community detection algorithms to extract initial topics as the starting point of interactive analysis that complement approaches such as LDA. The applicability and utility of the proposed approach is shown for a real-world use case: the analysis of product insights and topic-driven social networks analysis for a specific product line for an international hair styling and cosmetics company.

About the authors

Georg Fuchs

Georg Fuchs is a senior research scientist and project manager at Fraunhofer IAIS working in the field of visual analytics with a strong emphasis on spatio-temporal data analysis. His research interests include information visualization in general and visualization of spatio-temporal data in particular, visual analytics methodologies, task-driven adaptation of visual representations and Smart Visual Interfaces, as well as computer graphics and rendering. Georg Fuchs has co-authored 38+ peer-reviewed research papers and journal articles, and received a best short paper award at Smart Graphics 2008.

Fraunhofer IAIS, D-53757 Sankt Augustin

Hendrik Stange

Hendrik Stange is a research fellow at the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS and a project manager of many research and industrial big data projects with international consortia. He has a background in data mining and spatial business intelligence. Hendrik studied at the Otto-von-Guericke University Magdeburg and received his degree in Business Informatics in 2007. Since, he specialized in mobile analytics for outdoor advertising and telecommunications. Current research focuses on learning on streams of heterogeneous poly-structured data and visual analytics for big data applications.

Fraunhofer IAIS, D-53757 Sankt Augustin

Ahmad Samiei

Ahamd Samiei is PhD student at Fraunhofer IAIS. He recently obtained is MSc. in Computer Sciences on the topic of semi-supervised topic extraction from Twitter. His research interests include natural language processing, text mining, linked data and data mining in general.

Fraunhofer IAIS, D-53757 Sankt Augustin

Gennady Andrienko

Gennady Andrienko is a lead scientist responsible for the visual analytics research at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) and professor (part-time) at City University London. He co-authored the monographs “Exploratory Analysis of Spatial and Temporal Data” (Springer, 2006) and “Visual Analytics of Movement” (2013), 60+ peer-reviewed journal papers, 20+ book chapters and more than 100 conference papers. Since 2007, Gennady Andrienko is chairing the ICA Commission on GeoVisualization. He co-organized scientific events on visual analytics, geovisualization and visual data mining, and co-edited 10 special issues of journals.

Fraunhofer IAIS, D-53757 Sankt Augustin

Natalia Andrienko

Natalia Andrienko has been working at GMD, now Fraunhofer IAIS, since 1997. Since 2007, she is a lead scientist responsible for the visual analytics research. Since 2013 she is professor (part-time) at City University London. She co-authored the monographs “Exploratory Analysis of Spatial and Temporal Data” (Springer, 2006) and “Visual Analytics of Movement” (2013), over 60 peer-reviewed journal papers, over 20 book chapters and more than 100 conference papers. She received best paper awards at AGILE 2006 and IEEE VAST 2011 and 2012 conferences, best poster awards at AGILE 2007 and ACM GIS 2011, and VAST challenge award 2008.

Fraunhofer IAIS, D-53757 Sankt Augustin

Received: 2014-8-13
Revised: 2014-11-3
Accepted: 2014-12-5
Published Online: 2015-1-30
Published in Print: 2015-2-28

©2015 Walter de Gruyter Berlin/Boston

Downloaded on 18.4.2024 from https://www.degruyter.com/document/doi/10.1515/itit-2014-1078/html
Scroll to top button