poster

Extracting topics based on authors, recipients and content in microblogs

Authors:

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 1171 - 1174

https://doi.org/10.1145/2600428.2609537

Published: 03 July 2014 Publication History

Get Access

Abstract

Microblogs such as Twitter are important sources for spreading vital information at high speed. They also reflect the general people's reaction and opinion towards major events or stories. With information traveling so quickly, it is helpful to be able to apply unsupervised learning techniques to discover topics for information extraction and analysis. Although graphical models have been traditionally used for topic discovery in microblogs and text streams, previous work may not be as efficient because of the diverse and noisy nature of microblogs.

In this paper, we demonstrate the application of the Author-Topic and the Author-Recipient-Topic model to microblogs. We extensively compare these models under different settings to an LDA baseline. Our results show that the Author-Recipient-Topic model extracts the most coherent topics establishing that joint modeling on author-recipient pairs and on the content of tweet leads to quantitatively better topic discovery. This paper also addresses the problem of topic modeling on short text by using clustering techniques. This technique helps in boosting the performance of our models. Our study reveals interesting traits about Twitter messages, users and their interactions.

References

[1]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.

Digital Library

Google Scholar

[2]

L. Hong and B. D. Davison. Empirical study of topic modeling in twitter. In Proceedings of the First Workshop on Social Media Analytics, pages 80--88. ACM, 2010.

Digital Library

Google Scholar

[3]

A. McCallum, A. Corrada-Emmanuel, and X. Wang. Topic and role discovery in social networks. Computer Science Department Faculty Publication Series, page 3, 2005.

Google Scholar

[4]

R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improving lda topic models for microblogs via tweet pooling and automatic labeling. 2013.

Google Scholar

[5]

D. Ramage, S. T. Dumais, and D. J. Liebling. Characterizing microblogs with topic models. In ICWSM, 2010.

Crossref

Google Scholar

[6]

M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griffiths. Probabilistic author-topic models for information discovery. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 306--315. ACM, 2004.

Digital Library

Google Scholar

[7]

S. Vieweg, A. L. Hughes, K. Starbird, and L. Palen. Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1079--1088. ACM, 2010.

Digital Library

Google Scholar

Cited By

View all

He JChen JLi M(2023)Multi-knowledge Embeddings Enhanced Topic Modeling for Short TextsNeural Information Processing10.1007/978-3-031-30111-7_44(521-532)Online publication date: 13-Apr-2023
https://doi.org/10.1007/978-3-031-30111-7_44
Huang XLi LWang HHu CXu XWu C(2022)Rough-Set-Based Real-Time Interest Label Extraction over Large-Scale Social NetworksComplexity10.1155/2022/20729502022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/2072950
Nugroho RYang JZhao WParis CNepal S(2020)What and With Whom? Identifying Topics in Twitter Through Both Interactions and TextIEEE Transactions on Services Computing10.1109/TSC.2017.269653113:3(584-596)Online publication date: 1-May-2020
https://doi.org/10.1109/TSC.2017.2696531
Show More Cited By

Index Terms

Extracting topics based on authors, recipients and content in microblogs
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Extracting time series variation of topic popularity in microblogs
iiWAS2018: Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services

Extracting topics and their popularities in microblogs is a promising approach to discover popular topics in the world. To challenge this task, some methods that estimate popularity of topics based on Latent Dirichlet Allocation (LDA) has been proposed. ...
Examining the Coherence of the Top Ranked Tweet Topics
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Topic modelling approaches help scholars to examine the topics discussed in a corpus. Due to the popularity of Twitter, two distinct methods have been proposed to accommodate the brevity of tweets: the tweet pooling method and Twitter LDA. Both of these ...
Topic-Level Bursty Study for Bursty Topic Detection in Microblogs
Advances in Knowledge Discovery and Data Mining
Abstract
Microblogging services, such as Twitter and Sina Weibo, have gained tremendous popularity in recent years. The huge amount of user-generated information is spread on microblogs. Such user-generated contents are a mixture of different bursty topics ...

Comments

Information & Contributors

Information

Published In

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

July 2014

1330 pages

ISBN:9781450322577

DOI:10.1145/2600428

General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

topic model

Qualifiers

Poster

Conference

SIGIR '14

Sponsor:

SIGIR

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 6 - 11, 2014

Queensland, Gold Coast, Australia

Acceptance Rates

SIGIR '14 Paper Acceptance Rate 82 of 387 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
510
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

He JChen JLi M(2023)Multi-knowledge Embeddings Enhanced Topic Modeling for Short TextsNeural Information Processing10.1007/978-3-031-30111-7_44(521-532)Online publication date: 13-Apr-2023
https://doi.org/10.1007/978-3-031-30111-7_44
Huang XLi LWang HHu CXu XWu C(2022)Rough-Set-Based Real-Time Interest Label Extraction over Large-Scale Social NetworksComplexity10.1155/2022/20729502022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/2072950
Nugroho RYang JZhao WParis CNepal S(2020)What and With Whom? Identifying Topics in Twitter Through Both Interactions and TextIEEE Transactions on Services Computing10.1109/TSC.2017.269653113:3(584-596)Online publication date: 1-May-2020
https://doi.org/10.1109/TSC.2017.2696531
Das RPurves R(2020)Exploring the Potential of Twitter to Understand Traffic Events and Their Locations in Greater Mumbai, IndiaIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2019.295078221:12(5213-5222)Online publication date: Dec-2020
https://doi.org/10.1109/TITS.2019.2950782
Nugroho RParis CNepal SYang JZhao W(2020)A survey of recent methods on deriving topics from Twitter: algorithm to evaluationKnowledge and Information Systems10.1007/s10115-019-01429-z62:7(2485-2519)Online publication date: 9-Jan-2020
https://doi.org/10.1007/s10115-019-01429-z
Khatoon Mohammed TRangasamy RReddy VGovardhan A(2020)Leveraging Topic Models with Novel Word Embeddings for Effective Document ClusteringAdvances in Computational Intelligence and Informatics10.1007/978-981-15-3338-9_17(133-139)Online publication date: 30-Apr-2020
https://doi.org/10.1007/978-981-15-3338-9_17
Navarro-Murillo NCalvo-Vargas PCasasola-Murillo E(2019)Identification of Unsuitable Content for Children in Video Gaming Forums2019 IV Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI)10.1109/JoCICI48395.2019.9105201(1-6)Online publication date: Aug-2019
https://doi.org/10.1109/JoCICI48395.2019.9105201
Valencia JLaure ACentino NFabito BImperial JRodriguez RDe la Cruz AOctaviano MJamis M(2019)Understanding Anonymous Social Media Posts using Topic Modeling2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM )10.1109/HNICEM48295.2019.9072791(1-4)Online publication date: Nov-2019
https://doi.org/10.1109/HNICEM48295.2019.9072791
Wang HHuang XLi L(2018)Microblog oriented interest extraction with both content and network structureIntelligent Data Analysis10.3233/IDA-17341422:3(515-532)Online publication date: 7-May-2018
https://doi.org/10.3233/IDA-173414
Chen QHu QHuang JHe L(2018)Modeling Queries with Contextual Snippets for Information RetrievalACM Transactions on Intelligent Systems and Technology10.1145/31616079:4(1-26)Online publication date: 31-Jan-2018
https://dl.acm.org/doi/10.1145/3161607
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Extracting time series variation of topic popularity in microblogs

Examining the Coherence of the Top Ranked Tweet Topics

Topic-Level Bursty Study for Bursty Topic Detection in Microblogs

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tag

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations