skip to main content
10.1145/2487788.2488050acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Real time discussion retrieval from twitter

Published: 13 May 2013 Publication History

Abstract

While social media receive a lot of attention from the scientific community in general, there is little work on high recall retrieval of messages relevant to a discussion. Hash tag based search is widely used for data retrieval from social media. This work shows limitations of this approach, because the majority of the relevant messages do not even contain any hash tag, and unpredictable hash tags are used as the conversation evolves in time. To overcome these limitations, we propose an alternative retrieval method. Given an input stream of messages as an example of the discussion, our method extracts the most relevant words from it and queries the social network for more messages with these words. Our method filters messages that do not belong to the discussion using an LDA topic model. We demonstrate this concept on manually built collections of tweets about major sport and music events.

References

[1]
H. Amiri, Y. Bao, A. Cui, A. Datta, F. Fang, and X. Xu. Nusis at trec 2011 microblog track: Refining query results with hashtags. In The Twentieth Text REtrieval Conference Proceedings, 2011.
[2]
J. Benhardus. Streaming trend detection in twitter. National Science Foundation REU for Artificial Intelligence, NLP and IR, 2010.
[3]
G. Berardi, A. Esuli, D. Marcheggiani, and F. Sebastiani. Isti@ trec microblog track 2011: exploring the use of hashtag segmentation and text quality ranking. In The Twentieth Text REtrieval Conference Proceedings, 2011.
[4]
M. Bernstein, B. Suh, L. Hong, J. Chen, S. Kairam, and E. Chi. Eddi: interactive topic-based browsing of social status streams. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, pages 303--312. ACM, 2010.
[5]
D. M. Blei. Probabilistic topic models. Commun. ACM, 55(4):77--84, Apr. 2012.
[6]
Y. Chen, Z. Li, L. Nie, and X. Hu. A semi-supervised bayesian network model for microblog topic classification. In Proceedings of COLING, 2012.
[7]
N. Dai, M. Shokouhi, and B. D. Davison. Learning to rank for freshness and relevance. Proc. 34th ACM SIGIR, 2011.
[8]
E. Diaz-Aviles, P. Siehndel, and K. D. Naini. Exploiting social#-tagging behavior in twitter for information filtering and recommendation. In The Twentieth Text REtrieval Conference Proceedings, 2011.
[9]
S. Goorha and L. Ungar. Discovery of significant emerging trends. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 57--64. ACM, 2010.
[10]
A. Marcus, M. Bernstein, O. Badar, D. Karger, S. Madden, and R. Miller. Twitinfo: Aggregating and visualizing microblogs for event exploration. In Proceedings of the 2011 annual conference on Human factors in computing systems, pages 227--236. ACM, 2011.
[11]
M. Naaman, H. Becker, and L. Gravano. Hip and trendy: Characterizing emerging trends on twitter. Journal of the American Society for Information Science and Technology, 62(5):902--918, 2011.
[12]
A. Saha and V. Sindhwani. Learning evolving and emerging topics in social media: A dynamic nmf approach with temporal regularization. In Proceedings of the 5th International Conference on Web Search and Data Mining (WSDM), 2012.
[13]
E. T. K. Sang. Het gebruik van twitter voor taalkundig onderzoek. TABU: Bulletin voor Taalwetenschap, 39(1/2):62--72, 2011.
[14]
K. Tao, F. Abel, and C. Hauff. Wistud at trec 2011: Microblog track. 2011.

Cited By

View all
  • (2016)The Effect on Accuracy of Tweet Sample Size for Hashtag Segmentation Dictionary ConstructionAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-31753-3_31(382-394)Online publication date: 12-Apr-2016
  • (2014)Detecting hot topics from Twitter: A multiview approachJournal of Information Science10.1177/016555151454161440:5(578-593)Online publication date: 3-Jul-2014
  • (2014)Microblog Topic Contagiousness Measurement and Emerging Outbreak MonitoringProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management10.1145/2661829.2662014(1099-1108)Online publication date: 3-Nov-2014
  • Show More Cited By

Index Terms

  1. Real time discussion retrieval from twitter

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
    May 2013
    1636 pages
    ISBN:9781450320382
    DOI:10.1145/2487788

    Sponsors

    • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
    • CGIBR: Comite Gestor da Internet no Brazil

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. discussion retrieval
    2. event data
    3. social media

    Qualifiers

    • Research-article

    Conference

    WWW '13
    Sponsor:
    • NICBR
    • CGIBR
    WWW '13: 22nd International World Wide Web Conference
    May 13 - 17, 2013
    Rio de Janeiro, Brazil

    Acceptance Rates

    WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)The Effect on Accuracy of Tweet Sample Size for Hashtag Segmentation Dictionary ConstructionAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-31753-3_31(382-394)Online publication date: 12-Apr-2016
    • (2014)Detecting hot topics from Twitter: A multiview approachJournal of Information Science10.1177/016555151454161440:5(578-593)Online publication date: 3-Jul-2014
    • (2014)Microblog Topic Contagiousness Measurement and Emerging Outbreak MonitoringProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management10.1145/2661829.2662014(1099-1108)Online publication date: 3-Nov-2014
    • (undefined)TBTAG+ a Topic-Based Discussion Retrieval System for Twitter ContentSSRN Electronic Journal10.2139/ssrn.3165299

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media