skip to main content
10.1145/3295750.3298914acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
demonstration

Relevance-driven Clustering for Visual Information Retrieval on Twitter

Published:08 March 2019Publication History

ABSTRACT

Geo-temporal visualization of Twitter search results is a challenging task since the simultaneous display of all matching tweets would result in a saturated and unreadable display. In such settings, clustering search results can assist users to scan only a few coherent groups of related tweets rather than many individual tweets. However, in practice, the use of unsupervised clustering methods such as K -Means does not necessarily guarantee that the clusters themselves are relevant. Therefore, we develop a novel method of relevance-driven clustering for visual information retrieval to supply users with highly relevant clusters representing different information perspectives of their queries. We specifically propose a Visual Twitter Information Retrieval (Viz-TIR) tool for relevance-driven clustering and ranking of Twitter search results. At the heart of Viz-TIR is a fast greedy algorithm that optimizes an approximation of an expected F1-Score metric to generate these clusters. We demonstrate its effectiveness w.r.t. K -Means and a baseline method that shows all top matching results on a scenario related to searching natural disasters in US-based Twitter data spanning 2013 and 2014. Our demo shows that Viz-TIR is easy to use and more precise in extracting geo-temporally coherent clusters given search queries in comparison to K-Means, thus aiding the user in visually searching and browsing social network content. Overall, we believe this work enables new opportunities for the synthesis of information retrieval as well as combined relevance and display-aware optimization techniques to support query-adaptive visual information exploration interfaces.

References

  1. T. von Landesberger, A. Kuijper, T. Schreck, J. Kohlhammer, J.J. van Wijk, J.-D. Fekete, and D.W. Fellner. Visual analysis of large graphs: State-of-the-art and future research challenges. Computer Graphics Forum, 30(6):1719--1749, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  2. Shixia Liu, Weiwei Cui, Yingcai Wu, and Mengchen Liu. A survey on information visualization: recent advances and challenges. The Visual Computer, 30(12):1373--1393, Dec 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Guo-Dao Sun, Ying-Cai Wu, Rong-Hua Liang, and Shi-Xia Liu. A survey of visual analytics techniques and applications: State-of-the-art research and future challenges. Journal of Computer Science and Technology, 28(5):852--867, Sep 2013.Google ScholarGoogle ScholarCross RefCross Ref
  4. Stuart P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28:129--137, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. N. Jardine and C.J. van Rijsbergen. The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7(5):217--240, 1971.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ellen M. Voorhees. The cluster hypothesis revisited. In Proceedings of the 8th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '85, pages 188--196, New York, NY, USA, 1985. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. J. Van Rijsbergen. Information Retrieval. Butterworth-Heinemann, Newton, MA, USA, 2nd edition, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Oren Kurland. Re-ranking search results using language models of query-specific clusters. Inf. Retr., 12(4):437--460, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Oren Kurland and Eyal Krikon. The opposite of smoothing: a language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research, 41:367--395, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Or Levi, Ido Guy, Fiana Raiber, and Oren Kurland. Selective cluster presentation on the search results page. ACM Trans. Inf. Syst., 36(3):28:1--28:42, February 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fiana Raiber and Oren Kurland. Exploring the cluster hypothesis, and cluster-based retrieval, over the web. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12, pages 2507--2510, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ismail Sengor Altingovde, Rifat Ozcan, Huseyin Cagdas Ocalan, Fazli Can, and Özgür Ulusoy. Large-scale cluster-based retrieval experiments on turkish texts. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '07, pages 891--892, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xiaoyong Liu and W. Bruce Croft. Cluster-based retrieval using language models. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '04, pages 186--193, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ricardo A Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., 2 edition, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G.M.P. van Kempen and L.J. van Vliet. Mean and variance of ratio estimators used in fluorescence ratio imaging. Cytometry, 39(4):300--305, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  17. Zahra Iman, Scott Sanner, Mohamed Reda Bouadjenek, and Lexing Xie. A longitudinal study of topic classification on twitter. In ICWSM, pages 552--555, 2017.Google ScholarGoogle Scholar
  18. Dan Pelleg and Andrew W. Moore. X-means: Extending k-means with efficient estimation of the number of clusters. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML '00, pages 727--734, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    CHIIR '19: Proceedings of the 2019 Conference on Human Information Interaction and Retrieval
    March 2019
    463 pages
    ISBN:9781450360258
    DOI:10.1145/3295750

    Copyright © 2019 Owner/Author

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 8 March 2019

    Check for updates

    Qualifiers

    • demonstration

    Acceptance Rates

    Overall Acceptance Rate55of163submissions,34%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader