skip to main content
10.1145/1772690.1772725acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Time is of the essence: improving recency ranking using Twitter data

Published:26 April 2010Publication History

ABSTRACT

Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including realtime web search. However, supporting realtime web search introduces two challenges not encountered in non-realtime web search: quickly crawling relevant content and ranking documents with impoverished link and click information. In this paper, we advocate the use of realtime micro-blogging data for addressing both of these problems. We propose a method to use the micro-blogging data stream to detect fresh URLs. We also use micro-blogging data to compute novel and effective features for ranking fresh URLs. We demonstrate these methods improve effective of the portal web search engine for realtime web search.

References

  1. E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of 29th ACM SIGIR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Bonacich. Factoring and weighting approaches to clique identification. Journal of Mathematical Sociology, 2:113--120, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  3. K. Borau, C. Ullrich, J. Feng, and R. Shen. Microblogging for language learning: Using twitter to train communicative and cultural competence. In International Conference on Web Based Learning (ICWL) 2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Proceedings of International Conference on World Wide Web, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. Proc. of Intl. Conf. on Machine Learning, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise. Proceedings of ICML conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Diaz. Integration of news content into web results. Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), pages 182--191, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, pages 11--20, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. C. Dunlap and P. R. Lowenthal. Tweeting the night away: Using twitter to enhance social presence. In Journal of Information Systems Education Special Issue, Impacts of Web 2.0 and Virtual World Technologies on IS Education, 2009.Google ScholarGoogle Scholar
  11. Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Proceedings of International Conference on Machine Learning, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5):1189--1232, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Honeycutt and S. C. Herring. Beyond microblogging: Conversation and collaboration via twitter. In System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on, pages 1--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. A. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. Dec 2008.Google ScholarGoogle Scholar
  15. A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 6th International Conference on Information Systems for Crisis Response and Management, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  16. B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, pages 1--20, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20:422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In WebKDD/SNA-KDD '07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56--65, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. C. König, M. Gamon, and Q. Wu. Click-through prediction for news queries. In SIGIR 2009, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Krishnamurthy, P. Gill, and M. Arlitt. A few chirps about twitter. In WOSP '08: Proceedings of the first workshop on Online social networks, pages 19--24, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Y. Liu. Learning to rank for information retrieval. Tutorial on WWW conference, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Metzler, S. T. Dumais, and C. Meek. Similarity measures for short segments of text. In ECIR, pages 16--27, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Shamma, L. Kennedy, and E. Churchill. Tweet the debates: Understanding community annotation of uncollected sources. In Proceedings of the ACM International Conference on Multimedia. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Wang and C. Zhai. Learn from web search logs to organize search results. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Zhao and M. B. Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In GROUP '09: Proceedings of the ACM 2009 international conference on Supporting group work, pages 243--252, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th ACM SIGIR conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. NIPS, 2007.Google ScholarGoogle Scholar

Index Terms

  1. Time is of the essence: improving recency ranking using Twitter data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Copyright © 2010 International World Wide Web Conference Committee (IW3C2)

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 April 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    ePub

    View this article in ePub.

    View ePub