research-article

Time is of the essence: improving recency ranking using Twitter data

Authors:
Anlei Dong

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Ruiqiang Zhang

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Pranam Kolari

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Jing Bai

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Fernando Diaz

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Yi Chang

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Zhaohui Zheng

Yahoo! Inc., Sunnyvale, CA, USA

Yahoo! Inc., Sunnyvale, CA, USA
View Profile

,
Hongyuan Zha

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

WWW '10: Proceedings of the 19th international conference on World wide webApril 2010Pages 331–340https://doi.org/10.1145/1772690.1772725

Published:26 April 2010Publication History

WWW '10: Proceedings of the 19th international conference on World wide web

Pages 331–340

ABSTRACT

Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including realtime web search. However, supporting realtime web search introduces two challenges not encountered in non-realtime web search: quickly crawling relevant content and ranking documents with impoverished link and click information. In this paper, we advocate the use of realtime micro-blogging data for addressing both of these problems. We propose a method to use the micro-blogging data stream to detect fresh URLs. We also use micro-blogging data to compute novel and effective features for ranking fresh URLs. We demonstrate these methods improve effective of the portal web search engine for realtime web search.

References

E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of 29th ACM SIGIR, 2006. Google ScholarDigital Library
P. Bonacich. Factoring and weighting approaches to clique identification. Journal of Mathematical Sociology, 2:113--120, 1972.Google ScholarCross Ref
K. Borau, C. Ullrich, J. Feng, and R. Shen. Microblogging for language learning: Using twitter to train communicative and cultural competence. In International Conference on Web Based Learning (ICWL) 2009, 2009. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Proceedings of International Conference on World Wide Web, 1998. Google ScholarDigital Library
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarDigital Library
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. Proc. of Intl. Conf. on Machine Learning, 2005. Google ScholarDigital Library
Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise. Proceedings of ICML conference, 2007. Google ScholarDigital Library
F. Diaz. Integration of news content into web results. Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), pages 182--191, 2009. Google ScholarDigital Library
A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, pages 11--20, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
J. C. Dunlap and P. R. Lowenthal. Tweeting the night away: Using twitter to enhance social presence. In Journal of Information Systems Education Special Issue, Impacts of Web 2.0 and Virtual World Technologies on IS Education, 2009.Google Scholar
Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Proceedings of International Conference on Machine Learning, 1998. Google ScholarDigital Library
J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5):1189--1232, 2001.Google ScholarCross Ref
C. Honeycutt and S. C. Herring. Beyond microblogging: Conversation and collaboration via twitter. In System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on, pages 1--10, 2009. Google ScholarDigital Library
B. A. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. Dec 2008.Google Scholar
A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 6th International Conference on Information Systems for Crisis Response and Management, 2009.Google ScholarCross Ref
B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, pages 1--20, 2009. Google ScholarDigital Library
K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20:422--446, 2002. Google ScholarDigital Library
A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In WebKDD/SNA-KDD '07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56--65, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002. Google ScholarDigital Library
A. C. König, M. Gamon, and Q. Wu. Click-through prediction for news queries. In SIGIR 2009, 2009. Google ScholarDigital Library
B. Krishnamurthy, P. Gill, and M. Arlitt. A few chirps about twitter. In WOSP '08: Proceedings of the first workshop on Online social networks, pages 19--24, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
T. Y. Liu. Learning to rank for information retrieval. Tutorial on WWW conference, 2009. Google ScholarDigital Library
C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarDigital Library
D. Metzler, S. T. Dumais, and C. Meek. Similarity measures for short segments of text. In ECIR, pages 16--27, 2007. Google ScholarDigital Library
D. Shamma, L. Kennedy, and E. Churchill. Tweet the debates: Understanding community annotation of uncollected sources. In Proceedings of the ACM International Conference on Multimedia. ACM, 2009. Google ScholarDigital Library
X. Wang and C. Zhai. Learn from web search logs to organize search results. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarDigital Library
D. Zhao and M. B. Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In GROUP '09: Proceedings of the ACM 2009 international conference on Supporting group work, pages 243--252, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th ACM SIGIR conference, 2007. Google ScholarDigital Library
Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. NIPS, 2007.Google Scholar

Index Terms

Time is of the essence: improving recency ranking using Twitter data
1. Information systems
  1. Information retrieval

Recommendations

Towards recency ranking in web search
WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

In web search, recency ranking refers to ranking documents by relevance which takes freshness into account. In this paper, we propose a retrieval system which automatically detects and responds to recency sensitive queries. The system detects recency ...
Read More
Improving recency ranking using twitter data
Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context

In Web search and vertical search, recency ranking refers to retrieving and ranking documents by both relevance and freshness. As impoverished in-links and click information is the the biggest challenge for recency ranking, we advocate the use of ...
Read More
Online learning for recency search ranking using real-time user feedback
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Traditional machine-learned ranking algorithms for web search are trained in batch mode, which assume static relevance of documents for a given query. Although such a batch-learning framework has been tremendously successful in commercial search engines,...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '10: Proceedings of the 19th international conference on World wide web
April 2010
1407 pages
ISBN:9781605587998
DOI:10.1145/1772690
General Chairs:
Michael Rappa
North Carolina State University, USA
,
Paul Jones
University of North Carolina at Chapel Hill, USA
,
Program Chairs:
Juliana Freire
University of Utah, USA
,
Soumen Chakrabarti
Indian Institute of Technology, India
Copyright © 2010 International World Wide Web Conference Committee (IW3C2)
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 April 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Twitter
recency modeling
recency ranking
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 134
  Total Citations
  View Citations
- 2,878
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ePub

View this article in ePub.

View ePub

Time is of the essence: improving recency ranking using Twitter data

WWW '10: Proceedings of the 19th international conference on World wide web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards recency ranking in web search

Improving recency ranking using twitter data

Online learning for recency search ranking using real-time user feedback