ABSTRACT
Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including realtime web search. However, supporting realtime web search introduces two challenges not encountered in non-realtime web search: quickly crawling relevant content and ranking documents with impoverished link and click information. In this paper, we advocate the use of realtime micro-blogging data for addressing both of these problems. We propose a method to use the micro-blogging data stream to detect fresh URLs. We also use micro-blogging data to compute novel and effective features for ranking fresh URLs. We demonstrate these methods improve effective of the portal web search engine for realtime web search.
- E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of 29th ACM SIGIR, 2006. Google ScholarDigital Library
- P. Bonacich. Factoring and weighting approaches to clique identification. Journal of Mathematical Sociology, 2:113--120, 1972.Google ScholarCross Ref
- K. Borau, C. Ullrich, J. Feng, and R. Shen. Microblogging for language learning: Using twitter to train communicative and cultural competence. In International Conference on Web Based Learning (ICWL) 2009, 2009. Google ScholarDigital Library
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Proceedings of International Conference on World Wide Web, 1998. Google ScholarDigital Library
- A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002. Google ScholarDigital Library
- C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. Proc. of Intl. Conf. on Machine Learning, 2005. Google ScholarDigital Library
- Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise. Proceedings of ICML conference, 2007. Google ScholarDigital Library
- F. Diaz. Integration of news content into web results. Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), pages 182--191, 2009. Google ScholarDigital Library
- A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search. In WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, pages 11--20, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- J. C. Dunlap and P. R. Lowenthal. Tweeting the night away: Using twitter to enhance social presence. In Journal of Information Systems Education Special Issue, Impacts of Web 2.0 and Virtual World Technologies on IS Education, 2009.Google Scholar
- Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Proceedings of International Conference on Machine Learning, 1998. Google ScholarDigital Library
- J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5):1189--1232, 2001.Google ScholarCross Ref
- C. Honeycutt and S. C. Herring. Beyond microblogging: Conversation and collaboration via twitter. In System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on, pages 1--10, 2009. Google ScholarDigital Library
- B. A. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. Dec 2008.Google Scholar
- A. L. Hughes and L. Palen. Twitter adoption and use in mass convergence and emergency events. In Proceedings of the 6th International Conference on Information Systems for Crisis Response and Management, 2009.Google ScholarCross Ref
- B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, pages 1--20, 2009. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20:422--446, 2002. Google ScholarDigital Library
- A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microblogging usage and communities. In WebKDD/SNA-KDD '07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56--65, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2002. Google ScholarDigital Library
- A. C. König, M. Gamon, and Q. Wu. Click-through prediction for news queries. In SIGIR 2009, 2009. Google ScholarDigital Library
- B. Krishnamurthy, P. Gill, and M. Arlitt. A few chirps about twitter. In WOSP '08: Proceedings of the first workshop on Online social networks, pages 19--24, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- T. Y. Liu. Learning to rank for information retrieval. Tutorial on WWW conference, 2009. Google ScholarDigital Library
- C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarDigital Library
- D. Metzler, S. T. Dumais, and C. Meek. Similarity measures for short segments of text. In ECIR, pages 16--27, 2007. Google ScholarDigital Library
- D. Shamma, L. Kennedy, and E. Churchill. Tweet the debates: Understanding community annotation of uncollected sources. In Proceedings of the ACM International Conference on Multimedia. ACM, 2009. Google ScholarDigital Library
- X. Wang and C. Zhai. Learn from web search logs to organize search results. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarDigital Library
- D. Zhao and M. B. Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work. In GROUP '09: Proceedings of the ACM 2009 international conference on Supporting group work, pages 243--252, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th ACM SIGIR conference, 2007. Google ScholarDigital Library
- Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. NIPS, 2007.Google Scholar
Index Terms
- Time is of the essence: improving recency ranking using Twitter data
Recommendations
Towards recency ranking in web search
WSDM '10: Proceedings of the third ACM international conference on Web search and data miningIn web search, recency ranking refers to ranking documents by relevance which takes freshness into account. In this paper, we propose a retrieval system which automatically detects and responds to recency sensitive queries. The system detects recency ...
Improving recency ranking using twitter data
Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in contextIn Web search and vertical search, recency ranking refers to retrieving and ranking documents by both relevance and freshness. As impoverished in-links and click information is the the biggest challenge for recency ranking, we advocate the use of ...
Online learning for recency search ranking using real-time user feedback
CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge managementTraditional machine-learned ranking algorithms for web search are trained in batch mode, which assume static relevance of documents for a given query. Although such a batch-learning framework has been tremendously successful in commercial search engines,...
Comments