Skip to main content

Discovering Authoritative News Sources and Top News Stories

  • Conference paper
Information Retrieval Technology (AIRS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Included in the following conference series:

Abstract

With the popularity of reading news online, the idea of assembling news articles from multiple news sources and digging out the most important stories has become very appealing. In this paper we present a novel algorithm to rank assembled news articles as well as news sources according to their importance and authority respectively. We employ the visual layout information of news homepages and exploit the mutual reinforcement relationship between news articles and news sources. Specifically, we propose to use a label propagation based semi-supervised learning algorithm to improve the structure of the relation graph between sources and new articles. The integration of the label propagation algorithm with the HITS like mutual reinforcing algorithm produces a quite effective ranking algorithm. We implement a system TOPSTORY which could automatically generate homepages for users to browse important news. The result of ranking a set of news collected from multiple sources over a period of half a month illustrates the effectiveness of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–622 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Zhou, D.Y., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on Data Manifolds. MPI Technical Report (113), Max Planck Institute for Biological Cybernetics, Tübingen, Germany (2003)

    Google Scholar 

  3. Wayne, C.L.: Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation. In: Proceedings of the Language Resources and Evaluation Conference, LREC (2000)

    Google Scholar 

  4. Corso, G.M., Gulli, A., Romani, F.: Ranking a stream of news. In: Proceedings of the 14th International Conference on World Wide Web (2005)

    Google Scholar 

  5. Yao, J.Y., Wang, J., Li, Z.W., Li, M.J., Ma, W.Y.: Ranking Web News via Homepage Visual Layout and Cross-site Voting. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 131–142. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. He, J.R., Li, M.J., Zhang, H.J., Tong, H.H., Zhang, C.S.: Manifold-ranking based image retrieval. In: Proceedings of the 12th annual ACM International Conference on Multimedia (2004)

    Google Scholar 

  7. Jarvelin, K., Kekalainen, J.: Cumulated Gain-based Evaluation of IR Techniques. ACM Transactions on Information Systems (ACM TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  8. Zhu, X.J., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functitons. In: Proceedings of the 20th International Conference on Machine Learning (2003)

    Google Scholar 

  9. Cai, D., Yu, S.P., Wen, J.R., Ma, W.Y.: VIPS: a vision-based page segmentation algorithm. Microsoft Technical Report, MSR-TR-2003-79 (2003)

    Google Scholar 

  10. Radev, D.R., Blair-Goldensohn, S., Zhang, Z., Raghavan, R.S.: Newsinessence: A system for domain-independent, real-time news clustering and multi-document summarization. In: Proceedings of the Human Language Technology Conference (2001)

    Google Scholar 

  11. Gabrilovich, E., Dumais, S., Horvitz, E.: Newsjunkie: Providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International Conference on World Wide Web (2004)

    Google Scholar 

  12. http://www.nielsen-netratings.com/

  13. http://news.google.com/

  14. http://news.yahoo.com/

  15. http://newsbot.msnbc.msn.com/

  16. http://www.newsknife.com/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hu, Y., Li, M., Li, Z., Ma, Wy. (2006). Discovering Authoritative News Sources and Top News Stories. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_18

Download citation

  • DOI: https://doi.org/10.1007/11880592_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45780-0

  • Online ISBN: 978-3-540-46237-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics