Skip to main content

IPHITS: An Incremental Latent Topic Model for Link Structure

  • Conference paper
Information Retrieval Technology (AIRS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Abstract

The structure of linked documents is dynamic and keeps on changing. Even though different methods have been proposed to exploit the link structure in identifying hubs and authorities in a set of linked documents, no existing approach can effectively deal with its changing situation. This paper explores changes in linked documents and proposes an incremental link probabilistic framework, which we call IPHITS. The model deals with online document streams in a faster, scalable way and uses a novel link updating technique that can cope with dynamic changes. Experimental results on two different sources of online information demonstrate the time saving strength of our method. Besides, we make analysis of the stable rankings under small perturbations to the linkage patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bharat, K., Henzinger, M.R.: Improved algorithms for topic distillation in a hyperlinked environment. In: 21st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 104–111 (1998)

    Google Scholar 

  2. Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Link Analysis Ranking: Algorithms, Theory, and Experiments. ACM Transactions on Internet Technology 5(1), 231–297 (2005)

    Article  Google Scholar 

  3. Brandes, U.: A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25(2), 163–177 (2001)

    Article  MATH  Google Scholar 

  4. Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Raghavan, P., Rajagopalan, S.: Automatic resource list compilation by analyzing hyperlink structure and associated text. In: 7th International World Wide Web Conference, Brisbane, Austrilia, pp. 65–74 (1998)

    Google Scholar 

  5. Chou, T.C., Chen, M.C.: Using incremental PLSA for threshold resilient online event analysis. IEEE Trans. Knowledge and Data Engineering 20(3), 289–299 (2008)

    Article  MathSciNet  Google Scholar 

  6. Cohn, D., Chang, H.: Learning to probabilistically identify authoritative documents. In: 7th International Conference on Machine Learning, Austin, Texas, pp. 167–174 (2000)

    Google Scholar 

  7. Cohn, D., Hofmann, T.: The missing link - a probabilistic model of document content and hypertext connectivity. Neural Information Processing Systems 13 (2001)

    Google Scholar 

  8. Ding, C., He, X., Husbands, P., Zha, H., Simon, H.D.: PageRank, HITS and a unified framework for link analysis. In: 25th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 353–354 (2002)

    Google Scholar 

  9. Doan, A., Domingos, P., Halevy, A.Y.: Learning to match the schemas of data sources: A multistrategy approach. Machine Learning 50(3), 279–301 (2003)

    Article  MATH  Google Scholar 

  10. Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explorations Newsletter 7(2), 2–12 (2005)

    Google Scholar 

  11. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Maching Learning 42(1), 177–196 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  12. Jeh, G., Widom, J.: Scaling personalized web search. In: 12th International World Wide Web Conference, Budapest, Hungary, pp. 271–279 (2003)

    Google Scholar 

  13. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  14. Madadhain, J.O’., Hutchins, J., Smyth, P.: Prediction and ranking algorithms for even-based network data. SIGKDD Explorations 7(2) (2005)

    Google Scholar 

  15. Madadhain, J.O’., Smyth, P.: EventRank: A framework for ranking time-varying networks. In: 3rd KDD Workshop on Link Discovery LinkKDD, Issues, Approaches and Applications, Chicago, Illinois, pp. 9–16 (2005)

    Google Scholar 

  16. Ng, A.Y., Zheng, A.X., Jordan, M.I.: Link analysis, eigenvectors and stability. In: 17th International Joint Conference on Artificial Intelligence, Seattle, USA, pp. 903–910 (2001)

    Google Scholar 

  17. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report, Stanford University (1998)

    Google Scholar 

  18. Richardson, M., Domingos, P.: The intelligent surfer: probabilistic combination of link and content information in PageRank. Advances Neural Information Processing Systems 14 (2002)

    Google Scholar 

  19. Richardson, M., Prakash, A., Brill, E.: Beyond PageRank: machine learning for static ranking. In: 15th International World Wide Web Conference, Edinburth, Scotland, pp. 707–715 (2006)

    Google Scholar 

  20. Seeley, J.: The net of reciprocal influence: A problem in treating sociometric data. Canadian Journal of Psychology 3, 234–240 (1949)

    Article  Google Scholar 

  21. http://www.cs.umd.edu/~sen/lbc-proj/LBC.html

  22. Wu, F., Huberman, B.: Discovering communities in linear time: A physics approach. Europhysics Letters 38, 331–338 (2004)

    Google Scholar 

  23. Xu, G.: Building implicit links from content for forum search. In: 29th annual international ACM SIGIR conference on Research and development in information retrieval, Seattle, Washington, pp. 300–207 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, H., Zhao, W., Li, Z., Shi, Z. (2009). IPHITS: An Incremental Latent Topic Model for Link Structure. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04769-5_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04768-8

  • Online ISBN: 978-3-642-04769-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics