Skip to main content
Log in

On the Bursty Evolution of Blogspace

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

We propose two new tools to address the evolution of hyperlinked corpora. First, we define time graphs to extend the traditional notion of an evolving directed graph, capturing link creation as a point phenomenon in time. Second, we develop definitions and algorithms for time-dense community tracking, to crystallize the notion of community evolution.

We develop these tools in the context of Blogspace, the space of weblogs (or blogs). Our study involves approximately 750 K links among 25 K blogs. We create a time graph on these blogs by an automatic analysis of their internal time stamps. We then study the evolution of connected component structure and microscopic community structure in this time graph.

We show that Blogspace underwent a transition behavior around the end of 2001, and has been rapidly expanding, not just in metrics of scale but also in metrics of community structure and connectedness.

By randomizing link destinations in Blogspace, but retaining sources and timestamps, we introduce a concept of randomized Blogspace. Herein, we observe similar evolution of a giant component, but no corresponding increase in community structure.

Having demonstrated the formation of micro-communities over time, we then turn to the ongoing activity within active communities. We extend recent work of Kleinberg (2002) to discover dense periods of “bursty” intra-community link creation. Furthermore, we find that the blogs that give rise to these communities are significantly more enduring than an average blog.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in Proc. 20th Internat. Conference on Very Large Data Bases, 1994, pp. 487–499.

  2. Z. Bar-Yossef and S. Rajagopalan, “Template detection via data mining and its applications,” in Proc. 11th Internat. World-Wide Web Conference, 2002, pp. 580–591.

  3. N. Z. Bear, “The TTLB Blogosphere Ecosystem,” January 2004, http://www.truthlaidbear.com/ecosystem.php

  4. K. Bharat, B. Chang, M. Henzinger, and M. Ruhl, “Who links to whom: Mining linkage between web sites,” in IEEE Internat. Conference on Data Mining, 2001, pp. 51–58.

  5. B. E. Brewington and G. Cybenko, “Keeping Up with the Changing Web,” Computer 33(5), 2000, 52–58.

    Google Scholar 

  6. D. Eppstein, Z. Galil, and G. Italiano, “Dynamic graph algorithms,” in CRC Handbook of Algorithms and Theory of Computation, ed. M. J. Atallah, CRC Press, 1999, Chapter 8.

  7. P. Erdös and A. Rényi, “On the evolution of random graphs,” Magy. Tud. Akad. Mat. Kut. Intez. Kozl. 5, 1960, 17–61.

    Google Scholar 

  8. U. Feige, D. Peleg, and G. Kortsarz, “The dense k-subgraph problem,” Algorithmica 29(3), 2001, 410–421.

    Google Scholar 

  9. D. Fetterly, M. Manasse, M. Najork, and J. Wiener, “Crawling towards light: A large scale study of the evolution of Web pages,” in Proc. 1st Workshop on Algorithms for the Web Graph, 2002.

  10. G. W. Flake, S. Lawrence, and C. L. Giles, “Efficient identification of web communities,” in Proc. 6th ACM SIGKDD Internat. Conference on Knowledge Discovery and Data Mining, 2000, pp. 150–160.

  11. D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins, “Information Diffusion through Blogspace,” in Proc. 13th Internat. World-Wide Web Conference, 2004.

  12. J. Kleinberg, “Authoritative sources in a hyperlinked environment,” J. ACM 46(5), 2000, 604–632.

    Google Scholar 

  13. J. Kleinberg, “Bursty and hierarchical structure in streams,” in Proc. 8th ACM SIGKDD Internat. Conference on Knowledge Discovery and Data Mining, 2002, pp. 373–397.

  14. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “Extracting large scale knowledge bases from the Web,” in Proc. 27th Internat. Conference on Very Large Data Bases, 1999, pp. 639–650.

  15. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “Trawling the Web for cyber communities,” WWW8/Computer Networks 31(11–16), 1999, 1481–1493.

  16. T. Mitchell, Machine Learning, McGraw-Hill, 1997.

  17. Perseus Development Corporation, “The blogging iceberg,” 2004, http://www.perseusdevelopment.com/blogsurvey/thebloggingiceberg.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ravi Kumar.

Additional information

An extended abstract of this paper appeared in the 12th International World Wide Web Conference, 2003.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, R., Novak, J., Raghavan, P. et al. On the Bursty Evolution of Blogspace. World Wide Web 8, 159–178 (2005). https://doi.org/10.1007/s11280-004-4872-4

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-004-4872-4

Keywords

Navigation