skip to main content
10.1145/1281192.1281213acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Structural and temporal analysis of the blogosphere through community factorization

Published: 12 August 2007 Publication History

Abstract

The blogosphere has unique structural and temporal properties since blogs are typically used as communication media among human individuals. In this paper, we propose a novel technique that captures the structure and temporal dynamics of blog communities. In our framework, a community is a set of blogs that communicate with each other triggered by some events (such as a news article). The community is represented by its structure and temporal dynamics: a community graph indicates how often one blog communicates with another, and a community intensity indicates the activity level of the community that varies over time. Our method, community factorization, extracts such communities from the blogosphere, where the communication among blogs is observed as a set of subgraphs (i.e., threads of discussion). This community extraction is formulated as a factorization problem in the framework of constrained optimization, in which the objective is to best explain the observed interactions in the blogosphere over time. We further provide a scalable algorithm for computing solutions to the constrained optimization problems. Extensive experimental studies on both synthetic and real blog data demonstrate that our technique is able to discover meaningful communities that are not detectable by traditional methods.

References

[1]
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proc. of the 12th ACM SIGKDD Conference, 2006.
[2]
D. Chakrabarti, R. Kumar, and A. Tomkins. Evolutionary clustering. In Proc. of the 12th ACM SIGKDD Conference, 2006.
[3]
Y. Chi, B. L. Tseng, and J. Tatemura. Eigen-trend: Trend analysis in the blogosphere based on singular value decompositions. In Proc. of the 15th CIKM Conference, 2006.
[4]
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
[5]
L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM J. on Matrix Analysis and Applications, 21(4), 2000.
[6]
C. Ding, T. Li, and M. Jordan. Convex and semi-nonnegative matrix factorizations for clustering and low-dimension representation. Technical Report LBNL-60428, Lawrence Berkeley National Laboratory, 2006.
[7]
G. Flake, S. Lawrence, and C. Giles. Efficient identification of web communities. In Proc. of the 6th ACM SIGKDD Conference, 2000.
[8]
G. Golub and C. V. Loan. Matrix Computations. Johns Hopkins University Press, third edition, 1996.
[9]
D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proc. Of the 13th WWW Conference, 2004.
[10]
D. A. Harville. Matrix Algebra From a Statistician's Perspective. Springer, first edition, 2000.
[11]
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, first edition, 2003.
[12]
H. Ino, M. Kudo, and A. Nakamura. Partitioning of web graphs by community topology. In Proc. of the 14th WWW Conference, 2005.
[13]
R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proc. of the 12th WWW Conference, 2003.
[14]
R. Kumar, J. Novak, and A. Tomkins. Structure and evolution of online social networks. In Proc. of the 12th ACM SIGKDD Conference, 2006.
[15]
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401, 1999.
[16]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proc. of the 11th ACM SIGKDD Conference, 2005.
[17]
Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In Proc. of the 15th WWW Conference, 2006.
[18]
M. E. J. Newman. Modularity and community structure in networks. Proc. Natl. Acad. Sci., 2006.
[19]
A. Qamra, B. L. Tseng, and E. Y. Chang. Mining blog stories using community-based and temporal clustering. In Proc. of the 15th CIKM Conference, 2006.
[20]
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(8), 2000.
[21]
J. B. T. Falkowski and M. Spiliopoulou. Mining and visualizing the evolution of subgroups in social networks. In Proc. of the IEEE WI Conference, 2006.
[22]
G. Wahba. Spline Models for Observational Data. Society for Industrial and Applied Mathematics, 1990.
[23]
X. Wang and A. McCallum. Topics over time: A non-Markov continuous-time model of topical trends. In Proc. of the 12th ACM SIGKDD Conference, 2006.
[24]
S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In SDM, 2005.
[25]
W. Xu and Y. Gong. Document clustering by concept factorization. In Proceedings of the 27th Annual International ACM SIGIR Conference, 2004.
[26]
W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In Proceedings of the 26th Annual International ACM SIGIR Conference, 2003.

Cited By

View all
  • (2019)Tracking Network Evolution and Their Applications in Structural Network AnalysisIEEE Transactions on Network Science and Engineering10.1109/TNSE.2018.28156866:3(562-575)Online publication date: 1-Jul-2019
  • (2019)Generalized Interval Valued Nonnegative Matrix FactorizationICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682181(3412-3416)Online publication date: May-2019
  • (2016)Focal structures analysis: identifying influential sets of individuals in a social networkSocial Network Analysis and Mining10.1007/s13278-016-0319-z6:1Online publication date: 8-Apr-2016
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2007
1080 pages
ISBN:9781595936097
DOI:10.1145/1281192
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. blog
  2. blogosphere
  3. community factorization
  4. iterative search
  5. non-negative matrix factorization
  6. regularization

Qualifiers

  • Article

Conference

KDD07

Acceptance Rates

KDD '07 Paper Acceptance Rate 111 of 573 submissions, 19%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Tracking Network Evolution and Their Applications in Structural Network AnalysisIEEE Transactions on Network Science and Engineering10.1109/TNSE.2018.28156866:3(562-575)Online publication date: 1-Jul-2019
  • (2019)Generalized Interval Valued Nonnegative Matrix FactorizationICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682181(3412-3416)Online publication date: May-2019
  • (2016)Focal structures analysis: identifying influential sets of individuals in a social networkSocial Network Analysis and Mining10.1007/s13278-016-0319-z6:1Online publication date: 8-Apr-2016
  • (2015)Variable Selection for Efficient Nonnegative Tensor FactorizationProceedings of the 2015 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM.2015.31(805-810)Online publication date: 14-Nov-2015
  • (2015)Evolutionary Influence Maximization in Viral MarketingRecommendation and Search in Social Networks10.1007/978-3-319-14379-8_11(217-247)Online publication date: 13-Feb-2015
  • (2015)Data mining-based tag recommendation systemWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.11495:3(87-112)Online publication date: 1-May-2015
  • (2014)Distinction and Status Production on User-Generated Content PlatformsInformation Systems Research10.1287/isre.2014.053525:3(468-488)Online publication date: 1-Sep-2014
  • (2014)On the use of intrinsic time scale for dynamic community detection and visualization in social networks2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS)10.1109/RCIS.2014.6861033(1-11)Online publication date: May-2014
  • (2014)Social readerMultimedia Tools and Applications10.1007/s11042-012-1138-569:3(951-990)Online publication date: 1-Apr-2014
  • (2014)Decision Support Based on Time-Series AnalyticsProceedings of the 16th International Conference on Human Interface and the Management of Information. Information and Knowledge in Applications and Services - Volume 852210.1007/978-3-319-07863-2_22(217-225)Online publication date: 22-Jun-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media