skip to main content
10.1145/1183614.1183628acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Eigen-trend: trend analysis in the blogosphere based on singular value decompositions

Published:06 November 2006Publication History

ABSTRACT

The blogosphere - the totality of blog-related Web sites - has become a great source of trend analysis in areas such as product survey, customer relationship, and marketing. Existing approaches are based on simple counts, such as the number of entries or the number of links. In this paper, we introduce a novel concept, coined eigen-trend, to represent the temporal trend in a group of blogs with common interests and propose two new techniques for extracting eigen-trends in blogs. First, we propose a trend analysis technique based on the singular value decomposition. Extracted eigen-trends provide new insights into multiple trends on the same keyword. Second, we propose another trend analysis technique based on a higher-order singular value decomposition. This analyzes the blogosphere as a dynamic graph structure and extracts eigen-trends that reflect the structural changes of the blogosphere over time. Experimental studies based on synthetic data sets and a real blog data set show that our new techniques can reveal a lot of interesting trend information and insights in the blogosphere that are not obtainable from traditional count-based methods.

References

  1. J. Cho and H. Garcia-Molina. Effective page refresh policies for web crawlers. ACM Tran. on Database Systems, 28(4), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM J. on Matrix Analysis and Applications, 21(4), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. De Lathauwer, B. De Moor, and J. Vandewalle. On the best rank-1 and rank-(r1, r2,.., rn) approximation of higher-order tensors. SIAM J. on Matrix Analysis and Applications, 21(4), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman. Indexing by latent semantic analysis. J. American Soc. Info. Sci., 41, 1990.Google ScholarGoogle Scholar
  5. F. Douglis, A. Feldmann, and B. Krishnamurthy. Rate of change and other metrics: a live study of the World Wide Web. In Proc. of the USENIX Symposium on Internet Technologies and Systems, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener. A large-scale study of the evolution of web pages. In Proc. of the 12th WWW Conference, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. S. Glance, M. Hurst, and T. Tomokiyo. BlogPulse: Automated trend discovery for weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.Google ScholarGoogle Scholar
  8. G. Golub and C. V. Loan. Matrix Computations. Johns Hopkins University Press, third edition, 1996.Google ScholarGoogle Scholar
  9. D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proc. of the 13th WWW Conference, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. I. Jolliffe. Principal Component Analysis. Springer, second edition, 2002.Google ScholarGoogle Scholar
  11. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. of the ACM, 46(5), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. G. Kolda, B. W. Bader, and J. P. Kenny. Higher-order web link analysis using multilinear algebra. In Proc. of the 5th ICDM Conf., 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Kumar, J. Novak, P. Raghavan, and A. Tomkins. On the bursty evolution of blogspace. In Proc. of the 12th WWW Conference, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Lai. Temporal analysis of the human development indicators: Principal component approach. Social Indicators Research, 51, 2000.Google ScholarGoogle Scholar
  15. A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural analysis of network traffic flows. In Proc. of the 2004 SIGMETRICS Conf., 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proc. of the 11th ACM SIGKDD Conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In Proc. of the 15th WWW Conference, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. H. Murray. Sizing the internet. In White paper, Cyveillance, Inc., 2000.Google ScholarGoogle Scholar
  19. A. Ntoulas, J. Cho, , and C. Olston. What's new on the Web? the evolution of the web from a search engine perspective. In Proc. of the 13th WWW Conference, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Song, B. L. Tseng, C.-Y. Lin, and M.-T. Sun. ExpertiseNet: Relational and evolutionary expert modeling. In Int. Conf. on User Modeling, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Eigen-trend: trend analysis in the blogosphere based on singular value decompositions

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
            November 2006
            916 pages
            ISBN:1595934332
            DOI:10.1145/1183614

            Copyright © 2006 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 6 November 2006

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate1,861of8,427submissions,22%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader