Skip to main content

Cross-Domain Analysis of the Blogosphere for Trend Prediction

  • Chapter
  • First Online:
  • 2659 Accesses

Part of the book series: Lecture Notes in Social Networks ((LNSN,volume 6))

Abstract

In the recent years blogs became an important part of the web. New technologies like smartphones emerged that enable blogging at any time and make blogs more up-to-date than ever before. Due to their high popularity they are a valuable source of information regarding public opinions about all kind of topics. Blog postings that refer to products are of particular interest for companies to adjust marketing campaigns or advertisement. In this article we compare the blogging characteristics of two different domains: the music and the movie domain. We investigate how chatter from the blogosphere can be used to predict the success of products. We analyze and identify typical patterns of blogging behavior around the release of a product, point out methods for extracting features from the blogosphere and show that we can exploit these features to predict the monetary success of movies and music with high accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://blogpulse.com

  2. 2.

    http://www.blogscope.net/

  3. 3.

    http://twitter.com

  4. 4.

    http://search.twitter.com

  5. 5.

    http://www.amazon.com/gp/help/customer/display.html?nodeId=525376

  6. 6.

    http://boxofficemojo.com/

  7. 7.

    http://lucene.apache.org/java/2_4_0/scoring.html

  8. 8.

    http://blog.spinn3r.com/2007/10/announcing-spin.html

  9. 9.

    http://www.spinn3r.com/

  10. 10.

    http://lucene.apache.org/

  11. 11.

    http://www.cs.waikato.ac.nz/ml/weka/

  12. 12.

    http://www.imdb.com

  13. 13.

    http://swn.isti.cnr.it/

References

  1. Abel, F., Diaz-Aviles, E., Henze, N., Krause, D., Siehndel, P.: Analyzing the blogosphere for predicting the success of music and movie products. In: Memon, N., Alhajj, R. (eds.) International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2010), Odense, Denmark, pp. 276–280. IEEE Computer Society, Washington, DC (2010)

    Chapter  Google Scholar 

  2. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  3. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Wadsworth International Group, Belmont, California (1984)

    MATH  Google Scholar 

  4. Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.K.: Measuring user influence in twitter: the million follower fallacy. In: Cohen, W.W., Gosling, S. (eds.) Proceedings of the Fourth International Conference on Weblogs and Social Media (ICWSM ’10). AAAI, Palo Alto, California (2010)

    Google Scholar 

  5. Dhar, V., Chang, E.: Does Chatter Matter? The Impact of User-Generated Content on Music Sales. Journal of Interactive Marketing, 23(4), 300–307 (2009)

    Article  Google Scholar 

  6. Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, Vol. 28 (1998)

    Google Scholar 

  7. Glance, N.S., Hurst, M., Tomokiyo, T.: Blogpulse: automated trend discovery for weblogs. In: WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, ACM (2004)

    Google Scholar 

  8. Goetz, M, Leskovec, J., Mcglohon, M., Faloutsos, C.: Modeling blog dynamics. In: International Conference on Weblogs and Social Media. AAAI, Menlo Park (2009)

    Google Scholar 

  9. Gruhl, D., Guha, R., Nowell, D.L., Tomkins, A.: Information diffusion through blogspace. In: WWW ’04: Proceedings of the 13th International Conference on World Wide Web, pp. 491–501. ACM, New York (2004)

    Google Scholar 

  10. Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: KDD ’05: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 78–87. ACM, New York (2005)

    Google Scholar 

  11. Holmes, G., Pfahringer, B., Kirkby, R., Frank, E., Hall, M.: Multiclass alternating decision trees. In: ECML ’02: Proceedings of the 13th European Conference on Machine Learning, pp. 161–172. Springer, London (2002)

    Google Scholar 

  12. Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11(1), 63–90 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  13. John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. UAI’95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1995)

    Google Scholar 

  14. Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2), 81–93 (1938)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kohavi, R.: The power of decision tables. In: Lavrac, N., Wrobel, S. (eds.) Proceedings of the 8th European Conference on Machine Learning (ECML ’95), Heraclion, pp. 174–189. Springer, Berlin/Heidelberg (1995)

    Google Scholar 

  16. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 591–600. ACM, New York (2010)

    Google Scholar 

  17. Lerman, K., Ghosh, R.: Information contagion: an empirical study of spread of news on Digg and Twitter social networks. In: Proceedings of 4th International Conference on Weblogs and Social Media (ICWSM ’10), AAAI, Palo Alto, California, May 2010

    Google Scholar 

  18. Leskovec, J., Mcglohon, M., Faloutsos, C., Glance, N., Hurst, M.: Cascading behavior in large blog graphs. In: Society of Applied and Industrial Mathematics: Data Mining (SDM07), SIAM, Society for Industrial and Applied Mathematics, Philadelphia, April 2007

    Google Scholar 

  19. Liu, Y., Huang, X., An, A., Yu, X.: Arsa: a sentiment-aware model for predicting sales performance using blogs. In: SIGIR ’07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York (2007)

    Google Scholar 

  20. McGlohon, M., Leskovec, J., Faloutsos, C., Hurst, M., Glance, N.: Finding patterns in blog shapes and blog evolution. In: International Conference on Weblogs and Social Media, Boulder. Carnegie Mellon University, School of Computer Science, Machine, Pittsburgh (2007)

    Google Scholar 

  21. Obradovic, D., Baumann, S., Dengel, A.: A social network analysis and mining methodology for the monitoring of specific domains in the blogosphere. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2010), pp. 1–8. IEEE, Los Alamitos (2010)

    Google Scholar 

  22. Platt, J.C.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization, pp. 185–208. MIT, Cambridge (1999)

    Google Scholar 

  23. Poggio, T. Girosi, F.: Networks for approximation and learning. Proc. IEEE 78(9), 1481–1497 (1990)

    Article  Google Scholar 

  24. Sadikov, E., Parameswaran, A., Venetis, P.: Blogs as predictors of movie success. Technical report, Stanford University (2009)

    Google Scholar 

  25. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web (WWW ’10), pp. 851–860. ACM, New York (2010)

    Google Scholar 

  26. Sussman, M.: Who are the bloggers? The what and why of blogging. Technical report, Technorati Media (2009)

    Google Scholar 

  27. Technorati: State of the blogosphere 2008. Technical report, Technorati Media (2008)

    Google Scholar 

  28. Weng, J., Lim, E.P., He, Q., Leung, C.W.K.: What do people want in microblogs? measuring interestingness of hashtags in twitter. In: Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM ’10, pp. 1121–1126. IEEE Computer Society, Washington, DC (2010)

    Google Scholar 

  29. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11. ACM, New York (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Siehndel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Wien

About this chapter

Cite this chapter

Siehndel, P., Abel, F., Diaz-Aviles, E., Henze, N., Krause, D. (2013). Cross-Domain Analysis of the Blogosphere for Trend Prediction. In: Özyer, T., Rokne, J., Wagner, G., Reuser, A. (eds) The Influence of Technology on Social Network Analysis and Mining. Lecture Notes in Social Networks, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-1346-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-1346-2_12

  • Published:

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-1345-5

  • Online ISBN: 978-3-7091-1346-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics