Improving network response times using social information

Hiremagalore, Sharath; Liang, Chen; Stavrou, Angelos; Rangwala, Huzefa

doi:10.1007/s13278-012-0065-9

Improving network response times using social information

Original Article
Published: 08 April 2012

Volume 3, pages 209–220, (2013)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Sharath Hiremagalore¹,
Chen Liang¹,
Angelos Stavrou¹ &
…
Huzefa Rangwala²

229 Accesses
2 Citations
Explore all metrics

Abstract

Social networks and discussion boards have become significant outlets for people to communicate and freely express their opinions. Although the social networks themselves are usually well-provisioned, the participating users frequently point to external links in order to substantiate their discussions. Unfortunately, the heavy traffic load suddenly imposed on these externally linked websites makes them unresponsive, leading to the “flash crowd effect.” Flash crowds present a real challenge as their intensity and occurrence times are impossible to predict. Moreover, most present-day web hosting servers and caching systems, although increasingly capable, are designed to handle a nominal load of requests before they become unresponsive due to limited bandwidth or the processing power allocated to the hosting site. In this paper, we quantify the prevalence of flash crowd events for a popular social discussion board (Digg). Using PlanetLab, we measured the response times of 1,289 unique popular websites and verified that 89 % of the popular URLs suffered variations in their response times. In an effort to identify flash crowds in advance, we evaluated and compared traffic forecasting mechanisms. We showed that predicting network traffic using network measurements has very limited success and cannot be used for large-scale prediction. However, by analyzing the content and structure of social discussions, we were able to accurately forecast popularity for 86 % of the websites within 5 min of a story’s submission and for 95 % of the sites when more social content (5 h worth) became available. Our work indicates that we can effectively leverage social activity to forecast network events when it would otherwise be infeasible to anticipate them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

References

Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Google Scholar
Ali-Hasan N, Adamic LA (2007) Expressing social relationships on the blog through links and comments. In: International Conference on Weblogs and Social Media (ICWSM)
Barford P, Kline J, Plonka D, Ron A (2002) A signal analysis of network traffic anomalies. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment. ACM, pp 71–82
Baryshnikov Y, Coffman E, Pierre G, Rubenstein D, Squillante M, Yimwadsana T (2005) Predictability of web-server traffic congestion. In: Proceedings of the 10th international workshop on web content caching and distribution, IEEE Computer Society, Washington, DC, USA, pages 97–103
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30:1145–1159
Article Google Scholar
Canali C, Colajanni M, Lancellotti R (2010) Characteristics and evolution of content popularity and user relations in social networks. In: 2010 IEEE Symposium on Computers and Communications (ISCC), pp 750–756
Cha M, Prez J, Haddadi H (2011) The spread of media content through blogs. Soc Netw Anal Min. 1–16. doi:10.1007/s13278-011-0040-x
Chabaa S, Zeroual A, Antari J (2010) Identification and prediction of internet traffic using artificial neural networks. JILSA 2(3):147–155
Article Google Scholar
Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chang C-C, Lin C-J (2002) Training v-support vector regression: theory and algorithms. Neural Comput 14(8):1959–1977
Article MATH Google Scholar
Figueiredo F, Benevenuto F, and Almeida JM (2011) The tube over time: characterizing popularity growth of youtube videos. In: Proceedings of the fourth ACM international conference on Web search and data mining, WSDM ’11. ACM, New York, NY, USA, pp 745–754
Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32:63–76
Article MATH Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Article MathSciNet MATH Google Scholar
Fu-Ke S, Wei Z, Pan C (2009) An engineering approach to prediction of network traffic based on time-series model. In: International Joint Conference on Artificial Intelligence, 2009. JCAI’09, IEEE, pp 432–435
Halavais AMC (2001) The slashdot effect: analysis of a large-scale public conversation on the world wide web. University of Washington
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1)
Jamali S, Rangwala H (2009) Digging digg: comment mining, popularity prediction, and social network analysis. In: WISM’09-AICI’09, Shanghai University of Electic Power, Shanghai, China. EI Compendex and ISTP
Jung J, Krishnamurthy B, Rabinovich M (2002) Flash crowds and denial of service attacks: characterization and implications for cdns and web sites. In: Proceedings of the 11th international conference on World Wide Web, WWW ’02, ACM, New York, NY, USA, pages 293–304
Lakhina A, Crovella M, Diot C (2004) Characterization of network-wide anomalies in traffic flows. In: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, ACM, pp 201–206
Lerman K (2007) Social information processing in news aggregation. IEEE Internet Comput 11(6):16–28
Article MathSciNet Google Scholar
Li K, Zhou W, Li P, Hai J, Liu J (2009) Distinguishing ddos attacks from flash crowds using probability metrics. In: Third international conference on network and system security, 2009. NSS ’09, pp 9–17
Li X, Bian F, Crovella M, Diot C, Govindan R, Iannaccone G, Lakhina A (2006) Detection and identification of network anomalies using sketch subspaces. In: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, ACM, pp 147–152
Liang C, Hiremagalore S, Stavrou A, Rangwala H (2011) Predicting network response times using social information. In: ASONAM, pp 527–531
Mishne G, Glance N (2006) Leave a reply: an analysis of weblog comments. In: In third annual workshop on the Weblogging ecosystem
Niksic H (1996) GNU wget
Papagiannaki K, Taft N, Zhang Z.L, Diot C (2005) Long-term forecasting of Internet backbone traffic. IEEE Trans Neural Netw 16(5):1110–1124
Article Google Scholar
Rangwala H, Jamali S (2010) Defining a coparticipation network using comments on digg. Intell Syst IEEE 25(4):36–45
Article Google Scholar
Sengar H, Wang X, Wang H, Wijesekera D, Jajodia S (2009) Online detection of network traffic anomalies using behavioral distance. In: 17th International Workshop on quality of service, 2009. IWQoS, IEEE, pp 1–9
Shakkottai S, Johari R (2010) Demand-aware content distribution on the internet. IEEE/ACM Transact Netw 18(2):476–489
Article Google Scholar
Sivasubramanian S, Szymaniak M, Pierre G, Steen M (2004) Replication for web hosting systems. ACM Comput Surv (CSUR) 36(3):291–334
Article Google Scholar
Szabo G, Huberman B (2008) Predicting the popularity of online content. Technical Report HP Labs, pp 1–6
Tang L, Liu H (2009) Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 817–826
Tang L, Liu H (2010) Toward collective behavior prediction via social dimension extraction. IEEE Intell Syst
Webb G (1997) Decision tree grafting. In: In IJCAI-97: fifteen international joint conference on artificial intelligence, Morgan Kaufmann, pp 846–851
Webb GI (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40:159–196
Article Google Scholar
Wendell P, Freedman MJ (2011) Going viral: flash crowds in an open cdn. In: Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference, IMC ’11, ACM, New York, NY, USA, pp 549–558
Zhongbao K, Changshui Z (2003) Reply networks on a bulletin board system. Phys Rev E 67(3):036117
Article Google Scholar

Download references

Author information

Authors and Affiliations

Center for Secure Information Systems, George Mason University, Fairfax, VA, 22030, USA
Sharath Hiremagalore, Chen Liang & Angelos Stavrou
Department of Computer Science and Engineering, George Mason University, Fairfax, VA, 22030, USA
Huzefa Rangwala

Authors

Sharath Hiremagalore
View author publications
You can also search for this author in PubMed Google Scholar
Chen Liang
View author publications
You can also search for this author in PubMed Google Scholar
Angelos Stavrou
View author publications
You can also search for this author in PubMed Google Scholar
Huzefa Rangwala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huzefa Rangwala.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hiremagalore, S., Liang, C., Stavrou, A. et al. Improving network response times using social information. Soc. Netw. Anal. Min. 3, 209–220 (2013). https://doi.org/10.1007/s13278-012-0065-9

Download citation

Received: 11 October 2011
Revised: 28 February 2012
Accepted: 21 March 2012
Published: 08 April 2012
Issue Date: June 2013
DOI: https://doi.org/10.1007/s13278-012-0065-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving network response times using social information

Abstract

Access this article

Similar content being viewed by others

The Power of Digitalization: The Netflix Story

Social media analytics: a survey of techniques, tools and platforms

Performing web analytics with Google Analytics 4: a platform review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving network response times using social information

Abstract

Access this article

Similar content being viewed by others

The Power of Digitalization: The Netflix Story

Social media analytics: a survey of techniques, tools and platforms

Performing web analytics with Google Analytics 4: a platform review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation