skip to main content
research-article

Cost-Optimized Microblog Distribution over Geo-Distributed Data Centers: Insights from Cross-Media Analysis

Published: 20 April 2017 Publication History

Abstract

The unprecedent growth of microblog services poses significant challenges on network traffic and service latency to the underlay infrastructure (i.e., geo-distributed data centers). Furthermore, the dynamic evolution in microblog status generates a huge workload on data consistence maintenance. In this article, motivated by insights of cross-media analysis-based propagation patterns, we propose a novel cache strategy for microblog service systems to reduce the inter-data center traffic and consistence maintenance cost, while achieving low service latency. Specifically, we first present a microblog classification method, which utilizes the external knowledge from correlated domains, to categorize microblogs. Then we conduct a large-scale measurement on a representative online social network system to study the category-based propagation diversity on region and time scales. These insights illustrate social common habits on creating and consuming microblogs and further motivate our architecture design. Finally, we formulate the content cache problem as a constrained optimization problem. By jointly using the Lyapunov optimization framework and simplex gradient method, we find the optimal online control strategy. Extensive trace-driven experiments further demonstrate that our algorithm reduces the system cost by 24.5% against traditional approaches with the same service latency.

References

[1]
Jingwen Bian, Yang Yang, and Tat-Seng Chua. 2014. Predicting trending messages and diffusion participants in microblogging network. In Proceedings of the 37th International ACM SIGIR Conference on Research 8 Development in Information Retrieval. ACM, 537--546.
[2]
Lorenzo Bruzzone and Mattia Marconcini. 2010. Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 5 (2010), 770--787.
[3]
Sonja Buchegger, Doris Schiöberg, Le-Hung Vu, and Anwitaman Datta. 2009. Peerson: P2P social networking: Early experiences and insights. In Proceedings of the 2nd ACM EuroSys Workshop on Social Network Systems. ACM, 46--52.
[4]
Andrew R. Conn, Katya Scheinberg, and Luis N. Vicente. 2009. Introduction to Derivative-Free Optimization. Vol. 8. SIAM.
[5]
Peter Sheridan Dodds and Duncan J. Watts. 2005. A generalized model of social and biological contagion. Journal of Theoretical Biology 232, 4 (2005), 587--604.
[6]
Qiang Duan. 2015. Modeling and performance analysis for composite network--compute service provisioning in software-defined cloud environments. Digital Communications and Networks 1, 3 (2015), 181--190.
[7]
Yue Gao, Meng Wang, Zheng-Jun Zha, Jialie Shen, Xuelong Li, and Xindong Wu. 2013. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing 22, 1 (2013), 363--376.
[8]
Tao Guan, Yunfeng He, Liya Duan, Jianzhong Yang, Juan Gao, and Junqing Yu. 2014. Efficient BOF generation and compression for on-device mobile visual location recognition. IEEE MultiMedia 21, 2 (2014), 32--41.
[9]
Tao Guan, Yunfeng He, Juan Gao, Jianzhong Yang, and Junqing Yu. 2013. On-device mobile visual location recognition by integrating vision and inertial sensors. IEEE Transactions on Multimedia 15, 7 (2013), 1688--1699.
[10]
Tao Guan, Yuesong Wang, Liya Duan, and Rongrong Ji. 2015. On-device mobile landmark recognition using binarized descriptor with multifeature fusion. ACM Transactions on Intelligent Systems and Technology (TIST) 7, 1 (2015), 12.
[11]
Han Hu, Yonggang Wen, Tat-Seng Chua, Jian Huang, Wenwu Zhu, and Xuelong Li. 2016. Joint content replication and request routing for social video distribution over cloud CDN: A community clustering method. IEEE Transactions on Circuits and Systems for Video Technology 26, 7 (July 2016), 1320--1333.
[12]
Han Hu, Yonggang Wen, Tat-Seng Chua, and Xuelong Li. 2014. Toward scalable systems for big data analytics: A technology tutorial. IEEE Access 2 (2014), 652--687.
[13]
Han Hu, Yonggang Wen, Huanbo Luan, Tat-Seng Chua, and Xuelong Li. 2014. Toward multiscreen social TV with geolocation-aware social sense. IEEE MultiMedia 21, 3 (July 2014), 10--19.
[14]
Rongrong Ji, Yue Gao, Bineng Zhong, Hongxun Yao, and Qi Tian. 2011. Mining flickr landmarks by modeling reconstruction sparsity. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 7, 1 (2011), 31.
[15]
Yichao Jin, Yonggang Wen, and Han Hu. 2013. Minimizing monetary cost via cloud clone migration in multi-screen cloud social TV system. In Proceedings of the 2013 IEEE Global Communications Conference (GLOBECOM’13). IEEE, 1747--1752.
[16]
Yichao Jin, Yonggang Wen, Han Hu, and M.-J. Montpetit. 2014. Reducing operational costs in cloud social TV: An opportunity for cloud cloning. IEEE Transactions on Multimedia 16, 6 (Oct. 2014), 1739--1751.
[17]
Balachander Krishnamurthy, Phillipa Gill, and Martin Arlitt. 2008. A few chirps about twitter. In Proceedings of the 1st Workshop on Online Social Networks. ACM, 19--24.
[18]
Michal Kryczka, Ruben Cuevas, Carmen Guerrero, Eiko Yoneki, and Arturo Azcorra. 2010. A first step towards user assisted online social networks. In Proceedings of the 3rd Workshop on Social Network Systems. ACM, 6.
[19]
Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno M. Preguiça, and Rodrigo Rodrigues. 2012. Making geo-replicated systems fast as possible, consistent when necessary. In Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12). 265--278.
[20]
XueLong LI and HaiGang Gong. 2015. A survey on big data systems. SCIENTIA SINICA Informationis 45, 1 (2015), 1.
[21]
Guoxin Liu, Haiying Shen, and H. Chandler. 2013. Selective data replication for online social networks with distributed datacenters. In Proceedings of the 2013 21st IEEE International Conference on Network Protocols (ICNP’13). 1--10.
[22]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Vol. 1. Cambridge University Press, Cambridge.
[23]
Bertrand Mathieu and Patrick Truong. 2014. A CCN-based social network application optimising network proximity. In Proceedings of the 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS’14). IEEE, 446--451.
[24]
Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. ACM, 29--42.
[25]
Michael J. Neely. 2010. Stochastic network optimization with application to communication and queueing systems. Synthesis Lectures on Communication Networks 3, 1 (2010), 1--211.
[26]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling memcache at facebook. In Presented as Part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 385--398.
[27]
Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang, Nikos Laoutaris, Parminder Chhabra, and Pablo Rodriguez. 2011. The little engine (s) that could: Scaling online social networks. ACM SIGCOMM Computer Communication Review 41, 4 (2011), 375--386.
[28]
Craig Simth. 2015. By the numbers: 150+ amazing Twitter statistics. Retrieved June 13, 2015, from http://expandedramblings.com/index.php/march-2013-by-the-numbers-a-few-amazing-twitter-stats/#.U855Wfm4WjA.
[29]
Sina.Com. 2015. Sina. Retrieved June 13, 2015, from www.sina.com.cn.
[30]
Hikaru Takemura and Keishi Tajima. 2012. Tweet classification based on their lifetime duration. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 2367--2370.
[31]
Zhi Wang, Lifeng Sun, Xiangwen Chen, Wenwu Zhu, Jiangchuan Liu, Minghua Chen, and Shiqiang Yang. 2012. Propagation-based social-aware replication for social video contents. In Proceedings of the 20th ACM International Conference on Multimedia. ACM, 29--38.
[32]
Benchang Wei, Tao Guan, Liya Duan, Junqing Yu, and Tan Mao. 2015. Wide area localization and tracking on camera phones for mobile augmented reality systems. Multimedia Systems 21, 4 (2015), 381--399.
[33]
Weibo.Com. 2015. Sina Weibo. Retrieved June 13, 2015, from www.weibo.com.
[34]
Yonggang Wen, Xiaoqing Zhu, Joel JPC Rodrigues, and Chang Wen Chen. 2014. Cloud mobile media: Reflections and outlook. IEEE Transactions on Multimedia 16, 4 (2014), 885--902.
[35]
Mike P. Wittie, Veljko Pejovic, Lara Deek, Kevin C. Almeroth, and Ben Y. Zhao. 2010. Exploiting locality of interest in online social networks. In Proceedings of the 6th International Conference. ACM, 25.
[36]
Watcharee Wongyai and Luck Charoenwatana. 2012. Examining the network traffic of Facebook homepage retrieval: An end user perspective. In Proceedings of the 2012 International Joint Conference on Computer Science and Software Engineering (JCSSE). IEEE, 77--81.
[37]
Dapeng Wu, Boran Yang, and Ruyan Wang. 2016. Scalable privacy-preserving big data aggregation mechanism. Digital Communications and Networks 2, 3 (2016), 122--129.
[38]
Fangfei Zhou, Liang Zhang, Eric Franco, Alan Mislove, Richard Revis, and Ravi Sundaram. 2012. WebCloud: Recruiting social network users to assist in content distribution. In Proceedings of the 2012 11th IEEE International Symposium on Network Computing and Applications (NCA’12). IEEE, 10--19.
[39]
Yue-Ting Zhuang, Yi Yang, and Fei Wu. 2008. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Transactions on Multimedia 10, 2 (2008), 221--229.

Cited By

View all
  • (2023)SDTP: Accelerating Wide-Area Data Analytics With Simultaneous Data Transfer and ProcessingIEEE Transactions on Cloud Computing10.1109/TCC.2021.311999111:1(911-926)Online publication date: 1-Jan-2023
  • (2022)Geo-Distributed IoT Data Analytics With Deadline Constraints Across Network EdgeIEEE Internet of Things Journal10.1109/JIOT.2022.31861739:22(22914-22929)Online publication date: 15-Nov-2022
  • (2019)BBS Posts Time Series Analysis based on Sample Entropy and Deep Neural NetworksEntropy10.3390/e2101005721:1(57)Online publication date: 12-Jan-2019
  • Show More Cited By

Index Terms

  1. Cost-Optimized Microblog Distribution over Geo-Distributed Data Centers: Insights from Cross-Media Analysis

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 3
      Special Issue: Mobile Social Multimedia Analytics in the Big Data Era and Regular Papers
      May 2017
      320 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/3040485
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 April 2017
      Accepted: 01 November 2016
      Revised: 01 August 2016
      Received: 01 July 2015
      Published in TIST Volume 8, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Social media analytics
      2. cross-media analysis
      3. data center
      4. performance optimization

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)SDTP: Accelerating Wide-Area Data Analytics With Simultaneous Data Transfer and ProcessingIEEE Transactions on Cloud Computing10.1109/TCC.2021.311999111:1(911-926)Online publication date: 1-Jan-2023
      • (2022)Geo-Distributed IoT Data Analytics With Deadline Constraints Across Network EdgeIEEE Internet of Things Journal10.1109/JIOT.2022.31861739:22(22914-22929)Online publication date: 15-Nov-2022
      • (2019)BBS Posts Time Series Analysis based on Sample Entropy and Deep Neural NetworksEntropy10.3390/e2101005721:1(57)Online publication date: 12-Jan-2019
      • (2019)Deep Multi-scale Discriminative Networks for Double JPEG Compression ForensicsACM Transactions on Intelligent Systems and Technology10.1145/330127410:2(1-20)Online publication date: 15-Feb-2019
      • (2018)Toward Rendering-Latency Reduction for Composable Web Services via Priority-Based Object CachingIEEE Transactions on Multimedia10.1109/TMM.2017.277904120:7(1864-1875)Online publication date: Jul-2018

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media