skip to main content
research-article

Stream-monitoring with blockmon: convergence of network measurements and data analytics platforms

Published:29 April 2013Publication History
Skip Abstract Section

Abstract

Recent work in network measurements focuses on scaling the performance of monitoring platforms to 10Gb/s and beyond. Concurrently, IT community focuses on scaling the analysis of big-data over a cluster of nodes. So far, combinations of these approaches have targeted flexibility and usability over real-timeliness of results and efficient allocation of resources. In this paper we show how to meet both objectives with BlockMon, a network monitoring platform originally designed to work on a single node, which we extended to run distributed stream-data analytics tasks. We compare its performance against Storm and Apache S4, the state-of-the-art open-source stream-processing platforms, by implementing a phone call anomaly detection system and a Twitter trending algorithm: our enhanced BlockMon has a gain in performance of over 2.5x and 23x, respectively. Given the different nature of those applications and the performance of BlockMon as single-node network monitor [1], we expect our results to hold for a broad range of applications, making distributed BlockMon a good candidate for the convergence of network-measurement and IT-analysis platforms.

References

  1. A. di Pietro, F. Huici, N. Bonelli, B. Trammell, P. Kastovsky, T. Groleat, S. Vaton, and M. Dusi. Blockmon: Toward high-speed composable network traffic measurement. In Proceedings of the IEEE Infocom Conference (mini-conference), 2013.Google ScholarGoogle Scholar
  2. G. Iannaccone. Fast prototyping of network data mining applications. In Proceeding of the Passive and Active Measurement Conference, 2006.Google ScholarGoogle Scholar
  3. N. Bonelli, A. Di Pietro, S. Giordano, and G. Procissi. On multi-gigabit packet capturing with multi-core commodity hardware. In Proceedings of the Passive and Active Measurement Conference, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apache Hadoop. http://hadoop.apache.org (accessed 2012--11--10).Google ScholarGoogle Scholar
  6. T. Condie, N. Conway, P. Alvaro, J. Hellerstein, K. Elmeleegy, and R. Sears. Mapreduce online. In Proceedings of the USENIX NSDI Conference, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: a fault tolerant abstraction for in-memory cluster computing. In Proceedings of the USENIX NSDI conference, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Storm. http://storm-project.net (accessed 2012--11--10).Google ScholarGoogle Scholar
  9. L. Neumeyer, B. Robbins, A. Nair, and A. Kesari. S4: Distributed stream computing platform. In Proceedings of the International Conference on Data Mining Workshops, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Bianchi, N. d'Heureuse, and S. Niccolini. On-demand time-decaying bloom filters for telemarketer detection. Comput. Commun. Rev., 41(5):5--12, Sep. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. FP7 Demons Project. http://fp7-demons.eu (accessed 2012--11--10).Google ScholarGoogle Scholar
  12. BlockMon. http://blockmon.github.com/blockmon (accessed 2012--11--10).Google ScholarGoogle Scholar
  13. The 0MQ Project. http://www.zeromq.org.Google ScholarGoogle Scholar
  14. The Nimbus Project. http://www.nimbusproject.org.Google ScholarGoogle Scholar
  15. Apache S4. http://incubator.apache.org/s4 (accessed 2012--11--10).Google ScholarGoogle Scholar
  16. GNIP. http://gnip.com.Google ScholarGoogle Scholar
  17. Kestrel Queues. https://github.com/robey/kestrel.Google ScholarGoogle Scholar
  18. D. Eyers, T. Freudenreich, A. Margara, S. Frischbier, P. Pietzuch, and P. Eugster. Living in the present: on-the-y information processing in scalable web architectures. In Proceedings of the ACM International Workshop on Cloud Computing Platforms, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Scribe. https://github.com/facebook/scribe.Google ScholarGoogle Scholar
  20. J. Kreps, N. Narkhede, and J. Rao. Kafka: A distributed messaging system for log processing. In Proceedings of the International Workshop on Networking Meets Databases, 2011.Google ScholarGoogle Scholar
  21. Cloud MapReduce. http://code.google.com/p/cloudmapreduce.Google ScholarGoogle Scholar
  22. HStreaming. http://www.hstreaming.com.Google ScholarGoogle Scholar
  23. Brisk. http://www.datastax.com/products/enterprise.Google ScholarGoogle Scholar
  24. C. Bockermann and H. Blom. Processing data streams with the rapidminer streams-plugin. In Proceedings of the RapidMiner Community Meeting and Conference, 2012.Google ScholarGoogle Scholar
  25. Y. Lee and Y. Lee. Toward scalable internet traffic measurement and analysis with hadoop. Comput. Commun. Rev., 43(1):5--13, Jan. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 43, Issue 2
    April 2013
    72 pages
    ISSN:0146-4833
    DOI:10.1145/2479957
    Issue’s Table of Contents

    Copyright © 2013 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 29 April 2013

    Check for updates

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader