Skip to main content
Log in

Performance improvement of Apache Storm using InfiniBand RDMA

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In this paper, we attempt to improve the performance of a real-time stream processing by applying Apache Storm on InfiniBand. Apache Storm is a representative distributed framework for real-time stream processing, and InfiniBand is a high-performance communication standard. The default approach of running Storm on InfiniBand is to use IP over InfiniBand (IPoIB), which causes a serious CPU overload and fails to exploit high performance of InfiniBand. The CPU overload is mainly caused by frequent context switching and buffer copying operations. To solve this, we propose a new communication methodology using InfiniBand’s remote direct memory access (RDMA). In the proposed method, we replace the existing communication framework Netty to RJ-Netty, which is an RDMA/JXIO-based communication method. We can use Netty as well as RJ-Netty in Storm based on the preference. We also maximize the performance of RJ-Netty by applying multithreading on JXIO servers. Experimental results show that RJ-Netty significantly reduces CPU load while improving message throughput and complete latency compared to IPoIB as well as Ethernet. We believe that, as the first attempt to run Storm on InfiniBand, our approach is excellent in improving the processing performance of Storm by using InfiniBand RDMA functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. All source code of RJ-Netty designed and implemented in this paper can be accessed as open source in GitHub. The GitHub address is https://github.com/dke-knu/i2am/tree/master/rdma-based-storm.

References

  1. Accelio (Official website). http://www.accelio.org/

  2. Accelio (Open source code). https://github.com/accelio/accelio/

  3. Apache Flink. https://flink.apache.org/

  4. Apache Hadoop. http://hadoop.apache.org/

  5. Apache S4. http://incubator.apache.org/projects/s4.html

  6. Apache Spark. https://spark.apache.org/

  7. Apache Storm. http://storm.apache.org/

  8. Apache Thrift. https://thrift.apache.org/

  9. Apache Zookeeper. http://zookeeper.apache.org/

  10. Assuncaoa M, Veith A, Buyya R (2018) Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J Netw Comput Appl 103:1–17

    Article  Google Scholar 

  11. Barak D Introduction to remote direct memory access (RDMA). http://www.rdmamojo.com/2014/03/31/remote-direct-memory-access-rdma/. Accessed 27 May 2019

  12. Caneill M, Rheddane AE, Leroy V, Palma ND (2016) Locality-aware routing in stateful streaming applications. In: Proceedings of the 17th International Middleware Conference, Trento, Italy, pp 4:1–4:13

  13. Companies that have adopted Apache Storm. http://storm.apache.org/documentation/Powered-By.html

  14. Goetz P, O’Neill B (2014) Storm blueprints: patterns for distributed real-time computation. Packt Publishing, Birmingham

    Google Scholar 

  15. Huang J, Ouyang X, Jose J, Wasi-Ur-Rahman M, Wang H, Luo M, Subramoni H, Murthy C, Panda DK (2012) High-performance design of HBase with RDMA over InfiniBand. In: Proceedings of the IEEE 26th International Symposium on Parallel and Distributed Processing, Shanghai, China, pp 774–785

  16. Hunt P, Konar M, Junqueira FP, Reed B (2010) Zookeeper: wait-free coordination for internet-scale systems. In: Proceedings of the USENIX Annual Technical Conference, Boston, MA, pp 1–6

  17. Infiniband Trade Association. http://www.infinibandta.org/

  18. Inoubli W, Aridhi S, Mezni H, Maddouri M, Nguifo E (2018) An experimental survey on big data frameworks. Future Gener Comput Syst 86:546–564

    Article  Google Scholar 

  19. Islam NS, Wasi-Ur-Rahman M, Jose J, Rajachandrasekar R, Wang H, Subramoni H, Murthy C, Panda DK (2012) High performance RDMA-based design of HDFS over InfiniBand. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis, Salt Lake City, UT, pp 1–12

  20. JXIO. https://github.com/accelio/JXIO/

  21. Kili A, Understand Linux load averages and monitor performance of Linux. https://www.tecmint.com/understand-linux-load-averages-and-monitor-performance/. Accessed 27 May 2019

  22. Kim Y, Son S, Moon Y-S (2019) Apache storm configuration platform for dynamic sampling and filtering of data streams. ICIC Express Lett 10(1):1537

    Google Scholar 

  23. Lmax disruptor. https://lmax-exchange.github.io/disruptor/

  24. Lu X, Islam NS, Wasi-Ur-Rahman M, Jose J, Subramoni H, Wang H, Panda DK (2013) High-performance design of hadoop RPC with RDMA over InfiniBand. In: Proceedings of the IEEE 42nd International Conference on Parallel Processing, Lyon, France, pp 641–650

  25. Lu X, Wasi-Ur-Rahman M, Islam N, Shankar D, Panda DK (2014) Accelerating spark with RDMA for big data processing: early experiences. In: Proceedings of the IEEE 22nd Annual Symposium on High-Performance Interconnects, Mountain View, CA, pp 9–16

  26. Netty. https://netty.io/

  27. Silberschatz A, Galvin PB, Gagne G (2012) Operating system concepts, 9th edn. Wiley, Nwe York

    MATH  Google Scholar 

  28. Son S, Lee S, Gil M-S, Choi M-J, Moon Y-S (2018) Locality aware traffic distribution in apache storm for energy analytics platform. In: Proceedings of IEEE International Conference on Big Data and Smart Computing, Shanghai, China, pp 721–724

  29. Stevens WR, Fenner B, Rudoff AM (2013) UNIX network programming, 3rd edn. Addison-Wesley Professional, Boston

    Google Scholar 

  30. Top 500 Project. https://www.top500.org/

  31. Toshniwal A, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, Jackson J, Gade K, Fu M, Donham J, Bhagat N, Mittal S, Ryaboy D (2014) Storm@Twitter. In: Proceedings of the Internationl Conference on Management of Data, ACM SIGMOD, Snowbird, Utah, pp 147–156

  32. Wang C, Meng X, Guo Q, Weng Z, Yang C (2017) Automating characterization deployment in distributed data stream management systems. IEEE Trans Knowl Data Eng 29(12):2669–2681

    Article  Google Scholar 

  33. Wasi-Ur-Rahman M, Islam NS, Lu X, Jose J, Subramoni H, Wang H, Panda DK (2013) High-performance RDMA-based design of Hadoop MapReduce over InfiniBand. In: Proceedings of the IEEE 27th International Parallel and Distributed Processing Symposium Workshops, Cambridge, MA, pp 1908–1917

  34. Yang S (2017) IoT stream processing and analytics in the fog. IEEE Commun Mag 55(8):21–27

    Article  Google Scholar 

  35. Zeuch S, Monte B, Karimov J, Lutz C, Renz M, Traub J, BreB S, Tabl T, Markl V (2019) Analyzing efficient stream processing on modern hardware. Proc VLDB Endow 12(5):516–530

    Article  Google Scholar 

Download references

Acknowledgements

This research was partly supported by Korea Electric Power Corporation (Grant No. R18XA05) and the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2017R1A2B4008991).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang-Sae Moon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary Korean version of this work was published as “Reconfiguration of Apache Storm for InfiniBand Communications” in KIPS Trans. on Software and Data Engineering (KTSDE), Vol. 7, No. 8, pp. 297–306, Aug. 2018. This is a fully rewritten and extended English version of the Korean manuscript. The major extensions include (1) the detailed technical components of Accelio, (2) the additional experiments on complete latency, (3) the detailed examples of CPU usage, and (4) the detailed RDMA-related works.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, S., Son, S., Choi, MJ. et al. Performance improvement of Apache Storm using InfiniBand RDMA. J Supercomput 75, 6804–6830 (2019). https://doi.org/10.1007/s11227-019-02905-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-02905-7

Keywords

Navigation