Fault-Tolerant and Elastic Streaming MapReduce with Decentralized Coordination
- Univ. of Southern California, Los Angeles, CA (United States)
- Indian Inst. of Technology (IIT), Bangalore (India)
The MapReduce programming model, due to its simplicity and scalability, has become an essential tool for processing large data volumes in distributed environments. Recent Stream Processing Systems (SPS) extend this model to provide low-latency analysis of high-velocity continuous data streams. However, integrating MapReduce with streaming poses challenges: first, the runtime variations in data characteristics such as data-rates and key-distribution cause resource overload, that inturn leads to fluctuations in the Quality of the Service (QoS); and second, the stateful reducers, whose state depends on the complete tuple history, necessitates efficient fault-recovery mechanisms to maintain the desired QoS in the presence of resource failures. We propose an integrated streaming MapReduce architecture leveraging the concept of consistent hashing to support runtime elasticity along with locality-aware data and state replication to provide efficient load-balancing with low-overhead fault-tolerance and parallel fault-recovery from multiple simultaneous failures. Our evaluation on a private cloud shows up to 2:8 improvement in peak throughput compared to Apache Storm SPS, and a low recovery latency of 700 -1500 ms from multiple failures.
- Research Organization:
- City of Los Angeles Department, CA (United States)
- Sponsoring Organization:
- USDOE Office of Electricity (OE)
- DOE Contract Number:
- OE0000192
- OSTI ID:
- 1332339
- Report Number(s):
- DOE-USC-00192-71
- Resource Relation:
- Conference: IEEE International Conference on Distributed Computing Systems , Columbus, OH (United States), 29 Jun-2 Jul 2015
- Country of Publication:
- United States
- Language:
- English
Similar Records
Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting
Center for Technology for Advanced Scientific Componet Software (TASCS)