MapReduce and Its Applications, Challenges, and Architecture: a Comprehensive Review and Directions for Future Research

Khezr, Seyed Nima; Navimipour, Nima Jafari

doi:10.1007/s10723-017-9408-0

MapReduce and Its Applications, Challenges, and Architecture: a Comprehensive Review and Directions for Future Research

Published: 26 August 2017

Volume 15, pages 295–321, (2017)
Cite this article

Journal of Grid Computing Aims and scope Submit manuscript

927 Accesses
45 Citations
Explore all metrics

Abstract

Profound attention to MapReduce framework has been caught by many different areas. It is presently a practical model for data-intensive applications due to its simple interface of programming, high scalability, and ability to withstand the subjection to flaws. Also, it is capable of processing a high proportion of data in distributed computing environments (DCE). MapReduce, on numerous occasions, has proved to be applicable to a wide range of domains. However, despite the significance of the techniques, applications, and mechanisms of MapReduce, there is no comprehensive study at hand in this regard. Thus, this paper not only analyzes the MapReduce applications and implementations in general, but it also provides a discussion of the differences between varied implementations of MapReduce as well as some guidelines for planning future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MapReduce for Big Data Analysis: Benefits, Limitations and Extensions

An Overview of the MapReduce Model

MapReduce Algorithms for Big Data Analysis

References

Wang, B., Huang, S., Qiu, J., Liu, Y., Wang, G.: Parallel online sequential extreme learning machine based on MapReduce. Neurocomputing 149, 224–232 (2015)
Article Google Scholar
Marozzo, F., Talia, D., Trunfio, P.: P2P-MapReduce: parallel data processing in dynamic Cloud environments. J. Comput. Syst. Sci. 78, 1382–1402 (2012)
Article Google Scholar
Mohamed, H., Marchand-Maillet, S.: MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy. Parallel Comput. 39, 851–866 (2013)
Article Google Scholar
Barre, B., Klein, M., Soucy-Boivin, M., Ollivier, P.-A., Hallé, S.: MapReduce for parallel trace validation of LTL properties. In: Runtime Verification, pp. 184–198 (2013)
Lu, L., Shi, X., Jin, H., Wang, Q., Yuan, D., Wu, S.: Morpho: a decoupled MapReduce framework for elastic cloud computing. Futur. Gener. Comput. Syst. 36, 80–90 (2014)
Article Google Scholar
Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53, 72–77 (2010)
Article Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)
Article Google Scholar
Kolb, L., Thor, A., Rahm, E.: Multi-pass sorted neighborhood blocking with MapReduce. Comput. Sci. Res. Dev. 27, 45–63 (2012)
Article Google Scholar
Anjos, J.C., Carrera, I., Kolberg, W., Tibola, A.L., Arantes, L.B., Geyer, C.R.: MRA++: scheduling and data placement on MapReduce for heterogeneous environments. Futur. Gener. Comput. Syst. 42, 22–35 (2015)
Article Google Scholar
Zhang, J., Wong, J.-S., Li, T., Pan, Y.: A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems. Int. J. Approx. Reason. 55, 896–907 (2014)
Article Google Scholar
Slagter, K., Hsu, C.-H., Chung, Y.-C., Yi, G.: SmartJoin: a network-aware multiway join for MapReduce. Clust. Comput. 17, 1–13 (2014)
Article Google Scholar
Xiao, Z., Xiao, Y.: Achieving accountable MapReduce in cloud computing. Futur. Gener. Comput. Syst. 30, 1–13 (2014)
Article Google Scholar
Plantenga, T.D., Choe, Y.R., Yoshimura, A.: Using performance measurements to improve mapreduce algorithms. Procedia Comput. Sci. 9, 1920–1929 (2012)
Article Google Scholar
Polato, I., Ré, R., Goldman, A., Kon, F.: A comprehensive view of Hadoop research—a systematic literature review. J. Netw. Comput. Appl. 46, 1–25 (2014)
Article Google Scholar
Shamsi, J., Khojaye, M.A., Qasmi, M.A.: Data-intensive cloud computing: requirements, expectations, challenges, and solutions. J. Grid Comput. 11, 281–310 (2013)
Article Google Scholar
Plimpton, S.J., Devine, K.D.: MapReduce in MPI for large-scale graph algorithms. Parallel Comput. 37, 610–632 (2011)
Article Google Scholar
Wolf, J., Balmin, A., Rajan, D., Hildrum, K., Khandekar, R., Parekh, S., et al.: On the optimization of schedules for MapReduce workloads in the presence of shared scans. VLDB J.—Int. J. Very Large Data Bases 21, 589–609 (2012)
Article Google Scholar
Aznoli, F., Navimipour, N.J.: Cloud services recommendation: Reviewing the recent advances and suggesting the future research directions. J. Netw. Comput. Appl. 77, 73–86 (2017)
Article Google Scholar
Vakili, A., Navimipour, N.J.: Comprehensive and systematic review of the service composition mechanisms in the cloud environments. J. Netw. Comput. Appl. 81, 24–36 (2017)
Article Google Scholar
Yang, H., Luan, Z., Li, W., Qian, D.: MapReduce workload modeling with statistical approach. J. Grid Comput. 10, 279–310 (2012)
Article Google Scholar
Choi, J., Choi, C., Ko, B., Kim, P.: A method of DDoS attack detection using HTTP packet pattern and rule engine in cloud computing environment. Soft Comput. 18, 1697–1703 (2014)
Article Google Scholar
Chiregi, M., Navimipour, N.J.: A new method for trust and reputation evaluation in the cloud environments using the recommendations of opinion leaders’ entities and removing the effect of troll entities. Comput. Hum. Behav. 60, 280–292 (2016)
Article Google Scholar
Chiregi, M., Navimipour, N.J.: A comprehensive study of the trust evaluation mechanisms in the cloud computing. J. Serv. Sci. Res. 9, 1–30 (2017)
Article Google Scholar
Navimipour, N.J., Rahmani, A.M., Navin, A.H., Hosseinzadeh, M.: Expert Cloud: a Cloud-based framework to share the knowledge and skills of human resources. Comput. Hum. Behav. 46, 57–74 (2015)
Article Google Scholar
Keshanchi, B., Souri, A., Navimipour, N.J.: An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: formal verification, simulation, and statistical testing. J. Syst. Softw. 124, 1–21 (2017)
Article Google Scholar
Hazratzadeh, S., Navimipour, N.J.: Colleague recommender system in the Expert Cloud using the features matrix. Kybernetes 45, 1–30 (2017)
Google Scholar
Mohammadi, S.Z., Navimipour, J.N.: Invalid cloud providers’ identification using the support vector machine. Int. J. Next-Generation Comput. 8, 82–89 (2017)
Google Scholar
Zhang, J., Xiang, D., Li, T., Pan, Y.: M2M: a simple Matlab-to-MapReduce translator for cloud computing. Tsinghua Sci. Technol. 18, 1–9 (2013)
Article Google Scholar
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 716–727 (2012)
Article Google Scholar
Cormack, G.V., Smucker, M.D., Clarke, C.L.: Efficient and effective spam filtering and re-ranking for large web datasets. Inf. Retr. 14, 441–465 (2011)
Article Google Scholar
Lin, J.: Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 155–162 (2009)
Zhao, W., Ma, H., He, Q: Parallel k-means clustering based on mapreduce. In: Cloud Computing, pp. 674–679. Springer, Berlin (2009)
Baraglia, R., De Francisci Morales, G., Lucchese, C.: Document similarity self-join with MapReduce. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 731–736 (2010)
Caruana, G., Li, M., Liu, Y.: An ontology enhanced parallel SVM for scalable spam filter training. Neurocomputing 108, 45–57 (2013)
Article Google Scholar
Liao, R., Zhang, Y., Guan, J., Zhou, S.: CloudNMF: a MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets. Genomics Proteomics Bioinforma. 12, 48–51 (2014)
Article Google Scholar
Svendsen, M., Tirthapura, S.: Mining maximal cliques from a large graph using MapReduce: tackling highly uneven subproblem sizes. J. Parallel Distrib. Comput. 79, 104–114 (2012)
Google Scholar
Lee, K.-H., Lee, Y.-J., Choi, H., Chung, Y.D., Moon, B.: Parallel data processing with MapReduce: a survey. ACM SIGMOD Rec. 40, 11–20 (2012)
Article Google Scholar
Li, R., Hu, H., Li, H., Wu, Y., Yang, J.: Mapreduce parallel programming model: a state-of-the-art survey. Int. J. Parallel Prog. 44, 832–866 (2016)
Article Google Scholar
Khezr, S.N., Navimipour, N.J.: MapReduce and its application in optimization algorithms: a comprehensive study. Majlesi J. Multimed. Process. 4, 31–33 (2015)
Vijayalakshmi, V., Akila, A., Nagadivya, S.: The survey on MapReduce. Int. J. Eng. Sci. Technol. 4, 3335–3342 (2012)
Kalavri, V., Vlassov, V.: Mapreduce: limitations, optimizations and open issues. In: 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 1031–1038 (2013)
Debortoli, S., Müller, O., vom Brocke, J.: Comparing business intelligence and big data skills. Bus. Inf. Syst. Eng. 6, 289–300 (2014)
Article Google Scholar
Lin, J., Dyer, C.: Data-intensive text processing with MapReduce. Synth. Lect. Human Lang. Technol. 3, 1–177 (2010)
Article Google Scholar
Jain, R., Sarkar, P., Subhraveti, D.: Gpfs-snc: an enterprise cluster file system for big data. IBM J. Res. Dev. 57, 5:1–5:10 (2013)
Article Google Scholar
Lee, D., Kim, J.-S., Maeng, S.: Large-scale incremental processing with MapReduce. Futur. Gener. Comput. Syst. 36, 66–79 (2014)
Article Google Scholar
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, pp. 2–2 (2012)
Zhao, Y., Wu, J.: Dache: a data aware caching for big-data applications using the MapReduce framework. In: INFOCOM, 2013 Proceedings IEEE, pp. 35–39 (2013)
Costa, P., Donnelly, A., Rowstron, A.I., O’Shea, G.: Camdoop: exploiting in-network aggregation for big data applications. In: NSDI, pp. 3–3 (2012)
Pandey, S, Tokekar, V.: Prominence of MapReduce in Big Data Processing. In: 2014 Fourth International Conference on Communication Systems and Network Technologies (CSNT), pp. 555–560 (2014)
Ji, C., Li, Z., Qu, W., Xu, Y., Li, Y.: Scalable nearest neighbor query processing based on Inverted Grid Index. J. Netw. Comput. Appl. 44, 172–182 (2014)
Article Google Scholar
Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13, 1–37 (2015)
Article Google Scholar
Wu, T.-Y., Chen, C.-Y., Kuo, L.-S., Lee, W.-T., Chao, H.-C.: Cloud-based image processing system with priority-based data distribution mechanism. Comput. Commun. 35, 1809–1818 (2012)
Article Google Scholar
Senger, H., Gil-Costa, V., Arantes, L., Marcondes, C.A.C., Marín, M., Sato, L.M., et al.: BSP cost and scalability analysis for MapReduce operations. Concurr. Comput. Pract. Exp. 28, 2503–2527 (2016)
Article Google Scholar
Idris, M., Hussain, S., Ali, M., Abdulali, A., Siddiqi, M.H., Kang, B.H., et al.: Context-aware scheduling in MapReduce: a compact review. Concurr. Comput. Pract. Exp. 27, 5332–5349 (2015)
Article Google Scholar
Lee, C.-W., Hsieh, K.-Y., Hsieh, S.-Y., Hsiao, H.-C.: A dynamic data placement strategy for Hadoop in heterogeneous environments. Big Data Res. 1, 14–22 (2014)
Article Google Scholar
Aridhi, S., d’Orazio, L., Maddouri, M., Mephu Nguifo, E.: Density-based data partitioning strategy to approximate large-scale subgraph mining. Inf. Syst. 48, 213–223 (2015)
Article Google Scholar
Ding, L., Wang, G., Xin, J., Wang, X., Huang, S., Zhang, R.: ComMapReduce: an improvement of mapreduce with lightweight communication mechanisms. Data Knowl. Eng. 88, 224–247 (2013)
Article Google Scholar
Laclavík, M., Šeleng, M., Hluchý, L.: Towards large scale semantic annotation built on mapreduce architecture. In: Computational Science–ICCS 2008. Springer, pp. 331–338 (2008)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. In: ACM SIGOPS Operating Systems Review, pp. 59–72 (2007)
Yoo, R.M., Romano, A., Kozyrakis, C: Phoenix rebirth: scalable MapReduce on a large-scale shared-memory system. In: IEEE International Symposium on Workload Characterization, 2009. IISWC 2009, pp. 198–207 (2009)
Fang, W., He, B., Luo, Q., Govindaraju, N.K.: Mars: accelerating mapreduce with graphics processors. IEEE Trans. Parallel Distrib. Syst. 22, 608–620 (2011)
Article Google Scholar
Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.-H., Qiu, J., et al.: Twister: a runtime for iterative mapreduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 810–818 (2010)
Pan, J., Biannic, Y.L., Magoules, F.: Parallelizing multiple group-by query in share-nothing environment: a MapReduce study case. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 856–863 (2010)
Aarnio, T: Parallel data processing with MapReduce. In: TKK T-110.5190, Seminar on Internetworking (2009)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: ACM SIGOPS Operating Systems Review, pp. 29–43 (2003)
Liu, Y., Li, M., Alham, N.K., Hammoud, S.: HSim: a MapReduce simulator in enabling cloud computing. Futur. Gener. Comput. Syst. 29, 300–308 (2013)
Article Google Scholar
Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., et al.: G-Hadoop: MapReduce across distributed data centers for data-intensive computing. Futur. Gener. Comput. Syst. 29, 739–750 (2013)
Article Google Scholar
Rasooli, A., Down, D.G.: Guidelines for Selecting Hadoop Schedulers Based on System Heterogeneity. J. Grid Comput. 12, 499–519 (2014)
Article Google Scholar
Kala Karun, A., Chitharanjan, K.: A review on hadoop—HDFS infrastructure extensions. In: 2013 IEEE Conference on Information & Communication Technologies (ICT), pp. 132–137 (2013)
Vaidya, M: Parallel processing of cluster by map reduce. Int. J. Distrib. Parallel Syst. 3, 167 (2012)
Gu, R., Yang, X., Yan, J., Sun, Y., Wang, B., Yuan, C., et al.: SHadoop: improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters. J. Parallel Distrib. Comput. 74, 2166–2179 (2014)
Article Google Scholar
O’Driscoll, A., Daugelaite, J., Sleator, R.D.: ‘Big data’, Hadoop and cloud computing in genomics. J. Biomed. Inform. 46, 774–781 (2013)
Article MATH Google Scholar
Vijayalakshmi, V., Akila, A, Nagadivya, S.: The survey on mapreduce. Int. J. Eng. Sci. 4, 3335–3342 (2012)
Borthakur, D.: The hadoop distributed file system: architecture and design. Hadoop Project Website 11, 21 (2007)
Google Scholar
He, W., Cui, H., Lu, B., Zhao, J., Li, S., Ruan, G., et al.: Hadoop+: modeling and evaluating the heterogeneity for MapReduce applications in heterogeneous clusters. In: Proceedings of the 29th ACM on International Conference on Supercomputing, pp. 143–153 (2015)
He, B., Fang, W., Luo, Q., Govindaraju, N.K., Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 260–269 (2008)
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C: Evaluating mapreduce for multi-core and multiprocessor systems. In: IEEE 13th International Symposium on High Performance Computer Architecture, 2007. HPCA 2007, pp. 13–24 (2007)
Chen, R., Chen, H., Zang, B.: Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, pp. 523–534 (2010)
Chen, Y., Qiao, Z., Jiang, H., Li, K.-C., Ro, W.W.: Mgmr: Multi-gpu based mapreduce. In: Grid and Pervasive Computing, pp. 433–442. Springer (2013)
Gu, Y., Grossman, R.L.: Sector and Sphere: the design and implementation of a high-performance data cloud. Philos. Trans. R. Soc. Lond. A: Math. Phys. Eng. Sci. 367, 2429–2445 (2009)
Article Google Scholar
Zhang, Y., Gao, Q., Gao, L., Wang, C.: imapreduce: a distributed computing framework for iterative computation. J. Grid Comput. 10, 47–68 (2012)
Article Google Scholar
Liu, Q., Todman, T., Luk, W., Constantinides, G.A.: Automated mapping of the MapReduce pattern onto parallel computing platforms. J. Signal Process. Syst. 67, 65–78 (2012)
Article Google Scholar
Qian, J., Miao, D., Zhang, Z., Yue, X.: Parallel attribute reduction algorithms using MapReduce. Inf. Sci. 279, 671–690 (2014)
Article MathSciNet MATH Google Scholar
Derbeko, P., Dolev, S., Gudes, E., Sharma, S.: Security and privacy aspects in MapReduce on clouds: a survey. Comput. Sci. Rev. 20, 1–28 (2016)
Article MathSciNet MATH Google Scholar
Xia, T: Large-scale sms messages mining based on map-reduce. In: International Symposium on Computational Intelligence and Design, 2008. ISCID’08, pp. 7–12 (2008)
Jin, C., Vecchiola, C., Buyya, R.: MRPGA: an extension of MapReduce for parallelizing genetic algorithms. In: IEEE Fourth International Conference on eScience, 2008. eScience’08, pp. 214–221 (2008)
Xu, B., Gao, J., Li, C.: An efficient algorithm for DNA fragment assembly in MapReduce. Biochem. Biophys. Res. Commun. 426, 395–398 (2012)
Article Google Scholar
Hsu, C.-Y., Yang, C.-S., Yu, L.-C., Lin, C.-F., Yao, H.-H., Chen, D.-Y., et al.: Development of a cloud-based service framework for energy conservation in a sustainable intelligent transportation system. Int. J. Prod. Econ. 164, 454–461 (2015)
Article Google Scholar
Zhang, F., Cao, J.: A task-level adaptive mapreduce framework for real-time streaming data in healthcare applications. Futur. Gener. Comput. Syst. 43, 149–160 (2015)
Article Google Scholar
López, V., del Río, S., Benítez, J.M., Herrera, F.: Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data. Fuzzy Sets Syst. (2014)
Xu, X., Ji, Z., Yuan, F., Liu, X.: A novel parallel approach of cuckoo search using MapReduce. In: 2014 International Conference on Computer, Communications and Information Technology (CCIT 2014) (2014)
Bi, X., Zhao, X., Wang, G., Zhang, P., Wang, C.: Distributed extreme learning machine with kernels based on MapReduce. Neurocomputing 149, 456–463 (2015)
Article Google Scholar
del Río, S., López, V., Benítez, J.M., Herrera, F.: On the use of MapReduce for imbalanced big data using Random Forest. Inf. Sci. 285, 112–137 (2014)
Article Google Scholar
Kim, J., Chou, J., Rotem, D.: iPACS: power-aware covering sets for energy proportionality and performance in data parallel computing clusters. J. Parallel Distrib. Comput. 74, 1762–1774 (2014)
Article Google Scholar
Paniagua, C., Flores, H., Srirama, S.N.: Mobile sensor data classification for human activity recognition using MapReduce on cloud. Procedia Comput. Sci. 10, 585–592 (2012)
Article Google Scholar
Urbani, J., Kotoulas, S., Maassen, J., Van Harmelen, F., Bal, H.: WebPIE: a web-scale parallel inference engine using MapReduce. Web Semant. Sci. Serv. Agents World Wide Web 10, 59–75 (2012)
Article Google Scholar
Li, Z., Shen, Y., Yao, B., Guo, M.: OFScheduler: a dynamic network optimizer for MapReduce in heterogeneous cluster. Int. J. Parallel Prog. 43, 1–17 (2013)
Google Scholar
Rizvandi, N.B., Taheri, J., Moraveji, R., Zomaya, A.Y.: Network load analysis and provisioning of MapReduce applications. In: 2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), pp. 161–166 (2012)
Maurya, M., Mahajan, S.: Performance analysis of MapReduce Programs on Hadoop cluster. In: 2012 World Congress on Information and Communication Technologies (WICT), pp. 505–510 (2012)
Ahmad, F., Chakradhar, S.T., Raghunathan, A., Vijaykumar, T.: Tarazu: optimizing mapreduce on heterogeneous clusters. In: ACM SIGARCH Computer Architecture News, pp. 61–74 (2012)
Ahmad, F., Lee, S., Thottethodi, M., Vijaykumar, T: Puma: purdue mapreduce benchmarks suite (2012)
Brandt, A.: Algebraic analysis of MapReduce samples. Bachelor Thesis, University of Koblenz-Landau (2010)
Google Scholar
Verikas, A., Gelzinis, A., Bacauskiene, M.: Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44, 330–349 (2011)
Article Google Scholar
Miner, D., Shook, A.: MapReduce design patterns: building effective algorithms and analytics for Hadoop and other systems. O’Reilly Media, Inc. (2012)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 95 (2010)
Google Scholar
Xin, J., Wang, Z., Qu, L., Wang, G.: Elastic extreme learning machine for big data classification. Neurocomputing 149, 464–471 (2015)
Article Google Scholar
He, Q., Shang, T., Zhuang, F., Shi, Z.: Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102, 52–58 (2013)
Article Google Scholar
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings. 2004 IEEE International Joint Conference on Neural Networks, 2004, pp. 985–990 (2004)
Huang, G.-B., Chen, L., Siew, C.-K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17, 879–892 (2006)
Article Google Scholar
Huang, G.-B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70, 3056–3062 (2007)
Article Google Scholar
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Article Google Scholar
Alamir, P, Navimipour, N.J.: Trust evaluation between the users of social networks using the quality of service requirements and call log histories. Kybernetes 45, 1505–1523 (2016)
Mohammad Aghdam, S., Navimipour, N.J.: Opinion leaders selection in the social networks based on trust relationships propagation. Karbala Int. J. Modern Sci. 2, 88–97 (2016)
Article Google Scholar
Nourozi, M., Souri, A., Navimipour, N.J.: User relationship management approach for human behavior interactions in the social networks: behavioral modeling and formal verification. Behav. Inf. Technol. (2018, in press)
Liu, G., Zhang, M., Yan, F.: Large-scale social network analysis based on mapreduce. In: 2010 International Conference on Computational Aspects of Social Networks (CASoN), pp. 487–490 (2010)
Yang, S.-J., Chen, Y.-R.: Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds. J. Netw. Comput. Appl. 57, 61–70, 11// (2015)
Article Google Scholar
Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, 1995. MHS’95, pp. 39–43 (1995)
Shi, Y., Eberhart, R.C.: Empirical study of particle swarm optimization. In: Proceedings of the 1999 Congress on Evolutionary Computation, 1999. CEC 99 (1999)
Sheikholeslami, F., Navimipour, J.N.: Service allocation in the cloud environments using multi-objective particle swarm optimization algorithm based on crowding distance. Swarm Evol. Comput. 35, 53–64 (2017)
Article Google Scholar
McNabb, A.W., Monson, C.K., Seppi, K.D.: Parallel pso using mapreduce. In: IEEE Congress on Evolutionary Computation, 2007. CEC 2007, pp. 7–14 (2007)
Gandomi, A.H., Yang, X.-S., Alavi, A.H.: Cuckoo search algorithm: a metaheuristic approach to solve structural optimization problems. Eng. Comput. 29, 17–35 (2013)
Article Google Scholar
Navimipour, N.J., Milani, F.S.: Task scheduling in the cloud computing based on the cuckoo search algorithm. Int. J. Model. Optim. 5, 44 (2015)
Article Google Scholar
Li, H., Wei, X., Fu, Q., Luo, Y.: MapReduce delay scheduling with deadline constraint. Concurr. Comput. Pract. Exp. 26, 766–778 (2014)
Article Google Scholar
Asghari, S., Navimipour, J.N.: Cloud services composition using an inverted ant colony optimization algorithm. Int. J. Bio-Inspired Comput. (2017, in press)
Asghari, S., Navimipour, J.N.: Resource discovery in peer to peer networks using an inverted ant colony optimization algorithm. Peer-to-Peer Netw. Appl. (2017, in press)
Azad, P., Navimipour, N.J.: An energy-aware task scheduling in cloud computing using a hybrid cultural and ant colony optimization algorithm. Int. J. Cloud Appl. Comput. 7 (2017, in press)
Dréo, J., Siarry, P.: A new ant colony algorithm using the heterarchical concept aimed at optimization of multiminima continuous functions. In: Ant Algorithms. Springer, pp. 216–221 (2002)
Wu, B., Wu, G., Yang, M.: A mapreduce based ant colony optimization approach to combinatorial optimization problems. In: 2012 Eighth International Conference on Natural Computation (ICNC), pp. 728–732 (2012)
Wang, H., Xu, Z., Pedrycz, W.: An overview on the roles of fuzzy set techniques in big data processing: trends, challenges and opportunities. Knowl.-Based Syst. 118, 15–30 (2016)
Article Google Scholar
Li, X., Song, J., Zhang, F., Ouyang, X., Khan, S.U.: MapReduce-based fast fuzzy c-means algorithm for large-scale underwater image segmentation. Futur. Gener. Comput. Syst. 65, 90–101 (2016)
Article Google Scholar
Cheng, S.-T., Wang, H.-C., Chen, Y.-J., Chen, C.-F.: Performance analysis using petri net based MapReduce model in heterogeneous clusters. In: Advances in Web-Based Learning–ICWL 2013 Workshops, pp. 170–179 (2013)
Jayasree, M.: Data mining: exploring big data using Hadoop and MapReduce (2008)
Mesmoudi, A., Hacid, M.-S., Toumani, F.: Benchmarking SQL on MapReduce systems using large astronomy databases. Distrib. Parallel Databases 34, 1–32 (2015)
Google Scholar
Wu, L., Yuan, L., You, J.: Survey of large-scale data management systems for big data applications. J. Comput. Sci. Technol. 30, 163–183 (2015)
Article Google Scholar
Müller, G., Sonehara, N., Echizen, I., Wohlgemuth, S.: Sustainable cloud computing. Bus. Inf. Syst. Eng. 3, 129–131 (2011)
Article Google Scholar
Milani, A.S., Navimipour, N.J.: Load balancing mechanisms and techniques in the cloud environments: systematic literature review and future trends. J. Netw. Comput. Appl. 71, 86–89 (2016)
Article Google Scholar
Milani, B.A., Navimipour, N.J.: A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions. J. Netw. Comput. Appl. 64, 229–238 (2016)
Article Google Scholar
Ashouraie, M., Navimipour, N.J.: Priority-based task scheduling on heterogeneous resources in the Expert Cloud. Kybernetes 44, 1455–1471 (2015)
Article Google Scholar
Chiregi, M., Navimipour, N.J.: Trusted services identification in the cloud environment using the topological metrics. Karbala Int. J. Modern Sci. 2, 203–210 (2016)
Article Google Scholar
Sun, Y., Qi, J., Zhang, R., Chen, Y., Du, X.: MapReduce based location selection algorithm for utility maximization with capacity constraints. Computing 97, 1–21 (2013)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
Seyed Nima Khezr & Nima Jafari Navimipour

Authors

Seyed Nima Khezr
View author publications
You can also search for this author in PubMed Google Scholar
Nima Jafari Navimipour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nima Jafari Navimipour.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khezr, S.N., Navimipour, N.J. MapReduce and Its Applications, Challenges, and Architecture: a Comprehensive Review and Directions for Future Research. J Grid Computing 15, 295–321 (2017). https://doi.org/10.1007/s10723-017-9408-0

Download citation

Received: 28 October 2016
Accepted: 08 August 2017
Published: 26 August 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10723-017-9408-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MapReduce and Its Applications, Challenges, and Architecture: a Comprehensive Review and Directions for Future Research

Abstract

Access this article

Similar content being viewed by others

MapReduce for Big Data Analysis: Benefits, Limitations and Extensions

An Overview of the MapReduce Model

MapReduce Algorithms for Big Data Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MapReduce and Its Applications, Challenges, and Architecture: a Comprehensive Review and Directions for Future Research

Abstract

Access this article

Similar content being viewed by others

MapReduce for Big Data Analysis: Benefits, Limitations and Extensions

An Overview of the MapReduce Model

MapReduce Algorithms for Big Data Analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation