Skip to main content
Log in

Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

With the increasing availability of large amount of information and the benefits related to data processing, big data have gained large significance in recent years. With scalable nature of data, big data applications are processed using MapReduce programming model. However, the application of rule-based models in datasets is not straightforward and big data are not classified in an efficient manner. To overcome the above-mentioned problems, parallel linguistic fuzzy rule with canopy MapReduce (LFR-CM) framework is introduced. LFR-CM framework classifies big data using canopy MapReduce function for information sharing in cloud with higher classification accuracy and lesser time consumption. It comprises three steps for efficient classification in cloud environment. Initially, it constructs the fuzzy knowledge base (KB) from the big data training set where linguistic fuzzy rules are constructed. The second step in LFR-CM framework has three operations. The first operation is map function used in parallel manner through every cloud user without transmitting any data to other cloud user nodes. The second operation is processing of data through the map function across all additional cloud user nodes. The third operation is reduce function deployed by each cloud user through the partitioned information. Finally, by this way, the data classification is performed with higher classification accuracy and lesser time consumption. LFR-CM framework is implemented and evaluated on Amazon EC2 cloud big data datasets and compared with the other classification system that utilizes MapReduce in terms of the runtime, classification time, classification accuracy and input/output cost. Based on the results observed from the study, LFR-CM framework is more efficient than the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

\(\hbox{CS}\) :

Cloud servers

\(\hbox{CU}\) :

Cloud users

\(R_{i}\) :

Fuzzy rules

\(P_{i}^{1}\) :

Antecedent fuzzy set

\(C_{i}\) :

Class label

\(\hbox{RW}_{i}\) :

Rule weight

\(a_{p}\) :

Membership function

\(C_{\rm mn}\) :

Cloud master node

\(\hbox{MAP}\) :

Map function

\(\hbox{FM}_{\text{i}}\) :

Mapping threshold factor

\({\text{DS}}_{\rm{i}}\) :

Training set

\({\text{CT}}\) :

Classification time

\({\text{A}}_{\rm{i}}\) :

Classification accuracy

\({\text{DCC}}\) :

Data correctly classified

N :

Number of data

\(\hbox{KB}\) :

Knowledge base

n :

Number of instances

\(C_{\rm wn}\) :

Cloud worker nodes

References

  1. Ayma, V.A., Ferreira, R.S., Happ, P., Oliveira, D., Feitosa, R., Costa, G., Gamba, P.: Classification algorithms for big data analysis, a MapReduce approach. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 40(3), 17–21 (2015)

    Article  Google Scholar 

  2. Cao, J., Cui, H., Shi, H., Jiao, L.: Big data: a parallel particle swarm optimization-back-propagation neural network algorithm based on mapreduce. PloS One 11(6), e0157551 (2015)

    Article  Google Scholar 

  3. Chandak, M.B.: Role of big-data in classification and novel class detection in data streams. J. Big Data 3(1), 5 (2015)

    Article  MathSciNet  Google Scholar 

  4. Del Río, S., López, V., Benítez, J.M., Herrera, F.: On the use of MapReduce for imbalanced big data using random forest. Inf. Sci. 285, 112–137 (2014)

    Article  Google Scholar 

  5. Gao, F., Mei, J., Sun, J., Wang, J., Yang, E., Hussain, A.: A novel classification algorithm based on incremental semi-supervised support vector machine. PloS One 10(8), e0135709 (2015)

    Article  Google Scholar 

  6. Bhadani, A., Jothimani, D.: Big data: challenges, opportunities, and realities. Eff. Big Data Manag. Oppor. Implement. 1–24 (2017)

  7. Ishibuchi, H., Yamamoto, T.: Rule weight specification in fuzzy rule-based classification systems. IEEE Trans. Fuzzy Syst. 13(4), 428–435 (2005)

    Article  Google Scholar 

  8. Kamal, S., Ripon, S.H., Dey, N., Ashour, A.S., Santhi, V.: A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset. Comput. Methods Programs Biomed. 131, 191–206 (2016)

    Article  Google Scholar 

  9. Kashyap, H., Ahmed, H.A., Hoque, N., Roy, S., Bhattacharyya, D.K.: Big data analytics in bioinformatics: a machine learning perspective. arXiv preprint arXiv:1506.05101 (2015)

  10. Li, L., Xu, J., Xiao, W., Ge, B.: Behavior based social dimensions extraction for multi-label classification. PLoS One 11(4), e0152857 (2016)

    Article  Google Scholar 

  11. Liu, H., Gegov, A., Stahl, F.: J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Trans. Syst. 12(9), 433–446 (2013)

    Google Scholar 

  12. Olshannikova, E., Ometov, A., Koucheryavy, Y., Olsson, T.: Visualizing big data with augmented and virtual reality: challenges and research agenda. J. Big Data 2(1), 1–27 (2015)

    Article  Google Scholar 

  13. Peng, X., Liu, C.: Algorithms for neutrosophic soft decision making based on EDAS, new similarity measure and level soft set. J. Intell. Fuzzy Syst. 32(1), 955–968 (2017)

    Article  MATH  Google Scholar 

  14. Peng, X., Selvachandran, G.: Pythagorean fuzzy set: state of the art and future directions. Artif. Intell. Rev. (2017). https://doi.org/10.1007/s10462-017-9596-9

    Google Scholar 

  15. Peng, X., Yang, Y.: Algorithms for interval-valued fuzzy soft sets in stochastic multi-criteria decision making based on regret theory and prospect theory with combined weight. Appl. Soft Comput. 54, 415–430 (2017)

    Article  Google Scholar 

  16. Peng, X., Yang, Y.: Some results for pythagorean fuzzy sets. Int. J. Intell. Syst. 30(11), 1133–1160 (2015)

    Article  Google Scholar 

  17. Pramanik, T., Samanta, S., Pal, M., Mondal, S., Sarkar, B.: Interval-valued fuzzy ϕ-tolerance competition graphs. Springer 5, 1–19 (2016)

    Article  Google Scholar 

  18. Preoţiuc-Pietro, D., Volkova, S., Lampos, V., Bachrach, Y., Aletras, N.: Studying user income through language, behaviour and affect in social media. PLoS One 10(9), e0138717 (2015)

    Article  Google Scholar 

  19. Rahman, M.N., Esmailpour, A.: A hybrid data center architecture for big data. Big Data Res. 3, 29–40 (2016)

    Article  Google Scholar 

  20. Razzaghi, T., Roderick, O., Safro, I., Marko, N.: Multilevel weighted support vector machine for classification on healthcare data with missing values. PLoS ONE 11(5), e0155119 (2016)

    Article  Google Scholar 

  21. Samanta, S., Sarkar, B.: Generalized fuzzy Euler graphs and generalized fuzzy Hamiltonian graphs. J. Intell. Fuzzy Syst. 35(3), 3413–3419 (2018)

    Article  Google Scholar 

  22. Samanta, S., Sarkar, B.: Representation of competitions by generalized fuzzy graphs. Int. J. Comput. Intell. Syst. 11(1), 1005–1015 (2018)

    Article  Google Scholar 

  23. Samanta, S., Pramanik, T., Sarkar, B., Pal, M.: Fuzzy φ-tolerance competition graphs. Soft. Comput. 21(13), 3723–3734 (2017)

    Article  MATH  Google Scholar 

  24. Sarkar, B., Samanta, S.: Generalized fuzzy trees. Int. J. Comput. Intell. Syst. 10(1), 711–720 (2017)

    Article  Google Scholar 

  25. Sarkar, B., Mahapatra, A.S.: Periodic review fuzzy inventory models with variable lead time and fuzzy demand. Int. Trans. Oper. Res. 24(5), 1197–1227 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  26. Singh, D., Roy, D., Mohan, C.K.: DiP-SVM: distribution preserving kernel support vector machine for big data. IEEE Trans. Big Data 3(1), 79–90 (2017)

    Article  Google Scholar 

  27. Soni, H.N., Sarkar, B., Joshi, M.: Demand uncertainty and learning in fuzziness in a continuous review inventory model. J. Intell. Fuzzy Syst. 33(4), 2595–2608 (2017)

    Article  MATH  Google Scholar 

  28. Souliotis, K., Kani, C., Papageorgiou, M., Lionis, D., Gourgoulianis, K.: Using big data to assess prescribing patterns in Greece: the case of chronic obstructive pulmonary disease. PLoS ONE 11(5), e0154960 (2016)

    Article  Google Scholar 

  29. Sug, H.: Applying randomness effectively based on random forests for classification task of datasets of insufficient information. J. Appl. Math. 2012, 13 (2012)

    Article  MATH  Google Scholar 

  30. Suthaharan, S.: Machine learning models and algorithms for big data classification, vol. 36. Springer, Boston (2016)

    MATH  Google Scholar 

  31. Tcheng, D.K., Nayak, A.K., Fowlkes, C.C., Punyasena, S.W.: Visual recognition software for binary classification and its application to spruce pollen identification. PLoS ONE 11(2), e0148879 (2016)

    Article  Google Scholar 

  32. Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)

    Article  Google Scholar 

  33. Wu, C.J., Ku, C.F., Ho, J.M., Chen, M.S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016)

    Article  Google Scholar 

  34. Yang, C., Huang, Q., Li, Z., Liu, K., Hu, F.: Big data and cloud computing: innovation opportunities and challenges. Int. J. Digit. Earth 10(1), 13–53 (2017)

    Article  Google Scholar 

  35. Yun, X., Wu, G., Zhang, G., Li, K., Wang, S.: FastRAQ: a fast approach to range-aggregate queries in big data environments. IEEE Trans. Cloud Comput. 3(2), 206–218 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Vennila.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vennila, V., Kannan, A.R. Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud. Int. J. Fuzzy Syst. 21, 809–822 (2019). https://doi.org/10.1007/s40815-018-0597-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-018-0597-x

Keywords

Navigation