CCCa Framework - Classification System in Big Data Environment with Clustering and Cache Concepts

Subramanian, Sabitha Malli; Vijayalakshmi, S.; Venkataraman, Balaji; Venkumar, P.; Rathikaa Sre, R. M.

doi:10.1007/978-3-319-60618-7_5

CCCa Framework - Classification System in Big Data Environment with Clustering and Cache Concepts

Sabitha Malli Subramanian^18,19,
S. Vijayalakshmi²⁰,
Balaji Venkataraman^19,21,
P. Venkumar²¹ &
…
R. M. Rathikaa Sre²²

Conference paper
First Online: 19 August 2017

1277 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 614))

Abstract

The expansion of the data is swelling at an astonishing pace. The increasing usage of the digital technology massively increases the growth of the data generated by individuals or organizations/corporation produces big data. The big data environment generally uses the Map reduce framework which will take care of the job execution in Hadoop. Nowadays SPARK is becoming a popular framework which is written on top of the Hadoop framework to elevate the execution speed using runtime environment. A novel CCCa framework is proposed in this paper which includes the classification, clustering and cache techniques. This input data quality is improved by data cleansing activity. Similarity based clustering technique is involved to partition the job data into various clusters. Classification phase predicts the behavior of the data and artificial neural network (ANN) is applied for the classification of big data by means of the back propagation technique. The cache substitution technique is recommended to avoid the repetition of job processing. The proposed framework assures the consumption of less memory, computational time and achieved a higher level of accuracy and the prediction of the behavior of the dataset.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Leung, C.K.-S., Hayduk, Y.: Mining frequent patterns from uncertain data with MapReduce for Big Data analytics. In: Database Systems for Advanced Applications, pp. 440–455 (2013)
Google Scholar
Shim, K.: MapReduce algorithms for big data analysis. Proc. VLDB Endow. 5, 2016–2017 (2012)
Article Google Scholar
Cui, X., Zhu, P., Yang, X., Li, K., Ji, C.: Optimized big data K-means clustering using MapReduce. J. Supercomput. 70, 1249–1259 (2014)
Article Google Scholar
Moens, S., Aksehirli, E., Goethals, B.: Frequent itemset mining for big data. In: 2013 IEEE International Conference on Big Data, pp. 111–118 (2013)
Google Scholar
Pal, A., Agrawal, S.: An experimental approach towards big data for analyzing memory utilization on a Hadoop cluster using HDFS and MapReduce. In: 2014 First International Conference on Networks & Soft Computing (ICNSC), pp. 442–447 (2014)
Google Scholar
Evermann, J., Assadipour, G., Big Data meets process mining: implementing the alpha algorithm with map-reduce. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 1414–1416 (2014)
Google Scholar
Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: A MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)
Article Google Scholar
Chai, H., Wu, G., Zhao, Y.: A document-based data warehousing approach for large scale data mining. In: Pervasive Computing and the Networked World, pp. 69–81. Springer (2013)
Google Scholar
Patel, A.B., Birla, M., Nair, U.: Addressing big data problem using Hadoop and Map Reduce. In: Nirma University International Conference on Engineering (NUiCONE), pp. 1–5 (2012)
Google Scholar
Chen, D., Shen, C., Feng, J., Le, J.: An efficient parallel Top-k similarity join for massive multidimensional data using spark. Int. J. Database Theory Appl. 8(3), 57–68 (2015). doi:10.14257/ijdta.2015.8.3.06
Article Google Scholar
Apache Spark. http://spark.apache.org/
Xin, R.S., Rosen, J., Zaharia, M.: Shark: SQL and rich analytics at scale. In: Proceedings of the 2013 International Conference on Management of Data, pp. 13–24. ACM (2013)
Google Scholar
Hu, R., Dou, W., Liu, J.: ClubCF: A Clustering-based Collaborative Filtering Approach for Big Data Application (2014)
Google Scholar
De Francisci Morales, G.: SAMOA: a platform for mining big data streams. In: Proceedings of the 22nd International Conference on World Wide Web Companion, pp. 777–778 (2013)
Google Scholar
Yan, W., Brahmakshatriya, U., Xue, Y., Gilder, M., Wise, B.: p-PIC: parallel power iteration clustering for big data. J. Parallel Distrib. Comput. 73, 352–359 (2013)
Article Google Scholar
Koutsoumpakis, G.: Spark-based Application for Abnormal Log Detection. IT 14 057, Examensarbete 30 hp, Uppsala Universitet, September 2014
Google Scholar
Jin, C., et al.: A Scalable Hierarchical Clustering Algorithm Using Spark. Northwestern University Evanston, IL 60208, April 2015
Google Scholar
Hu, X., et al.: MUSE: asset risk scoring in enterprise network with mutually reinforced reputation propagation. EURASIP J. Inf. Secur., 17 (2014). http://jis.eurasipjournals.com/content/2014/1/17
Yan, Y., et al.: Is Apache Spark Scalable to Seismic Data Analytics and Computations? November 2015
Google Scholar
Sabitha, M.S., et al.: Rule Based Data Purification (RuBDaP) model for big data environment. Int. J. Eng. Res. Online 3(6), 528–534 (2015). ISSN: 2321-7758
Google Scholar
Saravanan, K., Sasithra, S.: Review on classification based on artificial neural networks. Int. J. Ambient Syst. Appl. (IJASA) 2(4), December 2014. doi:10.5121/ijasa.2014.2402
Arif, M., et al.: Application of data mining using artificial neural network: survey. Int. J. Database Theory Appl. 8(1), 245–270 (2015)
Article Google Scholar
Pradhan, G., et al.: Design of Simple ANN (SANN) model for data classification and its performance comparison with FLANN (Functional Link ANN). IJCSNS Int. J. Comput. Sci. Netw. Secur. 9(10), 105–115 (2009)
Google Scholar
Khatri, M.: A survey of naïve bayesian algorithms for similarity in recommendation systems. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(5), 217–219, (2012). ISSN: 2277 128X
Google Scholar
Que, Q., Belkin, M.: Back to the future: radial basis function networks revisited. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), JMLR: W&CP, Cadiz, Spain, vol. 51 (2016)
Google Scholar
https://en.wikipedia.org/wiki/Cohen’s_kappa

Download references

Author information

Authors and Affiliations

Bharathiar University & ZF Electronics TVS(for Sabitha), Coimbatore, India
Sabitha Malli Subramanian
ZF Electronics TVS, Madurai, India
Sabitha Malli Subramanian & Balaji Venkataraman
Thiagarajar College of Engineering, Madurai, India
S. Vijayalakshmi
Kalasalingam University, Krishnankoil, India
Balaji Venkataraman & P. Venkumar
Mepco Schlenk Engineering College, Sivakasi, India
R. M. Rathikaa Sre

Authors

Sabitha Malli Subramanian
View author publications
You can also search for this author in PubMed Google Scholar
S. Vijayalakshmi
View author publications
You can also search for this author in PubMed Google Scholar
Balaji Venkataraman
View author publications
You can also search for this author in PubMed Google Scholar
P. Venkumar
View author publications
You can also search for this author in PubMed Google Scholar
R. M. Rathikaa Sre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sabitha Malli Subramanian .

Editor information

Editors and Affiliations

Scientific Network for Innovation and Research, Machine Intelligence Research Labs (MIR Labs), Auburn, Washington, USA
Ajith Abraham
VIT University, Vellore, Tamil Nadu, India
Aswani Kumar Cherukuri
School of Engineering, Polytechnic of Porto (ISEP/IPP), Porto, Portugal
Ana Maria Madureira
Universiti Teknikal Malaysia Melaka, Durian Tunggal, Malaysia
Azah Kamilah Muda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Subramanian, S.M., Vijayalakshmi, S., Venkataraman, B., Venkumar, P., Rathikaa Sre, R.M. (2018). CCCa Framework - Classification System in Big Data Environment with Clustering and Cache Concepts. In: Abraham, A., Cherukuri, A., Madureira, A., Muda, A. (eds) Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2016). SoCPaR 2016. Advances in Intelligent Systems and Computing, vol 614. Springer, Cham. https://doi.org/10.1007/978-3-319-60618-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-60618-7_5
Published: 19 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60617-0
Online ISBN: 978-3-319-60618-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics