Abstract
The paper presents an implementation of a classification algorithm based on Kernel-based fuzzy C-means clustering. The algorithm is implemented in Apache Spark environment, and it is based on Resilient Distributed Datasets (RDDs) and RDD actions and transformations. The choice allows for parallel data manipulation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apache Spark website. http://spark.apache.org/
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science (2007). http://archive.ics.uci.edu/ml/
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inform. Theory IT-13, 21–27 (1967)
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2001)
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Online classifiers based on fuzzy c-means clustering. In: Badica, C., Nguyen, N.T., Brezovan, M. (eds.) Computational Collective Intelligence. Technologies and Applications, LNAI 8083, pp. 427–436. Springer, Berlin, Heidelberg (2013)
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of the on-line distance-based classifiers. In: Nguyen, N.T. et al. (eds.) Intelligent Information and Data-base Systems, LNAI 8398 Part II, pp. 177–186. Springer, Cham, Heidelberg. New York (2014)
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Distance-based ensemble online classifier with kernel clustering. In: Neves-Silva, R., Jain L.C., Howlett, R.J. (eds.), Intelligent Decision Technologies. Smart Innovation, Systems and Technologies, vol. 39, pp. 279–290. Springer (2015)
Jȩdrzejowicz J., Jȩdrzejowicz P.: A hybrid distance-based and naive bayes online classifier. In: Nnez, M., Nguyen, N.T., Camacho, D., Trawiski, B. (eds.) Computational Collective Intelligence: 7th International Conference, ICCCI 2015, Proceedings, pt. II. Madrid, Spain, 21–23 Sept 2015
Mitchell T.: Machine Learning. McGraw-Hill (1997)
Sparks’ machine learning library. http://spark.apache.org/docs/latest/mllib-guide.html
Zhang, D., Chen, S.: Fuzzy clustering using kernel method. In: Proceedings of the International Conference on Control and Automation ICCA, pp. 162–163. Xiamen, China (2003)
Ẑliobaite, I.: Combining similarity in time and space for training set formation under concept drift. Intell. Data Anal. 15(4), 589–611 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Jȩdrzejowicz, J., Jȩdrzejowicz, P., Wierzbowska, I. (2016). Apache Spark Implementation of the Distance-Based Kernel-Based Fuzzy C-Means Clustering Classifier. In: Czarnowski, I., Caballero, A., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies 2016. IDT 2016. Smart Innovation, Systems and Technologies, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-39630-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-39630-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39629-3
Online ISBN: 978-3-319-39630-9
eBook Packages: EngineeringEngineering (R0)