Skip to main content

Apache Spark Implementation of the Distance-Based Kernel-Based Fuzzy C-Means Clustering Classifier

  • Conference paper
  • First Online:
Intelligent Decision Technologies 2016 (IDT 2016)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 56))

Included in the following conference series:

  • 688 Accesses

Abstract

The paper presents an implementation of a classification algorithm based on Kernel-based fuzzy C-means clustering. The algorithm is implemented in Apache Spark environment, and it is based on Resilient Distributed Datasets (RDDs) and RDD actions and transformations. The choice allows for parallel data manipulation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Apache Spark website. http://spark.apache.org/

  2. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science (2007). http://archive.ics.uci.edu/ml/

  3. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inform. Theory IT-13, 21–27 (1967)

    Google Scholar 

  4. Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2001)

    MATH  Google Scholar 

  5. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Online classifiers based on fuzzy c-means clustering. In: Badica, C., Nguyen, N.T., Brezovan, M. (eds.) Computational Collective Intelligence. Technologies and Applications, LNAI 8083, pp. 427–436. Springer, Berlin, Heidelberg (2013)

    Google Scholar 

  6. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of the on-line distance-based classifiers. In: Nguyen, N.T. et al. (eds.) Intelligent Information and Data-base Systems, LNAI 8398 Part II, pp. 177–186. Springer, Cham, Heidelberg. New York (2014)

    Google Scholar 

  7. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Distance-based ensemble online classifier with kernel clustering. In: Neves-Silva, R., Jain L.C., Howlett, R.J. (eds.), Intelligent Decision Technologies. Smart Innovation, Systems and Technologies, vol. 39, pp. 279–290. Springer (2015)

    Google Scholar 

  8. Jȩdrzejowicz J., Jȩdrzejowicz P.: A hybrid distance-based and naive bayes online classifier. In: Nnez, M., Nguyen, N.T., Camacho, D., Trawiski, B. (eds.) Computational Collective Intelligence: 7th International Conference, ICCCI 2015, Proceedings, pt. II. Madrid, Spain, 21–23 Sept 2015

    Google Scholar 

  9. Mitchell T.: Machine Learning. McGraw-Hill (1997)

    Google Scholar 

  10. Sparks’ machine learning library. http://spark.apache.org/docs/latest/mllib-guide.html

  11. Zhang, D., Chen, S.: Fuzzy clustering using kernel method. In: Proceedings of the International Conference on Control and Automation ICCA, pp. 162–163. Xiamen, China (2003)

    Google Scholar 

  12. Ẑliobaite, I.: Combining similarity in time and space for training set formation under concept drift. Intell. Data Anal. 15(4), 589–611 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Izabela Wierzbowska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Jȩdrzejowicz, J., Jȩdrzejowicz, P., Wierzbowska, I. (2016). Apache Spark Implementation of the Distance-Based Kernel-Based Fuzzy C-Means Clustering Classifier. In: Czarnowski, I., Caballero, A., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies 2016. IDT 2016. Smart Innovation, Systems and Technologies, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-319-39630-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39630-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39629-3

  • Online ISBN: 978-3-319-39630-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics