Skip to main content

Building a Classification Model Using Affinity Propagation

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11734))

Included in the following conference series:

  • 1336 Accesses

Abstract

Regular classification of data includes a training set and test set. For example for Naïve Bayes, Artificial Neural Networks, and Support Vector Machines, each classifier employs the whole training set to train itself. This study will explore the possibility of using a condensed form of the training set in order to get a comparable classification accuracy. The technique we explored in this study will use a clustering algorithm to explore how the data can be compressed. For example, is it possible to represent 50 records as a single record? Can this single record train a classifier as similarly to using all 50 records? This thesis aims to explore the idea of how we can achieve data compression through clustering, what are the concepts that extract the qualities of a compressed dataset, and how to check the information gain to ensure the integrity and quality of the compression algorithm. This study will explore compression through Affinity Propagation using categorical data, exploring entropy within cluster sets to calculate integrity and quality, and testing the compressed dataset with a classifier using Cosine Similarity against the uncompressed dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mulier, F.M., Cherkassky, V.: Learning From Data. Wiley-IEEE Press (2007)

    Google Scholar 

  2. Kalechofsky, H.: A simple framework for building predictive models. M Squared Consulting (2016). http://www.msquared.com/wp-content/uploads/2017/01/A-Simple-Framework-for-Building-Predictive-Models.pdf

  3. Kumar, V.: Introduction to data mining. In: Cluster Analysis: Basic Concepts and Methods. Pearson (2005)

    Google Scholar 

  4. Dueck, D., Frey, B.J.: Clustering by passing messages between data points. Sci. Mag. 315, 972–976 (2007)

    MathSciNet  MATH  Google Scholar 

  5. Trono, J., Kronenberg, D., Redmond, P.: Affinity propagation, and other data clustering techniques, Saint Michael’s College. http://academics.smcvt.edu/jtrono/Papers/SMCClustering%20Paper_PatrickRedmond.pdf

  6. Refianti, R., Mutiara, A.B., Gunawan, S.: Time complexity comparison between affinity propagation algorithms. J. Theor. Appl. Inf. Technol. 95(7), 1497–1505 (2017)

    Google Scholar 

  7. Barrett, P.: Euclidian distance. Technical Whitepaper (2005). https://www.pbarrett.net/techpapers/euclid.pdf

  8. Kumar, A., Bholowalia, P.: EBK-means: a clustering technique based on Elbow Method and K-means in WSN. Int. J. Comput. Appl. 105(9), 17–24 (2014)

    Google Scholar 

  9. Dey, L., Ahmad, A.: A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl. Eng. 63, 503–527 (2007)

    Article  Google Scholar 

  10. Garcia, E.: Cosine Similarity Tutorial, 04 October 2015. http://www.minerazzi.com/tutorials/cosine-similarity-tutorial.pdf. Accessed 15 Sept 2018

  11. Wolberg, W.H.: UCI machine learning repository. University of Wisconsin Hospitals, Madison WI, 15 July 1992. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 01 Mar 2019

  12. Schlimmer, J.: UCI machine learning repository. The Audubon Society Field Guide to North American Mushrooms, 27 April 1987. https://archive.ics.uci.edu/ml/datasets/Mushroom. Accessed 01 Mar 2019

  13. Soklic, M., Zwitter, M.: Breast cancer data set. UCI Machine Learning Repository, 11 July 1988. https://archive.ics.uci.edu/ml/datasets/BreastCancer. Accessed 15 Apr 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Klecker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Klecker, C., Saad, A. (2019). Building a Classification Model Using Affinity Propagation. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2019. Lecture Notes in Computer Science(), vol 11734. Springer, Cham. https://doi.org/10.1007/978-3-030-29859-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29859-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29858-6

  • Online ISBN: 978-3-030-29859-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics