Abstract
We discuss a generalization of the fuzzy (weighted) k-means clustering procedure and point out its relationships with data aggregation in spaces equipped with arbitrary dissimilarity measures. In the proposed setting, a data set partitioning is performed based on the notion of points’ proximity to generic distance-based penalty minimizers. Moreover, a new data classification algorithm, resembling the k-nearest neighbors scheme but less computationally and memory demanding, is introduced. Rich examples in complex data domains indicate the usability of the methods and aggregation theory in general.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ban, A.I., Coroianu, L., Grzegorzewski, P.: Trapezoidal approximation and aggregation. Fuzzy Sets Syst. 177(1), 45–59 (2011)
Beliakov, G., Bustince, H., Calvo, T.: A Practical Guide to Averaging Functions. Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2016)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Advanced Applications in Pattern Recognition. Springer, Heidelberg (1981)
Bock, H.H.: Origins and extensions of the \(k\)-means algorithm in cluster analysis. Electron. J. Hist. Probab. Stat. 4(2), 1–18 (2008)
Boytsov, L.: Indexing methods for approximate dictionary searching: comparative analyses. ACM J. Exp. Algorithmics 16, 1–86 (2011)
Calvo, T., Beliakov, G.: Aggregation functions based on penalties. Fuzzy Sets Syst. 161, 1420–1436 (2010)
Cena, A., Gagolewski, M.: Aggregation and soft clustering of informetric data. In: Baczynski, M., De Baets, B., Mesiar, R. (eds.) Proceeding 8th International Summer School on Aggregation Operators (AGOP 2015), pp. 79–84. University of Silesia, Katowice (2015)
Chavent, M., Saracco, J.: Central tendency and dispersion measures for intervals and hypercubes. Commun. Stat. Theor. Methods 37, 1471–1482 (2008)
Coppersmith, D., Fleischer, L., Rudra, A.: Ordering by weighted number of wins gives a good ranking for weighted tournaments. In: Proceeding 17th Annual ACM-SIAM Symposium Discrete Algorithms (SODA 2006), pp. 776–782. ACM (2006)
Dinu, L.P., Manea, F.: An efficient approach for the rank aggregation problem. Theor. Comput. Sci. 359(1–3), 455–461 (2006)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622. ACM (2001)
Gagolewski, M.: Data Fusion: Theory, Methods, and Applications. Institute of Computer Science, Polish Academy of Sciences, Warsaw (2015)
Golub, T., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Grabisch, M., Marichal, J.L., Mesiar, R., Pap, E.: Aggregation Functions. Cambridge University Press, Cambridge (2009)
Grzegorzewski, P.: Metrics and orders in space of fuzzy numbers. Fuzzy Sets Syst. 97, 83–94 (1998)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, NewYork (2013)
Klawonn, F., Höppner, F.: What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier. In: Berthold, M., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 254–264. Springer, Heidelberg (2003)
Leisch, F.: A toolbox for K-centroids cluster analysis. Computat. Stat. Data Anal. 51(2), 526–544 (2006)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceeding Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Nat. Acad. Sci. 99(10), 6567–6572 (2002)
Winkler, R., Klawonn, F., Kruse, R.: Fuzzy clustering with polynomial fuzzifier in connection with M-estimators. Appl. Comput. Math. 10, 146–163 (2011)
Yu, J., Yang, M.S.: Optimality test for generalized FCM and its application to parameter selection. IEEE Trans. Fuzzy Syst. 13(1), 164–176 (2005)
Acknowledgments
This study was supported by the National Science Center, Poland, research project 2014/13/D/HS4/01700.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Cena, A., Gagolewski, M. (2016). Fuzzy K-Minpen Clustering and K-nearest-minpen Classification Procedures Incorporating Generic Distance-Based Penalty Minimizers. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-40581-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-40581-0_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40580-3
Online ISBN: 978-3-319-40581-0
eBook Packages: Computer ScienceComputer Science (R0)