A new preference disaggregation method for clustering problem: DISclustering

Esmaelian, Majid; Shahmoradi, Hadi; Nemati, Fateme

doi:10.1007/s00500-019-04210-0

A new preference disaggregation method for clustering problem: DISclustering

Methodologies and Application
Published: 08 July 2019

Volume 24, pages 4483–4503, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

172 Accesses
Explore all metrics

Abstract

Clustering, a famous technique in data analysis and data mining, attempts to find valuable patterns in datasets. In this technique, a set of alternatives is partitioned into logical groups which are called clusters. The partitioning is based on some predefined attributes to find clusters in which their alternatives are similar to each other comparing to other clusters. In conventional methods, the similarity is usually defined by a distance-based measurement, whereas in this study, we have proposed a new multi-attribute preference disaggregation method called DISclustering in which a new measurement named global utility is introduced for cluster similarity. In DISclustering, the global utility of each alternative is calculated through a feed-forward neural network in which its parameters are determined using SA algorithm. Each alternative is assigned to a cluster based on comparing the obtained global utility with cluster boundaries, called utility thresholds; aim to minimize the intra-cluster distances (ICD). For this purpose, all utility thresholds are estimated using PSO algorithm. The performance of the proposed method is compared with 18 clustering algorithms on 14 real datasets based on F-measure and object function values (ICD values using intra-cluster or Gower distances). The experimental results and hypothesis statistical test indicate that DISclustering algorithm significantly improved clustering results on F-measure criteria in which outperforms in almost 13 compared algorithms out of 18. Note that, DISclustering calculates cluster centroid in a different way comparing to other algorithms. Hence, its ICD values are less eligible to perform a fair comparison.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparative Analysis Between Crisp and Fuzzy Data Clustering Approaches for Traditional and Bioinspired Algorithms

IF-CLARANS: Intuitionistic Fuzzy Algorithm for Big Data Clustering

A unified framework for the key weights in MAGDM under uncertainty

Article 17 November 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5(1):19
Google Scholar
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Google Scholar
Abualigah LM, Khader AT, Al-Betar MA (2016) Unsupervised feature selection technique based on genetic algorithm for improving the text clustering. In: 2016 7th international conference on computer science and information technology (CSIT), 2016, pp 1–6
Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017a) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36
Google Scholar
Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017b) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435
Google Scholar
Abualigah LM, Khader AT, Al-Betar MA, Hanandeh ES (2017c) A new hybridization strategy for krill herd algorithm and harmony search algorithm applied to improve the data clustering. Management 9(11)
Abualigah LM, Khader AT, Hanandeh ES (2018a) A novel weighting scheme applied to improve the text document clustering techniques. In: Innovative computing, optimization and its applications. Springer, pp 305–320
Abualigah LM, Khader AT, Hanandeh ES (2018b) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466
Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018c) A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125
Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018d) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071
Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018e) A hybrid strategy for krill herd algorithm with harmony search algorithm to improve the data clustering? Intell Decis Technol 1–12 (preprint)
Aggarwal CC, Reddy CK (2013) Data clustering: algorithms and applications, vol 2. Chapman and Hall, Boca Raton
MATH Google Scholar
Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications, no. 2, vol 27. ACM, New York
Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin
MATH Google Scholar
Chatterjee GSS, Zhang A (1998) WaveCluster: a multi-resolution clustering approach for very large spatial databases. In: VLDB’98 proceedings of the 24rd international conference on very large data bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998, pp 428–439
Clerc M, Kennedy J (2002) The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans Evol Comput 6(1):58–73
Google Scholar
Dalli A (2003) Adaptation of the F-measure to cluster based lexicon quality evaluation. In: Proceedings of the EACL 2003 workshop on evaluation initiatives in natural language processing: are evaluation methods, metrics and resources reusable?, 2003, pp 51–56
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–22
MathSciNet MATH Google Scholar
Devaud JM, Groussaud G, Jacquet-Lagreze E (1980) UTADIS: Une méthode de construction de fonctions d’utilité additives rendant compte de jugements globaux. European Working Group Multicriteria Decision Aid, Bochum, p 94
Esmaelian M, Shahmoradi H, Vali M (2016) A novel classification method: a hybrid approach based on extension of the UTADIS with polynomial and PSO-GA algorithm. Appl Soft Comput 49:56–70
Google Scholar
Esmaelian M, Shahmoradi H, Nemati F (2017) P-UTADIS: a multi criteria classification method. In: Nassiri-Mofakham F (ed) Current and future developments in artificial intelligence. Bentham Science Publishers, Sharjah, pp 213–266
Google Scholar
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96(34):226–231
Google Scholar
Fan C-Y, Fan P-S, Chan T-Y, Chang S-H (2012) Using hybrid data mining and machine learning clustering analysis to predict the turnover rate for technology professionals. Expert Syst Appl 39(10):8844–8851
Google Scholar
Figueira J, Greco S, Ehrgott M (2005) Multiple criteria decision analysis: state of the art surveys, vol 78. Springer, Berlin
MATH Google Scholar
Gower JC (1971) A general coefficient of similarity and some of its properties. Biometrics 27:857–871
Google Scholar
Grigoras G, Scarlatache F (2015) An assessment of the renewable energy potential using a clustering based data mining method. Case study in Romania. Energy 81:416–429
Google Scholar
Handl J, Knowles J, Dorigo M (2003) Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and id-som. In: Proceedings of the third international conference on hybrid intelligent systems. IOS Press
Hinneburg A, Keim DA (1998) An efficient approach to clustering in large multimedia databases with noise. KDD 98:58–65
Google Scholar
Hinneburg A, Aggarwal CC, Keim DA (2000) What is the nearest neighbor in high dimensional spaces? In: 26th International conference on very large databases, 2000, pp 506–515
Hu G, Zhou S, Guan J, Hu X (2008) Towards effective document clustering: a constrained K-means based approach. Inf. Process. Manag. 44(4):1397–1409
Google Scholar
Huang G, Liu T, Yang Y, Lin Z, Song S, Wu C (2015) Discriminative clustering via extreme learning machine. Neural Netw 70:1–8
MATH Google Scholar
Iván G, Grolmusz V (2014) On dimension reduction of clustering results in structural bioinformatics. Biochim Biophys Acta (BBA)-Proteins Proteom 1844(12):2277–2283
Google Scholar
Jacquet-Lagrèze E (1995) An application of the UTA discriminant model for the evaluation of R & D projects. In: Advances in multicriteria analysis. Springer, pp 203–211
Jacquet-Lagreze E, Siskos J (1982) Assessing a set of additive utility functions for multicriteria decision-making, the UTA method. Eur J Oper Res 10(2):151–164
MATH Google Scholar
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666
Google Scholar
Kargari M, Sepehri MM (2012) Stores clustering using a data mining approach for distributing automotive spare-parts to reduce transportation costs. Expert Syst Appl 39(5):4740–4748
Google Scholar
Kerr G, Ruskin HJ, Crane M, Doolan P (2008) Techniques for clustering gene expression data. Comput Biol Med 38(3):283–293
Google Scholar
King B (1967) Step-wise clustering procedures. J Am Stat Assoc 62(317):86–101
Google Scholar
Li Y, Chung SM, Holt JD (2008) Text document clustering based on frequent word meaning sequences. Data Knowl Eng 64(1):381–404
Google Scholar
Liu D, Jiang M, Yang X, Li H (2016) Analyzing documents with quantum clustering: a novel pattern recognition algorithm based on quantum mechanics. Pattern Recognit. Lett. 77:8–13
Google Scholar
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
MathSciNet MATH Google Scholar
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, vol 1, no 14, pp 281–297
McQuitty LL (1957) Elementary linkage analysis for isolating orthogonal and oblique types and typal relevancies. Educ Psychol Meas 17(2):207–229
Google Scholar
Melin P, Castillo O (2014) A review on type-2 fuzzy logic applications in clustering, classification and pattern recognition. Appl Soft Comput 21:568–577
Google Scholar
Mirkin B (2012) Clustering: a data recovery approach, vol 19. Chapman and Hall, Boca Raton
MATH Google Scholar
Moon TK (1996) The expectation-maximization algorithm. IEEE Signal Process Mag 13(6):47–60
Google Scholar
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT press, Cambridge
MATH Google Scholar
Peng Y, Zheng W-L, Lu B-L (2016) An unsupervised discriminative extreme learning machine and its applications to data clustering. Neurocomputing 174:250–264
Google Scholar
Rokach L, Maimon O (2005) Clustering methods. In: Data mining and knowledge discovery handbook. Springer, pp 321–352
Schikuta E (1996) Grid-clustering: an efficient hierarchical clustering method for very large data sets. In: Proceedings of 13th international conference on pattern recognition, 1996, vol 2, pp 101–105
Shi Y (2001) Particle swarm optimization: developments, applications and resources. In: Proceedings of the 2001 congress on evolutionary computation (IEEE Cat. No. 01TH8546), 2001, vol 1, pp 81–86
Taguchi G (1990) Introduction to quality engineering, Tokyo. Asian Product Organ
Van Laarhoven PJM, Aarts EHL (1987) Simulated annealing. In: Simulated annealing: theory and applications. Springer, pp 7–15
Walpole RE (1982) Introduction to statistics
Walpole RE, Myers RH, Myers SL, Ye K (2011) Probability and statistics for engineers and scientists, 9th edn. Pearson, London
MATH Google Scholar
Wang W, Yang J, Muntz R (1997) STING: a statistical information grid approach to spatial data mining. VLDB 97:186–195
Google Scholar
Wangchamhan T, Chiewchanwattana S, Sunat K (2017) Efficient algorithms based on the k-means and chaotic league championship algorithm for numeric, categorical, and mixed-type data clustering. Expert Syst Appl 90:146–167
Google Scholar
Warnekar CS, Krishna G (1979) A heuristic clustering algorithm using union of overlapping pattern-cells. Pattern Recognit 11(2):85–93
MATH Google Scholar
Zahn CT (1970) Graph theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 20(SLAC-PUB-0672-REV):68
MATH Google Scholar
Zell A (1994) Simulation neuronaler netze, no. 5.3, vol 1. Addison-Wesley, Bonn
MATH Google Scholar
Zhao L, Yang Y (2009) PSO-based single multiplicative neuron model for time series prediction. Expert Syst Appl 36(2):2805–2812
MathSciNet Google Scholar
Zopounidis C, Doumpos M (2002) Multicriteria classification and sorting methods: a literature review. Eur J Oper Res 138(2):229–246
MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank referees for their helpful comments.

Author information

Authors and Affiliations

Department of Management, University of Isfahan, Hezarjerib St., Azadi Square, Isfahan, 7344181746, Iran
Majid Esmaelian & Hadi Shahmoradi
Department of Artificial Intelligence, Faculty of Computer Engineering, University of Isfahan, Hezarjerib St., Azadi Square, Isfahan, 7344181746, Iran
Fateme Nemati

Authors

Majid Esmaelian
View author publications
You can also search for this author inPubMed Google Scholar
Hadi Shahmoradi
View author publications
You can also search for this author inPubMed Google Scholar
Fateme Nemati
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Majid Esmaelian.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by the author.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Esmaelian, M., Shahmoradi, H. & Nemati, F. A new preference disaggregation method for clustering problem: DISclustering. Soft Comput 24, 4483–4503 (2020). https://doi.org/10.1007/s00500-019-04210-0

Download citation

Published: 08 July 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00500-019-04210-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new preference disaggregation method for clustering problem: DISclustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Comparative Analysis Between Crisp and Fuzzy Data Clustering Approaches for Traditional and Bioinspired Algorithms

IF-CLARANS: Intuitionistic Fuzzy Algorithm for Big Data Clustering

A unified framework for the key weights in MAGDM under uncertainty

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now