Abstract
Effectively measuring the similarity or dissimilarity of two vague concepts plays a key step in reasoning and computing with vague concepts. In this paper, we define semantic distances between data instances and vague concepts based on modeling vagueness in a framework called label semantics. We also propose two clustering methods based on these sematic distances, which can cluster data instances and vague concepts simultaneously. To evaluate our approach, we conduct several experimental studies on three datasets including Corel images and labels, Reuters-21578, and TDT2. It is illustrated that the proposed distances have the ability to effectively evaluate sematic similarities between data instances and vague concepts.
Similar content being viewed by others
Notes
Actually the third equation of this theorem doesn’t need the selection function assumption. It is a general property of appropriateness measure.
Reuters-21578 is available at http://www.daviddlewis.com/resources/testcollections/reuters21578/.
TDT2 is available at http://www.itl.nist.gov/iad/mig/publications/proceedings/darpa99/html/tdt110/tdt110.htm.
LibSVM is available at http://www.csie.ntu.edu.tw/cjlin/libsvm.
References
Bharti K, Singh P (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126
Bishop M (2006) Pattern recognition and machine learning. Springer, Berlin
Cambria E (2012) Sentic computing for socal media marketing. Multimed Tools Appl 59(2):557–577
Cambria E, Hussain A (2012) Sentic computing: techniues, tools, and applications. Springer, Berlin
Carneiro G, Chan A, Moreno P, Vasconcelos N (2006) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans PAMI 29(3):394–410
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Chen Y, Garcia E, Gupta M, Rahimi A, Cazzanti L (2009) Similarity-based classification: concepts and algorithms. J Mach Learn Res 10(2):747–776
Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297
Crosscombe M, Lawry J (2016) A model of multi-agent consensus for vague and uncertain beliefs. Adapt Behav 24(4):249–260
Daniel R, Lawry J, Rico-Ramirez A, Clukie D (2007) Classification of weather radar images using linguistic decision trees with conditional labelling. In: FUZZ-IEEE, pp 1–6
David A (2005) Statistical models: theory and practice. Cambridge University Press, Cambridge
Deng C, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637
Figueiredo F, Rocha L, Couto T, Salles T, Goncalves M (2011) Word co-occurrence features for text classification. Inf Syst 36(5):843–858
Francisco A, Martinez J, Aguilar C, Roldon C (2016) Estimation of a fuzzy regression model using fuzzy distances. IEEE Trans Fuzzy Syst 24(2):344–359
Goldberger J, Hinton G, Roweis S, Salakhutdinov R (2005) Neighbourhood components analysis. In: NIPS, pp 513–520
Gu B, Sheng VS (2016) A robust regularization path algorithm for \(v\)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015b) Incremental learning for \(v\)-support vector regression. Neural Netw 67:140–150
Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 28(7):1646–1656
H Druker CB (1997) Support vector regression machine. In: NIPS, pp 155–161
Guo H, Wang X, Wang L (2016) Delphi method for estimating membership function of uncertain set. J Uncertain Anal Appl 4(1):1–17
He H, Lawry J (2014) The linguistic attribute hierarchy and its optimisation for classification. Soft Comput 18(10):1967–1984
Janis V, Montes S (2007) Distance between fuzzy sets as a fuzzy quantity. Acta Univ Matthiae Belii Ser Math 14:41–49
Jolliffe I (2005) Principal component analysis. Wiley Online Library, Hoboken
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. In: Technical report, Engineering faculty, Computer Engineering Department. Erciyes University Press, Erciyes
Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: NIPS
Lawry J (2006) Modelling and reasoning with vague concepts. Springer, Berlin
Lawry J (2014) Probability, fuzziness and borderline cases. Int J Approx Reason 55(5):1164–1184
Lawry J, Tang Y (2009) Uncertainty modelling for vague concepts: a prototype theory approach. Artif Intell 173:1539–1558
Lewis M, Lawry J (2016) Hierarchical conceptual spaces for concept combination. Aritif Intell 237:204–227
Li D (2004) Some measures fo dissimilarity in intuitionistic fuzzy structures. J Comput Syst Sci 8:115–122
Hyung LK, Song KLYS (1994) Similarity measrue between fuzzy sets and between elements. Fuzzy Sets Syst 62:291–293
Lovasz L, Plummer M (1986) Matching theory. Budapest
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability, pp 281–297
McCulloch J, Wagner C, Akckelin U (2013) Measuring the directional distance between fuzzy sets. In: UKCI 2013, the 13th annual workshop on computational intelligence, Surrey University, pp 38–45
Ng A, Jordan M, Weiss Y (2009) On spectral clustering: analysis and an algorithm. J Mach Learn Res 10(2):747–776
Nieradka G, Butkiewicz B (2007) A method for automatic membership function estimation based on fuzzy measures. Foundations of fuzzy logic and soft computing. Springer, Berlin, Heidelberg, pp 451–460
P Groenen UK, Rosmalen JV (2007) Fuzzy clustering with minkowski distance function. In: Advances in fuzzy clustering and its applications, pp 53–68
Pappis C, Karacapilidis N (1993) A comparative assessment of measures of similarity of fuzzy values. Fuzzy Sets Syst 56:171–174
Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172(1–2):91–129
Qin Z, Lawry J (2008) LFOIL: Linguistic rule induction in the label semantic framework. Fuzzy Sets Syst 159(4):435–448
Qin Z, Tang Y (2014) Uncertainty modeling for data mining: a label semantics approach. Springer, Berlin
Rosch E (1973) Natural categories. Cogn Psychol 4:328–350
Rosch E (1975) Cognitive representation of semantic categories. J Exp Psychol 104:192–233
Rosmalen JV (2006) Fuzzy clustering with minkowski distance. In: Econometric, pp 53–68
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Medasani S, Kim J, Krishnapuram R (1998) An overview of membership function generation techniques for pattern recognition. Int J Approx Reason 19:391–417
Scott J (2012) Illusions in regression analysis. Int J Forecast 28(3):689
Smola A, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222
Szmidt E, Kacprzyk J (2000) Distances between intuitionistic fuzzy sets. Fuzzy Sets Syst 114:505–518
Turnbull O, Lawry J, Lowengerg M, Richards A (2016) A cloned linguistic decision tree controller for real-time path planning in hostile environments. Fuzzy Sets Syst 293:1–29
V Srivastava, Tripathi BK, Pathak VK (2011) An evolutionaru fuzzy clustering with minkowski distances. In: International conference on neural information processing, pp 753–760
Vapnik V (1998) Statistical learning theory. Wiley, Hoboken
Victor S, Semyon V (2006) A theoretical introduction to numerical analysis. CRC Press, Boca Raton
Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Wu H, Luk R, Wong K, Kwok K (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst 26(3):55–59
Xiaohui C, Potok T (2005) Document clustering analysis based on hybrid PSO+ k-means algorithm. J Comput Sci Special issue (April 15):27–33
Xing EP, Jordan MI, Russell SJ, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: NIPS, pp 521–528
Zadeh L (1965) Fuzzy sets. Inf Control 8(3):335–353
Zadeh L (1975) The concept of linguistic variable and its application to approximate reasoning part 2. Inf Sci 4:301–357
Zadeh L (1996) Fuzzy logic = computing with words. IEEE Trans Fuzzy Syst 4:103–111
Zhang W, Qin Z, Tao W (2012) Semi-automatic image annotation using sparse coding. In: ICMLC
Zhang Y, Schneider J (2012) Maximum margin output coding. In: ICML
Zheng Y, Jeon B, Xu D, Wu QJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. Neural Netw 28(2):961–973
Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173
Acknowledgements
This work is supported by the Natural Science Foundation of China (Grant Nos. 61572162 and 61272188) and the Zhejiang Provincial Key Science and Technology Project Foundation (No. 2017C01010).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Zhang, W., Hu, H., Hu, H. et al. Semantic distance between vague concepts in a framework of modeling with words. Soft Comput 23, 3347–3364 (2019). https://doi.org/10.1007/s00500-017-2992-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-017-2992-x