Skip to main content
Log in

Semantic distance between vague concepts in a framework of modeling with words

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Effectively measuring the similarity or dissimilarity of two vague concepts plays a key step in reasoning and computing with vague concepts. In this paper, we define semantic distances between data instances and vague concepts based on modeling vagueness in a framework called label semantics. We also propose two clustering methods based on these sematic distances, which can cluster data instances and vague concepts simultaneously. To evaluate our approach, we conduct several experimental studies on three datasets including Corel images and labels, Reuters-21578, and TDT2. It is illustrated that the proposed distances have the ability to effectively evaluate sematic similarities between data instances and vague concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. Actually the third equation of this theorem doesn’t need the selection function assumption. It is a general property of appropriateness measure.

  2. Reuters-21578 is available at http://www.daviddlewis.com/resources/testcollections/reuters21578/.

  3. TDT2 is available at http://www.itl.nist.gov/iad/mig/publications/proceedings/darpa99/html/tdt110/tdt110.htm.

  4. LibSVM is available at http://www.csie.ntu.edu.tw/cjlin/libsvm.

  5. http://www.lextek.com/manuals/onix/stopwords1.html.

References

  • Bharti K, Singh P (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126

    Article  Google Scholar 

  • Bishop M (2006) Pattern recognition and machine learning. Springer, Berlin

    MATH  Google Scholar 

  • Cambria E (2012) Sentic computing for socal media marketing. Multimed Tools Appl 59(2):557–577

    Article  Google Scholar 

  • Cambria E, Hussain A (2012) Sentic computing: techniues, tools, and applications. Springer, Berlin

    Book  Google Scholar 

  • Carneiro G, Chan A, Moreno P, Vasconcelos N (2006) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans PAMI 29(3):394–410

    Article  Google Scholar 

  • Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  • Chen Y, Garcia E, Gupta M, Rahimi A, Cazzanti L (2009) Similarity-based classification: concepts and algorithms. J Mach Learn Res 10(2):747–776

    MathSciNet  MATH  Google Scholar 

  • Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Crosscombe M, Lawry J (2016) A model of multi-agent consensus for vague and uncertain beliefs. Adapt Behav 24(4):249–260

    Article  Google Scholar 

  • Daniel R, Lawry J, Rico-Ramirez A, Clukie D (2007) Classification of weather radar images using linguistic decision trees with conditional labelling. In: FUZZ-IEEE, pp 1–6

  • David A (2005) Statistical models: theory and practice. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Deng C, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637

    Article  Google Scholar 

  • Figueiredo F, Rocha L, Couto T, Salles T, Goncalves M (2011) Word co-occurrence features for text classification. Inf Syst 36(5):843–858

    Article  Google Scholar 

  • Francisco A, Martinez J, Aguilar C, Roldon C (2016) Estimation of a fuzzy regression model using fuzzy distances. IEEE Trans Fuzzy Syst 24(2):344–359

    Article  Google Scholar 

  • Goldberger J, Hinton G, Roweis S, Salakhutdinov R (2005) Neighbourhood components analysis. In: NIPS, pp 513–520

  • Gu B, Sheng VS (2016) A robust regularization path algorithm for \(v\)-support vector classification. IEEE Trans Neural Netw Learn Syst 1:1–8

    Google Scholar 

  • Gu B, Sheng VS, Tay KY, Romano W, Li S (2015a) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416

    Article  MathSciNet  Google Scholar 

  • Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015b) Incremental learning for \(v\)-support vector regression. Neural Netw 67:140–150

    Article  MATH  Google Scholar 

  • Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 28(7):1646–1656

    Article  MathSciNet  Google Scholar 

  • H Druker CB (1997) Support vector regression machine. In: NIPS, pp 155–161

  • Guo H, Wang X, Wang L (2016) Delphi method for estimating membership function of uncertain set. J Uncertain Anal Appl 4(1):1–17

    Article  Google Scholar 

  • He H, Lawry J (2014) The linguistic attribute hierarchy and its optimisation for classification. Soft Comput 18(10):1967–1984

    Article  Google Scholar 

  • Janis V, Montes S (2007) Distance between fuzzy sets as a fuzzy quantity. Acta Univ Matthiae Belii Ser Math 14:41–49

    MathSciNet  MATH  Google Scholar 

  • Jolliffe I (2005) Principal component analysis. Wiley Online Library, Hoboken

    MATH  Google Scholar 

  • Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. In: Technical report, Engineering faculty, Computer Engineering Department. Erciyes University Press, Erciyes

  • Lavrenko V, Manmatha R, Jeon J (2004) A model for learning the semantics of pictures. In: NIPS

  • Lawry J (2006) Modelling and reasoning with vague concepts. Springer, Berlin

    MATH  Google Scholar 

  • Lawry J (2014) Probability, fuzziness and borderline cases. Int J Approx Reason 55(5):1164–1184

    Article  MathSciNet  MATH  Google Scholar 

  • Lawry J, Tang Y (2009) Uncertainty modelling for vague concepts: a prototype theory approach. Artif Intell 173:1539–1558

    Article  MathSciNet  MATH  Google Scholar 

  • Lewis M, Lawry J (2016) Hierarchical conceptual spaces for concept combination. Aritif Intell 237:204–227

    Article  MathSciNet  MATH  Google Scholar 

  • Li D (2004) Some measures fo dissimilarity in intuitionistic fuzzy structures. J Comput Syst Sci 8:115–122

    Article  MATH  Google Scholar 

  • Hyung LK, Song KLYS (1994) Similarity measrue between fuzzy sets and between elements. Fuzzy Sets Syst 62:291–293

    Article  Google Scholar 

  • Lovasz L, Plummer M (1986) Matching theory. Budapest

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley symposium on mathematical statistics and probability, pp 281–297

  • McCulloch J, Wagner C, Akckelin U (2013) Measuring the directional distance between fuzzy sets. In: UKCI 2013, the 13th annual workshop on computational intelligence, Surrey University, pp 38–45

  • Ng A, Jordan M, Weiss Y (2009) On spectral clustering: analysis and an algorithm. J Mach Learn Res 10(2):747–776

    MathSciNet  Google Scholar 

  • Nieradka G, Butkiewicz B (2007) A method for automatic membership function estimation based on fuzzy measures. Foundations of fuzzy logic and soft computing. Springer, Berlin, Heidelberg, pp 451–460

    Google Scholar 

  • P Groenen UK, Rosmalen JV (2007) Fuzzy clustering with minkowski distance function. In: Advances in fuzzy clustering and its applications, pp 53–68

  • Pappis C, Karacapilidis N (1993) A comparative assessment of measures of similarity of fuzzy values. Fuzzy Sets Syst 56:171–174

    Article  MathSciNet  MATH  Google Scholar 

  • Qin Z, Lawry J (2005) Decision tree learning with fuzzy labels. Inf Sci 172(1–2):91–129

    Article  MathSciNet  MATH  Google Scholar 

  • Qin Z, Lawry J (2008) LFOIL: Linguistic rule induction in the label semantic framework. Fuzzy Sets Syst 159(4):435–448

    Article  MathSciNet  MATH  Google Scholar 

  • Qin Z, Tang Y (2014) Uncertainty modeling for data mining: a label semantics approach. Springer, Berlin

    Book  MATH  Google Scholar 

  • Rosch E (1973) Natural categories. Cogn Psychol 4:328–350

    Article  Google Scholar 

  • Rosch E (1975) Cognitive representation of semantic categories. J Exp Psychol 104:192–233

    Article  Google Scholar 

  • Rosmalen JV (2006) Fuzzy clustering with minkowski distance. In: Econometric, pp 53–68

  • Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  • Medasani S, Kim J, Krishnapuram R (1998) An overview of membership function generation techniques for pattern recognition. Int J Approx Reason 19:391–417

    Article  MathSciNet  MATH  Google Scholar 

  • Scott J (2012) Illusions in regression analysis. Int J Forecast 28(3):689

    Article  Google Scholar 

  • Smola A, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    Article  MathSciNet  Google Scholar 

  • Szmidt E, Kacprzyk J (2000) Distances between intuitionistic fuzzy sets. Fuzzy Sets Syst 114:505–518

    Article  MathSciNet  MATH  Google Scholar 

  • Turnbull O, Lawry J, Lowengerg M, Richards A (2016) A cloned linguistic decision tree controller for real-time path planning in hostile environments. Fuzzy Sets Syst 293:1–29

    Article  MathSciNet  Google Scholar 

  • V Srivastava, Tripathi BK, Pathak VK (2011) An evolutionaru fuzzy clustering with minkowski distances. In: International conference on neural information processing, pp 753–760

  • Vapnik V (1998) Statistical learning theory. Wiley, Hoboken

    MATH  Google Scholar 

  • Victor S, Semyon V (2006) A theoretical introduction to numerical analysis. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  • Wu H, Luk R, Wong K, Kwok K (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst 26(3):55–59

    Article  Google Scholar 

  • Xiaohui C, Potok T (2005) Document clustering analysis based on hybrid PSO+ k-means algorithm. J Comput Sci Special issue (April 15):27–33

  • Xing EP, Jordan MI, Russell SJ, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: NIPS, pp 521–528

  • Zadeh L (1965) Fuzzy sets. Inf Control 8(3):335–353

    Article  Google Scholar 

  • Zadeh L (1975) The concept of linguistic variable and its application to approximate reasoning part 2. Inf Sci 4:301–357

    Article  MATH  Google Scholar 

  • Zadeh L (1996) Fuzzy logic = computing with words. IEEE Trans Fuzzy Syst 4:103–111

    Article  Google Scholar 

  • Zhang W, Qin Z, Tao W (2012) Semi-automatic image annotation using sparse coding. In: ICMLC

  • Zhang Y, Schneider J (2012) Maximum margin output coding. In: ICML

  • Zheng Y, Jeon B, Xu D, Wu QJ, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy c-means algorithm. Neural Netw 28(2):961–973

    Google Scholar 

  • Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of China (Grant Nos. 61572162 and 61272188) and the Zhejiang Provincial Key Science and Technology Project Foundation (No. 2017C01010).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hua Hu or Haiyang Hu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Hu, H., Hu, H. et al. Semantic distance between vague concepts in a framework of modeling with words. Soft Comput 23, 3347–3364 (2019). https://doi.org/10.1007/s00500-017-2992-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2992-x

Keywords

Navigation