Skip to main content

Towards a Classification of Binary Similarity Measures

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10632))

Abstract

Similarity measures for binary variables are used in many problems of machine learning, pattern recognition and classification. Currently, the dozens of similarity measures are introduced and the problem of comparative analysis of these measures appears. One of the methods used for such analysis is clustering of similarity measures based on correlation between data similarity values obtained by different measures. The paper proposes the method of comparative analysis of similarity measures based on the set theoretic representation of these measures and comparison of algebraic properties of these representations. The results show existing relationship between results of clustering and the classification of measures by their properties. Due to the results of clustering depend on the clustering method and on data used for measuring correlation between measures we conclude that the classification based on the proposed properties of similarity measures is more suitable for comparative analysis of similarity measures.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Batyrshin, I.: On definition and construction of association measures. J. Intell. Fuzzy Syst. 29, 2319–2326 (2015)

    Article  MathSciNet  Google Scholar 

  2. Batyrshin, I.Z., Kubysheva, N., Solovyev, V., Villa-Vargas, L.A.: Visualization of similarity measures for binary data and 2 × 2 tables. Computación y Sistemas 20(3), 345–353 (2016)

    Article  Google Scholar 

  3. Batagelj, V., Bren, M.: Comparing resemblance measures. J. Classif. 12(1), 73–90 (1995)

    Article  MathSciNet  Google Scholar 

  4. Baulieu, F.B.: A classification of presence/absence based dissimilarity coefficients. J. Classif. 6(1), 233–246 (1989)

    Article  MathSciNet  Google Scholar 

  5. Choi, S.S., Cha, S.H., Charles, C.T.: A survey of binary similarity and distance measures. J. Syst. Cybern. Inf. 8, 43–48 (2010)

    Google Scholar 

  6. Clifford, H.T., Stephenson, W.: An Introduction to Numerical Classification, vol. 229. Academic Press, New York (1975)

    MATH  Google Scholar 

  7. Duarte, J.M., Santos, J.B.D., Melo, L.C.: Comparison of similarity coefficients based on RAPD markers in the common bean. Genet. Mol. Biol. 22(3), 427–432 (1999)

    Article  Google Scholar 

  8. Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Am. Stat. Assoc. 49, 732–764 (1954)

    MATH  Google Scholar 

  9. Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 857–871

    Google Scholar 

  10. Gower, J.C., Legendre, P.: Metric and Euclidean properties of dissimilarity coefficients. J. Classif. 3(1), 5–48 (1986)

    Article  MathSciNet  Google Scholar 

  11. Hassanat, A.B.: Dimensionality invariant similarity measure. J. Am. Sci. 221–226 (2014)

    Google Scholar 

  12. Johnston, J.W.: Similarity indices I: what do they measure? In: Energy Research and Development Administration, vol. 136 (1976)

    Google Scholar 

  13. Legendre, P., Legendre, L.F.: Numerical Ecology, 2nd edn. Elsevier, Amsterdam (1998)

    MATH  Google Scholar 

  14. Lesot, M.-J., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data: a survey. Int. J. Knowl. Eng. Soft Data Paradig. 1(1), 63–84 (2009)

    Article  Google Scholar 

  15. Meilă, M.: Comparing clusterings: an information based distance. J. Multivar. Anal. 98, 873–895 (2007)

    Article  MathSciNet  Google Scholar 

  16. Meyer, A.D.S., Garcia, A.A.F., Souza, A.P.D., Souza Jr., C.L.D.: Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L). Genet. Mol. Biol. 27(1), 83–91 (2004)

    Article  Google Scholar 

  17. Pearson, K., Blakeman, J.: Mathematical contributions to the theory of evolution. In: 13th on the Theory of Contingency and Its Relation to Association and Normal Correlation. Dulau & Co., London (1912)

    Google Scholar 

  18. Pfitzner, D., Leibbrandt, R., Powers, D.: Characterization and evaluation of similarity measures for pairs of clusterings. Knowl. Inf. Syst. 19, 361–394 (2009)

    Article  Google Scholar 

  19. Rodríguez-Salazar, M.E., Álvarez-Hernández, S., Bravo-Núñez, E.: Coeficientes de asociación. Plaza y Valdés Editores, México (2001)

    Google Scholar 

  20. Sidorov, G., Gelbukh, A., Gómez-Adorno, H., Pinto, D.: Soft similarity and soft cosine measure: similarity of features in vector space model. Computación y Sistemas 18(3), 491–504 (2014)

    Article  Google Scholar 

  21. Sokal, R.R., Sneath, P.H.A.: Principles of Numerical Taxonomy. WH Freeman, New York (1963)

    MATH  Google Scholar 

  22. Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 32–41 (2002)

    Google Scholar 

  23. Tversky, A.: Features of similarity. Psychol. Rev. 84, 327–352 (1977)

    Article  Google Scholar 

  24. Warrens, M.J.: A comparison of multi-way similarity coefficients for binary sequences. Int. J. Res. Rev. Appl. Sci. 16(1), 12 (2013)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The work is partially supported by the projects SIP 20171344, BEIFI of IPN and 283778 of CONACYT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ildar Batyrshin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mejia, I.R., Batyrshin, I. (2018). Towards a Classification of Binary Similarity Measures. In: Castro, F., Miranda-Jiménez, S., González-Mendoza, M. (eds) Advances in Soft Computing. MICAI 2017. Lecture Notes in Computer Science(), vol 10632. Springer, Cham. https://doi.org/10.1007/978-3-030-02837-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02837-4_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02836-7

  • Online ISBN: 978-3-030-02837-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics