A fast procedure for the calculation of similarity coefficients in automatic classification

https://doi.org/10.1016/0306-4573(81)90026-1Get rights and content

Abstract

A fast algorithm is described for comparing the lists of terms representing documents in automatic classification experiments. The speed of the procedure arises from the fact that all of the non-zero-valued coefficients for a given document are identified together, using an inverted file to the terms in the document collection. The complexity and running time of the algorithm are compared with previously described procedures.

References (8)

There are more references available in the full text version of this article.

Cited by (26)

  • Molecular similarity analysis

    2013, Chemoinformatics for Drug Discovery
  • High-speed rough clustering for very large document collections

    2010, Journal of the American Society for Information Science and Technology
  • Similarity methods in chemoinformatics

    2009, Annual Review of Information Science and Technology
View all citing articles on Scopus
View full text