Skip to main content
Log in

fastWKendall: an efficient algorithm for weighted Kendall correlation

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

The Kendall correlation is a non-parametric method that measures the strength of dependence between two sequences. Like Pearson correlation and Spearman correlation, Kendall correlation is widely applied in sequence similarity measurements and cluster analysis. We propose an efficient algorithm, fastWKendall, to compute the approximate weighted Kendall correlation in \(O(n\log n)\) time and O(n) space complexity. This is an improvement to the state-of-the-art \(O(n^2)\) time requirement. The proposed method can be incorporated to perform conventional sequential similarity measurement and cluster analysis much more rapidly. This is important for analysis of huge-volume datasets, such as genome databases, streaming stock market data, and publicly available huge datasets on the Internet. The code which is implemented in R is available for public access.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. From Corollary 1.

  2. From Inference 2.

  3. From Corollary 4.

  4. By Lemma 1.

References

  • Amerise IL, Tarsitano A (2016) Combining dissimilarity matrices by using rank correlations. Comput Stat 31(1):1–15

    Article  MathSciNet  MATH  Google Scholar 

  • Campello RJGB, Hruschka ER (2009) On comparing two sequences of numbers and its applications to clustering analysis. Inf Sci 179(8):1025–1039

    Article  MathSciNet  MATH  Google Scholar 

  • Cerro JC, Cerdà V, Pey J (2015) Trends of air pollution in the Western Mediterranean Basin from a 13-year database: a research considering regional, suburban and urban environments in Mallorca (Balearic Islands). Atmos Environ 103:138–146

    Article  Google Scholar 

  • Chan CH, Yan F, Kittler J, Mikolajczyk K (2015) Full ranking as local descriptor for visual recognition: a comparison of distance metrics on \(s_n\). Pattern Recognit 48(4):1328–1336

    Article  Google Scholar 

  • Christensen D (2005) Fast algorithms for the calculation of Kendall’s \(\tau \). Comput Stat 20(1):51–62

    Article  MathSciNet  MATH  Google Scholar 

  • Coolenmaturi T (2014) A new weighted rank coefficient of concordance. J Appl Stat 41(41):1721–1745

    Article  MathSciNet  Google Scholar 

  • Coolenmaturi T (2016) New weighted rank correlation coefficients sensitive to agreement on top and bottom rankings. J Appl Stat 43(12):1–19

    MathSciNet  Google Scholar 

  • Costa JPD, Roque LAC, Soares C (2015) The weighted rank correlation coefficient \(r_{W2}\) in the case of ties. Stat Probab Lett 99:20–26

    Article  MathSciNet  MATH  Google Scholar 

  • Etesami O, Gohari A (2016) Maximal rank correlation. IEEE Commun Lett 20(1):117–120

    Article  Google Scholar 

  • Fenwick PM (1994) A new data structure for cumulative frequency tables. Softw Pract Exp 24(3):327–336

    Article  Google Scholar 

  • Goodman LA, Kruskal WH (1954) Measures of association for cross classifications. J Am Stat Assoc 49(268):732–764

    MATH  Google Scholar 

  • Jaskowiak PA, Campello RJ, Costa IG (2014) On the selection of appropriate distances for gene expression data clustering. BMC Bioinform 15(Suppl2):390–400

    Google Scholar 

  • Kendall MG (1938) A new measure of rank correlation. Biometrika 30(3):81–93

    Article  MathSciNet  MATH  Google Scholar 

  • Marden JI, Kendall M, Gibbons JD (1992) Rank correlation methods (5th ed.). J Am Stat Assoc 87(417):249

    Article  Google Scholar 

  • Melucci M (2009) Weighted rank correlation in information retrieval evaluation. In: Asia information retrieval symposium on information retrieval technology, pp 75–86

  • Okada C, Yugo S, Torres RDS (2015) Unsupervised distance learning by rank correlation measures for image retrieval. In: ACM on international conference on multimedia retrieval, pp 331–338

  • R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

  • Shieh GS (1998) A weighted Kendall’s tau statistic. Stat Probab Lett 39(1):17–24

    Article  MathSciNet  MATH  Google Scholar 

  • Slotta DJ (2005) Evaluating biological data using rank correlation methods. PhD thesis, Virginia Polytechnic Institute and State University

  • Stepanov A (2015) On the Kendall correlation coefficient. arXiv:1507.01427 [math.ST]

  • Tarsitano A (2005) Weighted rank correlation and hierarchical clustering. In: Zani S, Cerioli A, Riani M, Vichi M (eds) Book of short papers of CLADAG 2005. Springer, Palermo, pp 517–521

    Google Scholar 

  • Wilbik A, Keller JM, Bezdek JC (2014) Linguistic prototypes for data from eldercare residents. IEEE Trans Fuzzy Syst 22(1):110–123

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the Chinese National Natural Science Foundation (Grant No. 61472082), Natural Science Foundation of Fujian Province of China (Grant No. 2014J01220), Scientific Research Innovation Team Construction Program of Fujian Normal University (Grant No. IRTL1702) and the US National Science Foundation (Grant No. IIS-1552860).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Jiang.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J., Adjeroh, D.A., Jiang, BH. et al. fastWKendall: an efficient algorithm for weighted Kendall correlation. Comput Stat 33, 1823–1845 (2018). https://doi.org/10.1007/s00180-017-0775-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-017-0775-6

Keywords

Navigation