Skip to main content
Log in

A comparative study of the parallel wavelet-based clustering algorithm on three-dimensional dataset

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cluster analysis—as a technique for grouping a set of objects into similar clusters—is an integral part of data analysis and has received wide interest among data mining specialists. The parallel wavelet-based clustering algorithm using discrete wavelet transforms has been shown to extract the approximation component of the input data on which objects of the clusters are detected based on the object connectivity property. However, this algorithm suffers from inefficient I/O operations and performance degradation due to redundant data processing. We address these issues to improve the parallel algorithm’s efficiency and extend the algorithm further by investigating two merging techniques (both merge-table and priority-queue based approaches), and apply them on three-dimensional data. In this study, we compare two parallel WaveCluster algorithms and a parallel K-means algorithm to evaluate the implemented algorithms’ effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Arneodo A, Bacry E, Graves PV, Muzy JF (1995) Characterizing long-range correlations in DNA sequences from wavelet analysis. Phys Rev Lett 74:3293–3296. doi:10.1103/PhysRevLett.74.3293

    Article  Google Scholar 

  2. Cohen L (2000) The uncertainty principles of windowed wave functions. Opt Commun 179(16):221–229. doi:10.1016/S0030-4018(00)00454-5. http://www.sciencedirect.com/science/article/pii/S0030401800004545

  3. Haar A (1910) Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen 69(3):331–371. doi:10.1007/BF01456326

    Article  MATH  MathSciNet  Google Scholar 

  4. Lewis AS, Knowles G (1992) Image compression using the 2-D wavelet transform. IEEE Trans Image Process 1(2):244–250. doi:10.1109/83.136601

    Article  Google Scholar 

  5. Liu Y, Pisharath J, Liao WK, Memik G, Choudhary A, Dubey P (2004) Performance evaluation and characterization of scalable data mining algorithms. In: Proceedings of IASTED. http://users.eecs.northwestern.edu/wkliao/Kmeans/

  6. Loughlin P, Cohen L (2004) The uncertainty principle: global, local, or both? IEEE Trans Signal Process 52(5):1218–1227. doi:10.1109/TSP.2004.826160

    Article  MathSciNet  Google Scholar 

  7. Sheikholeslami G, Chatterjee S, Zhang A (2000) Wavecluster: a wavelet-based clustering approach for spatial data in very large databases. VLDB J 8(3–4):289–304

    Article  Google Scholar 

  8. Shim I, Soraghan JJ, Siew W (2001) Detection of PD utilizing digital signal processing methods. Part 3: open-loop noise reduction. Electr Insul Mag IEEE 17(1):6–13. doi:10.1109/57.901611

    Article  Google Scholar 

  9. Torrence C, Compo GP (1998) A practical guide to wavelet analysis. Bull Am Meteorol Soc 79:61–78

    Article  Google Scholar 

  10. Tufekci Z, Gowdy J (2000) Feature extraction using discrete wavelet transform for speech recognition. In: Proceedings of the IEEE on Southeastcon 2000, pp 116–123. doi:10.1109/SECON.2000.845444

  11. Valens C (1999) A really friendly guide to wavelets. C. Valens@mindless.com 2004

  12. Yildirim AA, Ozdogan C (2011) Parallel wavecluster: a linear scaling parallel clustering algorithm implementation with application to very large datasets. J Parallel Distrib Comput 71(7):955–962. doi:10.1016/j.jpdc.2011.03.007

    Article  Google Scholar 

Download references

Acknowledgments

Compute, storage and other resources from the Division of Research Computing in the Office of Research and Graduate Studies at Utah State University are gratefully acknowledged. We also would like to thank Dr. Wei-keng Liao for providing us with the source code of the parallel K-means algorithm.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmet Artu Yıldırım.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yıldırım, A.A., Watson, D. A comparative study of the parallel wavelet-based clustering algorithm on three-dimensional dataset. J Supercomput 71, 2365–2380 (2015). https://doi.org/10.1007/s11227-015-1385-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1385-0

Keywords

Navigation