Skip to main content
Log in

Access the cluster tendency by visual methods for robust speech clustering

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

Identifying the cues for speech segments of speech data is an indispensable task in speaker clustering. The existing techniques perform the task of speech clustering without any prior knowledge of cluster tendency. Many techniques are investigated for finding a prior cluster tendency (CT). During the investigation, the visual access tendency (VAT) is recognized as a reasonable choice to find a cluster tendency. The speech clustering poses three important problems, which are as follows: modelling the speech data, cluster tendency, and effective speech clustering. Modelling is required for defining the shape of the speech segment based on the characteristics of speaker’s voice; hence it is useful for speech recognition. The GMM is a good choice for obtaining the precise model of speech data. Determining the number of speakers (or number of clusters) for the speech is known as cluster tendency. The quality of speech clustering depends on modelling and a prior clustering tendency. The classical algorithms [such as k-means, and minimum spanning tree (MST)-based-clustering] are merged with VAT for determining the effective clustering results along with a prior cluster tendency. We use linear subspace learning for representing the speech segments (or speech utterances) in a projected space of high-dimensional data. Various linear subspace learning techniques are used for improving the speech clustering results. The proposed approaches are hybrid approaches (i.e., k-means-CT, and MST–CT-based clustering), they use expensive steps. For this key reason, we propose another method, direct visualized clustering method, in which we derive the explicit speaker clustering results directly from VAT instead of using either k-means or MST-based clustering. We experimented the proposed methods on TSP speech datasets and done the comparative study for demonstrating the effectiveness of our work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.mmsp.ece.mcgill.ca/Documents/Data/.

References

  • Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55(6):1304–1312

    Article  Google Scholar 

  • Bezdek J et al (2002) VAT: a tool for visual assessment of cluster tendency. In: Proceedings of international joint conference on neural networks, pp 2225–2230

  • Berkhin P (2002) Survey of clustering data mining techniques. Technical report, accrue software

  • Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: Proceedings of AAAI94 workshop knowledge discovery in databases (KDD), pp 359–370

  • Bezdek JC et al (2007) Visual assessment of clustering tendency for rectangular dissimilarity matrices. IEEE Trans Fuzzy Syst 15(5):890–903

    Article  Google Scholar 

  • Campbell WM et al (2006) Support vector machines using GMM supervectors for speaker verification, pp 308–311

  • Chenzhe Z et al (2014) SVM Venn machine with k-means clustering. In: Artificial intelligence applications and innovations, IFIP advances in information and communication technology, vol 437, pp 251–260

  • Chu S, Tang H, Haung T (2009) Fishervoice and semi-supervised speaker clustuering. In: Proceeding of IEEE international acoustic speech and signal processing, pp 4089–4092

  • Chen S et al (2014) A hybrid Clustering algorithm based on fuzzy c-means and improved swarm optimization. Arab J Sci Eng 39(12):8875–8887

    Article  MathSciNet  Google Scholar 

  • Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Ester M, Kriegel P, Sander J, Xu X (1996) A density based algorithm for discovering clusters in large databases with noise. In: Proceedings of 3rd international conference on knowledge discovery and data mining, pp 226–231

  • Fisher B, Zollar T, Nuhman J (2001) Path based pairwise data clustering with application to texture segmentation. In: Energy minimization methods in computer vision and pattern recognition, vol 2, no 134, pp 235–250

  • Fu AWC et al (2005) Scaling and time warping in time series querying. In: Proceedings of VLDB conference

  • Ghoting A, Parhasarthy S, Otey ME (2006) Fast mining of distance-based outliers in high dimensional datasets. In: Proceedings of SIAM international conference data mining (SDM), vol 16, no. 13, pp 349–364

  • Grygorash O, Zhou Y, Jorgensen Z (2006) Minimum spanning tree-based clustering algorithms. In: Proceedings of IEEE international conference tools with artificial intelligence, pp 73–81

  • He X et al (2003) Locality preserving projections. In: Proceedings of advance in neural information processing systems

  • He X et al (2005) Neighborhood preserving embedding. In: Proceeding of IEEE International conference computer vision, pp 1208–1213

  • Havens TC et al (2013) An efficient formulation of the improved visual assessment of cluster tendency (iVAT) algorithm. IEEE Trans Knowl Data Eng 24(5):813–822

    Article  Google Scholar 

  • Han J, Kamber M (2011) Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  • Havens TC, Bezdek JC et al (2006) Scalable single linkage hierarchical clustering for big data, pp 215–234

  • Havens TC, Bezdek JC et al (2010) A new Implementation of the co-VAT algorithm for visual assessment of clusters in rectangular relational data, pp 363–371

  • Havens TC, Bezdek JC, Keller JM, Popescu M (2008) Dunns cluster validity index as a contrast measure of VAT images. In: International conference on pattern recognition, pp 1–4

  • He X et al (2011) Laplacian regularized Gaussian mixture model for data clustering. IEEE Trans Knowl Data Eng 23(9):1406–1418

    Article  Google Scholar 

  • Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • Jain AK, Murthi MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  • Jana PK, Naik A (2009) An efficient minimum spanning tree based clustering algorithm. In: International conference on methods and models in computer science

  • Karypis G, Han E (1999) A hierarchical clustering algorithm using dynamic modeling. IEEE Trans Comput Spec Issue Data Anal Min 32(8):68–75

    Google Scholar 

  • Laszlo M, Mukherjee S (2005) Minimum spanning tree partitioning algorithm for micro aggregation. IEEE Trans Knowl Data Eng 17(7):902–911

    Article  Google Scholar 

  • Lew M et al (2006) Content-based multimedia information retrieval: state of art and challenges. ACM Trans Multimed Comput Commun Appl 2(3):1–19

    Article  MathSciNet  Google Scholar 

  • Li X, Hu W, Shen C, Dick A, Zhang Z (2014) Context-aware hypergraph construction for robust spectral clustering. IEEE Trans Knowl Data Eng, 26(10):2588–2597

    Article  Google Scholar 

  • Lovasz L, Plummer M (1986) Matching theory. Budapest, Northholland

    MATH  Google Scholar 

  • Mammone R, Zhang X, Ramachandran R (1996) Robust speaker recognition: a feature-based approach. IEEE Signal Process Mag 13(5):58–71

    Article  Google Scholar 

  • Moore J et al (1997) Web page categorization and feature selection using association rule and principle component clustering. In: Proceeding of workshop Information technologies and systems

  • Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66

    Article  MathSciNet  Google Scholar 

  • Ozertem U, Erdogmus D et al (2008) Mean shift spectral clustering. Pattern Recognit 41(4):1924–1938

    Article  MATH  Google Scholar 

  • Pujari AK (2001) Data mining techniques. Universities Press, London

    Google Scholar 

  • Reynolds DA (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Proc 10:19–41

    Article  Google Scholar 

  • Senoussaoui M et al (2014) A study of the cosine distance-based mean shift for telephone speech diarization. In: IEEE transactions on audio, speech, and language processing, vol 22, no 1

  • Tang H et al (2012) Partially supervised speaker clustering. IEEE Trans Pattern Anal Mach Intell 34(5):959–971

    Article  Google Scholar 

  • Togneri R et al. (2011) An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst Mag 11(2):23–61

    Article  Google Scholar 

  • Vathy-Fogarassy A, Kiss A, Abonyi J (2006) Hybrid minimal spanning tree and mixture of Gaussians based clustering algorithm. In: Foundations of information and knowledge systems. Springer, pp 313–330

  • Voila P, Wells III W (1997) Alignment by maximization of mutual information. Int J Comput Vis 24(2):137–154

    Article  Google Scholar 

  • Wang X, Wang X, Wilkes DM (2009) A divide-and-conquer approach for minimum spanning tree-based clustering. IEEE Trans Knowl Data Eng 21(7):945–958

    Article  Google Scholar 

  • Wang L, Bezdek J, Ramamohanarao K (2010) Enhanced visual analysis for cluster tendency assessment and data partitioning. IEEE Trans Knowl Data Eng 22(10):1401–1414

    Article  Google Scholar 

  • Wu C-H, Ouyang C-S, Chen L-W, Lu L-W (2015) A new fuzzy clustering validity index with a median factor for centroid-based clustering. IEEE Trans Fuzzy Syst 23(3):701–718

    Article  Google Scholar 

  • Wang L et al (2010) SpecVAT: enhanced visual cluster analysis. In: Proceedings of international conference on data mining, pp 152–157

  • Xu Y, Olman V, Xu D (2002) Clustering gene expression data using a graph theoretic approach: an application of minimum spanning trees. Bioinformatics 18(4):536–545

    Article  Google Scholar 

  • Xuand R, Wunsch D II (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  • Zahn CT (1971) Graph theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 20(1):68–86

    Article  MATH  Google Scholar 

  • Zhao Y et al (2006) Generalized dimension-reduction framework for recent-biased time series analysis. IEEE Trans Knowl Data Eng 18(2):231–244

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Suneetha Rani.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suneetha Rani, T., Krishna Prasad, M.H.M. Access the cluster tendency by visual methods for robust speech clustering. Int J Syst Assur Eng Manag 8 (Suppl 1), 465–477 (2017). https://doi.org/10.1007/s13198-015-0393-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-015-0393-z

Keywords

Navigation