Skip to main content
Log in

An enhanced fuzzy c-means algorithm for audio segmentation and classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Pleum Press, New York

    Book  MATH  Google Scholar 

  2. Bezdek JC, Keller J, Krisnapuram R, Pal NR (2005) Fuzzy models and algorithms for pattern recognition and image processing. Kluwer Academic Publishers, Norwell

    Google Scholar 

  3. Bezdek JC, Nikhil R (1995) On cluster validity for the fuzzy c-means model. IEEE Trans on Fuzzy Syst 3(3):370–379

    Article  Google Scholar 

  4. Chen L, Gunduz S, and Ozsu MT. (2006) Mixed type audio classification with support vector machine. IEEE Int. Conf. on Multimedia and Expo, 781–784

  5. Fukuyama Y and Sugeno M. (1989) A new method for Fuzzy clustering. 5th Fuzzy System Symp., 247–250

  6. Gang C, Hui T, Xin-meng C (2005) Audio segmentation via the similarity measure of audio feature vectors. Wuhan Univ J Natur Sci 10(5):833–837

    Article  Google Scholar 

  7. Krinidis S, Chatzis V (2010) A Robust fuzzy local information c-means clustering algorithm. IEEE Trans Image Process 19(5):1328–1337

    Article  MathSciNet  Google Scholar 

  8. Liu Z, Huang J, and Yang Y. (1998) Classification of TV programs based on audio information using hidden markov model. IEEE 2nd Workshop on Multimedia Sig Process, 27–32

  9. Liu Z, Wang Y (1998) Audio feature extraction and analysis for scene segmentation and classification. J VLSI Sign Process 20:61–79

    Article  Google Scholar 

  10. Lu L, Zhang HJ, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process 10(7):504–516

    Article  Google Scholar 

  11. Luong HV and Kim J-M. (2009) A Generalized spatial fuzzy C-means algorithm for medical image segmentation. IEEE Int. Conf. on Fuzzy Systems, 409–414

  12. Nitanda N, Haseyama M, Kitajima H (2006) Audio signal segmentation and classification using fuzzy c-means clustering. Syst Comput Jpn 37(4):23–34

    Article  Google Scholar 

  13. Park DC (2009) Classification of audio signals using fuzzy c-means with Divergence-based Kernel. Pattern Recognit Lett 30(9):794–798

    Article  Google Scholar 

  14. Park DC, Nguyen DH, Beack SH, Park S (2005) Classification of audio signals using Gradient-based fuzzy c-means algorithm with divergence measure. Adv Multimedia Inf Process PCM 2005:698–708

    Google Scholar 

  15. Park DC, Tran CN, Min BJ, and Park S. (2006) Modeling and classification of audio signals using Gradient-based fuzzy C-means algorithm with a Mercer Kernel. In 9th Pacific Rim International Conference on Artificial Intelligence: 1104–1108

  16. Tzanetakis G, Cook P (2002) Music genre classification of audio signals. IEEE Trans on Speech Audio Process 10(5):293–302

    Article  Google Scholar 

  17. Wang JC, Wang JF, Lin CB, Jian KT, Kuok WH (2006) Content-based audio classification using support vector machines and independent component analysis. 18th Int. Conf. on. Pattern Recognit 4:157–160

    Google Scholar 

  18. Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification search and retrieval of audio. IEEE Multimedia Mag 3:27–36

    Article  Google Scholar 

  19. Xie XL, Beni GA (1991) A validity measure for fuzzy clustering. IEEE Trans on Pattern Anal Mach Intell 13(8):841–847

    Article  Google Scholar 

  20. Zhang T, Kuo C-CJ (2001) Audio content analysis for online audiovisual data segmetation and classification. IEEE Trans on Speech Audio Process 9(4):441–457

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2011-0017941)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Myon Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haque, M.A., Kim, JM. An enhanced fuzzy c-means algorithm for audio segmentation and classification. Multimed Tools Appl 63, 485–500 (2013). https://doi.org/10.1007/s11042-011-0921-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0921-z

Keywords

Navigation