An enhanced fuzzy c-means algorithm for audio segmentation and classification

Haque, Mohammad A.; Kim, Jong-Myon

doi:10.1007/s11042-011-0921-z

An enhanced fuzzy c-means algorithm for audio segmentation and classification

Published: 18 November 2011

Volume 63, pages 485–500, (2013)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mohammad A. Haque¹ &
Jong-Myon Kim¹

384 Accesses
10 Citations
Explore all metrics

Abstract

Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

K-Means algorithm based on multi-feature-induced order

Article 09 April 2024

References

Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Pleum Press, New York
Book MATH Google Scholar
Bezdek JC, Keller J, Krisnapuram R, Pal NR (2005) Fuzzy models and algorithms for pattern recognition and image processing. Kluwer Academic Publishers, Norwell
Google Scholar
Bezdek JC, Nikhil R (1995) On cluster validity for the fuzzy c-means model. IEEE Trans on Fuzzy Syst 3(3):370–379
Article Google Scholar
Chen L, Gunduz S, and Ozsu MT. (2006) Mixed type audio classification with support vector machine. IEEE Int. Conf. on Multimedia and Expo, 781–784
Fukuyama Y and Sugeno M. (1989) A new method for Fuzzy clustering. 5th Fuzzy System Symp., 247–250
Gang C, Hui T, Xin-meng C (2005) Audio segmentation via the similarity measure of audio feature vectors. Wuhan Univ J Natur Sci 10(5):833–837
Article Google Scholar
Krinidis S, Chatzis V (2010) A Robust fuzzy local information c-means clustering algorithm. IEEE Trans Image Process 19(5):1328–1337
Article MathSciNet Google Scholar
Liu Z, Huang J, and Yang Y. (1998) Classification of TV programs based on audio information using hidden markov model. IEEE 2nd Workshop on Multimedia Sig Process, 27–32
Liu Z, Wang Y (1998) Audio feature extraction and analysis for scene segmentation and classification. J VLSI Sign Process 20:61–79
Article Google Scholar
Lu L, Zhang HJ, Jiang H (2002) Content analysis for audio classification and segmentation. IEEE Trans Speech Audio Process 10(7):504–516
Article Google Scholar
Luong HV and Kim J-M. (2009) A Generalized spatial fuzzy C-means algorithm for medical image segmentation. IEEE Int. Conf. on Fuzzy Systems, 409–414
Nitanda N, Haseyama M, Kitajima H (2006) Audio signal segmentation and classification using fuzzy c-means clustering. Syst Comput Jpn 37(4):23–34
Article Google Scholar
Park DC (2009) Classification of audio signals using fuzzy c-means with Divergence-based Kernel. Pattern Recognit Lett 30(9):794–798
Article Google Scholar
Park DC, Nguyen DH, Beack SH, Park S (2005) Classification of audio signals using Gradient-based fuzzy c-means algorithm with divergence measure. Adv Multimedia Inf Process PCM 2005:698–708
Google Scholar
Park DC, Tran CN, Min BJ, and Park S. (2006) Modeling and classification of audio signals using Gradient-based fuzzy C-means algorithm with a Mercer Kernel. In 9th Pacific Rim International Conference on Artificial Intelligence: 1104–1108
Tzanetakis G, Cook P (2002) Music genre classification of audio signals. IEEE Trans on Speech Audio Process 10(5):293–302
Article Google Scholar
Wang JC, Wang JF, Lin CB, Jian KT, Kuok WH (2006) Content-based audio classification using support vector machines and independent component analysis. 18^th Int. Conf. on. Pattern Recognit 4:157–160
Google Scholar
Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification search and retrieval of audio. IEEE Multimedia Mag 3:27–36
Article Google Scholar
Xie XL, Beni GA (1991) A validity measure for fuzzy clustering. IEEE Trans on Pattern Anal Mach Intell 13(8):841–847
Article Google Scholar
Zhang T, Kuo C-CJ (2001) Audio content analysis for online audiovisual data segmetation and classification. IEEE Trans on Speech Audio Process 9(4):441–457
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2011-0017941)

Author information

Authors and Affiliations

School of Electrical Engineering, University of Ulsan, Ulsan, South Korea, 689-749
Mohammad A. Haque & Jong-Myon Kim

Authors

Mohammad A. Haque
View author publications
You can also search for this author in PubMed Google Scholar
Jong-Myon Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jong-Myon Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haque, M.A., Kim, JM. An enhanced fuzzy c-means algorithm for audio segmentation and classification. Multimed Tools Appl 63, 485–500 (2013). https://doi.org/10.1007/s11042-011-0921-z

Download citation

Published: 18 November 2011
Issue Date: March 2013
DOI: https://doi.org/10.1007/s11042-011-0921-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhanced fuzzy c-means algorithm for audio segmentation and classification

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

K-Means algorithm based on multi-feature-induced order

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhanced fuzzy c-means algorithm for audio segmentation and classification

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

K-Means algorithm based on multi-feature-induced order

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation