Skip to main content

Audio Segmentation and Classification Using a Temporally Weighted Fuzzy C-Means Algorithm

  • Conference paper
Advances in Neural Networks – ISNN 2011 (ISNN 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6676))

Included in the following conference series:

Abstract

In this paper, we present a noble method to segment and classify audio stream using a temporally weighted fuzzy c-means algorithm (TWFCM). The proposed algorithm is utilized to determine the boundaries between different kinds of sounds in an audio stream; and then classify the audio segments into five classes of sound such as music, speech, speech with music background, speech with noise background, and silence. This is an enhancement on conventional fuzzy c-means algorithm, applied in audio segmentation and classification domain, by addressing and reflecting the matter of temporal correlations between the audio signals in the current and previous time. A 3-elements feature vector is utilized in segmentation and a 5-elements feature vector is utilized in classification by using TWFCM. The audio-cuts can be detected accurately by this method, and mistakes caused by audio effects can be eliminated in segmentation. Improved classification performance is also achieved. The application of this method is demonstrated in segmenting and classifying real-world audio data such as television news, radio signals, etc. Experimental results indicate that the proposed method outperforms the conventional FCM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Lu, L., Zhang, H.J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Trans. Speech Audio Processing 10(7), 504–516 (2002)

    Article  Google Scholar 

  2. Brezeale, D., Cook, D.J.: Automatic Video Classification: A Survey of the Literature. IEEE Trans. on Syst, Man, and Cybernetics-Part C: Applications and Reviews 38(3), 416–430 (2008)

    Article  Google Scholar 

  3. Naoki, N., Miki, H., Hideo, K.: Audio Signal Segmentation and Classification Using Fuzzy c-Means Clustering. Journal of Systems and Computers in Japan 37(4), 23–34 (2006)

    Article  Google Scholar 

  4. Huang, J., Liu, Z., Wang, Y.: Integration of Audio and Visual Information for Content-based Video Segmentation. In: Proc. Int. Conf. on Image Processing, pp. 526–3530 (1998)

    Google Scholar 

  5. Luong, H.V., Kim, J.-M.: A Generalized Spatial Fuzzy C-Means algorithm for Medical Image Segmentation. In: Proc. of IEEE Int. Conf. on Fuzzy Systems, pp. 409–414 (2009)

    Google Scholar 

  6. Bezdek, J.C.: Pattern recognition with Fuzzy Objective Function algorithm. Plenum Press, New York (1981)

    Book  MATH  Google Scholar 

  7. Bezdek, J.C., Nikhil, R.: On Cluster Validity for the Fuzzy c-Means Model. IEEE Trans. Fuzzy Systems 3(3), 370–379 (1995)

    Article  Google Scholar 

  8. Fukuyama, Y., Sugeno, M.: A New Method for Fuzzy Clustering. In: Proc. 5th Fuzzy Syst. Symp., pp. 247–250 (1989)

    Google Scholar 

  9. Xie, X.L., Beni, G.A.: Validity Measure for Fuzzy Clustering. IEEE Trans. Pattern Anal. Machine Intell. 3(8), 357–363 (1982)

    Google Scholar 

  10. Zhang, T., Kuo, C.–C.J.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Transactions on Speech and Audio Processing 9(4), 441–457 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, N.T.T., Haque, M.A., Kim, CH., Kim, JM. (2011). Audio Segmentation and Classification Using a Temporally Weighted Fuzzy C-Means Algorithm. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21090-7_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21089-1

  • Online ISBN: 978-3-642-21090-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics