Journals & Magazines >IEEE Transactions on Multimedia >Volume: 27

Investigating the Effective Dynamic Information of Spectral Shapes for Audio Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The spectral shape holds crucial information for Audio Classification (AC), encompassing the spectrum's envelope, details, and dynamic changes over time. Conventional met...Show More

Metadata

Abstract:

The spectral shape holds crucial information for Audio Classification (AC), encompassing the spectrum's envelope, details, and dynamic changes over time. Conventional methods utilize cepstral coefficients for spectral shape description but overlook its variation details. Deep-learning approaches capture some dynamics but demand substantial training or fine-tuning resources. The Learning in the Model Space (LMS) framework precisely captures the dynamic information of temporal data by utilizing model fitting, even when computational resources and data are limited. However, applying LMS to audio faces challenges: 1) The high sampling rate of audio hinders efficient data fitting and capturing of dynamic information. 2) The Dynamic Information of Partial Spectral Shapes (DIPSS) may enhance classification, as only specific spectral shapes are relevant for AC. This paper extends an AC framework called Effective Dynamic Information Capture (EDIC) to tackle the above issues. EDIC constructs Mel-Frequency Cepstral Coefficients (MFCC) sequences within different dimensional intervals as the fitted data, which not only reduces the number of sequence sampling points but can also describe the change of the spectral shape in different parts over time. EDIC enables us to implement a topology-based selection algorithm in the model space, selecting effective DIPSS for the current AC task. The performance on three tasks confirms the effectiveness of EDIC.

Published in: IEEE Transactions on Multimedia ( Volume: 27)

Page(s): 1114 - 1126

Date of Publication: 24 December 2024

ISSN Information:

DOI: 10.1109/TMM.2024.3521837

Funding Agency:

Contents

References is not available for this document.

Investigating the Effective Dynamic Information of Spectral Shapes for Audio Classification

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Investigating the Effective Dynamic Information of Spectral Shapes for Audio Classification

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?