skip to main content
10.1145/3663976.3664242acmotherconferencesArticle/Chapter ViewAbstractPublication PagescvipprConference Proceedingsconference-collections
research-article

Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection

Published: 27 June 2024 Publication History

Abstract

The outlier exposure method is an effective approach to address the unsupervised anomaly sound detection problem. The key focus of this method is how to make the model learn the distribution space of normal data. Based on biological perception and data analysis, it is found that anomalous audio and noise often have higher frequencies. Therefore, we propose a data augmentation method for high-frequency information in contrastive learning. This enables the model to pay more attention to the low-frequency information of the audio, which represents the normal operational mode of the machine. We evaluated the proposed method on the DCASE 2020 Task 2. The results showed that our method outperformed other contrastive learning methods used on this dataset. We also evaluated the generalizability of our method on the DCASE 2022 Task 2 dataset.

References

[1]
Jisheng Bai, Yafei Jia, and Siwei Huang. 2022. JLESS SUBMISSION TO DCASE2022 TASK2: BATCH MIXING STRATEGY BASED METHOD WITH ANOMALY DETECTOR FOR ANOMALOUS SOUND DETECTION. Technical Report. DCASE2022 Challenge.
[2]
Sheng Chen, Yang Liu, Xiang Gao, and Zhen Han. 2018. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. In Proc. of CCBR. Springer, 428–438.
[3]
Kota Dohi, Takashi Endo, Harsh Purohit, Ryo Tanabe, and Yohei Kawaguchi. 2021. Flow-based self-supervised density estimation for anomalous sound detection. In ICASSP. IEEE, 336–340.
[4]
Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, and Yohei Kawaguchi. 2022. MIMII DG: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection for Domain Generalization Task. arXiv preprint arXiv:2205.13879 (2022).
[5]
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audioset: An ontology and human-labeled dataset for audio events. In ICASSP. IEEE, 776–780.
[6]
Ritwik Giri, Srikanth V. Tenneti, Karim Helwani, Fangzhou Cheng, Umut Isik, and Arvindh Krishnaswamy. 2020. Unsupervised Anomalous Sound Detection Using Self-Supervised Classification and Group Masked Autoencoder for Density Estimation. Technical Report. DCASE2020 Challenge.
[7]
Andrey Guzhov, Federico Raue, Jörn Hees, and Andreas Dengel. 2021. Esresne (x) t-fbsp: Learning robust time-frequency transformation of audio. In IJCNN. IEEE, 1–8.
[8]
Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. 2021. ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions. In Proc. of DCASE2021. Barcelona, Spain, 1–5.
[9]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proc. of CVPR. 9729–9738.
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. of CVPR. 770–778.
[11]
Hadi Hojjati and Narges Armanfard. 2022. Self-Supervised Acoustic Anomaly Detection Via Contrastive Learning. In Proc. of ICASSP. 3253–3257.
[12]
Hadi Hojjati and Narges Armanfard. 2022. Self-supervised acoustic anomaly detection via contrastive learning. In Proc. of ICASSP. IEEE, 3253–3257.
[13]
Alessandro Ilic Mezza, Giulio Zanetti, Maximo Cobos, and Fabio Antonacci. 2023. Zero-Shot Anomalous Sound Detection in Domestic Environments Using Large-Scale Pretrained Audio Pattern Recognition Models. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. https://doi.org/10.1109/ICASSP49357.2023.10095736
[14]
Anbai Jiang, Wei-Qiang Zhang, Yufeng Deng, Pingyi Fan, and Jia Liu. 2023. Unsupervised Anomaly Detection and Localization of Machine Audio: A Gan-Based Approach. In Proc. of ICASSP. 1–5.
[15]
Youde Liu, Jian Guan, Qiaoxi Zhu, and Wenwu Wang. 2022. Anomalous Sound Detection Using Spectral-Temporal Information Fusion. In ICASSP. IEEE, 816–820.
[16]
Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, and Gianfranco Doretto. 2017. Unified Deep Supervised Domain Adaptation and Generalization. In Proc. of ICCV. 5716–5726.
[17]
Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, and Kunio Kashino. 2021. BYOL for audio: Self-supervised learning for general-purpose audio representation. In IJCNN. IEEE, 1–8.
[18]
Eduardo C Nunes. 2021. Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021).
[19]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
[20]
Harsh Purohit, Ryo Tanabe, Takeshi Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi. 2019. MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection. In Proc. of DCASE2019). 209–213.
[21]
Aaqib Saeed, David Grangier, and Neil Zeghidour. 2021. Contrastive learning of general-purpose audio representations. In ICASSP. IEEE, 3875–3879.
[22]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proc. of CVPR. 4510–4520.
[23]
Benjamin Staar, Michael Lütjen, and Michael Freitag. 2019. Anomaly detection with convolutional neural networks for industrial surface inspection. Procedia CIRP 79 (2019), 484–489. Proc. of CIRP.
[24]
Kaori Suefusa, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, and Yohei Kawaguchi. 2020. Anomalous Sound Detection Based on Interpolation Deep Neural Network. In Proc. of ICASSP. 271–275. https://doi.org/10.1109/ICASSP40776.2020.9054344
[25]
Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and Philip Yu. 2022. Generalizing to Unseen Domains: A Survey on Domain Generalization. IEEE Transactions on Knowledge and Data Engineering (2022), 1–1.
[26]
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proc. of CVPR. 3733–3742.
[27]
Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. PCL: Proxy-based Contrastive Learning for Domain Generalization. In Proc. of CVPR. 7087–7097.
[28]
Xiao-Min Zeng, Yan Song, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue, Li-Rong Dai, and Ian McLoughlin. 2023. Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection. In Proc. of ICASSP. 1–5.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition
April 2024
373 pages
ISBN:9798400716607
DOI:10.1145/3663976
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anomaly sound detection
  2. contrastive learning
  3. data augmentation
  4. unsupervised learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CVIPPR 2024

Acceptance Rates

Overall Acceptance Rate 14 of 38 submissions, 37%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 36
    Total Downloads
  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)6
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media