research-article

Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection

Authors:

Yunxiang Zhang,

Shun HuangAuthors Info & Claims

CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition

Article No.: 59, Pages 1 - 5

https://doi.org/10.1145/3663976.3664242

Published: 27 June 2024 Publication History

Abstract

The outlier exposure method is an effective approach to address the unsupervised anomaly sound detection problem. The key focus of this method is how to make the model learn the distribution space of normal data. Based on biological perception and data analysis, it is found that anomalous audio and noise often have higher frequencies. Therefore, we propose a data augmentation method for high-frequency information in contrastive learning. This enables the model to pay more attention to the low-frequency information of the audio, which represents the normal operational mode of the machine. We evaluated the proposed method on the DCASE 2020 Task 2. The results showed that our method outperformed other contrastive learning methods used on this dataset. We also evaluated the generalizability of our method on the DCASE 2022 Task 2 dataset.

References

[1]

Jisheng Bai, Yafei Jia, and Siwei Huang. 2022. JLESS SUBMISSION TO DCASE2022 TASK2: BATCH MIXING STRATEGY BASED METHOD WITH ANOMALY DETECTOR FOR ANOMALOUS SOUND DETECTION. Technical Report. DCASE2022 Challenge.

[2]

Sheng Chen, Yang Liu, Xiang Gao, and Zhen Han. 2018. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. In Proc. of CCBR. Springer, 428–438.

[3]

Kota Dohi, Takashi Endo, Harsh Purohit, Ryo Tanabe, and Yohei Kawaguchi. 2021. Flow-based self-supervised density estimation for anomalous sound detection. In ICASSP. IEEE, 336–340.

[4]

Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, and Yohei Kawaguchi. 2022. MIMII DG: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection for Domain Generalization Task. arXiv preprint arXiv:2205.13879 (2022).

[5]

Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audioset: An ontology and human-labeled dataset for audio events. In ICASSP. IEEE, 776–780.

[6]

Ritwik Giri, Srikanth V. Tenneti, Karim Helwani, Fangzhou Cheng, Umut Isik, and Arvindh Krishnaswamy. 2020. Unsupervised Anomalous Sound Detection Using Self-Supervised Classification and Group Masked Autoencoder for Density Estimation. Technical Report. DCASE2020 Challenge.

[7]

Andrey Guzhov, Federico Raue, Jörn Hees, and Andreas Dengel. 2021. Esresne (x) t-fbsp: Learning robust time-frequency transformation of audio. In IJCNN. IEEE, 1–8.

[8]

Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. 2021. ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions. In Proc. of DCASE2021. Barcelona, Spain, 1–5.

[9]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proc. of CVPR. 9729–9738.

[10]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. of CVPR. 770–778.

[11]

Hadi Hojjati and Narges Armanfard. 2022. Self-Supervised Acoustic Anomaly Detection Via Contrastive Learning. In Proc. of ICASSP. 3253–3257.

[12]

Hadi Hojjati and Narges Armanfard. 2022. Self-supervised acoustic anomaly detection via contrastive learning. In Proc. of ICASSP. IEEE, 3253–3257.

[13]

Alessandro Ilic Mezza, Giulio Zanetti, Maximo Cobos, and Fabio Antonacci. 2023. Zero-Shot Anomalous Sound Detection in Domestic Environments Using Large-Scale Pretrained Audio Pattern Recognition Models. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. https://doi.org/10.1109/ICASSP49357.2023.10095736

[14]

Anbai Jiang, Wei-Qiang Zhang, Yufeng Deng, Pingyi Fan, and Jia Liu. 2023. Unsupervised Anomaly Detection and Localization of Machine Audio: A Gan-Based Approach. In Proc. of ICASSP. 1–5.

[15]

Youde Liu, Jian Guan, Qiaoxi Zhu, and Wenwu Wang. 2022. Anomalous Sound Detection Using Spectral-Temporal Information Fusion. In ICASSP. IEEE, 816–820.

[16]

Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, and Gianfranco Doretto. 2017. Unified Deep Supervised Domain Adaptation and Generalization. In Proc. of ICCV. 5716–5726.

[17]

Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, and Kunio Kashino. 2021. BYOL for audio: Self-supervised learning for general-purpose audio representation. In IJCNN. IEEE, 1–8.

[18]

Eduardo C Nunes. 2021. Anomalous sound detection with machine learning: A systematic review. arXiv preprint arXiv:2102.07820 (2021).

[19]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).

[20]

Harsh Purohit, Ryo Tanabe, Takeshi Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi. 2019. MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection. In Proc. of DCASE2019). 209–213.

[21]

Aaqib Saeed, David Grangier, and Neil Zeghidour. 2021. Contrastive learning of general-purpose audio representations. In ICASSP. IEEE, 3875–3879.

[22]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proc. of CVPR. 4510–4520.

[23]

Benjamin Staar, Michael Lütjen, and Michael Freitag. 2019. Anomaly detection with convolutional neural networks for industrial surface inspection. Procedia CIRP 79 (2019), 484–489. Proc. of CIRP.

[24]

Kaori Suefusa, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, and Yohei Kawaguchi. 2020. Anomalous Sound Detection Based on Interpolation Deep Neural Network. In Proc. of ICASSP. 271–275. https://doi.org/10.1109/ICASSP40776.2020.9054344

[25]

Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and Philip Yu. 2022. Generalizing to Unseen Domains: A Survey on Domain Generalization. IEEE Transactions on Knowledge and Data Engineering (2022), 1–1.

[26]

Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proc. of CVPR. 3733–3742.

[27]

Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. PCL: Proxy-based Contrastive Learning for Domain Generalization. In Proc. of CVPR. 7087–7097.

[28]

Xiao-Min Zeng, Yan Song, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue, Li-Rong Dai, and Ian McLoughlin. 2023. Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection. In Proc. of ICASSP. 1–5.

Index Terms

Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Magnitude-Contrastive Network for Unsupervised Graph Anomaly Detection
Web and Big Data
Abstract
Effectively identifying anomalous nodes within networks is crucial for various applications, such as fraud detection, network intrusion prevention, and social network activity monitoring. Existing graph anomaly detection methods based on ...
Deep semi-supervised learning with contrastive learning and partial label propagation for image data
Abstract
Deep semi-supervised learning is becoming an active research topic because it jointly utilizes labeled and unlabeled samples in training deep neural networks. Recent advances are mainly focused on inductive semi-supervised learning ...
Self-supervised learning representation for abnormal acoustic event detection based on attentional contrastive learning
Abstract
Most abnormal acoustic event detection (AAED) is completed by supervised training of deep learning methods, but manually labeled samples are costly and scarce. This work proposes a self-supervised learning representation for AAED based on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CVIPPR '24: Proceedings of the 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition

April 2024

373 pages

ISBN:9798400716607

DOI:10.1145/3663976

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CVIPPR 2024

CVIPPR 2024: 2024 2nd Asia Conference on Computer Vision, Image Processing and Pattern Recognition

April 26 - 28, 2024

Xiamen, China

Acceptance Rates

Overall Acceptance Rate 14 of 38 submissions, 37%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
36
Total Downloads

Downloads (Last 12 months)36
Downloads (Last 6 weeks)6

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten