Compression Detection of Audio Waveforms Based on Stacked Autoencoders

Luo, Da; Cheng, Wenqing; Yuan, Huaqiang; Luo, Weiqi; Liu, Zhenghui

doi:10.1007/978-3-030-57881-7_35

Da Luo¹¹,
Wenqing Cheng¹¹,
Huaqiang Yuan¹¹,
Weiqi Luo¹² &
…
Zhenghui Liu¹³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12240))

Included in the following conference series:

International Conference on Artificial Intelligence and Security

1241 Accesses
1 Citations

Abstract

With the easy acquisition of digital recordings, the field of audio forensics has become increasingly prominent. Detection of audio compression history is an important issue in the field of audio forensics. In this paper, a detection framework is proposed to detect whether a given audio waveform is an original waveform or a decompressed one. We extract the spectrum features from the frequency domain and then adopt a stacked autoencoder to effectively detect the frame-level audio fragments to distinguish between the original audio frames and the decompressed audio frames. Then, a majority voting algorithm is applied to make the final decision for an audio clip. Our analysis focuses on multi-time compressed audio, including single compression, double compression, triple compression and even four-time compression in three kinds of compression formats. The experimental results show that the proposed framework can effectively detect multi-time compressed audio. Furthermore, the proposed framework can also estimate the compression bitrate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barni, M., Chen, Z., Tondi, B.: Adversary-aware, data-driven detection of double JPEG compression: how to make counter-forensics harder. In: IEEE International Workshop on Information Forensics and Security (2016). https://doi.org/10.1109/wifs.2016.7823902
Pevný, T., Fridrich, J.: Detection of double-compression in JPEG images for applications in steganography. IEEE Trans. Inf. Forensics Secur. 3(2), 247–258 (2008)
Article Google Scholar
Galvan, F., Puglisi, G., Bruna, A., Battiato, S.: First quantization matrix estimation from double compressed JPEG images. IEEE Trans. Inf. Forensics Secur. 9(8), 1299–1310 (2014)
Article Google Scholar
Yang, R., Shi, Y., Huang, J.: Defeating fakequality MP3. In: Proceedings of ACM Workshop Multimedia Security, Princeton, NJ, USA, pp. 117C-124 (2009)
Google Scholar
Yang, R., Shi, Y., Huang, J.: Detecting double compression of audio signal. In: Procedings of SPIE Electronic Imaging. International Society for Optics and Photonics, 75410 K-75410 K-10 (2010)
Google Scholar
Qiao, M., Sung, A., Liu, Q.: Revealing real quality of double compressed MP3 audio. In: Proceedings of the International Conference on Multimedia, pp. 1011–1014 (2010)
Google Scholar
Liu, Q., Sung, A., Qiao, M.: Detection of double MP3 compression. Cognit. Comput. 2(4), 291–296 (2010)
Article Google Scholar
Ma, P., Wang, R., Yan, D., Jin, C.: A Huffman table index based approach to detect double MP3 compression. In: Shi, Y.Q., Kim, H.J., Pérez-González, F. (eds.) IWDW 2013. LNCS, vol. 8389, pp. 258–271. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43886-2_19
Chapter Google Scholar
Bießmann, P., et al.: Estimating MP3PRO encoder parameters from decoded audio. In: Proceedings of GI-Jahrestagung, 2841C–2852 (2013)
Google Scholar
Yan, D., Wang, R., Zhou, J., Jin, C., Yang, Z.: Compression history detection for MP3 audio. KSII Trans. Internet Inf. Syst. 12(2), 662–675 (2018)
Google Scholar
Jenner, F., Kwasinsk, A.: Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In: Proceedings of International Conference Acoustic, Speech and Signal Process, Kyoto, Japan, pp. 1737–1740 (2012)
Google Scholar
Hiçsönmez, S., Sencar, H., Avcibas, I.: Audio codec identification through payload sampling. In: Proceedings of Workshop Information Forensics Security (2011). https://doi.org/10.1109/wifs.2011.6123128
Hiçsönmez, S., Uzun, E., Sencar, H.: Methods for identifying traces of compression in audio. In: Proceedings of 1st International Conference Communication, Signal Processing and Application, Sharjah, United Arab Emirates (2013). https://doi.org/10.1109/iccspa.2013.6487284
Hennequin, R., Royo-Letelier, J., Moussallam, M.: Codec independent lossy audio compression detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 726–730 (2017)
Google Scholar
Luo, D., Luo, W., Yang, R., Huang, J.: Identifying compression history of wave audio and its applications. ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP), 10(3), 30 (2014)
Google Scholar
Huang, Q., Wang, R., Yan, D., Zhang, J.: AAC double compression audio detection algorithm based on the difference of scale factor. Information 9(7), 161 (2018)
Article Google Scholar
Shen, Y., Jia, J., Cai, L.: Detecting double compressed AMR-format audio recordings. In Proceedings of 10th Phonetics Conference of China (2012)
Google Scholar
Luo, D., Yang, R., Huang, J.: Detecting double compressed AMR audio using deep learning. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2688–2692 (2014)
Google Scholar
Luo, D., Yang, R., Li, B., Huang, J.: Detection of double compressed AMR audio using stacked autoencoder. IEEE Trans. Inf. Forensics Secur. 12(2), 432–444 (2017)
Article Google Scholar
Gärtner, D., Dittmar, C., Aichroth, P., Cuccovillo, L., Mann, S., Schuller, G.: Efficient cross-codec framing grid analysis for audio tampering detection. In: Proceedings of Audio Engineering Society Convention, p. 136 (2014)
Google Scholar
Gärtner, D., Cuccovillo, L., Mann, S., Aichroth, P.: A multi-codec audio dataset for codec analysis and tampering detection. In: Proceedings of 54th Audio Engineering Society Conference on Audio Forensics (2014)
Google Scholar
Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1–3), 19–41 (2000)
Article Google Scholar
Kingma, D.: Max welling: auto-encoding variational Bayes. arXiv:1312.6114 [stat.ML] (2013)
Kan, X., et al.: Snow cover mapping for mountainous areas by fusion of MODIS L1B and geographic data based on stacked denoising auto-encoders. Comput. Mater. Continua 57(1), 49–68 (2018)
Article Google Scholar
Zhao, X., Jiaxin, W., Zhang, Y., Shi, Y., Wang, L.: Fault diagnosis of motor in frequency domain signal by stacked de-noising auto-encoder. Comput. Mater. Continua 57(2), 223–242 (2018)
Article Google Scholar

Download references

Acknowledgments

The work presented in this paper was supported in part by the NSFC (61602318, 61672551, 61631016, 61972090), the Guangzhou Science and Technology Plan Project under Grant 201707010167, the Science and Technology planning project of Guangdong Province (2014A010103039, 2015A010103022).

Author information

Authors and Affiliations

Dongguan University of Technology, Dongguan, People’s Republic of China
Da Luo, Wenqing Cheng & Huaqiang Yuan
Sun Yat-sen University, Guangzhou, People’s Republic of China
Weiqi Luo
Xinyang Normal University, Xinyang, People’s Republic of China
Zhenghui Liu

Authors

Da Luo
View author publications
You can also search for this author in PubMed Google Scholar
Wenqing Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Huaqiang Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Weiqi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zhenghui Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Da Luo .

Editor information

Editors and Affiliations

Nanjing University of Information Science, Nanjing, China
Xingming Sun
Nanjing University of Information Science, Nanjing, China
Jinwei Wang
Purdue University, West Lafayette, IN, USA
Elisa Bertino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, D., Cheng, W., Yuan, H., Luo, W., Liu, Z. (2020). Compression Detection of Audio Waveforms Based on Stacked Autoencoders. In: Sun, X., Wang, J., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2020. Lecture Notes in Computer Science(), vol 12240. Springer, Cham. https://doi.org/10.1007/978-3-030-57881-7_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-57881-7_35
Published: 01 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57880-0
Online ISBN: 978-3-030-57881-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics