Skip to main content

Compression Detection of Audio Waveforms Based on Stacked Autoencoders

  • Conference paper
  • First Online:
Artificial Intelligence and Security (ICAIS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12240))

Included in the following conference series:

Abstract

With the easy acquisition of digital recordings, the field of audio forensics has become increasingly prominent. Detection of audio compression history is an important issue in the field of audio forensics. In this paper, a detection framework is proposed to detect whether a given audio waveform is an original waveform or a decompressed one. We extract the spectrum features from the frequency domain and then adopt a stacked autoencoder to effectively detect the frame-level audio fragments to distinguish between the original audio frames and the decompressed audio frames. Then, a majority voting algorithm is applied to make the final decision for an audio clip. Our analysis focuses on multi-time compressed audio, including single compression, double compression, triple compression and even four-time compression in three kinds of compression formats. The experimental results show that the proposed framework can effectively detect multi-time compressed audio. Furthermore, the proposed framework can also estimate the compression bitrate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barni, M., Chen, Z., Tondi, B.: Adversary-aware, data-driven detection of double JPEG compression: how to make counter-forensics harder. In: IEEE International Workshop on Information Forensics and Security (2016). https://doi.org/10.1109/wifs.2016.7823902

  2. Pevný, T., Fridrich, J.: Detection of double-compression in JPEG images for applications in steganography. IEEE Trans. Inf. Forensics Secur. 3(2), 247–258 (2008)

    Article  Google Scholar 

  3. Galvan, F., Puglisi, G., Bruna, A., Battiato, S.: First quantization matrix estimation from double compressed JPEG images. IEEE Trans. Inf. Forensics Secur. 9(8), 1299–1310 (2014)

    Article  Google Scholar 

  4. Yang, R., Shi, Y., Huang, J.: Defeating fakequality MP3. In: Proceedings of ACM Workshop Multimedia Security, Princeton, NJ, USA, pp. 117C-124 (2009)

    Google Scholar 

  5. Yang, R., Shi, Y., Huang, J.: Detecting double compression of audio signal. In: Procedings of SPIE Electronic Imaging. International Society for Optics and Photonics, 75410 K-75410 K-10 (2010)

    Google Scholar 

  6. Qiao, M., Sung, A., Liu, Q.: Revealing real quality of double compressed MP3 audio. In: Proceedings of the International Conference on Multimedia, pp. 1011–1014 (2010)

    Google Scholar 

  7. Liu, Q., Sung, A., Qiao, M.: Detection of double MP3 compression. Cognit. Comput. 2(4), 291–296 (2010)

    Article  Google Scholar 

  8. Ma, P., Wang, R., Yan, D., Jin, C.: A Huffman table index based approach to detect double MP3 compression. In: Shi, Y.Q., Kim, H.J., Pérez-González, F. (eds.) IWDW 2013. LNCS, vol. 8389, pp. 258–271. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43886-2_19

    Chapter  Google Scholar 

  9. Bießmann, P., et al.: Estimating MP3PRO encoder parameters from decoded audio. In: Proceedings of GI-Jahrestagung, 2841C–2852 (2013)

    Google Scholar 

  10. Yan, D., Wang, R., Zhou, J., Jin, C., Yang, Z.: Compression history detection for MP3 audio. KSII Trans. Internet Inf. Syst. 12(2), 662–675 (2018)

    Google Scholar 

  11. Jenner, F., Kwasinsk, A.: Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In: Proceedings of International Conference Acoustic, Speech and Signal Process, Kyoto, Japan, pp. 1737–1740 (2012)

    Google Scholar 

  12. Hiçsönmez, S., Sencar, H., Avcibas, I.: Audio codec identification through payload sampling. In: Proceedings of Workshop Information Forensics Security (2011). https://doi.org/10.1109/wifs.2011.6123128

  13. Hiçsönmez, S., Uzun, E., Sencar, H.: Methods for identifying traces of compression in audio. In: Proceedings of 1st International Conference Communication, Signal Processing and Application, Sharjah, United Arab Emirates (2013). https://doi.org/10.1109/iccspa.2013.6487284

  14. Hennequin, R., Royo-Letelier, J., Moussallam, M.: Codec independent lossy audio compression detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 726–730 (2017)

    Google Scholar 

  15. Luo, D., Luo, W., Yang, R., Huang, J.: Identifying compression history of wave audio and its applications. ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP), 10(3), 30 (2014)

    Google Scholar 

  16. Huang, Q., Wang, R., Yan, D., Zhang, J.: AAC double compression audio detection algorithm based on the difference of scale factor. Information 9(7), 161 (2018)

    Article  Google Scholar 

  17. Shen, Y., Jia, J., Cai, L.: Detecting double compressed AMR-format audio recordings. In Proceedings of 10th Phonetics Conference of China (2012)

    Google Scholar 

  18. Luo, D., Yang, R., Huang, J.: Detecting double compressed AMR audio using deep learning. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2688–2692 (2014)

    Google Scholar 

  19. Luo, D., Yang, R., Li, B., Huang, J.: Detection of double compressed AMR audio using stacked autoencoder. IEEE Trans. Inf. Forensics Secur. 12(2), 432–444 (2017)

    Article  Google Scholar 

  20. Gärtner, D., Dittmar, C., Aichroth, P., Cuccovillo, L., Mann, S., Schuller, G.: Efficient cross-codec framing grid analysis for audio tampering detection. In: Proceedings of Audio Engineering Society Convention, p. 136 (2014)

    Google Scholar 

  21. Gärtner, D., Cuccovillo, L., Mann, S., Aichroth, P.: A multi-codec audio dataset for codec analysis and tampering detection. In: Proceedings of 54th Audio Engineering Society Conference on Audio Forensics (2014)

    Google Scholar 

  22. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  23. Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  24. Kingma, D.: Max welling: auto-encoding variational Bayes. arXiv:1312.6114 [stat.ML] (2013)

  25. Kan, X., et al.: Snow cover mapping for mountainous areas by fusion of MODIS L1B and geographic data based on stacked denoising auto-encoders. Comput. Mater. Continua 57(1), 49–68 (2018)

    Article  Google Scholar 

  26. Zhao, X., Jiaxin, W., Zhang, Y., Shi, Y., Wang, L.: Fault diagnosis of motor in frequency domain signal by stacked de-noising auto-encoder. Comput. Mater. Continua 57(2), 223–242 (2018)

    Article  Google Scholar 

Download references

Acknowledgments

The work presented in this paper was supported in part by the NSFC (61602318, 61672551, 61631016, 61972090), the Guangzhou Science and Technology Plan Project under Grant 201707010167, the Science and Technology planning project of Guangdong Province (2014A010103039, 2015A010103022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Da Luo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luo, D., Cheng, W., Yuan, H., Luo, W., Liu, Z. (2020). Compression Detection of Audio Waveforms Based on Stacked Autoencoders. In: Sun, X., Wang, J., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2020. Lecture Notes in Computer Science(), vol 12240. Springer, Cham. https://doi.org/10.1007/978-3-030-57881-7_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57881-7_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57880-0

  • Online ISBN: 978-3-030-57881-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics