skip to main content
research-article

Identifying Compression History of Wave Audio and Its Applications

Published: 17 April 2014 Publication History

Abstract

Audio signal is sometimes stored and/or processed in WAV (waveform) format without any knowledge of its previous compression operations. To perform some subsequent processing, such as digital audio forensics, audio enhancement and blind audio quality assessment, it is necessary to identify its compression history. In this article, we will investigate how to identify a decompressed wave audio that went through one of three popular compression schemes, including MP3, WMA (windows media audio) and AAC (advanced audio coding). By analyzing the corresponding frequency coefficients, including modified discrete cosine transform (MDCT) and Mel-frequency cepstral coefficients (MFCCs), of those original audio clips and their decompressed versions with different compression schemes and bit rates, we propose several statistics to identify the compression scheme as well as the corresponding bit rate previously used for a given WAV signal. The experimental results evaluated on 8,800 audio clips with various contents have shown the effectiveness of the proposed method. In addition, some potential applications of the proposed method are discussed.

References

[1]
P. Bestagini, A. Allam, S. Milani, M. Tagliasacchi, and S. Tubaro. 2012. Video codec identification. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 2257--2260.
[2]
T. Bianchi, A. Rosa, and M. Fontani. 2013. Detection and classification of double compressed MP3 audio tracks. In Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security. 159--164.
[3]
C.-C. Chang and C.-J. Lin. 2011. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1--27:27.
[4]
G. Chen, X. Kong, W. Zhong, and B. Wang. 2012. Detection of double mp3 compression based on fluctuation intensity of quantized MDCT coefficients. In Proceedings of the China Information Hiding and Multimedia Security Workshop. 164--167.
[5]
Z. Fan and R. L. De Queiroz. 2003. Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE Trans. Image Process. 12, 2, 230--235.
[6]
Formatfactory. Formatfactory software - http://www.formatoz.com/.
[7]
D. Fu, Y. Shi, and W. Su. 2007. A generalized benford's law for JPEG coefficients and its applications in image forensics. In Proceedings of SPIE on Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents. Vol. 6505.
[8]
Goldwave. Goldwave software - http://www.goldwave.ca/.
[9]
GTZAN. GTZAN Genre Collection - http://marsyas.info/download/data sets/.
[10]
S. Hacker. 2000. MP3: The Definitive Guide. O'Reilly Media.
[11]
S. Hiçsönmez, H. T. Sencar, and I. Avcibas. 2011. Audio codec identification through payload sampling. In Proceedings of the International Workshop on Information Forensics and Security.
[12]
S. Hiçsönmez, E. Uzun, and H. T. Sencar. 2013. Methods for identifying traces of compression in audio. In Proceedings of the 1st International Conference on Communications, Signal Processing, and Their Applications. 1--6.
[13]
F. Jenner and A. Kwasinski. 2012. Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. Kyoto, 1737--1740.
[14]
C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang. 2007. Digital audio forensics: A first practical evaluation on microphone and environment classification. In Proceedings of the Workshop on Multimedia and security. 63--74.
[15]
Lame MP3 Encoder. http://sourceforge.net/projects/lame/.
[16]
Q. Liu, A. Sung, and M. Qiao. 2010. Detection of double mp3 compression. Cognitive Comput. 2, 291--296.
[17]
J. Lukáš and J. Fridrich. 2003. Estimation of primary quantization matrix in double compressed JPEG images. In Proceedings of the Digital Forensic Research Workshop.
[18]
D. Luo, W. Luo, R. Yang, and J. Huang. 2012. Compression history identification for digital audio signal. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 1733--1736.
[19]
W. Luo, J. Huang, and G. Qiu. 2010a. JPEG error analysis and its applications to digital image forensics. IEEE Trans. Inf. Forensics Secur. 5, 3, 480--491.
[20]
W. Luo, Y. Wang, and J. Huang. 2010b. Detection of quantization artifacts and its applications to transform encoder identification. IEEE Trans. Inf. Forensics Secur. 5, 4, 810--815.
[21]
H. Malik and H. Farid. 2010. Audio forensics from acoustic reverberation. In Proceedings of the International Conference on Acoustics Speech and Signal Processing. 1710--1713.
[22]
MP3Standard. Information technology - coding of moving pictures and associated audio for digital storage media up to about 1.5 mbit/s.
[23]
T. Painter and A. Spanias. 2000. Perceptual coding of digital audio. Proc. IEEE 88, 4, 451--515.
[24]
D. Pan. 1995. A tutorial on MPEG/Audio compression. IEEE Multimedia 2, 2, 60--74.
[25]
J. P. Princen, A. W. Johnson, and A. B. Bradley. 1987. Subband/transform coding using filter bank designs based on time domain aliasing cancellation. In Proceedings of the Intenational Conference on Acoustics, Speech, and Signal Processing. 2161--2164.
[26]
M. Qiao, A. Sung, and Q. Liu. 2010. Revealing real quality of double compressed MP3 audio. In Proceedings of the International Conference on Multimedia. 1011--1014.
[27]
D. Reynolds, T. Quatieri, and R. Dunn. 2000. Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10, 1, 19--41.
[28]
M. Tagliasacchi and S. Tubaro. 2010. Blind estimation of the QP parameter in H.264/AVC decoded video. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services. 1--4.
[29]
Voicebox. http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
[30]
R. Yang, Z. Qu, and J. Huang. 2008. Detecting digital audio forgeries by checking frame offsets. In Proceedings of the ACM Workshop on Multimedia and Security. 21--26.
[31]
R. Yang, Y. Shi, and J. Huang. 2009. Defeating fake-quality MP3. In Proceedings of the ACM Workshop on Multimedia and Security. 117--124.
[32]
R. Yang, Y. Shi, and J. Huang. 2010. Detecting double compression of audio signal. In Proceedings of SPIE vol. 7541, Media Forensics and Security II.

Cited By

View all
  • (2024)Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study (Preprint)JMIR Biomedical Engineering10.2196/56246Online publication date: 10-Jan-2024
  • (2024)ENFformer: Long-short term representation of electric network frequency for digital audio tampering detectionKnowledge-Based Systems10.1016/j.knosys.2024.111938297(111938)Online publication date: Aug-2024
  • (2022)Effectiveness of MP3 Coding Depends on the Music Genre: Evaluation Using Semantic Differential ScalesAcoustics10.3390/acoustics40300424:3(704-719)Online publication date: 27-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 10, Issue 3
April 2014
140 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2602979
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2014
Accepted: 01 January 2014
Revised: 01 July 2013
Received: 01 March 2013
Published in TOMM Volume 10, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Audio compression history identification
  2. mel-frequency cepstral coefficients
  3. modified discrete cosine transform

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study (Preprint)JMIR Biomedical Engineering10.2196/56246Online publication date: 10-Jan-2024
  • (2024)ENFformer: Long-short term representation of electric network frequency for digital audio tampering detectionKnowledge-Based Systems10.1016/j.knosys.2024.111938297(111938)Online publication date: Aug-2024
  • (2022)Effectiveness of MP3 Coding Depends on the Music Genre: Evaluation Using Semantic Differential ScalesAcoustics10.3390/acoustics40300424:3(704-719)Online publication date: 27-Aug-2022
  • (2021)Detection of AAC compression using MDCT-based features and supervised learningJournal of Experimental & Theoretical Artificial Intelligence10.1080/0952813X.2021.1882003(1-18)Online publication date: 31-Jan-2021
  • (2020)Identification of VoIP Speech With Multiple Domain Deep FeaturesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2019.296063515(2253-2267)Online publication date: 2020
  • (2020)Detection of AMR double compression using compressed-domain speech featuresForensic Science International: Digital Investigation10.1016/j.fsidi.2020.20090733(200907)Online publication date: Jun-2020
  • (2019)Forensic Recognition of Narrowband AMR Signals2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)10.1109/SPED.2019.8906279(1-6)Online publication date: Oct-2019
  • (2018)Lossy Audio Compression Identification2018 26th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO.2018.8553611(2459-2463)Online publication date: Sep-2018
  • (2018)Encoding Detection and Bit Rate Classification of AMR-Coded Speech Based on Deep Neural NetworkIEICE Transactions on Information and Systems10.1587/transinf.2017EDL8155E101.D:1(269-272)Online publication date: 2018
  • (2018)Improved Audio Steganalytic Feature and Its Applications in Audio ForensicsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/319057514:2(1-14)Online publication date: 25-Apr-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media