research-article

Identifying Compression History of Wave Audio and Its Applications

Authors:

Jiwu HuangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 10, Issue 3

Article No.: 30, Pages 1 - 19

https://doi.org/10.1145/2575978

Published: 17 April 2014 Publication History

Abstract

Audio signal is sometimes stored and/or processed in WAV (waveform) format without any knowledge of its previous compression operations. To perform some subsequent processing, such as digital audio forensics, audio enhancement and blind audio quality assessment, it is necessary to identify its compression history. In this article, we will investigate how to identify a decompressed wave audio that went through one of three popular compression schemes, including MP3, WMA (windows media audio) and AAC (advanced audio coding). By analyzing the corresponding frequency coefficients, including modified discrete cosine transform (MDCT) and Mel-frequency cepstral coefficients (MFCCs), of those original audio clips and their decompressed versions with different compression schemes and bit rates, we propose several statistics to identify the compression scheme as well as the corresponding bit rate previously used for a given WAV signal. The experimental results evaluated on 8,800 audio clips with various contents have shown the effectiveness of the proposed method. In addition, some potential applications of the proposed method are discussed.

References

[1]

P. Bestagini, A. Allam, S. Milani, M. Tagliasacchi, and S. Tubaro. 2012. Video codec identification. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 2257--2260.

[2]

T. Bianchi, A. Rosa, and M. Fontani. 2013. Detection and classification of double compressed MP3 audio tracks. In Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security. 159--164.

Digital Library

[3]

C.-C. Chang and C.-J. Lin. 2011. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1--27:27.

Digital Library

[4]

G. Chen, X. Kong, W. Zhong, and B. Wang. 2012. Detection of double mp3 compression based on fluctuation intensity of quantized MDCT coefficients. In Proceedings of the China Information Hiding and Multimedia Security Workshop. 164--167.

[5]

Z. Fan and R. L. De Queiroz. 2003. Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE Trans. Image Process. 12, 2, 230--235.

Digital Library

[6]

Formatfactory. Formatfactory software - http://www.formatoz.com/.

[7]

D. Fu, Y. Shi, and W. Su. 2007. A generalized benford's law for JPEG coefficients and its applications in image forensics. In Proceedings of SPIE on Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents. Vol. 6505.

[8]

Goldwave. Goldwave software - http://www.goldwave.ca/.

[9]

GTZAN. GTZAN Genre Collection - http://marsyas.info/download/data sets/.

[10]

S. Hacker. 2000. MP3: The Definitive Guide. O'Reilly Media.

Digital Library

[11]

S. Hiçsönmez, H. T. Sencar, and I. Avcibas. 2011. Audio codec identification through payload sampling. In Proceedings of the International Workshop on Information Forensics and Security.

Digital Library

[12]

S. Hiçsönmez, E. Uzun, and H. T. Sencar. 2013. Methods for identifying traces of compression in audio. In Proceedings of the 1st International Conference on Communications, Signal Processing, and Their Applications. 1--6.

[13]

F. Jenner and A. Kwasinski. 2012. Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. Kyoto, 1737--1740.

[14]

C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang. 2007. Digital audio forensics: A first practical evaluation on microphone and environment classification. In Proceedings of the Workshop on Multimedia and security. 63--74.

Digital Library

[15]

Lame MP3 Encoder. http://sourceforge.net/projects/lame/.

[16]

Q. Liu, A. Sung, and M. Qiao. 2010. Detection of double mp3 compression. Cognitive Comput. 2, 291--296.

[17]

J. Lukáš and J. Fridrich. 2003. Estimation of primary quantization matrix in double compressed JPEG images. In Proceedings of the Digital Forensic Research Workshop.

[18]

D. Luo, W. Luo, R. Yang, and J. Huang. 2012. Compression history identification for digital audio signal. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. 1733--1736.

[19]

W. Luo, J. Huang, and G. Qiu. 2010a. JPEG error analysis and its applications to digital image forensics. IEEE Trans. Inf. Forensics Secur. 5, 3, 480--491.

Digital Library

[20]

W. Luo, Y. Wang, and J. Huang. 2010b. Detection of quantization artifacts and its applications to transform encoder identification. IEEE Trans. Inf. Forensics Secur. 5, 4, 810--815.

Digital Library

[21]

H. Malik and H. Farid. 2010. Audio forensics from acoustic reverberation. In Proceedings of the International Conference on Acoustics Speech and Signal Processing. 1710--1713.

[22]

MP3Standard. Information technology - coding of moving pictures and associated audio for digital storage media up to about 1.5 mbit/s.

[23]

T. Painter and A. Spanias. 2000. Perceptual coding of digital audio. Proc. IEEE 88, 4, 451--515.

[24]

D. Pan. 1995. A tutorial on MPEG/Audio compression. IEEE Multimedia 2, 2, 60--74.

Digital Library

[25]

J. P. Princen, A. W. Johnson, and A. B. Bradley. 1987. Subband/transform coding using filter bank designs based on time domain aliasing cancellation. In Proceedings of the Intenational Conference on Acoustics, Speech, and Signal Processing. 2161--2164.

[26]

M. Qiao, A. Sung, and Q. Liu. 2010. Revealing real quality of double compressed MP3 audio. In Proceedings of the International Conference on Multimedia. 1011--1014.

Digital Library

[27]

D. Reynolds, T. Quatieri, and R. Dunn. 2000. Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10, 1, 19--41.

Digital Library

[28]

M. Tagliasacchi and S. Tubaro. 2010. Blind estimation of the QP parameter in H.264/AVC decoded video. In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services. 1--4.

[29]

Voicebox. http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.

[30]

R. Yang, Z. Qu, and J. Huang. 2008. Detecting digital audio forgeries by checking frame offsets. In Proceedings of the ACM Workshop on Multimedia and Security. 21--26.

Digital Library

[31]

R. Yang, Y. Shi, and J. Huang. 2009. Defeating fake-quality MP3. In Proceedings of the ACM Workshop on Multimedia and Security. 117--124.

Digital Library

[32]

R. Yang, Y. Shi, and J. Huang. 2010. Detecting double compression of audio signal. In Proceedings of SPIE vol. 7541, Media Forensics and Security II.

Cited By

Oreskovic JKaufman JFossat Y(2024)Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study (Preprint)JMIR Biomedical Engineering10.2196/56246Online publication date: 10-Jan-2024
https://doi.org/10.2196/56246
Zeng CLi KWang Z(2024)ENFformer: Long-short term representation of electric network frequency for digital audio tampering detectionKnowledge-Based Systems10.1016/j.knosys.2024.111938297(111938)Online publication date: Aug-2024
https://doi.org/10.1016/j.knosys.2024.111938
Papadakis NAroni IStavroulakis G(2022)Effectiveness of MP3 Coding Depends on the Music Genre: Evaluation Using Semantic Differential ScalesAcoustics10.3390/acoustics40300424:3(704-719)Online publication date: 27-Aug-2022
https://doi.org/10.3390/acoustics4030042
Show More Cited By

Index Terms

Identifying Compression History of Wave Audio and Its Applications

Recommendations

Audio compression (data): Data compression, Streaming media, Audio file format, Algorithm, Computer software, Audio codec, Lossless data compression, Lossy ... (information theory), Coding theory
A Tutorial on MPEG/Audio Compression

This tutorial covers the theory behind MPEG/audio compression. This algorithm was developed by the Motion Picture Experts Group (MPEG), as an International Organization for Standardization (ISO) standard for the high fidelity compression of digital ...
Scalable Audio Compression at Low Bitrates

A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 10, Issue 3

April 2014

140 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2602979

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2014

Accepted: 01 January 2014

Revised: 01 July 2013

Received: 01 March 2013

Published in TOMM Volume 10, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Natural Science Foundation of China
Zhujiang Science and Technology (2011J2200091)
National Science & Technology Pillar Program (No:2012BAK16B06)
Guangdong NSF (S2013010012039)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
570
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Oreskovic JKaufman JFossat Y(2024)Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study (Preprint)JMIR Biomedical Engineering10.2196/56246Online publication date: 10-Jan-2024
https://doi.org/10.2196/56246
Zeng CLi KWang Z(2024)ENFformer: Long-short term representation of electric network frequency for digital audio tampering detectionKnowledge-Based Systems10.1016/j.knosys.2024.111938297(111938)Online publication date: Aug-2024
https://doi.org/10.1016/j.knosys.2024.111938
Papadakis NAroni IStavroulakis G(2022)Effectiveness of MP3 Coding Depends on the Music Genre: Evaluation Using Semantic Differential ScalesAcoustics10.3390/acoustics40300424:3(704-719)Online publication date: 27-Aug-2022
https://doi.org/10.3390/acoustics4030042
García-Hernández JGómez-Flores W(2021)Detection of AAC compression using MDCT-based features and supervised learningJournal of Experimental & Theoretical Artificial Intelligence10.1080/0952813X.2021.1882003(1-18)Online publication date: 31-Jan-2021
https://doi.org/10.1080/0952813X.2021.1882003
Huang YLi BBarni MHuang J(2020)Identification of VoIP Speech With Multiple Domain Deep FeaturesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2019.296063515(2253-2267)Online publication date: 2020
https://doi.org/10.1109/TIFS.2019.2960635
Sampaio JNascimento F(2020)Detection of AMR double compression using compressed-domain speech featuresForensic Science International: Digital Investigation10.1016/j.fsidi.2020.20090733(200907)Online publication date: Jun-2020
https://doi.org/10.1016/j.fsidi.2020.200907
Pop GMihalache SBurileanu D(2019)Forensic Recognition of Narrowband AMR Signals2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)10.1109/SPED.2019.8906279(1-6)Online publication date: Oct-2019
https://doi.org/10.1109/SPED.2019.8906279
Kim BRafii Z(2018)Lossy Audio Compression Identification2018 26th European Signal Processing Conference (EUSIPCO)10.23919/EUSIPCO.2018.8553611(2459-2463)Online publication date: Sep-2018
https://doi.org/10.23919/EUSIPCO.2018.8553611
SHIN SJANG WYUN HPARK H(2018)Encoding Detection and Bit Rate Classification of AMR-Coded Speech Based on Deep Neural NetworkIEICE Transactions on Information and Systems10.1587/transinf.2017EDL8155E101.D:1(269-272)Online publication date: 2018
https://doi.org/10.1587/transinf.2017EDL8155
Luo WLi HYan QYang RHuang J(2018)Improved Audio Steganalytic Feature and Its Applications in Audio ForensicsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/319057514:2(1-14)Online publication date: 25-Apr-2018
https://dl.acm.org/doi/10.1145/3190575
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents