Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Verma, Vinay; Khanna, Nitin

doi:10.1007/s11042-020-10205-z

Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Published: 07 January 2021

Volume 80, pages 23581–23603, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

437 Accesses
9 Citations
Explore all metrics

Abstract

With the rapid increase in user-generated multimedia content, extensive outreach over social media, and their potential in critical applications such as law enforcement, sourcey identification from re-compressed and noisy multimedia are of great importance. This paper proposes a system for speaker-independent cell-phone identification from recorded audio. This system is capable of dealing with test audio with different speech content and a different speaker compared to the training audio. Each recorded audio has the device fingerprint implicitly embedded in it, which encourages us to design a CNN-based system for learning the device-specific signatures directly from the magnitude of discrete Fourier transform of the audio. This paper also addresses the scenario where the recorded audio is re-compressed due to efficient storage and network transmission requirements, which is a common phenomenon in this age of social media. The scenario of the cell-phone classification from the audio recordings in the presence of additive white Gaussian noise is addressed as well. We show that our proposed system performs as well as the state-of-art systems for the speaker-dependent case with clean audio recordings and exhibits much higher robustness in the speaker-independent case with clean, re-compressed, and noisy audio recordings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Source Cell-Phone Identification Using Spectral Features of Device Self-noise

Speaker Identification in Noisy Environments for Forensic Purposes

Effect of Format Conversion on Source Identification from Audio Recordings: A Study for Forensic Purposes

Article 19 November 2023

Data Availability

The dataset created by the authors might be made publicly available depending on the permissions received from the funding agency.

Code Availability

The code created by the authors for this paper might be made publicly available depending on the permissions received from the funding agency.

Notes

References

Aggarwal R, Singh S, Roul AK, Khanna N (2014) Cellphone identification using noise estimates from recorded audio. In: International conference on communications and signal processing (ICCSP). IEEE, pp 1218–1222
Baldini G, Amerini I (2019) Smartphones identification through the built-in microphones with convolutional neural network. IEEE Access 7:158685–158696
Article Google Scholar
Baldini G, Amerini I, Gentile C (2019) Microphone identification using. Convolutional Neural Networks. IEEE Sensors Letters
Bellard F, Niedermayer M, et al. (2019) FFmpeg. Available from: http://ffmpeg.org
Buchholz R, Kraetzer C, Dittmann J (2009) Microphone classification using Fourier coefficients. In: International workshop on information hiding. Springer, pp 235–246
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3):1–27
Article Google Scholar
Cuccovillo L, Aichroth P (2016) Open-set microphone classification via blind channel analysis. In: IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2074–2078
Cuccovillo L, Mann S, Tagliasacchi M, Aichroth P (2013) Audio tampering detection via microphone classification. In: IEEE 15th international workshop on multimedia signal processing (MMSP). IEEE, pp 177–182
Eskidere Ö (2014) Source microphone identification from speech recordings based on a Gaussian mixture model. Turkish Journal of Electrical Engineering & Computer Sciences 22(3):754–767
Article Google Scholar
Eskidere Ö (2016) Identifying acquisition devices from recorded speech signals using wavelet-based features. Turkish Journal of Electrical Engineering & Computer Sciences 24(3):1942–1954
Article Google Scholar
Garcia-Romero D, Espy-Wilson CY (2010) Automatic acquisition device identification from speech recordings. In: International conference on acoustics speech and signal processing (ICASSP). IEEE, pp 1806–1809
Hanilçi C, Ertas F (2013) Optimizing acoustic features for source cell-phone recognition using speech signals. In: Proceedings of the first ACM workshop on information hiding and multimedia security. ACM, pp 141–148
Hanilçi C, Ertas F, Ertas T, Eskidere Ö (2012) Recognition of brand and models of cell-phones from recorded speech signals. IEEE Trans Inform Forensics Secur 7(2):625–634
Article Google Scholar
Hanilçi C, Kinnunen T (2014) Source cell-phone recognition from recorded speech using non-speech segments. Digital Signal Processing 35:75–85
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Ikram S, Malik H (2012) Microphone identification using higher-order statistics. In: Audio engineering society conference: 46th international conference: audio forensics. Audio Engineering Society
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Jiang Y, Leung FH (2019) Source microphone recognition aided by a kernel-based projection method. IEEE Trans Inform Forensics Secur 14(11):2875–2886
Article Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Kotropoulos C, Samaras S (2014) Mobile phone identification using recorded speech signals. In: 19th international conference on digital signal processing (DSP). IEEE, pp 586–591
Kraetzer C, Oermann A, Dittmann J, Lang A (2007) Digital audio forensics: a first practical evaluation on microphone and environment classification. In: Proceedings of the 9th workshop on multimedia & security. ACM, pp 63–74
Kraetzer C, Schott M, Dittmann J (2009) Unweighted fusion in microphone forensics using a decision tree and linear logistic regression models. In: Proceedings of the 11th ACM workshop on Multimedia and security, pp 49–56
Kurniawan F, Rahim M, Mohd S, Khalil MS, Khan MK (2016) Statistical based audio forensic on identical microphones. International Journal of Electrical & Computer Engineering (2088-8708) 6(5)
Li Y, Zhang X, Li X, Zhang Y, Yang J, He Q (2018) Mobile phone clustering from speech recordings using deep representation and spectral clustering. IEEE Trans Inform Forensics Secur 13(4):965–977
Article Google Scholar
Luo D, Korus P, Huang J (2018) Band energy difference for source attribution in audio forensics. IEEE Trans Inform Forensics Secur 13(9):2179–2189
Article Google Scholar
Luo D, Yang R, Li B, Huang J (2016) Detection of double compressed AMR audio using stacked autoencoder. IEEE Trans Inform Forensics Secur 12 (2):432–444
Article Google Scholar
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605
MATH Google Scholar
O’Dea S (2020) Number of smartphone users worldwide from 2016 to 2021. https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/
Panagakis Y, Kotropoulos C (2012) Automatic telephone handset identification by sparse representation of random spectral features. In: Proceedings of the on multimedia and security, pp 91–96
Panagakis Y, Kotropoulos C (2012) Telephone handset identification by feature selection and sparse representations. In: IEEE international workshop on information forensics and security (WIFS). IEEE, pp 73–78
Pandey V, Verma VK, Khanna N (2014) Cell-phone identification from audio recordings using PSD of speech-free regions. In: IEEE students’ conference on electrical, electronics and computer science (SCEECS). IEEE, pp 1–6
Poisel R, Tjoa S (2011) Forensics investigations of multimedia data: a review of the state-of-the-art. In: Sixth international conference on IT security incident management and IT forensics. IEEE, pp 48–61
Qin T, Wang R, Yan D, Lin L (2018) Source cell-phone identification in the presence of additive noise from CQT domain. Information 9(8):205
Article Google Scholar
Rabiner L, Schafer R (1978) Digital processing of speech signals. Prentice-Hall signal processing series. Prentice-Hall
Shen Y, Jia J, Cai L (2012) Detecting double compressed AMR-format audio recordings. In: Proc. of the 10th phonetics conference of China (PCC), pp 1–5
Stamm MC, Wu M, Liu KR (2013) Information forensics: an overview of the first decade. IEEE Access 1:167–200
Article Google Scholar
Verma V, Agarwal N, Khanna N (2018) Dct-domain deep convolutional neural networks for multiple JPEG compression classification. Signal Processing: Image Communication 67:22–33
Google Scholar
Verma V, Khanna N (2019) CNN-based system for speaker independent cell-phone identification from recorded audio. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 53–61
Verma V, Khaturia P, Khanna N (2018) Cell-phone identification from recompressed audio recordings. In: Twenty fourth national conference on communications (NCC). IEEE, pp 1–6
Vu HQ, Liu S, Yang X, Li Z, Ren Y (2012) Identifying microphone from noisy recordings by using representative instance one class-classification approach. Journal of Networks
Wang Q, Zhang R (2016) Double JPEG compression forensics based on a convolutional neural network. EURASIP Journal on Information Security 2016:23
Wojcicki K (2020) HTK MFCC MATLAB. https://in.mathworks.com/matlabcentral/fileexchange/32849-htk-mfcc-matlab
Zakariah M, Khan MK, Malik H (2018) Digital multimedia audio forensics: past, present and future. Multimed Tools Applic 77(1):1009–1040
Article Google Scholar
Zou L, He Q, Wu J (2017) Source cell phone verification from speech recordings using sparse representation. Digital Signal Processing 62:125–136
Article Google Scholar

Download references

Acknowledgements

We would like to thank Mr. Da Luo of Shenzhen University for providing us the feature extraction code of [25]. This material is based upon work partially supported by a grant from the Department of Science and Technology (DST), New Delhi, India, under Award Number ECR/2015/000583. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

Funding

This material is based upon work partially supported by a grant from the Department of Science and Technology (DST), New Delhi, India, under Award Number ECR/2015/000583.

Author information

Authors and Affiliations

Multimedia Analysis and Security (MANAS) Lab, Electrical Engineering, Indian Institute of Technology Gandhinagar (IITGN), Gujarat, India
Vinay Verma & Nitin Khanna

Authors

Vinay Verma
View author publications
You can also search for this author in PubMed Google Scholar
Nitin Khanna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nitin Khanna.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in IEEE CVPR’19 Workshop on Media Forensics [38].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Verma, V., Khanna, N. Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings. Multimed Tools Appl 80, 23581–23603 (2021). https://doi.org/10.1007/s11042-020-10205-z

Download citation

Received: 06 March 2020
Revised: 19 August 2020
Accepted: 25 November 2020
Published: 07 January 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11042-020-10205-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Abstract

Access this article

Similar content being viewed by others

Source Cell-Phone Identification Using Spectral Features of Device Self-noise

Speaker Identification in Noisy Environments for Forensic Purposes

Effect of Format Conversion on Source Identification from Audio Recordings: A Study for Forensic Purposes

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speaker-independent source cell-phone identification for re-compressed and noisy audio recordings

Abstract

Access this article

Similar content being viewed by others

Source Cell-Phone Identification Using Spectral Features of Device Self-noise

Speaker Identification in Noisy Environments for Forensic Purposes

Effect of Format Conversion on Source Identification from Audio Recordings: A Study for Forensic Purposes

Data Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation