research-article

Optimizing acoustic features for source cell-phone recognition using speech signals

Authors:

Cemal Hanilçi,

Figen ErtasAuthors Info & Claims

IH&MMSec '13: Proceedings of the first ACM workshop on Information hiding and multimedia security

Pages 141 - 148

https://doi.org/10.1145/2482513.2482520

Published: 17 June 2013 Publication History

Abstract

This paper presents comparison and optimization of acoustic features for source cell-phone recognition using recorded speech signals. Different acoustic feature extraction methods such as Mel-frequency, linear frequency and Bark frequency cepstral coefficients (MFCC, LFCC and BFCC) and linear prediction cepstral coefficients (LPCC) are considered. In addition to different feature sets, the effect of dynamic features, delta and double-delta coefficients (Δ and Δ²), and feature normalizations, cepstral mean normalization (CMN), cepstral variance normalization (CVN) and cepstral mean and variance normalization (CMVN) are also examined on the performance of source cell-phone recognition. The same support vector machine (SVM) classifier with fixed parameters and the same cell-phone dataset are used in the experiments in order to make a fair comparison of different features and feature normalization techniques.

References

[1]

İ. Avcıbaş. Audio steganalysis with content-independent distortion measures. IEEE Signal Processing Letters, 13(2):92--95, Feb. 2006.

[2]

İ. Avcıbaş, N. D. Memon, and B. Sankur. Steganalysis using image quality metrics. IEEE Transactions on Image Processing, 12(2):221--229, 2003.

Digital Library

[3]

S. Bayram,.I. Avcıbaş, B. Sankur, and N. Memon. Image manipulation detection. Journal of Electronic Imaging, 15(4):1--17, Dec. 2006.

[4]

F. Bimbot, J.-F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega-Garcia, D. Petrovska-Delacrétaz, and D. A. Reynolds. A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing., 2004(4):430--451, 2004.

Digital Library

[5]

W. M. Campbell. Generalized linear discriminant sequence kernels for speaker recognition. In Proceedings of the IEEE Int. Conf. Audio, Speech and Sig. Processing (ICASSP'02), pages 161--164, 2002.

[6]

W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, and P. A. Torres-Carrasquillo. Support vector machines for speaker and language recognition. Computer Speech & Language, 20(2--3):210--229, 2006.

[7]

C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3):1--27, 2011. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.

Digital Library

[8]

K. Daoudi and J. Louradour. A comparison between sequence kernels for SVM speaker verification. In Proceedings of the IEEE Int. Conf. Audio, Speech and Sig. Processing (ICASSP'09), pages 4241--4244, 2009.

Digital Library

[9]

A. E. Dirik, H. T. Sencar, and N. D. Memon. Flatbed scanner identification based on dust and scratches over scanner platen. In Proceedings of the IEEE Int. Conf. Audio, Speech and Sig. Processing (ICASSP'09), pages 1385--1388, 2009.

Digital Library

[10]

S. Furui. Digital Speech Processing, Synthesis, and Recognition. New York and Basel: Marcel Dekker, Inc., 1989.

[11]

C. Hanilçi and F. Ertaş. Investigation of the effect of data duration and speaker gender on text-independent speaker recognition. Computers & Electrical Engineering, 39(2):441--452, 2013.

Digital Library

[12]

C. Hanilçi, F. Ertaş, T. Ertaş, and Ö. Eskidere. Recognition of brand and model of cell-phones from recorded speech signals. IEEE Transactions on Information Forensics and Security, 7(2):625--634, 2012.

Digital Library

[13]

N. Khanna. Scanner identification using feature-based processing and analysis. IEEE Transactions on Information Forensics and Security, 4(1):123--139, 2009.

Digital Library

[14]

N. Khanna, A. K. Mikkilineni, A. F. Martone, G. N. Ali, G. T. C. Chiu, J. P. Allebach, and E. J. Delp. A survey of forensic characterization methods for physical devices. Digital Investigation, 3:17--28, Sept. 2006.

Digital Library

[15]

B. E. Koenig. Authentication of forensic audio recordings. Journal of Audio Engineering Society, 38(1--2):3--33, Jan.-Feb. 1990.

[16]

B. E. Koenig and D. S. Lacey. Forensic authentication of digital audio recordings. Journal of Audio Engineering Society, 57(9):662--695, Sept. 2009.

[17]

F.-H. Liu, R. M. Stern, X. Huang, and A. Acero. Efficient cepstral normalization for robust speech recognition. In Proceedings of the Workshop on Human Language Technology, pages 69--74, 1993.

Digital Library

[18]

Q. Liu, A. H. Sung, and M. Qiao. Temporal derivative-based spectrum and mel-cepstrum audio steganalysis. IEEE Transactions on Information Forensics and Security, 4(3):359--368, 2009.

Digital Library

[19]

Q. Liu, A. H. Sung, and M. Qiao. Derivative-based audio steganalysis. ACM Transactions on Multimedia Computing, Communications and Applications, 7(3):18:1--18:19, 2011.

Digital Library

[20]

P. C. Loizou. Speech Enhancement: Theory and Practice . CRC Press, 1st edition, June 2007.

[21]

J. Lukáŝ, J. Fridrich, and M. Goljan. Digital camera identification from sensor pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205--214, June 2006.

Digital Library

[22]

J. Makhoul. Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4):561--580, Apr. 1975.

[23]

R. C. Mayer. Audio forensic examination. IEEE Signal Processing Magazine, 26(2):84--94, March 2009.

[24]

A. K. Mikkilineni, N. Khanna, and E. J. Delp. Texture based attacks on intrinsic signature based printer identification. In Proceedings of the Media Forensics and Security, volume 7541, 2010.

[25]

Y. Panagakis and C. Kotropoulos. Automatic telephone handset identification by sparse representation of random spectral features. In Proceedings of the Multimedia and Security, pages 91--96. ACM, 2012.

Digital Library

[26]

D. A. Reynolds. Experimental evaluation of features for robust speaker identification. IEEE Transactions on Speech and Audio Processing, 2(4), Oct. 1994.

[27]

D. A. Reynolds. Large Population Speaker Identification Using Clean and Telephone Speech. IEEE Signal Processing Letters, 2:46--48, Mar. 1995.

[28]

D. P. N. Rodríguez, J. A. Apolinário, and L. W. P. Biscainho. Audio authenticity: detecting ENF discontinuity with high precision phase analysis. IEEE Transactions on Information Forensics and Security, 5(3):534--543, Sept. 2010.

Digital Library

[29]

P. Rose. Forensic Speaker Identification. CRC Press, July 2002.

[30]

B. J. Shannon and K. K. Paliwal. A comparative study of filter bank spacing for speech recognition. In Proceedings of the Microelectronic Engineering Research Conference, 2003.

[31]

D. Sharma, P. A. Naylor, N. D. Gaubitch, and M. Brookes. Non intrusive codec identification algorithm. In Proceedings of the IEEE Int. Conf. Audio, Speech and Sig. Processing (ICASSP-2012), pages 4477--4480, 2012.

[32]

T.-F. Wu, C.-J. Lin, and R. C. Weng. Probability estimates for multi-class classification by pairwise coupling. The Journal of Machine Learning Research, 5:975--1005, 2004.

Digital Library

[33]

R. Zheng, S. Zhang, and B. Xu. A comparative study of feature and score normalization for speaker verification. In Proceedings of the 2006 International Conference on Advances in Biometrics, ICB'06, pages 531--538, Berlin, Heidelberg, 2006. Springer-Verlag.

Digital Library

Cited By

Alimohad ABengherabi MBelabbaci EBengherabi A(2024)I-vector and variability compensation techniques for mobile phone recognitionSTUDIES IN ENGINEERING AND EXACT SCIENCES10.54021/seesv5n2-3685:2(e9486)Online publication date: 21-Oct-2024
https://doi.org/10.54021/seesv5n2-368
Zeng CZhao YWang ZLi KWan XLiu M(2024)Squeeze-and-Excitation Self-Attention Mechanism Enhanced Digital Audio Source Recognition Based on Transfer LearningCircuits, Systems, and Signal Processing10.1007/s00034-024-02850-844:1(480-512)Online publication date: 13-Sep-2024
https://doi.org/10.1007/s00034-024-02850-8
Wang ZZhan JZhang GOuyang DGuo H(2023)An End-to-End Transfer Learning Framework of Source Recording Device Identification for Audio Sustainable SecuritySustainability10.3390/su15141127215:14(11272)Online publication date: 19-Jul-2023
https://doi.org/10.3390/su151411272
Show More Cited By

Index Terms

Optimizing acoustic features for source cell-phone recognition using speech signals
1. Information systems

Recommendations

Source cell-phone recognition from recorded speech using non-speech segments

In a recent study, we have introduced the problem of identifying cell-phones using recorded speech and shown that speech signals convey information about the source device, making it possible to identify the source with some accuracy. In this paper, we ...
Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

We study the low-variance and robust features for speech recognition system on the AURORA-4 corpus.We propose to compute cepstral features from a regularized MVDR (RMVDR) spectral estimates, denoted as RMVDR-based Cepstral Coefficient (RMCC) features.A ...
Voice Gender Recognition Using Acoustic Features, MFCCs and SVM
Computational Science and Its Applications – ICCSA 2022
Abstract
This paper presents a voice gender recognition system. Acoustic features and Mel-Frequency Cepstral Coefficients (MFCCs) are extracted to define the speaker's gender. The most used features in these kinds of studies are acoustic features, but in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IH&MMSec '13: Proceedings of the first ACM workshop on Information hiding and multimedia security

June 2013

242 pages

ISBN:9781450320818

DOI:10.1145/2482513

General Chair:
William Puech
LIRMM & University of Montpellier, France
,
Program Chairs:
Marc Chaumont
LIRMM & University of Nimes, France
,
Jana Dittmann
Otto-von-Guericke University, Germany
,
Patrizio Campisi
University of Roma TRE, Italy

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

IH&MMSec '13

Sponsor:

SIGMM

IH&MMSec '13: ACM Information Hiding and Multimedia Security Workshop

June 17 - 19, 2013

Montpellier, France

Acceptance Rates

IH&MMSec '13 Paper Acceptance Rate 27 of 74 submissions, 36%;

Overall Acceptance Rate 128 of 318 submissions, 40%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
239
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alimohad ABengherabi MBelabbaci EBengherabi A(2024)I-vector and variability compensation techniques for mobile phone recognitionSTUDIES IN ENGINEERING AND EXACT SCIENCES10.54021/seesv5n2-3685:2(e9486)Online publication date: 21-Oct-2024
https://doi.org/10.54021/seesv5n2-368
Zeng CZhao YWang ZLi KWan XLiu M(2024)Squeeze-and-Excitation Self-Attention Mechanism Enhanced Digital Audio Source Recognition Based on Transfer LearningCircuits, Systems, and Signal Processing10.1007/s00034-024-02850-844:1(480-512)Online publication date: 13-Sep-2024
https://doi.org/10.1007/s00034-024-02850-8
Wang ZZhan JZhang GOuyang DGuo H(2023)An End-to-End Transfer Learning Framework of Source Recording Device Identification for Audio Sustainable SecuritySustainability10.3390/su15141127215:14(11272)Online publication date: 19-Jul-2023
https://doi.org/10.3390/su151411272
Leonzio DCuccovillo LBestagini PMarcon MAichroth PTubaro S(2023)Audio Splicing Detection and Localization Based on Acquisition Device TracesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.329341518(4157-4172)Online publication date: 2023
https://doi.org/10.1109/TIFS.2023.3293415
Zhang YLuo D(2023)Audio Source Verification Method Based on Structural Re-parameterization Network2023 4th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)10.1109/AINIT59027.2023.10212478(756-759)Online publication date: 16-Jun-2023
https://doi.org/10.1109/AINIT59027.2023.10212478
Li CWang JDing XZhang N(2021)Acoustic Imaging Using the Built-In Sensors of a SmartphoneSymmetry10.3390/sym1306106513:6(1065)Online publication date: 14-Jun-2021
https://doi.org/10.3390/sym13061065
Zeng CZhu DWang ZWu MXiong WZhao N(2021)Spatial and temporal learning representation for end-to-end recording device identificationEURASIP Journal on Advances in Signal Processing10.1186/s13634-021-00763-12021:1Online publication date: 17-Jul-2021
https://doi.org/10.1186/s13634-021-00763-1
Verma VKhanna N(2021)Speaker-independent source cell-phone identification for re-compressed and noisy audio recordingsMultimedia Tools and Applications10.1007/s11042-020-10205-zOnline publication date: 7-Jan-2021
https://doi.org/10.1007/s11042-020-10205-z
Li XYan DDong LWang R(2019)Anti-Forensics of Audio Source Identification Using Generative Adversarial NetworkIEEE Access10.1109/ACCESS.2019.29600977(184332-184339)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2960097
Chen JXiang SHuang HLiu W(2019)Detecting and locating digital audio forgeries based on singularity analysis with wavelet packetMultimedia Tools and Applications10.1007/s11042-014-2406-375:4(2303-2325)Online publication date: 17-Jan-2019
https://dl.acm.org/doi/10.1007/s11042-014-2406-3
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten