Standard audio format encapsulation (SAFE)

Beigi, Homayoon; Markowitz, Judith A.

doi:10.1007/s11235-010-9315-1

Standard audio format encapsulation (SAFE)

Published: 26 May 2010

Volume 47, pages 235–242, (2011)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Homayoon Beigi¹ &
Judith A. Markowitz²

97 Accesses
1 Citation
Explore all metrics

Abstract

One characteristic that distinguishes speaker recognition (identification, verification, classification, tracking, etc.) from other biometrics is that it is designed to operate with devices and over channels that were created for other technologies and functions. That characteristic supports broad, inexpensive, and speedy deployments. The explosion of mobile devices has exacerbated the mismatch problem and the challenges for interoperability. This paper presents a detailed proposal for interoperability that supports all types of audio interchange operations while, at the same time, limiting the audio formats to a small set of widely-used, open standards. We call this proposal Standard Audio Format Encapsulation (SAFE). The SAFE proposal has been incorporated into speaker-recognition data interchange draft standards by the M1 (biometrics) committee of ANSI/INCITS and ISO/IEC JTC1/SC37 project 19794-13 (Voice data).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

ANSI/INCITS (2009). Project 1821—INCITS 456:200x, information technology—speaker recognition format for raw data interchange (SIVR-1). URL abstract: http://www.incits.org/abstracts/1821a.htm, purchase: http://www.techstreet.com.
Beigi, H. (2009). Effects of time lapse on speaker recognition results. In 16th internation conference on digital signal processing (pp. 1–6).
Beigi, H. (2010). Fundamentals of speaker recognition. New York: Springer. ISBN: 978-0-387-77591-3.
Google Scholar
Burrows, M., & Wheeler, D. J. (1994). A block-sorting lossless data compression algorithm. Tech. rep., Digital SRC Research Report.
Coalson, J. (2009). FLAC comparison.
Coalson, J. (2009). FLAC (free lossless audio codec).
Coalson, J. (2009). FLAC links.
Goncalves, I., Pfeiffer, S., & Montgomery, C. (2008). Ogg media types. RFC 5334 (proposed standard). URL http://www.ietf.org/rfc/rfc5334.txt.
Huffman, D. (1952). A method for the construction of minimum-redundancy codes. Proceedings of the Institute of Radio Engineers, 40(9), 1098–1101.
Google Scholar
ITU-T (1988). G.711 pulse code modulation (PCM) of voice frequencies. ITU-T recommendation. URL http://www.itu.int/rec/T-REC-G.711-198811-I/en.
JTC1/SC37, I. (2009). Text of 3rd WD 19794-13 biometric data interchange formats—part 13: voice data. URL http://isotc.iso.org/livelink/livelink/JTC001-SC37-N-3053.pdf?func=doc.Fetch&nodeId=7941680&docTitle=JTC001-SC37-N-3053.
Pfeiffer, S. (2003). The ogg encapsulation format version 0. RFC 3533 (informational). URL http://www.ietf.org/rfc/rfc3533.txt.
Rabiner, L., & Juang, B. H. (1993). Fundamentals of speech recognition. Prentice Hall signal processing series. New York: Prentice Hall. ISBN: 0-13-015157-2.
Google Scholar
Salomon, D. (2006). Data compression: the complete reference (4th ed.). New York: Springer. ISBN: 1-84-628602-6.
Google Scholar
Sollaud, A. (2008). RTP payload format for ITU-T recommendation G.711.1. RFC 5391 (proposed standard). URL http://www.ietf.org/rfc/rfc5391.txt.
Summerfield, R., Dunstone, T., & Summerfield, C. (2008). Speaker verification in a multi-vendor environment. In W3C workshop on speaker identification and verification (SIV).
*0.8* 1.2 Vorbis I Specifications (2004). The XIPH open-source community. URL http://xiph.org/ao/doc/.
Viswanathan, M., Beigi, H. S., Dharanipragada, S., Maali, F., & Tritschler, A. (2000). Multimedia document retrieval using speech and speaker recognition. International Journal on Document Analysis and Recognition, 2(4), 147–162. Invited paper.
Google Scholar
Libao ogg audio api. (2004). The XIPH open-source community. URL http://xiph.org/ao/doc/.
Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3), 337–343.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Recognition Technologies, Inc., 3616 Edgehill Road, Yorktown Heights, NY, 10598-1104, USA
Homayoon Beigi
J. Markowitz Consultants, 5801 North Sheridan Road, Suite 19A, Chicago, IL, 60660, USA
Judith A. Markowitz

Authors

Homayoon Beigi
View author publications
You can also search for this author in PubMed Google Scholar
Judith A. Markowitz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Homayoon Beigi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beigi, H., Markowitz, J.A. Standard audio format encapsulation (SAFE). Telecommun Syst 47, 235–242 (2011). https://doi.org/10.1007/s11235-010-9315-1

Download citation

Published: 26 May 2010
Issue Date: August 2011
DOI: https://doi.org/10.1007/s11235-010-9315-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Standard audio format encapsulation (SAFE)

Abstract

Access this article

Similar content being viewed by others

The Role and Importance of Speech Standards

Attacking Speaker Recognition Systems with Phoneme Morphing

Challenges in Speech Coding Research

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Standard audio format encapsulation (SAFE)

Abstract

Access this article

Similar content being viewed by others

The Role and Importance of Speech Standards

Attacking Speaker Recognition Systems with Phoneme Morphing

Challenges in Speech Coding Research

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation