On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing

Lin, Hsiao-Pu; Hsieh, Hung-Yun

doi:10.1007/978-3-642-10625-5_43

Hsiao-Pu Lin²⁰ &
Hung-Yun Hsieh^20,21

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 22))

Included in the following conference series:

International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness

615 Accesses
3 Altmetric

Abstract

As the popularity of multi-functional communication devices grows, traditional audio conferencing now may involve heterogeneous teleconferencing devices, including POTS phone, VoIP phones, dual-mode smart phones, and so on. During a multi-party audio conference involving heterogeneous devices, it is possible that a video conference is held concurrently involving a subset of devices capable of processing video streams for better the conferencing experience. In such a scenario, the need for synchronization between circuit-switched audio streams and packet-switched video streams arises. While the problem of audio-video synchronization has been extensively investigated in related work, existing solutions are limited to synchronization in packet-data networks and hence are not applicable in the target environment. In this work, we consider the problem of supporting such an overlay video conference among dual-mode phones. We first transform the audio-video synchronization problem into the problem of synchronizing circuit-switched and packet-switched audio streams. We then propose an end-to-end solution for audio synchronization that is transparent to the heterogeneous network protocol suites involved. We investigate synchronization algorithms based on digital speech processing using different acoustic features of the speech signal in the waveform, cepstrum, and spectrum domains. We evaluate the effectiveness of different algorithms under various impairments including codec distortion, line noises, packet losses, and overlapping utterances. Evaluation results show a promising direction for using DSP-based algorithms to address the synchronization problem across heterogeneous telephony systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

CoMAC: A cooperation-based multiparty audio conferencing system for mobile users

Article 21 November 2022

Flexilink: A Low Latency Solution for Packet Based Media

Reliable transmission of stream transported media in wireless real time communications

Article 28 May 2018

References

Hsieh, H.-Y., Li, C.-W., Lin, H.-P.: Handoff with DSP support: Enabling seamless voice communications across heterogeneous telephony systems on dual-mode mobile devices. IEEE Transactions on Mobile Computing 8(1), 93–108 (2009)
Article Google Scholar
Liu, C., Xie, Y., Lee, M.J.: Multipoint multimedia teleconference system with adaptive synchronization. IEEE Journal on Selected Areas in Communications (J-SAC) 14, 1422–1435 (1996)
Article Google Scholar
Xie, Y., Liu, C., Lee, M.J., Saadawi, Y.N.: Adaptive multimedia synchronization in a teleconference system. ACM/Springer Multimedia Systems 7(4), 326–337 (1999)
Article Google Scholar
Kim, C., Seo, K.-D., Sung, W., Jung, S.-H.: Efficient audio/video synchronization method for video telephony system in consumer cellular phones. In: Proceedings of the ICCE 2006 Consumer Electronics, January 2006, pp. 137–138 (2006)
Google Scholar
Liu, H., Zarki, M.E.: A synchronization control scheme for real-time streaming multimedia applications. In: Proceedings of 13th Packet Video Workshop (April 2003)
Google Scholar
Yang, M., Bourbakis, N., Chen, Z., Trifas, M.: An efficient audio-video synchronization methodology. In: Proceedings of the IEEE International Conference on Multimedia and Expo., July 2007, pp. 767–770 (2007)
Google Scholar
Lie, W.-N., Hsieh, H.-C.: Lips detection by morphological image processing. In: Proceedings of ICSP 1998, pp. 1084–1087 (1998)
Google Scholar
Zoric, G., Pandzic, I.S.: A real-time lip sync system using a genetic algorithm for automatic neural network configuration. In: Proceedings of the IEEE International Conference on Multimedia and Expo., July 2005, pp. 1366–1369 (2005)
Google Scholar
Cutler, R., Bridgewater, A.: Audio/video synchronization using audio hashing. Patent No. US 2006/0291478 A1 (December 2006)
Google Scholar
Jourjine, A., Richard, S., Yilmaz, O.: Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), June 2000, pp. 2985–2988 (2000)
Google Scholar
Rickard, S., Yilmaz, O.: On the approximate W-Disjoint Orthogonality of speech. In: Proceedings of ICASSP, May 2002, pp. 13–17 (2002)
Google Scholar
Yilmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 52(7), 1830–1847 (2004)
Article MathSciNet Google Scholar
Shan, Z., Swary, J., Aviyente, S.: Underdetermined source separation in the time-frequency domain. In: Proceedings of ICASSP, September 2007, pp. 945–948 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate Institute of Communication Engineering, Taiwan
Hsiao-Pu Lin & Hung-Yun Hsieh
Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, 106
Hung-Yun Hsieh

Authors

Hsiao-Pu Lin
View author publications
You can also search for this author in PubMed Google Scholar
Hung-Yun Hsieh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Sapienza University of Rome, Via Salaria 113, 00198, Rome, Italy
Novella Bartolini
Computer Technology Institute, N. Kazantzaki Str. 1, Patras University Campus, 26504, Patras, Rion, Greece
Sotiris Nikoletseas
Ohio State University, 2015 Neil Avenue, 43210, Columbus, OH, USA
Prasun Sinha
University of Roma Tor Vergata, Via del Politecnico 1, 00133, Roma, Italy
Valeria Cardellini
National ICT Australia (NICTA), Garden Street 13, Eveleigh, 2015, NSW, Australia
Anirban Mahanti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, HP., Hsieh, HY. (2009). On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing. In: Bartolini, N., Nikoletseas, S., Sinha, P., Cardellini, V., Mahanti, A. (eds) Quality of Service in Heterogeneous Networks. QShine 2009. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 22. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10625-5_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-10625-5_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10624-8
Online ISBN: 978-3-642-10625-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

CoMAC: A cooperation-based multiparty audio conferencing system for mobile users

Flexilink: A Low Latency Solution for Packet Based Media

Reliable transmission of stream transported media in wireless real time communications

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

CoMAC: A cooperation-based multiparty audio conferencing system for mobile users

Flexilink: A Low Latency Solution for Packet Based Media

Reliable transmission of stream transported media in wireless real time communications

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation