Remote Recovery of Sound from Speckle Pattern Video Based on Convolutional LSTM

Zhu, Dali; Yang, Long; Zeng, Hualin

doi:10.1007/978-3-030-88052-1_7

Dali Zhu^12,13,
Long Yang^12,13 &
Hualin Zeng^12,13

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12919))

Included in the following conference series:

International Conference on Information and Communications Security

897 Accesses
2 Citations

Abstract

In the field of security surveillance, remotely acquire the sound signal of the target is an attractive research topic. The research has broad application prospects, such as counter-terrorism, rescue, medical monitoring, and so on. To obtain clear and accurate sound signal of the target, we propose a method based on convolutional LSTM network to recover the sound. The principle of our method consists of two steps. First, we record the speckle images of target remotely. Then we utilize the convolutional LSTM network to extract the subtle movement from speckle images. The results demonstrate that our network is superior to convolutional neural network in the accuracy and efficiency of processing temporal-spatial speckle image data. The influence of different sampling rates on sound extraction is revealed through appropriate experimental settings. In addition, we also reveal the principle that our network has stronger generalization ability than convolutional neural network. Benefit from the powerful generalization ability of the network, our method could perform accurate and robust sound extraction to unseen objects. The excellent performance of our method proves that it is a significant development in the field of remote sound acquisition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization (2016). http://arxiv.org/org/abs/1607.06450v1
Barcellona, C., et al.: Remote recovery of audio signals from videos of optical speckle patterns: a comparative study of signal recovery algorithms. Opt. Express 28(6), 8716–8723 (2020). https://doi.org/10.1364/OE.386406
Article Google Scholar
Billa, J.: Dropout approaches for LSTM based speech recognition systems. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5879–5883 (2018). https://doi.org/10.1109/ICASSP.2018.8462544
Blaber, J., Adair, B., Antoniou, A.: Ncorr: open-source 2D digital image correlation matlab software. Experiment. Mech. 55, 1105–1122 (2015)
Article Google Scholar
Castellini, P., Martarelli, M., Tomasini, E.: Laser doppler vibrometry: development of advanced solutions answering to technology’s needs. Mech. Syst. Signal Process. 20(6), 1265–1285 (2006). https://doi.org/10.1016/j.ymssp.2005.11.015
Article Google Scholar
Davis, A., Rubinstein, M., Wadhwa, N., Mysore, G.J., Durand, F., Freeman, W.T.: The visual microphone: passive recovery of sound from video. ACM Trans. Graph. 33(4) (2014)
Google Scholar
Diamond, D.H., Heyns, P.S., Oberholster, A.J.: Accuracy evaluation of sub-pixel structural vibration measurements through optical flow analysis of a video sequence. Measurement 95, 166–172 (2017)
Article Google Scholar
Garg, P., et al.: Measuring transverse displacements using unmanned aerial systems laser doppler vibrometer (UAS-LDV): development and field validation. Sensors 20(21) (2020). https://doi.org/10.3390/s20216051
Graves, A.: Generating sequences with recurrent neural networks. ArXiv abs/1308.0850 (2013)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015)
Google Scholar
Kritsis, K., Kaliakatsos-Papakostas, M., Katsouros, V., Pikrakis, A.: Deep convolutional and lstm neural network architectures on leap motion hand tracking data sequences. In: 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1–5 (2019). 10.23919/EUSIPCO.2019.8902973
Google Scholar
Mutegeki, R., Han, D.S.: A cnn-lstm approach to human activity recognition. In: 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 362–366 (2020). https://doi.org/10.1109/ICAIIC48513.2020.9065078
Ozana, N., et al.: Demonstration of a remote optical measurement configuration that correlates with breathing, heart rate, pulse pressure, blood coagulation, and blood oxygenation. Proc. IEEE 103(2), 248–262 (2015). https://doi.org/10.1109/JPROC.2014.2385793
Article Google Scholar
Pasunuru, R., Bansal, M.: Multi-task video captioning with video and entailment generation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1273–1283 (Jul 2017)
Google Scholar
Peters, W.H., Ranson, W.F.: Digital imaging techniques in experimental stress analysis. Optical Eng. 21(3), 427–431 (1982). https://doi.org/10.1117/12.7972925
Article Google Scholar
Rothberg, S., et al.: An international review of laser doppler vibrometry: making light work of vibration measurement. Optics Lasers Eng. 99, 11–22 (2017). https://doi.org/10.1016/j.optlaseng.2016.10.023
Article Google Scholar
Shao, X., Zhong, F., Huang, W., Dai, X., Chen, Z., He, X.: Digital image correlation with improved efficiency by pixel selection. Appl. Opt. 59(11), 3389–3398 (2020). https://doi.org/10.1364/AO.387678
Article Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.k., Woo, W.C.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, p. 802–810. NIPS 2015 (2015)
Google Scholar
Smith, B.M., O’Toole, M., Gupta, M.: Tracking multiple objects outside the line of sight using speckle imaging. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6258–6266 (2018). https://doi.org/10.1109/CVPR.2018.00655
Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using LSTMs. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning - vol. 37, pp. 843–852. ICML 2015 (2015)
Google Scholar
Xu, Z., Li, S., Deng, W.: Learning temporal features using lstm-cnn architecture for face anti-spoofing. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). pp. 141–145 (2015). https://doi.org/10.1109/ACPR.2015.7486482
Yang, D., Su, Z., Zhang, S., Zhang, D.: Real-time matching strategy for rotary objects using digital image correlation. Appl. Opt. 59(22), 6648–6657 (2020). https://doi.org/10.1364/AO.397655
Article Google Scholar
Zalevsky, Z., et al.: Simultaneous remote extraction of multiple speech sources and heart beats from secondary speckles pattern. Opt. Express 17(24), 21566–21580 (2009). https://doi.org/10.1364/OE.17.021566
Article Google Scholar
Zhu, D., Yang, L., Li, Z., Zeng, H.: Remote speech extraction from speckle image by convolutional neural network. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–6 (2020). https://doi.org/10.1109/ISCC50000.2020.9219652

Download references

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Dali Zhu, Long Yang & Hualin Zeng
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Dali Zhu, Long Yang & Hualin Zeng

Authors

Dali Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Long Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hualin Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hualin Zeng .

Editor information

Editors and Affiliations

Singapore Management University, Singapore, Singapore
Debin Gao
Tsinghua University, Beijing, China
Qi Li
Xi'an Jiaotong University, Xi'an, China
Xiaohong Guan
Chongqing University, Chongqing, China
Xiaofeng Liao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, D., Yang, L., Zeng, H. (2021). Remote Recovery of Sound from Speckle Pattern Video Based on Convolutional LSTM. In: Gao, D., Li, Q., Guan, X., Liao, X. (eds) Information and Communications Security. ICICS 2021. Lecture Notes in Computer Science(), vol 12919. Springer, Cham. https://doi.org/10.1007/978-3-030-88052-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-88052-1_7
Published: 17 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88051-4
Online ISBN: 978-3-030-88052-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics