TRINet: Tracking and Re-identification Network for Multiple Targets in Egocentric Videos Using LSTMs

Nigam, Jyoti; Rameshan, Renu M.

doi:10.1007/978-3-030-29891-3_38

Jyoti Nigam¹⁰ &
Renu M. Rameshan¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11679))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

937 Accesses
1 Citations

Abstract

We present a recurrent network based novel framework for tracking and re-identifying multiple targets in first-person perspective. Even though LSTMs can act as a sequence classifier, most of the previous works in multi target tracking use their output with some distance metric for data association. In this work, we employ an LSTM as a classifier and train it over the memory cells output vectors corresponding to different targets obtained from another LSTM. This classifier, based on appearance and motion features, discriminates the targets in two consecutive frames as well as re-identify them in a time interval. We integrate this classifier as an additional block in a detection free tracking architecture which enhances the performance in terms of re-identification of targets and also indicates the absence of targets. We propose a dataset of twenty egocentric videos containing multiple targets to validate our approach.

This publication is an outcome of the R & D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India, being implemented by Digital India Corporation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. J. Image Video Process. 2008, 1 (2008)
Article Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Chen, L., Ai, H., Shang, C., Zhuang, Z., Bai, B..: Online multi-object tracking with convolutional neural networks. In: ICIP (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Google Scholar
Girija, S.S.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016). Software tensorflow.org
Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of International Conference on Neural Networks (ICNN 1996), vol. 1, pp. 347–352. IEEE (1996)
Google Scholar
Gordon, D., Farhadi, A., Fox, D.: Re3: real-time recurrent regression networks for object tracking. arXiv preprint arXiv:1705.06368, 3 (2017)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 262–275. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_21
Chapter Google Scholar
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Chapter Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–959 (2017)
Google Scholar
Karpathy, A., Johnson, J., Fei-Fei, L.: Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015)
Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295. IEEE (2012)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Google Scholar
Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. CoRR abs/1603.00831 (2016). http://arxiv.org/abs/1603.00831
Nigam, J., Rameshan, R.M.: EgoTracker: pedestrian tracking with re-identification in egocentric videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 40–47 (2017)
Google Scholar
Prosser, B.J., Zheng, W.S., Gong, S., Xiang, T., Mary, Q.: Person re-identification by support vector ranking. In: BMVC, vol. 2, p. 6 (2010)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39, August 2014. https://doi.org/10.1109/ICPR.2014.16
Zheng, W.S., Gong, S., Xiang, T.: Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology, Mandi, H.P., India
Jyoti Nigam & Renu M. Rameshan

Authors

Jyoti Nigam
View author publications
You can also search for this author in PubMed Google Scholar
Renu M. Rameshan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jyoti Nigam .

Editor information

Editors and Affiliations

Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Mario Vento
Department of Computer and Electrical Engineering and Applied Mathematics, University of Salerno, Fisciano (SA), Italy
Gennaro Percannella

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nigam, J., Rameshan, R.M. (2019). TRINet: Tracking and Re-identification Network for Multiple Targets in Egocentric Videos Using LSTMs. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11679. Springer, Cham. https://doi.org/10.1007/978-3-030-29891-3_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-29891-3_38
Published: 22 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29890-6
Online ISBN: 978-3-030-29891-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics