Skip to main content

TRINet: Tracking and Re-identification Network for Multiple Targets in Egocentric Videos Using LSTMs

  • Conference paper
  • First Online:
Book cover Computer Analysis of Images and Patterns (CAIP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11679))

Included in the following conference series:

Abstract

We present a recurrent network based novel framework for tracking and re-identifying multiple targets in first-person perspective. Even though LSTMs can act as a sequence classifier, most of the previous works in multi target tracking use their output with some distance metric for data association. In this work, we employ an LSTM as a classifier and train it over the memory cells output vectors corresponding to different targets obtained from another LSTM. This classifier, based on appearance and motion features, discriminates the targets in two consecutive frames as well as re-identify them in a time interval. We integrate this classifier as an additional block in a detection free tracking architecture which enhances the performance in terms of re-identification of targets and also indicates the absence of targets. We propose a dataset of twenty egocentric videos containing multiple targets to validate our approach.

This publication is an outcome of the R & D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India, being implemented by Digital India Corporation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. J. Image Video Process. 2008, 1 (2008)

    Article  Google Scholar 

  2. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56

    Chapter  Google Scholar 

  3. Chen, L., Ai, H., Shang, C., Zhuang, Z., Bai, B..: Online multi-object tracking with convolutional neural networks. In: ICIP (2017)

    Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

    Google Scholar 

  5. Girija, S.S.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016). Software tensorflow.org

  6. Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of International Conference on Neural Networks (ICNN 1996), vol. 1, pp. 347–352. IEEE (1996)

    Google Scholar 

  7. Gordon, D., Farhadi, A., Fox, D.: Re3: real-time recurrent regression networks for object tracking. arXiv preprint arXiv:1705.06368, 3 (2017)

  8. Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 262–275. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_21

    Chapter  Google Scholar 

  9. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45

    Chapter  Google Scholar 

  10. Hu, P., Ramanan, D.: Finding tiny faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–959 (2017)

    Google Scholar 

  11. Karpathy, A., Johnson, J., Fei-Fei, L.: Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015)

  12. Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295. IEEE (2012)

    Google Scholar 

  13. Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)

    Google Scholar 

  14. Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. CoRR abs/1603.00831 (2016). http://arxiv.org/abs/1603.00831

  15. Nigam, J., Rameshan, R.M.: EgoTracker: pedestrian tracking with re-identification in egocentric videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 40–47 (2017)

    Google Scholar 

  16. Prosser, B.J., Zheng, W.S., Gong, S., Xiang, T., Mary, Q.: Person re-identification by support vector ranking. In: BMVC, vol. 2, p. 6 (2010)

    Google Scholar 

  17. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

    Google Scholar 

  18. Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39, August 2014. https://doi.org/10.1109/ICPR.2014.16

  19. Zheng, W.S., Gong, S., Xiang, T.: Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyoti Nigam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nigam, J., Rameshan, R.M. (2019). TRINet: Tracking and Re-identification Network for Multiple Targets in Egocentric Videos Using LSTMs. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11679. Springer, Cham. https://doi.org/10.1007/978-3-030-29891-3_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29891-3_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29890-6

  • Online ISBN: 978-3-030-29891-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics