research-article

Multi-Target Multi-Camera Tracking with Human Body Part Semantic Features

Authors:

Zunlin FanAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 199 - 208

https://doi.org/10.1145/3357384.3358029

Published: 03 November 2019 Publication History

Abstract

Recently, Multi-Target Multi-Camera Tracking (MTMCT) has gained more and more attention. It is a challenging task with major problems including occlusion, background clutter, poses and camera point of view variations. Compared to single camera tracking, which takes advantage of location information and strict time constraints, good appearance features are more important to MTMCT. This drives us to extract robust and discriminative features for MTMCT. We propose MTMCT\_HS which uses human body part semantic features to overcome the above challenges. We use a two-stream deep neural network to extract the global appearance features and human body part semantic maps separately, and employ aggregation operations to generate final features. We argue that these features are more suitable for affinity measurement, which can be seen as the average of appearance similarity weighted by the corresponding human body part similarity. Next, our tracker adopts a hierarchical correlation clustering algorithm, which combines targets' appearance feature similarity with motion correlation for data association. We validate the effectiveness of our MTMCT\_HS method by demonstrating its superiority over the state-of-the-art method on DukeMTMC benchmark. Experiments show that the extracted features with human body part semantics are more effective for MTMCT compared with the methods solely employing global appearance features.

References

[1]

Mustafa Ayazoglu, Binlong Li, Caglayan Dicle, Mario Sznaier, and Octavia I Camps. 2011. Dynamic subspace-based coordinated multicamera tracking. In 2011 International Conference on Computer Vision. IEEE, 2462--2469.

Digital Library

[2]

Shai Bagon and Meirav Galun. 2011. Large scale correlation clustering optimization. arXiv preprint arXiv:1112.2903 (2011).

[3]

Igor Barros Barbosa, Marco Cristani, Barbara Caputo, Aleksander Rognhaugen, and Theoharis Theoharis. 2018. Looking beyond appearances: Synthetic training data for deep cnns in re-identification. Computer Vision and Image Understanding 167 (2018), 50--62.

Digital Library

[4]

Asad A Butt and Robert T Collins. 2013. Multi-target tracking by lagrangian relaxation to min-cost network flow. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1846--1853.

Digital Library

[5]

Yinghao Cai and Gerard Medioni. 2014. Exploring context information for intercamera multiple target tracking. In IEEE Winter Conference on Applications of Computer Vision. IEEE, 761--768.

[6]

Yinghao Cai and Gerard Medioni. 2014. Exploring context information for intercamera multiple target tracking. In IEEE Winter Conference on Applications of Computer Vision. IEEE, 761--768.

[7]

Lijun Cao, Weihua Chen, Xiaotang Chen, Shuai Zheng, and Kaiqi Huang. 2015. An equalised global graphical model-based approach for multi-camera object tracking. arXiv preprint arXiv:1502.03532 (2015).

[8]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multiperson 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.

[9]

Visesh Chari, Simon Lacoste-Julien, Ivan Laptev, and Josef Sivic. 2015. On pairwise costs for network flow multi-object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5537--5545.

[10]

Kuan-Wen Chen, Chih-Chuan Lai, Pei-Jyun Lee, Chu-Song Chen, and Yi-Ping Hung. 2011. Adaptive learning for target tracking and true linking discovering across multiple non-overlapping cameras. IEEE Transactions on Multimedia 13, 4 (2011), 625--638.

Digital Library

[11]

Long Chen, Haizhou Ai, Zijie Zhuang, and Chong Shang. 2018. Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification. In IEEE International Conference on Multimedia and Expo (ICME) 2018. 1--6.

[12]

Xiaojing Chen, Le An, and Bir Bhanu. 2015. Multitarget tracking in nonoverlapping cameras using a reference set. IEEE Sensors Journal 15, 5 (2015), 2692--2704.

[13]

Robert T Collins. 2012. Multitarget data association with higher-order motion models. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 1744--1751.

[14]

Abir Das, Anirban Chakraborty, and Amit K Roy-Chowdhury. 2014. Consistent re-identification in a camera network. In European conference on computer vision. Springer, 330--345.

[15]

Kuan Fang, Yu Xiang, Xiaocheng Li, and Silvio Savarese. 2018. Recurrent autoregressive networks for online multi-object tracking. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 466--475.

[16]

Pedro F Felzenszwalb, David A McAllester, Deva Ramanan, et al. 2008. A discriminatively trained, multiscale, deformable part model. In Cvpr, Vol. 2. 7.

[17]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.

Digital Library

[18]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[19]

Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).

[20]

Mahdi M Kalayeh, Emrah Basaran, Muhittin Gökmen, Mustafa E Kamasak, and Mubarak Shah. 2018. Human semantic parsing for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1062--1071.

[21]

Ahmed T Kamal, Jay A Farrell, and Amit K Roy-Chowdhury. 2013. Information consensus for distributed multi-target tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2403--2410.

Digital Library

[22]

Hilke Kieritz, Wolfgang Hubner, and Michael Arens. 2018. Joint detection and online multi-object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1459--1467.

[23]

Chanho Kim, Fuxin Li, and James M Rehg. 2018. Multi-object tracking with neural gating using bilinear lstm. In Proceedings of the European Conference on Computer Vision (ECCV). 200--215.

[24]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2117-- 2125.

[25]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.

[26]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.

[27]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21--37.

[28]

Andrii Maksai, Xinchao Wang, Francois Fleuret, and Pascal Fua. 2017. Nonmarkovian globally consistent multi-object tracking. In Proceedings of the IEEE International Conference on Computer Vision. 2544--2554.

[29]

Niki Martinel, Christian Micheloni, and Gian Luca Foresti. 2014. Saliency weighted features for person re-identification. In European Conference on Computer Vision. Springer, 191--208.

[30]

Anton Milan, S Hamid Rezatofighi, Anthony Dick, Ian Reid, and Konrad Schindler. 2017. Online multi-target tracking using recurrent neural networks. In Thirty- First AAAI Conference on Artificial Intelligence.

Digital Library

[31]

Manfred Padberg and Giovanni Rinaldi. 1991. A branch-and-cut algorithm for the resolution of large-scale symmetric traveling salesman problems. SIAM review 33, 1 (1991), 60--100.

[32]

Zhen Qin and Christian R Shelton. 2012. Improving multi-target tracking via social grouping. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1972--1978.

Digital Library

[33]

Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

[34]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.

[35]

Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European Conference on Computer Vision. Springer, 17--35.

[36]

Ergys Ristani and Carlo Tomasi. 2018. Features for multi-target multi-camera tracking and re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6036--6046.

[37]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211--252.

[38]

Guang Shu, Afshin Dehghan, Omar Oreifej, Emily Hand, andMubarak Shah. 2012. Part-based multiple-person tracking with partial occlusion handling. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1815--1821.

Digital Library

[39]

Yumin Suh, Jingdong Wang, Siyu Tang, Tao Mei, and Kyoung Mu Lee. 2018. Part-aligned bilinear representations for person re-identification. In Proceedings of the European Conference on Computer Vision (ECCV). 402--419.

Digital Library

[40]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.

[41]

Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, and Bernt Schiele. 2016. Multiperson tracking by multicut and deep matching. In European Conference on Computer Vision. Springer, 100--111.

[42]

Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, and Bernt Schiele. 2017. Multiple people tracking by lifted multicut and person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3539--3548.

[43]

Yonatan Tariku Tesfaye, Eyasu Zemene, Andrea Prati, Marcello Pelillo, and Mubarak Shah. 2019. Multi-target Tracking in Multiple Non-overlapping Cameras Using Fast-Constrained Dominant Sets. International Journal of Computer Vision (2019), 1--18.

Digital Library

[44]

JiuqingWan and Liu Li. 2013. Distributed optimization for global data association in non-overlapping camera networks. In 2013 Seventh International Conference on Distributed Smart Cameras (ICDSC). IEEE, 1--7.

[45]

Bo Yang and Ram Nevatia. 2012. An online learned CRF model for multi-target tracking. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2034--2041.

[46]

Shu Zhang, Yingying Zhu, and Amit Roy-Chowdhury. 2015. Tracking multiple interacting targets in a camera network. Computer Vision and Image Understanding 134 (2015), 64--73.

Digital Library

[47]

Liming Zhao, Xi Li, Yueting Zhuang, and Jingdong Wang. 2017. Deeply-learned part-aligned representations for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision. 3219--3228.

[48]

Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In Proceedings of the IEEE International Conference on Computer Vision. 3754--3762.

[49]

Zhedong Zheng, Liang Zheng, and Yi Yang. 2018. Pedestrian alignment network for large-scale person re-identification. IEEE Transactions on Circuits and Systems for Video Technology (2018).

Index Terms

Multi-Target Multi-Camera Tracking with Human Body Part Semantic Features
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Fast online multi-target multi-camera tracking for vehicles
Abstract
Multi-target multi-camera tracking (MTMCT) for vehicles, which aims to track multiple vehicles across multi-camera environments, is crucial in surveillance or intelligent transportation systems due to its broad applicability in real situations. ...
Multi-Camera Multi-Target Tracking with Space-Time-View Hyper-graph

Incorporating multiple cameras is an effective solution to improve the performance and robustness of multi-target tracking to occlusion and appearance ambiguities. In this paper, we propose a new multi-camera multi-target tracking method based on a ...
Monocular human pose tracking using multi frame part dynamics
WMVC'09: Proceedings of the 2009 international conference on Motion and video computing

Efficient monocular human pose tracking in dynamic scenes is an important problem. Existing pose tracking methods either use activity priors to restrict the search space, or use generative body models with weak kinematic constraints to infer pose over ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
280
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten