Elsevier

Knowledge-Based Systems

Volume 86, September 2015, Pages 21-32
Knowledge-Based Systems

One global optimization method in network flow model for multiple object tracking

https://doi.org/10.1016/j.knosys.2015.04.018Get rights and content

Highlights

Abstract

In this paper, we address the task of automatically tracking a variable number of objects in the scene of a monocular and uncalibrated camera. We propose a global optimization method in network flow model for multiple object tracking. This approach extends recent work which formulates the tracking-by-detection into a maximum-a posteriori (MAP) data association problem. We redefine the observation likelihood and the affinity between observations to handle long term occlusions. Moreover, an improved greedy algorithm is designed to solve min-cost flow, reducing the amount of ID switches apparently. Furthermore, a linear hypothesis method is proposed to fill up the gaps in the trajectories. The experiment results demonstrate that our method is effective and efficient, and outperforms the state-of-the-art approaches on several benchmark datasets.

Introduction

Multiple object tracking is an important aspect of compute vision, especially the pedestrian tracking. It has been used in many tasks, such as the video surveillance and automatic drive. Unlike the single object tracking, which only keeps an eye on one object along the frame sequence, multiple object tracking must track all the targets in camera sight and address many complex situations, e.g., object enters and exits, one is occluded by another or some barriers, and so on. In single object tracking, we may just concentrate on how to represent the object which needs to be tracked. However, in multiple object tracking, the focus has been transformed to how to address the data association problem, i.e., how to find the corresponding observation in previous frames or next frames. Moreover, the multiple object tracker has to automatically track targets of a certain category, so that when a target emerges in the scene, the tracker should start a new tracking. If the camera is uncalibrated and monocular, lots of information will be lost when the 3D world is mapped to 2D image, which brings about some intricate occlusions. Different methods have been proposed to address the problem of tracking multiple objects using monocular and uncalibrated camera in recent years. Nevertheless, the factors such as the complicated background, the crowded people and the low-quality video can render the multiple tracking extremely challenging.

With the improvement of the detector [1], [2], [3], [4] in recent years, tracking-by-detection [5], [6], [7] is becoming feasible and popular. In the framework of tracking-by-detection, it is not indispensable to observe the targets during the tracking procedure any more and we can concentrate on addressing the data association problem because high reliable object detections are given by the detector as the input observations. That is, we only need to connect the target hypotheses generated by the detector as their similarity across frames. Several models based on graph theory have been introduced to simulate the tracking problem, such as the Maximum Weight Independent Set (MWIS) model [6], Generalized Minimum Clique Graphs (GMCP) model [8] and network flow model [9], [10], [11]. Actually, the detection result may have many false positives and missing detections. The missing detections are mainly caused by occlusions. Some state-of-the-art detectors [12], [13] have a good performance on the partial occlusion. But for the full occlusion, it is helpless. Hence, it is necessary for tracking-by-detection tracker to be able to eliminate the false positive and fill up the missing detections (see Fig. 1).

Tracking in network flow model is a kind of global optimal procedure. In this framework, the object responses, i.e., the nodes, compose a large and intricate network graph. Two nodes are connected by an arc means that they may represent the same target and the cost of arcs or edges in the graph indicates the agreement between two detection observation. Pirsiavash et al. [5] formulated data association to an MAP problem and solved it as a min-cost flow issue. Rather than applying the proven optimal algorithms directly, such as the push relabel algorithm [14] and successive shortest path [15], they proposed a greedy method on the basis of finding the shortest path with dynamic programming algorithm. Their approach can return all the trajectories in extremely short time, not giving the global optimal solution but a high-quality approximate solution. Our work is mainly inspired by the min-cost flow network described in [5]. We find out that, in our cost flow network model, this greedy algorithm is not high-quality any more. We aim to reformulate the similarity evaluation to adapt the network flow mode to long term occlusions and improve Pirsiavash’s approximate greedy method to be not only efficient but also precise. The main contributions of this paper include:

  • A new integrated observation model for evaluating the affinity between two detection observations. This integrated observation model not only has better robustness than previous models, but also can deal with the occlusions situation.

  • An improved approximate greedy algorithm. For the problem that Pirsiavash’s approximate greedy may cause a lot of ID switches, this improved approximate greedy algorithm can remarkably eliminate the ID switches and its running time is much less than optimal algorithms.

  • A simple and effective linear hypothesis method for reducing the number of false negatives. After the optimization, a lot of gaps caused by the occlusion exist in the trajectories though the detections have been connected to trajectories. This linear hypothesis mechanism can fill up most of the gaps accurately.

In the rest of this paper, we briefly discuss the related work in Section 2 and describe the construction of network flow model in Section 3. A novel optimization algorithm is proposed in Section 4 and a detailed experimental evaluation of the presented method is given in Section 5. Finally, a conclusion is drawn in Section 6.

Section snippets

Related work

Considerable developments have occurred for multiple targets tracking since radar tracking [16] twenty years before. Early approaches follow objects in local strategy [17], [18]. That is, they solve the data association frame-by-frame and object-by-object. Once one object is found in one frame, the tracker will keep on looking for it in the next frame based on its state estimate in one or more previous frames. In this framework, kalman filter [19] and particle filter [20] which is known as a

Network flow model

According to [22], we model the tracking problem as an MAP estimation problem. Then, an objective function can be derived from this MAP problem by representing the data association as a Hidden Markov Model (HMM), and figured as a min-cost flow [15], [22], [5] problem.

Approximate greedy programming algorithm

Pirsiavash et al. [5] proposed a novel approximate greedy algorithm based on Dynamic Programming (DP) to find min-cost flows. Contrasted to the classic push-relabel method [14] with computational complexity O(n2mlogn) or Successive Shortest-Paths (SSP) algorithm [15] with computational complexity O(Knlogn), this greedy algorithm decreases the computational complexity to O(Kn), where K is the number of found trajectories, n is the number of nodes and m is the number of edges in the network graph.

Datasets

In this section, we show the performance of our method on two popular public datasets: PETS2009 [26] and TUD datasets [27]. In PETS2009, we select the PETS2009-S2L1-view001 as the testing sequence. TUD datasets contains three frame sequences: TUD crossing (201 frames), TUD campus (71 frames) and TUD stadtmitte (179 frames). Both of these two datasets are very challenging and have distinct characteristics. PETS09 sequences are low frame rate (7 fps), high definition. The camera looks down the

Conclusion

In this paper, we have presented an integrated observation model to measure the similarity between two observations and an improved greedy algorithm for solving min-cost flow problem in the typical network flow model. Our integrated observation model can precisely describe the characteristics of object observations produced by detectors and formulate the affinities between them. Hence, even a person is occluded by more than tens of frames, once he/she is detected again, our tracker can find out

Acknowledgements

This study was supported by the Innovation Project of Scholars from Overseas of Shenzhen (KQCX20120801104656658) and the Technology Innovation Project of Shenzhen (Nos. CXZZ20120618155717337, CXZZ20130318162826126). This research was also supported in part by Shenzhen IOT key technology and application systems integration engineering laboratory. The authors would like to thank the anonymous reviewers for their constructive comments and suggestions.

References (30)

  • A. Goldberg

    An efficient implementation of a scaling minimum-cost flow algorithm

    J. Algor.

    (1997)
  • L. Zhang et al.

    Global data association for multi-object tracking using network flows

  • N. Dalal, Finding People in Images and Videos, Ph.D. thesis, Institut National Polytechnique de Grenoble/INRIA...
  • P.F. Felzenszwalb et al.

    Object detection with discriminatively trained part-based models

    IEEE Trans. Patt. Anal. Mach. Intell.

    (2010)
  • B. Leibe et al.

    Robust object detection with interleaved categorization and segmentation

    Int. J. Comp. Vis.

    (2008)
  • J. Marin et al.

    Random forests of local experts for pedestrian detection

  • H. Pirsiavash et al.

    Globally-optimal greedy algorithms for tracking a variable number of objects

  • W. Brendel et al.

    Multiobject tracking as maximum weight independent set

  • M. Breitenstein et al.

    Robust tracking-by-detection using a detector confidence particle filter

  • A. Zamir, A. Dehghan, M. Shah, Gmcp-tracker: global multi-object tracking using generalized minimum clique graphs, in:...
  • J. Berclaz et al.

    Multiple object tracking using k-shortest paths optimization

    IEEE Trans. Patt. Anal. Mach. Intell.

    (2011)
  • H. Jiang et al.

    A linear programming approach for multiple object tracking

  • A. Andriyenko, K. Schindler, Globally optimal multi-target tracking on a hexagonal lattice, in: Computer Vision–ECCV...
  • J. Marín et al.

    Occlusion handling via random subspace classifiers for human detection

    IEEE Trans. Cybernet.

    (2014)
  • M. Enzweiler et al.

    Multi-cue pedestrian classification with partial occlusion handling

  • Cited by (0)

    View full text