Abstract
Inside parcel distribution hubs, several tenth of up 100 000 parcels processed each day get lost. Human operators have to tediously recover these parcels by searching through large amounts of video footage from the installed large-scale camera network. We want to assist these operators and work towards an automatic solution. The challenge lies both in the size of the hub with a high number of cameras and in the adverse conditions. We describe and evaluate an industry scale tracking framework based on state-of-the-art methods such as Mask R-CNN. Moreover, we adapt a siamese network inspired feature vector matching with a novel feature improver network, which increases tracking performance. Our calibration method exploits a calibration parcel and is suitable for both overlapping and non-overlapping camera views. It requires little manual effort and needs only a single drive-by of the calibration parcel for each conveyor belt. With these methods, most parcels can be tracked start-to-end.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J. Image Video Process. 2008 (2008). https://doi.org/10.1155/2008/246309
Bewley, A., Ge, Z., Ott, L., Ramos, F.T., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing, pp. 3464–3468 (2016). https://doi.org/10.1109/ICIP.2016.7533003
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, pp. 2544–2550 (2010). https://doi.org/10.1109/CVPR.2010.5539960
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a Siamese time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6, pp. 737–744 (1993)
Chahyati, D., Fanany, M.I., Arymurthy, A.M.: Tracking people by detection using CNN features. Proc. Comput. Sci. 124, 167–172 (2017). https://doi.org/10.1016/j.procs.2017.12.143
Danelljan, M., Khan, F.S., Felsberg, M., van de Weijer, J.: Adaptive color attributes for real-time visual tracking. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1090–1097 (2014). https://doi.org/10.1109/CVPR.2014.143
Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F.J., Marín-Jiménez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014). https://doi.org/10.1016/j.patcog.2014.01.005
Grabner, H., Grabner, M., Bischof, H.: Real-time tracking via on-line boosting. In: Proceedings of the British Machine Vision Conference 2006, pp. 47–56 (2006). https://doi.org/10.5244/C.20.6
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.322
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_50
Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: 20th International Conference on Pattern Recognition, pp. 2756–2759 (2010). https://doi.org/10.1109/ICPR.2010.675
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012). https://doi.org/10.1109/TPAMI.2011.239
Kang, K., Ouyang, W., Li, H., Wang, X.: Object detection from video tubelets with convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 817–825 (2016). https://doi.org/10.1109/CVPR.2016.95
Karaca, H.N., Akınlar, C.: A multi-camera vision system for real-time tracking of parcels moving on a conveyor belt. In: Yolum, I., Güngör, T., Gürgen, F., Özturan, C. (eds.) ISCIS 2005. LNCS, vol. 3733, pp. 708–717. Springer, Heidelberg (2005). https://doi.org/10.1007/11569596_73
Kroeger, T., Timofte, R., Dai, D., Van Gool, L.: Fast optical flow using dense inverse search. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 471–488. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_29
Kuhn, H.W., Yaw, B.: The Hungarian method for the assignment problem. Naval Res. Logist. Q. 2, 83–97 (1955)
Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese CNN for robust target association. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 418–425 (2016). https://doi.org/10.1109/CVPRW.2016.59
Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017). https://doi.org/10.1109/CVPR.2017.472
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lukezic, A., Vojír, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter tracker with channel and spatial reliability. Int. J. Comput. Vis. 126(7), 671–688 (2018). https://doi.org/10.1007/s11263-017-1061-3
Matterport: Mask R-CNN for object detection and segmentation. https://github.com/matterport/Mask_RCNN
Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking (2016). https://arxiv.org/abs/1603.00831
Milan, A., Rezatofighi, S.H., Dick, A.R., Reid, I.D., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 4225–4232 (2017)
Radke, R.J., Andra, S., Al-Kofahi, O., Roysam, B.: Image change detection algorithms: a systematic survey. IEEE Trans. Image Process. 14(3), 294–307 (2005). https://doi.org/10.1109/TIP.2004.838698
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018). https://arxiv.org/abs/1804.02767
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Shin, I.S., Nam, S.H., Yu, H.G., Roberts, R.G., Moon, S.B.: Conveyor visual tracking using robot vision. In: Proceedings of 2006 Florida Conference on Recent Advances in Robotics, pp. 1–5. Citeseer (2006)
Tang, Z., Miao, Z., Wan, Y.: Background subtraction using running Gaussian average and frame difference. In: Ma, L., Rauterberg, M., Nakatsu, R. (eds.) ICEC 2007. LNCS, vol. 4740, pp. 411–414. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74873-1_50
Tomasi, C., Kanade, T.: Detection and tracking of feature points. Technical report. Carnegie Mellon University, Technical Report CMU-CS-91-132 (1991)
Wang, X., Türetken, E., Fleuret, F., Fua, P.: Tracking interacting objects optimally using integer programming. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 17–32. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_2
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, vol. 18, pp. 1473–1480 (2005)
Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). https://arxiv.org/abs/1212.5701
Acknowledgments
This work was supported by the Central Innovation Programme for SMEs of the Federal Ministry for Economic Affairs and Energy of Germany under grant agreement number 16KN044302.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 6877 KB)
Supplementary material 2 (mp4 11745 KB)
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Clausen, S., Zelenka, C., Schwede, T., Koch, R. (2019). Parcel Tracking by Detection in Large Camera Networks. In: Brox, T., Bruhn, A., Fritz, M. (eds) Pattern Recognition. GCPR 2018. Lecture Notes in Computer Science(), vol 11269. Springer, Cham. https://doi.org/10.1007/978-3-030-12939-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-12939-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12938-5
Online ISBN: 978-3-030-12939-2
eBook Packages: Computer ScienceComputer Science (R0)