Abstract
This paper proposes a novel bilateral counting network to estimate the accurate and robust counting result for single-image object counting task. The proposed network is composed of two main components: the concentrated dilated pyramid module and dual-context extraction path. The concentrated dilated pyramid module extracts the multi-scale feature from the image to address the scale variant issue in object counting task via a pyramid structure and also uses a shortcut concentration to facilitate the back-propagation of the gradient so as to improve the counting performance. And the dual-context extraction path obtains different-level context related to the object counting task through convoluting and down-sampling the image different times. The concentrated dilated pyramid module and the dual-context extraction path are integrated to boost the final counting result. Extensive experiments on vehicle counting and crowd counting datasets including TRANCOS, Mall, Shanghaitech_A and WorldExpo’10 demonstrate the feasibility and effectiveness for the object counting task.












Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Arteta, C., Lempitsky, V., Zisserman, A.: Counting in the wild. In: Proceedings of the ECCV Conference, pp. 483–498 (2016)
Boominathan, L., Kruthiventi, S.S.S., Babu, R.: Crowdnet: A deep convolutional network for dense crowd counting. In: Proceedings of the ACMMM Conference, pp. 640–644 (2016)
Chen, J.C., Kumar, A., Ranjan, R., et al.: A cascaded convolutional neural network for age estimation of unconstrained faces. In: IEEE 8th International Conference on Biometrics Theory, Applications and Systems (2016)
Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: Proceedings of BMVC (2012)
Choi, J.S., Choi, M.J., Lee, J.M., et al.: A new automated cell counting program by using hough transform-based double edge. Lect. Not. Electr. Eng. 421, 712–716 (2016)
Daniel, O., Roberto, J.L.: Towards perspective-free object counting with deep learning. In: Proceedings of the ECCV Conference, pp. 615–629 (2016)
Fan, C.S., Liang, J.M., Lin, Y.T., et al.: A survey of intelligent video surveillance systems: history, applications and future. Front. Artif. Intell. Appl. 274, 1479–1488 (2015)
Fiaschi, L., Koethe, U., Nair, R., et al.: Learning to count with regression forest and structured labels. In: Proceedings ICPR Conference, pp. 2685–2688 (2012)
Guerrerogómezolmedo, R., Torrejiménez, B., et al.: Extremely overlapping vehicle counting. In: In: Proceedings of the Iberian Conference, pp. 423–431 (2015)
Idrees, H., Saleemi, I., Seibert, C., et al.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the CVPR Conference, pp. 2547–2554 (2013)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13 (2015)
Kumagai, S., Hotta, K., Kurita, T.: Mixture of counting CNNS. Mach. Vis. Appl. 29, 1119–1126 (2018)
Lempitsky, V.S., Zisserman, A.: Learning to count objects in images. In: Proceedings of the ICONIP Conference, pp. 1324–1332 (2010)
Liu, L.B., Wang, H.J., Li, G.B., et al.: Crowd counting using deep recurrent spatial-aware network. In: International Joint Conference on Artificial Intelligence, pp. 849–855 (2018)
Luo, H.L., Sang, J., Wu, W.Q., et al.: A high-density crowd counting method based on convolutional feature fusion. Appl. Sci. 8, 2367 (2018)
Marsden, M., McGuiness, K., Little, S., et al.: Fully convolutional crowd counting on highly congested scenes. In: Proceedings of International Joint Conference on Computer Vision, Imaging Computer Graphics Theory and Applications, pp. 27–33 (2017)
Mukherjee, S., Gil, S., Ray, N.: Unique people count from monocular videos. Vis. Comput. 31, 1405–1417 (2015)
Pham V. Q., Kozakaya, T., Yamaguchi, O., et al.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the ICCV Conference, pp. 3253–3261 (2015)
Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: Proceedings of the ECCV Conference, pp. 278–293 (2018)
Rao, A.S., Gubbi, J., Marusic, S., et al.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31, 1533–1552 (2015)
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the CVPR Conference, pp. 6–17 (2017)
Sheng, B., Shen, C., Lin, G., et al.: Crowd counting via weighted VLAD on dense attribute feature maps. IEEE Trans. Circ. Syst. Video Technol. 28, 1788–1797 (2018)
Sindagi, V.A., Patel, V.M.: A survey of recent advances in cnn-based single image crowd counting and density estimation. Pattern Recogn. Lett. 107, 3–16 (2016)
Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: Proceedings of AVSS Conference, pp. 1–6 (2017)
Sossa, H., Pogrebnyak, O., Cuevas, F.: Object counting without conglomerate separation. In: Mexican International Conference on Computer Science, pp. 216–220 (2003)
Spampinato, C., Chen-Burger, Y.H., Nadarajan, G., et al: Detecting tracking and counting fish in low quality unconstrained underwater videos. In: Proceedings of the VISAPP Conference, pp. 514–519 (2008)
Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29, 983–1009 (2013)
Wang, C., Zhang, H., Yang, L., et al.: Deep people counting in extremely dense crowds. In: Proceedings of the ACMMM Conference, pp. 1299–1302 (2015)
Xu, B., Qiu, G.: Crowd density estimation based on rich features and random projection forest. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1–8 (2016)
Yao, H.Y., Kang, H., Wan, W., Li, H.: Deep spatial regression model for image crowd counting. arXiv:1710.09757 (2017)
Zhang, C., Li, H.S., Wang, X.G., et al.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the CVPR Conference, pp. 833–841 (2015)
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the CVPR Conference, pp. 589–597 (2016)
Acknowledgements
This work was supported partly by the National Natural Science Foundation of China (No. 61379065), the Natural Science Foundation of Hebei province in China (Nos. F2019203285; 2019203526), the Project funded by China Postdoctoral Science Foundation (No. 2018M631763) and Yanshan University Doctoral Foundation (BL18010)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, H., Zhang, S. & Kong, W. Bilateral counting network for single-image object counting. Vis Comput 36, 1693–1704 (2020). https://doi.org/10.1007/s00371-019-01769-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-019-01769-5