Abstract
In this paper, we study the problem of human position recommendation in mobile photographing and propose a learning-based method to summarize the photographing knowledge from massive social images to improve the robustness and effectiveness. In contrast to existing photographing guide methods, we focus on turning to the collaborative web data source and learning the distribution of human position. To overcome the challenges in landmark image alignment and the relative human position projection, we propose a 3D reconstruction-based method to align the background region and human region into a uniform coordinate system. Finally, a camera-view sensitive human position recommendation strategy is carried out. A dataset containing 30,000 photos of ten landmark scenes is collected from Flickr, and a group of experiments are conducted comparing both our alternatives and various other baseline methods. Moreover, an application is developed on mobile phones to implement the real-time photographing recommendation. The experimental results show that our proposed framework achieves promising results, which demonstrate the robustness and effectiveness of our approach.












Similar content being viewed by others
References
Agarwal S, Snavely N, Simon I, Seitz S, Szeliski R (2009) Building rome in a day. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 72–79
Arya S, Mount D (1993) Approximate nearest neighbor queries in fixed dimensions. In: Proceedings of the fourth annual ACM-SIAM symposium on discrete algorithms. Society for industrial and applied mathematics, pp 271–280
Bhattacharya S, Sukthankar R, Shah M (2010) A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the international conference on Multimedia. ACM, pp 271–280
Chen D, Baatz G, Koser K, Tsai S, Vedantham R, Pylvanainen T, Roimela K, Chen X, Bach J, Pollefeys M et al (2011) City-scale landmark identification on mobile devices. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 737–744
Chen H, Liu BB, Luo H, Lu ZM (2012) Fast image artistic style learning using twin-codebook vector quantization. J Inf Hiding Multimedia Signal Process 3(1):66–70
Cheng B, Ni B, Yan S, Tian Q (2010) Learning to photograph. In: Proceedings of the international conference on multimedia. ACM, pp 291–300
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 1. IEEE, pp 886–893
Damera-Venkata N, Kite T, Geisler W, Evans B, Bovik A (2000) Image quality assessment based on a degradation model. IEEE Trans Image Process 9(4):636–650
Datta R, Joshi D, Li J, Wang J (2006) Studying aesthetics in photographic images using a computational approach. In: Computer vision–ECCV 2006, pp 288–301
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Freeman M (2006) The complete guide to light & lighting in digital photography. Lark Books (NC)
Freeman M (2007) The photographer’s eye: composition and design for better digital photos. Focal Pr
Girod B, Chandrasekhar V, Chen D, Cheung N, Grzeszczuk R, Reznik Y, Takacs G, Tsai S, Vedantham R (2011) Mobile visual search. IEEE Signal Process Mag 28(4):61–76
Hartley R, Zisserman A, Ebrary I (2003) Multiple view geometry in computer vision, vol 2. Cambridge Univ Press
Hays J, Efros AA (2008) im2gps: estimating geographic information from a single image. In: Proceedings of the IEEE conf. on computer vision and pattern recognition (CVPR)
Ji R, Duan L, Chen J, Yao H, Rui Y, Chang S, Gao W (2011) Towards low bit rate mobile visual search with multiple-channel coding. In: Proceedings of the 19th ACM international conference on multimedia. ACM, pp 573–582
Ji R, Duan L-Y, Chen J, Yao H, Yuan J, Rui Y, Gao W (2012) Location discriminative vocabulary coding for mobile landmark search. Int J Comput Vis 96(3):290–314
Joshi D, Gallagher A, Yu J, Luo J (2010) Inferring photographic location using geotagged web images. Multimed Tools Appl 56(1):131–153
Ke Y, Tang X, Jing F (2006) The design of high-level features for photo quality assessment. In: 2006 IEEE computer society conference on computer vision and pattern recognition, vol 1. IEEE, pp 419–426
Kretzschmar H, Stachniss C, Plagemann C, Burgard W (2008) Estimating landmark locations from geo-referenced photographs. In: IEEE/RSJ international conference on intelligent robots and systems, 2008. IROS 2008. IEEE, pp 2902–2907
Li X (2002) Blind image quality assessment. In: Proceedings on 2002 international conference on image processing, 2002, vol 1. IEEE, pp 449–452
Li X, Wu C, Zach C, Lazebnik S, Frahm J (2008) Modeling and recognition of landmark image collections using iconic scene graphs. In: Proc. ECCV, vol 8
Li Y, Crandall D, Huttenlocher D (2009) Landmark classification in large-scale image collections. In: 2009 IEEE 12th international conference on computer vision. IEEE, pp 1957–1964
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: Proceedings of the 10th European conference on computer vision: part III. Springer, pp 386–399
Robertson D, Cipolla R (2004) An image-based system for urban navigation. In: The 15th British machine vision conference (BMVC), pp 819–828
Sattler T, Leibe B, Kobbelt L (2011) Fast image-based localization using direct 2d-to-3d matching. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 667–674
Sheikh H, Bovik A, Cormack L (2005) No-reference quality assessment using natural scene statistics: Jpeg2000. IEEE Trans Image Process 14(11):1918–1927
Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. In: ACM transactions on graphics (TOG), vol 25. ACM, pp 835–846
Snavely N, Seitz S, Szeliski R (2008) Modeling the world from internet photo collections. Int J Comput Vis 80(2):189–210
Sun X, Yao H, Ji R, Liu S (2009) Photo assessment based on computational visual attention model. In: ACM MM, vol 1658
Tong H, Li M, Zhang H, He J, Zhang C (2005) Classification of digital photos taken by photographers or home users. In: Advances in multimedia information processing-PCM 2004, pp 198–205
Tong H, Li M, Zhang H, Zhang C, He J, Ma W (2005) Learning no-reference quality metric by examples. In: Proceedings of the 11th international multimedia modelling conference, 2005. MMM 2005. IEEE, pp 247–254
Wang Z, Bovik A, Sheikh H, Simoncelli E (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Wu C, Agarwal S, Curless B, Seitz S (2011) Multicore bundle adjustment. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3057–3064
Yeh T, Tollmar K, Darrell T (2004) Searching the web with mobile images for location recognition. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 2. IEEE, pp 76–81
Yu F, Ji R, Chang S (2011) Active query sensing for mobile location search. In: Proceedings of the 19th ACM international conference on multimedia. ACM, pp 3–12
Zhang W, Kosecka J (2006) Image based localization in urban environments. In: Third international symposium on 3D data processing, visualization, and transmission. IEEE, pp 33–40
Acknowledgements
This work was supported in part by the National Science Foundation of China (No. 61071180 & No. 61133003).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, P., Yao, H., Ji, R. et al. Where should I stand? Learning based human position recommendation for mobile photographing. Multimed Tools Appl 69, 3–29 (2014). https://doi.org/10.1007/s11042-012-1343-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1343-2