Elsevier

Neurocomputing

Volume 177, 12 February 2016, Pages 529-542
Neurocomputing

Photo sundial: Estimating the time of capture in consumer photos

https://doi.org/10.1016/j.neucom.2015.11.050Get rights and content

Abstract

The time of capture of consumer photos provides rich information in temporal context and has been widely employed for solving various multimedia problems, such as multimedia retrieval and social media analysis. However, we observed that the recorded time stamp in a consumer photo does not often correspond to the true local time at which the photo was taken. This would greatly damage the robustness of time-aware multimedia applications, such as travel route recommendation. Therefore, motivated by the use of traditional sundials, this work proposes a system, Photo Sundial, for estimating the time of capture by exploiting the astronomical theory. In particular, we infer the time by establishing its relations to the measurable astronomical factors from a given outdoor photo, i.e. the sun position in the sky and the camera viewing direction in the photo-taken location. In practice, since it is more often that people would take multiple photos in a single trip, we further develop an optimization framework to jointly estimate the time from multiple photos. Experimental results show that the average estimated time error is less than 0.9 h by the proposed approach, with a significant 65% relative improvement compared to the state-of-the-art method (2.5 h). To the best of our knowledge, this work is the first study in multimedia research to explicitly address the problem of time of capture estimation in consumer photos, and the achieved performances highly encourage our system for practical applications.

Introduction

Most modern digital cameras have the ability to automatically add a time stamp to the photos that users take. The cameras store the information in the digital photo itself, in the exchangeable image file (EXIF) format. Since the time information provides rich contextual cues related to capture conditions independent of the captured scene contents, it is widely employed in multimedia research to benefit the understanding of content semantics and the creation of various time-aware applications, such as time-based photo clustering [1], semantic scene classification [2], [3], automatic image annotation [4], travel route recommendation [5], [6], [7], and virtual navigation in photos [8], [9].

In the literature [1], [5], [6], a time stamp is commonly expected to be the local time at which the photo was taken in a geographical location. However, this is often not a valid assumption, especially for consumer photos. For instance, Fig. 1(a) shows three examples downloaded from Flickr. The captured scene contents are obviously inconsistent with the associated time stamps. Further, Fig. 1(b) gives the time stamp distributions of two large sets of GPS-geotagged photos that we randomly collected from Flickr using the GPS coordinates of two Sydney attractions, i.e. Bondi Beach and Royal Botanic Gardens. It can be found that peculiar phenomena are also caused by the problematic time stamps. For example, according to the time distribution, there are many photos being taken in the Royal Botanic Gardens at midnight when it is closed. Moreover, it is unusual that the number of photos taken between 2 AM and 5 AM is larger than that between 2PM and 5PM in the afternoon.

In general, the problematic time stamp is due to two main cases: (1) The camera time is set incorrect. (2) Some camera models would reset the time to the default setting once the battery pack is removed. Besides, the wide-spread use of social networking websites brings additional issues. For example, many modern websites allow visitors to upload photographs and the metadata of these photos are commonly stripped as in the uploading process. Moreover, in the past few years, there is a considerable rise in availability of digital image editing software as well as photo editing mobile apps. The metadata of photos processed by these editing software are often overwritten. As a result, the correctness in photograph metadata is often doubted.

Therefore, the development of effective techniques for probing the correct time of capture will not only be desired to help efficiently organize people׳s personal photos but also be able to ensure the robustness of multimedia applications those significantly depend on time information. For example, travel route recommendation is an active topic of multimedia research [5], [6], [7]. It often utilizes the social media sites to collect time-labelled images and identify “the best time of day” for visiting travel spots. The unreliable time stamps would greatly degrade the recommendation quality. Similarly, advanced time-sensitive services, such as the “Any Time” featured in Google׳s image search, would suffer if images were indexed by incorrect time stamps.

In this paper, to address the above problem, we propose a system, Photo Sundial, to compute the time information of outdoor photos. Motivated by the use of traditional sundials idea [10], we first measure two astronomical factors from a given photo, i.e. the sun position in the sky and the camera viewing direction. According to the astronomical theory [11], [12], we then infer the initial estimated time by formulating the mathematical relation between the involved astronomical factors. Considering that people tend to take multiple photos in a single trip (possibly including a number of different locations and even indoor environments) [5], [6], [13], this fact can be used to further optimize the initial estimated time of capture by the proposed joint alignment for each set of multiple photos. To validate the effectiveness of our system, in the experiments, we have collected a dataset of 2102 consumer photos annotated with the ground truth of the time of capture and GPS coordinate. The experimental results show that the average estimated time error is less than 0.9 h by our approach, with a significant 65% relative improvement compared to the state-of-the-art method (2.5 h).

Our main contributions are threefold: (1) To the best of our knowledge, this work is the first study in multimedia research to explicitly address the problem of time of capture estimation by exploiting the temporal relationship among multiple photos, and the achieved performances highly encourage our system for practical applications. (2) An approximation method is proposed to effectively calculate the initial estimated time of capture when a measured sun position is inaccurate. Also, general algorithms for accessing the camera viewing direction are developed to help obtain the true position of the sun. (3) The collected dataset is valuable to be the first to provide reliable contextual information of the photo capture conditions lacking in the existing image datasets.

In the rest of this paper, Sections 2 and 3 review the related work in the literature and give the prior knowledge of astronomical theory, respectively. The main framework of the proposed approach for estimating the time of capture is presented in Section 4. Section 5 shows the experimental results and discussions. Finally, Section 6 concludes our work and gives directions for the future research.

Section snippets

Related work

In the literature, since it lacks research studies on estimating the time of capture of consumer photos and videos, we will briefly summarize relevant researches on a broader range of aspects, including the time-related factors measurement, camera pose determination, and multimedia applications based on geo-temporal contexts, as follows.

Prior knowledge

In astronomy [11], [12], a celestial coordinate system is a metric geometry (a geometry where distance can be measured) for mapping positions on the celestial sphere. As shown in Fig. 2(a), the celestial coordinate system uses two parameters, i.e. the azimuth angle ϕs and the elevation angle θs, of the spherical coordinate system to express the position of an interesting point, such as the sun or a star. The azimuth angle is an angle between a reference vector and the vector from an observer to

The proposed method

An overview of the proposed method is shown in Fig. 4. Let P=(P1,,PN) denote the sequence of N given outdoor photos Pi, i{1,,N}. Since the sky portions are exploited for estimating the sun position, each Pi is first segmented into sky and non-sky regions by using a pixel-wise labeling algorithm [24]. Next, given the segmented sky regions, we calculate the position of the sun ϕc by a machine learning based sky modeling method [14]. Further, the photo׳s initial estimated time of capture (i.e. T

The dataset

In the experiments, a testing photo is required to have the ground truth of the local time of capture and GPS coordinate. All the information are easy to obtain using a modern smartphone, but it is uneasy to find an existing large photo source (e.g. image databases and social media sites) with all the information at once. Especially, it is extremely difficult to ensure if the recorded time stamp is trustworthy or not, as discussed in Section 1. Also, for the task of joint alignment from

Conclusions and future work

In this work, we proposed a novel system, Photo Sundial, for estimating the time of capture of outdoor photos by exploiting the astronomical theory, as motivated by the use of traditional sundials. The proposed system was validated with a large dataset of 2102 consumer photos and achieved competitive performances. Our joint alignment approach can effectively reduce the average estimated time error to as low as less than 0.9 h, with a significant 65% relative improvement compared to the

Acknowledgements

This work was supported by the Ministry of Science and Technology of Taiwan under Grants MOST-103-2221-E-001-007-MY2 and MOST-103-2221-E-011-105. The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its funding of this International Research Group (IRG14-28).

Tsung-Hung Tsai received the B.S. degree from the department of computer science and information engineering, National Central University, Taiwan in 2007, and the M.S. degree in electronics engineering from the National Taiwan University in 2009. During 2010–2012, he was a Research Assistant of Multimedia Computing Laboratory (MCLab), the Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan. His research interests include multimedia content analysis,

References (31)

  • M. Boutell et al.

    Beyond pixelsexploiting camera metadata for photo classification

    Pattern Recognit.

    (2005)
  • I. Reda et al.

    Solar position algorithm for solar radiation applications

    Solar Energy

    (2004)
  • M. Cooper et al.

    Temporal event clustering for digital photo collections

    ACM Trans. Multimed. Comput. Commun. Appl.

    (2005)
  • J. Yuan et al.

    Mining compositional features from gps and visual cues for event recognition in photo collections

    IEEE Trans. Multimed.

    (2010)
  • L. Cao, J. Luo, T.S. Huang, Annotating photo collections by label propagation according to multiple similarity cues,...
  • Y. Arase, X. Xie, T. Hara, S. Nishio, Mining people׳s trips from large scale geo-tagged photos, in: Proceedings of the...
  • X. Lu, C. Wang, J.-M. Yang, Y. Pang, L. Zhang, Photo2trip: generating travel routes from geo-tagged photos for trip...
  • C.-Y. Fu, M.-C. Hu, J.-H. Lai, H. Wang, J.-L. Wu, Travelbuddy: interactive travel route recommendation with a visual...
  • C.-C. Hsieh, W.-H. Cheng, C.-H. Chang, Y.-Y. Chuang, J.-L. Wu, Photo navigator, in: Proceedings of the ACM...
  • N. Snavely et al.

    Finding paths through the world׳s photos

    ACM Trans. Graph.

    (2008)
  • R.N. Mayall et al.

    Sundials: Their Construction and Use

    (2000)
  • F.H. Shu

    The Physical Universe: An Introduction to Astronomy

    (1982)
  • H. Karttunen et al.

    Fundamental Astronomy

    (2007)
  • A.-J. Cheng, Y.-Y. Chen, Y.-T. Huang, W.H. Hsu, H.-Y. M. Liao, Personalized travel recommendation by mining people...
  • J.-F. Lalonde, A.A. Efros, S.G. Narasimhan, Estimating natural illumination from a single outdoor image, in:...
  • Cited by (16)

    View all citing articles on Scopus

    Tsung-Hung Tsai received the B.S. degree from the department of computer science and information engineering, National Central University, Taiwan in 2007, and the M.S. degree in electronics engineering from the National Taiwan University in 2009. During 2010–2012, he was a Research Assistant of Multimedia Computing Laboratory (MCLab), the Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan. His research interests include multimedia content analysis, pattern recognition and computer vision.

    Wei-Cih Jhou received the B.S. and M.S. degrees in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 2009 and 2011. She is currently working as a Research Assistant in Multimedia Computing Laboratory in Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan. Her current research interests include image rendering, image and video processing, deep learning and multimedia content analysis.

    Wen-Huang Cheng received the B.S. and M.S. degrees in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 2002 and 2004, respectively, where he received the Ph.D. (Hons.) degree from the Graduate Institute of Networking and Multimedia in 2008. He is an Associate Research Fellow with the Research Center for Information Technology Innovation (CITI), Academia Sinica, Taipei, Taiwan, where he is the Founding Leader with the Multimedia Computing Laboratory (MCLab), CITI, and an Assistant Research Fellow with a joint appointment in the Institute of Information Science. Before joining Academia Sinica, he was a Principal Researcher with MagicLabs, HTC Corporation, Taoyuan, Taiwan, from 2009 to 2010. His current research interests include multimedia content analysis, multimedia big data, deep learning, computer vision, mobile multimedia computing, social media, and human computer interaction. He has received numerous research awards, including the Outstanding Youth Electrical Engineer Award from the Chinese Institute of Electrical Engineering in 2015, the Top 10% Paper Award from the 2015 IEEE International Workshop on Multimedia Signal Processing, the Outstanding Reviewer Award from the 2015 ACM International Conference on Internet Multimedia Computing and Service, the Prize Award of Multimedia Grand Challenge from the 2014 ACM Multimedia Conference, the K. T. Li Young Researcher Award from the ACM Taipei/Taiwan Chapter in 2014, the Outstanding Young Scholar Awards from the Ministry of Science and Technology in 2014 and 2012, the Outstanding Social Youth of Taipei Municipal in 2014, the Best Reviewer Award from the 2013 Pacific-Rim Conference on Multimedia, the Best Poster Paper Award from the 2012 International Conference on 3D Systems and Applications. He supervised his post-doctoral fellows to award the Academia Sinica Postdoctoral Fellowship in 2013 and 2011.

    Min-Chun Hu is also known as Min-Chun Tien and Ming-Chun Tien. She is an Assistant Professor with the Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan. She received the B.S. and M.S. degrees in computer science and information engineering from National Chiao-Tung University, Hsinchu, Taiwan, in 2004 and 2006, respectively, and the Ph.D. degree from the Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan, in 2011. She was a Post-Doctoral Research Fellow with the Research Center for Information Technology Innovation, Academia Sinica, from 2011 to 2012. Her research interests include digital signal processing, digital content analysis, pattern recognition, computer vision, and multimedia information system.

    I-Chao Shen received the B.B.A. and M.B.A. degrees in information management from National Taiwan University, Taipei City, Taiwan, in 2009 and 2011, respectively, and he is currently working toward the Ph.D. degree in computer science at the University of British Columbia, Vancouver, BC, Canada. His research interests include visual data analysis, geometry processing, and digital fabrication.

    Tekoing Lim received the BS and the MS. degrees in Mathematics from the University of Lille 1, France, in 2005 and 2007 respectively, and the Ph.D. degree in Physics from Ecole Polytechnique in 2011 for his research at CEA (Commissariat l׳Energie Atomique, France) on computational electromagnetics. He is currently working on multimedia content analysis at Academia Sinica, Taiwan. His research interests include machine learning, computer vision and artificial intelligence.

    Kai-Lung Hua received the B.S. degree in electrical engineering from National Tsing Hua University in 2000, and the M.S. degree in communication engineering from National Chiao Tung University in 2002, both in Hsinchu, Taiwan. He received the Ph.D. degree from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, in 2010. Since 2010, Dr. Hua has been with National Taiwan University of Science and Technology, where he is currently an Associate Professor in the Department of Computer Science and Information Engineering. He is a member of Eta Kappa Nu and Phi Tau Phi, as well as a recipient of MediaTek Doctoral Fellowship. His current research interests include digital image and video processing, computer vision, and multimedia networking.

    Ahmed Ghoneim received his M.Sc. degree in Software Modeling from University of Menoufia, Egypt, and the Ph.D. degree from the University of Magdeburg (Germany) in the area of software engineering, in 1999 and 2007 respectively. He is currently an Assistant Professor at the department of software engineering, King Saud University. His research activities address software evolution; service oriented engineering, software development methodologies, Quality of Services, Net-Centric Computing, and Human Computer Interaction (HCI).

    M. Anwar Hossain is an Associate Professor in the Department of Software Engineering, College of Computer and Information Sciences (CCIS) at King Saud University (KSU), Riyadh. He completed his Master and Ph.D. in Electrical and Computer Engineering from the University of Ottawa, Canada, where he was associated with Multimedia Computing Research Laboratories. At KSU, Dr. Hossain received IBM faculty award. His current research interests include multimedia cloud, multimedia surveillance and privacy, Internet of Things, smart cities and ambient intelligence. He has authored/co-authored over 90 research articles as journals, conference papers, and book chapters. Dr. Hossain has co-organized more than ten IEEE/ACM workshops including IEEE ICME AAMS-PS 2011-13, IEEE ICME AMUSE 2014, ACM MM EMASC-2014, IEEE ISM CMAS-CITY2015. He is also involved as TPC in several other conferences. He served as a guest editor of Springer Multimedia Tools and Applications journal. Also, currently he serves as a guest editor of another MTAP issue and International Journal of Distributed Sensor Networks. He has secured several grants for research and innovation totaling more than $5 million. He is currently supervising a number of research students at KSU. He is a member of IEEE, ACM and ACM SIGMM. He is also the co-editor of SIG MM Records.

    Shintami C. Hidayati received the B.S. degree in informatics from Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia, in 2009, and the M.S. degree in computer science and information engineering from National Taiwan University of Science and Technology, Taipei, Taiwan, in 2012, where she is currently working towards her Ph.D. degree. Prior to joining the master׳s program, she worked at Institut Teknologi Sepuluh Nopember as a research staff member. Her research interests include machine learning and data mining and their applications to multimedia analysis, information retrieval, and computer vision.

    View full text