skip to main content
10.1145/1386352.1386363acmconferencesArticle/Chapter ViewAbstractPublication PagescivrConference Proceedingsconference-collections
research-article

World-scale mining of objects and events from community photo collections

Published: 07 July 2008 Publication History

Abstract

In this paper, we describe an approach for mining images of objects (such as touristic sights) from community photo collections in an unsupervised fashion. Our approach relies on retrieving geotagged photos from those web-sites using a grid of geospatial tiles. The downloaded photos are clustered into potentially interesting entities through a processing pipeline of several modalities, including visual, textual and spatial proximity. The resulting clusters are analyzed and are automatically classified into objects and events. Using mining techniques, we then find text labels for these clusters, which are used to again assign each cluster to a corresponding Wikipedia article in a fully unsupervised manner. A final verification step uses the contents (including images) from the selected Wikipedia article to verify the cluster-article assignment. We demonstrate this approach on several urban areas, densely covering an area of over 700 square kilometers and mining over 200,000 photos, making it probably the largest experiment of its kind to date.

References

[1]
R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In SIGMOD'93, 1993.
[2]
M. Aurnhammer, P. Hanappe, and L. Steels. Integrating collaborative tagging and emergent semantics for image retrieval. In Collaborative Web Tagging Workshop (WWW'06), 2006.
[3]
H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. In ECCV'06, 2006.
[4]
C. Borgelt. An implementation of the fp-growth algorithm. In OSDM'05, 2005.
[5]
A. Bosch, A. Zisserman, and X. Muñoz. Scene classification via pLSA. In ECCV'06, 2006.
[6]
W. B. Croft and D. J. Harper. Using probabilistic models of document retrieval without relevance information. Journal of Documentation, 35, 1997.
[7]
R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In CVPR'03, 2003.
[8]
M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. In Comm. of the ACM, 1981.
[9]
M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S. Seitz. Multi-view stereo for community photo collections. In ICCV'07, 2007.
[10]
R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2004.
[11]
A. Jaffe, M. Naaman, T. Tassa, and M. Davis. Generating summaries and visualization for large collections of geo-referenced photographs. In MIR'06.
[12]
B. Leibe, E. Seemann, and B. Schiele. Pedestrian detection in crowded scenes. In CVPR'05, 2005.
[13]
S. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State of the art and challenges. In ACM Trans. Multimedia Comput. Commun. Appl., 2006.
[14]
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 2004.
[15]
J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions. In BMVC'02, 2002.
[16]
K. Mikolajczyk and C. Schmid. Scale and affine invariant interest point detectors. IJCV, 60(1), 2004.
[17]
D. Nistér and H. Stewénius. Scalable recognition with a vocabulary tree. In CVPR'06, 2006.
[18]
L. Paletta, G. Fritz, C. Seifert, P. Luley, and A. Almer. A mobile vision service for multimedia tourist applications in urban environments. In IEEE Intel. Transp. Syst. Conf., 2006.
[19]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR'07, 2007.
[20]
T. Quack, H. Bay, and L. Van Gool. Object recognition for the internet of things. In Internet of Things 2008, 2008.
[21]
T. Quack, V. Ferrari, and L. Van Gool. Video mining with frequent itemset configurations. In CIVR'06, 2006.
[22]
J. Quinlan. Induction of decision trees. Mach. Learn., 1:81--106, 1986.
[23]
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1986.
[24]
S. Savarese and L. Fei-Fei. 3d generic object categorization, localization and pose estimation. In ICCV'07, 2007.
[25]
I. Simon, N. Snavely, and S. M. Seitz. Scene summarization for online image collections. In ICCV'07, 2007.
[26]
A. Singhal, C. Buckley, and M. Mitra. Pivoted document length normalization. In SIGIR '96, 1996.
[27]
J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos. In ICCV'03, 2003.
[28]
J. Sivic and A. Zisserman. Video data mining using configurations of viewpoint invariant regions. In CVPR'04, 2004.
[29]
J. Sivic and A. Zisserman. Video data mining using configurations of viewpoint invariant regions. In CVPR'04, 2004.
[30]
T. Tuytelaars and L. Van Gool. Wide baseline stereo matching based on local, affinely invariant regions. In BMVC'00, 2000.
[31]
M. Vergauwen and L. Van Gool. Web-based 3d reconstruction service. MVA, 17(6):411--426, 2006.
[32]
A. Webb. Statistical Pattern Recognition. Wiley, second edition, 2002.

Cited By

View all
  • (2024)Social media data for content creation in location-based gamesJournal of Location Based Services10.1080/17489725.2024.2414000(1-28)Online publication date: 30-Oct-2024
  • (2022)Deep Learning-Based Image Geolocation for Travel Recommendation via Multi-Task LearningJournal of Circuits, Systems and Computers10.1142/S021812662250127431:07Online publication date: 17-Jan-2022
  • (2021)Bilinear Image Translation for Temporal Analysis of Photo CollectionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.295031743:4(1197-1212)Online publication date: 1-Apr-2021
  • Show More Cited By

Index Terms

  1. World-scale mining of objects and events from community photo collections

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIVR '08: Proceedings of the 2008 international conference on Content-based image and video retrieval
    July 2008
    674 pages
    ISBN:9781605580708
    DOI:10.1145/1386352
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. database
    2. geo-referenced
    3. image
    4. mining
    5. object recognition
    6. photo collection
    7. retrieval
    8. web

    Qualifiers

    • Research-article

    Conference

    CIVR08

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Social media data for content creation in location-based gamesJournal of Location Based Services10.1080/17489725.2024.2414000(1-28)Online publication date: 30-Oct-2024
    • (2022)Deep Learning-Based Image Geolocation for Travel Recommendation via Multi-Task LearningJournal of Circuits, Systems and Computers10.1142/S021812662250127431:07Online publication date: 17-Jan-2022
    • (2021)Bilinear Image Translation for Temporal Analysis of Photo CollectionsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.295031743:4(1197-1212)Online publication date: 1-Apr-2021
    • (2021)LAST: Location-Appearance-Semantic-Temporal Clustering Based POI SummarizationIEEE Transactions on Multimedia10.1109/TMM.2020.297747823(378-390)Online publication date: 2021
    • (2021)Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision2021 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV48922.2021.00048(418-427)Online publication date: Oct-2021
    • (2021)Discovering shopping visitors’ behavior and preferences using geo-tagged social photos: a case study of Los Angeles CityJournal of Marketing Analytics10.1057/s41270-021-00107-wOnline publication date: 23-Feb-2021
    • (2021)Visualizing Landscapes by Geospatial TechniquesModern Approaches to the Visualization of Landscapes10.1007/978-3-658-30956-5_4(47-78)Online publication date: 1-Feb-2021
    • (2020)Sights, titles and tagsProceedings of the 10th International Conference on Web Intelligence, Mining and Semantics10.1145/3405962.3405987(149-158)Online publication date: 30-Jun-2020
    • (2019)URBAN-i: From urban scenes to mapping slums, transport modes, and pedestrians in cities using deep learning and computer visionEnvironment and Planning B: Urban Analytics and City Science10.1177/239980831984651748:1(76-93)Online publication date: 6-May-2019
    • (2019)From Images to 3D Shape AttributesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.278281041:1(93-106)Online publication date: 1-Jan-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media