Skip to main content
Log in

Automatic construction of POI address lists at city streets from geo-tagged photos and web data: a case study of San Jose City

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Points of Interest (POIs) are crucial data sources for location based applications. Social media and traditional web can include up-to-date POI information that is emerging in the real world and shared by users and organizations. They have been demonstrated as the potential data sources for enriching the existing POI databases and online map services. Commonly, a POI is associated with a street for navigation, accessing, indexing and searching. The association between a POI and a street is present in the POI address. This paper proposes a novel approach for automatically constructing POI address lists at streets of a city from geo-tagged social media photos and web data. The proposed method can yield POI addresses that are missing on Google Maps, OpenStreetMap or Wikimapia. As a result, it is potentially applied for enriching POI data and enhancing online digital map services. In our approach, we first specify the relation between a POI name discovered from geo-tagged photos and related streets, candidate addresses; and then we utilize this relation to mine the POI address from web snippets by a search engine. We present a case study of San Jose City, California, USA. The analysis results have demonstrated the effectiveness of the proposed method, providing a promising solution for automatically constructing POI address lists at city streets from geo-tagged social media photos and web data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.flickr.com/services/api/

References

  1. Ahlers D (2013) Business entity retrieval and data provision for yellow pages by local search. In: proceedings of IRPS workshop@ ECIR2013

  2. Ahlers D, Boll S (2008) Retrieving address-based locations from the web. In: Proceedings of the 5th International Workshop on Geographic Information Retrieval, pp 27–34

  3. Alves AO, Pereira FC, Rodrigues F, Oliveirinha J (2010) Place in perspective: extracting online information about points of interest. In: Proceedings of International Joint Conference on Ambient Intelligence, pp. 61–72

  4. Asadi S, Yang G, Zhou X, Shi Y, Zhai B, Jiang WWR (2008) Pattern-Based Extraction of Addresses from Web Page Content. In: Pattern-based extraction of addresses from web page content. Proceedings of Asia-Pacific Web Conference, In, pp 407–418

    Google Scholar 

  5. Blohm S (2011) Large-scale pattern-based information extraction from the world wide web. KIT Scientific Publishing

  6. Borges KA, Laender AH, Medeiros CB, Davis Jr CA (2007) Discovering geographic locations in web pages using urban addresses. In: Proceedings of the 4th ACM Workshop on Geographical Information Retrieval, pp 31–36

  7. Cai W, Wang S, Jiang Q (2005) Address extraction: extraction of location-based information from the web. Proceedings of Asia-Pacific Web Conference, In, pp 925–937

    Google Scholar 

  8. Chuang HM, Chang CH, Kao TY, Cheng CT, Huang YY, Cheong KP (2016) Enabling maps/location searches on mobile devices: constructing a POI database via focused crawling and information extraction. Int J Geogr Inf Sci 30(7):1405–1425

    Article  Google Scholar 

  9. Chuang HM, Chang CH, Cheng CT (2016) Improving the effectiveness of POI search by associated information summarization. In: proceedings of Asian language processing (IALP). pp 336-339. IEEE

  10. Dakrory S, Abdelatif BA, Kayed M, Ali AA (2021) Extracting geographic addresses from social media using deep recurrent neural networks. In: 2021 9th international Japan-Africa conference on electronics, communications, and computations (JAC-ECC) (pp. 135-139). IEEE.

  11. Efremova J, Endres I, Vidas I, Melnik O (2018) A geo-tagging framework for address extraction from web pages. In: Proceedings of Industrial Conference on Data Mining. pp. 288–295

  12. Gao S, Li L, Li W, Janowicz K, Zhang Y (2017) Constructing gazetteers from volunteered big geo-data based on hadoop. Comput, Environ Urban Syst, Geospat Cloud Comput Big Data 61:172–186

    Google Scholar 

  13. Gelernter J, Ganesh G, Krishnakumar H, Zhang W (2013) Automatic gazetteer enrichment with user-geocoded data. In: Proceedings of GEOCROWD ‘13, pp 87–94

  14. Hu Y, Mao H, McKenzie G (2019) A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements. Int J Geogr Inf Sci 33(4):714–738

    Article  Google Scholar 

  15. Koswatte S, Mcdougall K, Liu X (2016) Semantic location extraction from crowdsourced data. Int Archiv Photogram, Remote Sens Spatial Inform Sci 41(B2):543–547

    Article  Google Scholar 

  16. Lamprianidis G, Skoutas D, Papatheodorou G, Pfoser D (2014) Extraction, integration and analysis of crowdsourced points of interest from multiple web sources. In: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Crowdsourced and Volunteered Geographic Information. pp 16–23

  17. Li C, Sun A (2017) Extracting fine-grained location with temporal awareness in tweets: a two-stage approach. J Assoc Inf Sci Technol 68(7):1652–1670

    Article  Google Scholar 

  18. Li L, Wang W, He B, Zhang Y (2018) A hybrid method for Chinese address segmentation. Int J Geogr Inf Sci 32(1):30–48

    Article  Google Scholar 

  19. Lim J, Nitta N, Nakamura K, Babaguchi N (2019) Constructing geographic dictionary from streaming geotagged tweets. ISPRS Int J Geo Inf 8(5):216

    Article  Google Scholar 

  20. Lingad J, Karimi S, Yin J (2013) Location extraction from disaster-related microblogs. In: proceedings of the international conference on world wide web (companion). pp 1017–1020

  21. Matuszka T , Kiss A (2014) Geodint: towards semantic web-based geographic data integration. In: Proceedings of Asian Intelligent Information and Database Systems. pp. 191–200

  22. Moura TH, Davis CA, Fonseca FT (2017) Reference data enhancement for geographic information retrieval using linked data. Trans GIS 21(4):683–700

    Article  Google Scholar 

  23. Nesi P, Pantaleo G, Tenti M (2016) Geographical localization of web domains and organization addresses recognition by employing natural language processing, pattern matching and clustering. Eng Appl Artif Intell 51:202–211

    Article  Google Scholar 

  24. Popescu A, Grefenstette G, Moëllic P-A (2008) Gazetiki: automatic creation of a geographical gazetteer. In: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, pp 16–20

  25. Popescu A, Grefenstette G, Bouamor H (2009) Mining a multilingual geographical gazetteer from the web. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, pp 58–65

  26. Rae A, Murdock V, Popescu A, Bouchard H (2012) Mining the web for points of interest. In: Proceedings of the International ACM SIGIR conference on Research and Development in Information Retrieval., pp. 711–720

  27. Steven B, Loper E, Klein E (2009) Natural language processing with Python. O’Reilly Media, Inc., Sebastopol, CA

    MATH  Google Scholar 

  28. Uryupina O (2003) Semi-supervised learning of geographical gazetteers from the internet. In Proceedings of the HLTNAACL 2003 Workshop on Analysis of Geographic References. pp 18–25

  29. Van Canneyt S, Van Laere O, Schockaert S, Dhoedt B (2012) using social media to find places of interest: a case study. In: proceedings of the 1st ACM SIGSPATIAL international workshop on crowdsourced and volunteered geographic information. pp 2-8. ACM

  30. Xu L, Du Z, Mao R, Zhang F, Liu R (2020) GSAM: a deep neural network model for extracting computational representations of Chinese addresses fused with geospatial feature. Comput Environ Urban Syst 1(81):101473

    Article  Google Scholar 

  31. Zenasni S, Kergosien E, Roche M, Teisseire M (2016) Extracting new spatial entities and relations from short messages. In: proceedings of the 8th international conference on Management of Digital EcoSystems.pp 189-196. ACM

  32. Zhang Y, Ma Q, Chiang YY, Knoblock C, Zhang X, Yang P, Gao M, Hu X (2019) Extracting geographic features from the internet: a geographic information mining framework. Knowl-Based Syst 174:57–72

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by University of Economics Ho Chi Minh City, Vietnam.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh-Hieu Bui.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bui, TH. Automatic construction of POI address lists at city streets from geo-tagged photos and web data: a case study of San Jose City. Multimed Tools Appl 82, 34749–34770 (2023). https://doi.org/10.1007/s11042-023-14862-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14862-8

Keywords

Navigation