skip to main content
10.1145/3330089.3330128acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicsentConference Proceedingsconference-collections
research-article

Extracting Geographic Knowledge from Wikipedia

Published:26 December 2018Publication History

ABSTRACT

GIS is becoming a necessity in a wide variety of application domains and the extraction of such geographic information has taken an important part in the computer science field.

This thesis has the objective of extracting geographic data from Wikipedia to make it easier for users to obtain the information they want.

One problematic aspect is the large volume XML file processing, we try to use text mining and machine learning techniques to solve this problem.

In this work, we present and evaluate an approach to extract geographic data from Wikipedia from a very large XML file and create a geographic databae. Our technique is to extract infoboxes from geographic articles using the supervised machine learning (SVM) technique. We create after that tables containing geographic data (name, longitude, latitude ... etc) and we make the joins between different tables that will help us to structure our result.

References

  1. C.boukli hacene. A.rabah fissa, "Geographic information systems courses and practical work". (Université Aboubakr Belkaïd-- Tlemcen --, 2014).Google ScholarGoogle Scholar
  2. S Bernier, S Duthoit, S Ladet, « Basic Concepts of Geographic Information Systems (SIG)":, (universite de toulouse, 2014).Google ScholarGoogle Scholar
  3. T. Poibeau: « Automatic Extraction of Information: From Raw Text to Semantic Web". (thesis), 2002.Google ScholarGoogle Scholar
  4. G. Grefenstette. Conquering language: Using NLP on a massive scale to build high dimensional language models from the web". In Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing, pages 35{49, (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Auer, C. Bizer, G. kobilarov, J. Lehmann, R. Cyganiak et Z. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In Proc.of ISWC 2007 (Busan, Korea, November 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Kazama et K. Torisawa. 2007. Exploiting Wikipédia as External Knowledge for Named Entity Recognition. In Proc.of the 2007 Joint Conference on Empirical Methods in Natural Language ProcessingetComputational Natural Language Learning, p. 698--707, (Prague, June 2007).Google ScholarGoogle Scholar
  7. R.Bunescu & M.Pasca.2006. Using encyclopedic knowledge for named entity disambiguisation. In EACL 2006.Google ScholarGoogle Scholar
  8. A. Popescu, G. Grefenstette et P.A. Moëllic.2008. Gazetiki: Automatic Creation of a Geographical Gazetteer, Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Bouamor. 2009. « Extraction of knowledge from the Web for searching georeferenced images"Google ScholarGoogle Scholar
  10. C. Toriani, S. Battle et S. Cayzer. 2007. Sharing, Discovering and Browsing Geotagged / Pictures on the Web. Ed. Springer 2007.Google ScholarGoogle Scholar
  11. D. Ahlers. 2017. Linkage Quality Analysis of GeoNames in the Semantic Web, 2017 - dl.acm.org. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Ballatore, J. Jokar Arsanjani - Placing Wikimapia: an exploratory analysis 2018 - Taylor & Francis.Google ScholarGoogle Scholar
  13. L. Palen, R. Soden, TJ. Anderson... - Success & scale in a data-producing organization: The socio-technical evolution of OpenStreetMap in response to humanitarian events, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Anguelov - 2010 Google Street View: Capturing the World at Street Level -- Google AI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Robert Laurini, Sylvie Servigne 2011/2012. « Geographical Space ». Ahern, S., Naaman, M., Nair, R., Yang, J.H.-I.: World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 1--10. ACM, Vancouver (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Chenachena. 2013. « Extraction of Geographic Knowledge in Wikipedia".Google ScholarGoogle Scholar
  17. I. Tellier, Introduction to Text Mining (University of Paris 3 - Sorbonne Nouvelle -2012)Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICSENT 2018: Proceedings of the 7th International Conference on Software Engineering and New Technologies
    December 2018
    201 pages
    ISBN:9781450361019
    DOI:10.1145/3330089

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 26 December 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader