ABSTRACT
GIS is becoming a necessity in a wide variety of application domains and the extraction of such geographic information has taken an important part in the computer science field.
This thesis has the objective of extracting geographic data from Wikipedia to make it easier for users to obtain the information they want.
One problematic aspect is the large volume XML file processing, we try to use text mining and machine learning techniques to solve this problem.
In this work, we present and evaluate an approach to extract geographic data from Wikipedia from a very large XML file and create a geographic databae. Our technique is to extract infoboxes from geographic articles using the supervised machine learning (SVM) technique. We create after that tables containing geographic data (name, longitude, latitude ... etc) and we make the joins between different tables that will help us to structure our result.
- C.boukli hacene. A.rabah fissa, "Geographic information systems courses and practical work". (Université Aboubakr Belkaïd-- Tlemcen --, 2014).Google Scholar
- S Bernier, S Duthoit, S Ladet, « Basic Concepts of Geographic Information Systems (SIG)":, (universite de toulouse, 2014).Google Scholar
- T. Poibeau: « Automatic Extraction of Information: From Raw Text to Semantic Web". (thesis), 2002.Google Scholar
- G. Grefenstette. Conquering language: Using NLP on a massive scale to build high dimensional language models from the web". In Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing, pages 35{49, (2007). Google ScholarDigital Library
- S. Auer, C. Bizer, G. kobilarov, J. Lehmann, R. Cyganiak et Z. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In Proc.of ISWC 2007 (Busan, Korea, November 2007). Google ScholarDigital Library
- J. Kazama et K. Torisawa. 2007. Exploiting Wikipédia as External Knowledge for Named Entity Recognition. In Proc.of the 2007 Joint Conference on Empirical Methods in Natural Language ProcessingetComputational Natural Language Learning, p. 698--707, (Prague, June 2007).Google Scholar
- R.Bunescu & M.Pasca.2006. Using encyclopedic knowledge for named entity disambiguisation. In EACL 2006.Google Scholar
- A. Popescu, G. Grefenstette et P.A. Moëllic.2008. Gazetiki: Automatic Creation of a Geographical Gazetteer, Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries. Google ScholarDigital Library
- H. Bouamor. 2009. « Extraction of knowledge from the Web for searching georeferenced images"Google Scholar
- C. Toriani, S. Battle et S. Cayzer. 2007. Sharing, Discovering and Browsing Geotagged / Pictures on the Web. Ed. Springer 2007.Google Scholar
- D. Ahlers. 2017. Linkage Quality Analysis of GeoNames in the Semantic Web, 2017 - dl.acm.org. Google ScholarDigital Library
- A. Ballatore, J. Jokar Arsanjani - Placing Wikimapia: an exploratory analysis 2018 - Taylor & Francis.Google Scholar
- L. Palen, R. Soden, TJ. Anderson... - Success & scale in a data-producing organization: The socio-technical evolution of OpenStreetMap in response to humanitarian events, 2015. Google ScholarDigital Library
- D. Anguelov - 2010 Google Street View: Capturing the World at Street Level -- Google AI. Google ScholarDigital Library
- Robert Laurini, Sylvie Servigne 2011/2012. « Geographical Space ». Ahern, S., Naaman, M., Nair, R., Yang, J.H.-I.: World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 1--10. ACM, Vancouver (2007). Google ScholarDigital Library
- N. Chenachena. 2013. « Extraction of Geographic Knowledge in Wikipedia".Google Scholar
- I. Tellier, Introduction to Text Mining (University of Paris 3 - Sorbonne Nouvelle -2012)Google Scholar
Recommendations
Energy Fraud Detection in Advanced Metering Infrastructure AMI
The Smart Grid advanced metering infrastructure (AMI) is one of the key components of the smart grid. It provides a two-way communication network between smart meters and utility systems, offering interactive services for managing billing and energy ...
Comments