ABSTRACT
With the increasing volume of online geosciences data, geoscientists are now facing huge challenges in rapidly discovering and extracting valuable information from a large number of documents. Nowadays, it has become crucial to develop flexible and efficient tools that can help geoscientists to quickly navigate through unstructured texts to reveal hidden patterns and trends. This paper presents a workflow for the multidimensional analysis of geosciences literature. NLP techniques and ontologies are used to automatically identify and extract domain-specific concepts and entities buried in unstructured text. Based on these extracted data, we defined a multidimensional representation form of geosciences text documents which facilitates quantitative and exploratory analysis for knowledge discovery. To illustrate the potential of the proposed workflow, we implemented a pilot system that allows the user to perform multidimensional analysis on large collection of documents through interactive and user-friendly visualizations. We have analyzed the rare earth elements and carbonatites research topic as an example. The obtained visualizations show the usefulness and the efficiency of the proposed system for discovering knowledge and identifying potential research gaps.
- Sobhana, N., Mitra, P., & Ghosh, S. K. (2010). Conditional random field based named entity recognition in geological text. International Journal of Computer Applications, 1(3), 143-147Google ScholarCross Ref
- Leveling, J. (2015, November). Tagging of temporal expressions and geological features in scientific articles. In Proceedings of the 9th Workshop on Geographic Information Retrieval (pp. 1-10).Google ScholarDigital Library
- Wang, C., Ma, X., Chen, J., & Chen, J. (2018). Information extraction and knowledge graph construction from geoscience literature. Computers & Geosciences, 112, 112-120.Google ScholarCross Ref
- Fan, R., Wang, L., Yan, J., Song, W., Zhu, Y., & Chen, X. (2020). Deep Learning-Based Named Entity Recognition and Knowledge Graph Construction for Geological Hazards. ISPRS International Journal of Geo-Information, 9(1), 15.Google ScholarCross Ref
- Teufel, S., & Moens, M. (2002). Summarizing scientific articles: experiments with relevance and rhetorical status. Computational linguistics, 28(4), 409-445.Google Scholar
- Ibekwe-Sanjuan, F., Silvia, F., Eric, S., & Eric, C. (2011). Annotation of scientific summaries for information retrieval. arXiv preprint arXiv:1110.5722.Google Scholar
- Liu, Y., Wu, F., Liu, M., & Liu, B. (2013). Abstract sentence classification for scientific papers based on transductive SVM. Computer and Information Science, 6(4), 125.Google ScholarCross Ref
- Huber, R., & Klump, J. (2015). Agenames a stratigraphic information harvester and text parser. Earth Science Informatics, 8(1), 125-134.Google ScholarCross Ref
- Leidner, J. L. (2008). Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. Universal-Publishers..Google Scholar
- B. Technologies. CLAVIN: Cartographic Location And Vicinity INdexer. http://clavin.bericotechnologies.com/, 2012–2013.s.Google Scholar
- ORRIS, G. J. and R. I. GRAUCH (2002). Rare earth element mines, deposits and occurrences, Open-File Report 2002-189, US Geological Survey, Reston, Va, 167pGoogle Scholar
- Kynicky, J., Smith, M. P., & Xu, C. (2012). Diversity of rare earth deposits: the key example of China. Elements, 8(5), 361-367.Google ScholarCross Ref
- Annad, O., Bendaoud, A., & Goria, S. (2017). Web information monitoring and crowdsourcing for promoting and enhancing the Algerian geoheritage. Arabian Journal of Geosciences, 10(13), 1-15.Google ScholarCross Ref
Recommendations
A conceptual model for multidimensional analysis of documents
ER'07: Proceedings of the 26th international conference on Conceptual modelingData warehousing and OLAP are mainly used for the analysis of transactional data. Nowadays, with the evolution of Internet, and the development of semi-structured data exchange format (such as XML), it is possible to consider entire fragments of data ...
Surveying the complementary role of automatic data analysis and visualization in knowledge discovery
VAKD '09: Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive ExplorationThe aim of this work is to survey and reflect on the various ways to integrate visualization and data mining techniques toward a mixed-initiative knowledge discovery taking the best of human and machine capabilities. Following a bottom-up bibliographic ...
Advanced Mining of Association Rules over Periodic Snapshots in a Data Warehouse
i-Know '13: Proceedings of the 13th International Conference on Knowledge Management and Knowledge TechnologiesThe traditional approach to integration of data mining algorithms with OLAP is that of predictive mining applied on transactional data with the aim of explaining the findings manually discovered via OLAP. We propose an alternative model, in which a ...
Comments