Skip to main content

Structuring Taxonomies from Texts: A Case-Study on Defining Soil Classes

  • Conference paper
Computational Science and Its Applications – ICCSA 2012 (ICCSA 2012)

Abstract

Currently, most of the information digitally available is presented in textual form and it is largely acknowledged that, in many fields, the advance of knowledge may strongly benefit from this source of information. The treatment of this vast amount of texts by means of Text Mining (TM) techniques has produced interesting information in fields like Competitive Intelligence and Bibliometry that need to make sense from textual descriptions of facts. In this paper we approach the problem of taxonomy generation from texts, a common need from a large set of scientific disciplines. Taxonomy generation refers to building a hierarchical structure that organizes concepts in a knowledge domain. We applied TM techniques to help experts in Pedology in building taxonomy from redundant soils descriptions. The motto of the application is the fact that, in the early eighties, different organizations mapped and described equivalent classes of soils from Brazilian savannas, generating redundant descriptions with different class labels. There were produced 28 soil maps that covered 4,101 descriptions of soil classes. This profusion of redundant soil descriptions clearly represents a Babel Tower that makes difficult tasks like environment management and food production. The proposed process is based in clustering analysis and runs on the soil descriptions, performing a successive refinement of the abstractions found in soil descriptions. The method builds a frame that shows, for each cluster formed, the prototype (a representative word vector) and the soil descriptions related to that cluster. The results have been analyzed by a team of experts as input information to the laborious reasoning process involved in building concepts from the semantic relations among the soil descriptions. Without a help like the present process, the experts would have to compare visually at least 4,101 × 4.100 × …× 1 soil descriptions to define the clusters, what is much more laborious.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aandahl, A.R.: The first comprehensive soil classification system. Journal of Soil and Water Conservation 20, 243–246 (1965)

    Google Scholar 

  2. EMBRAPA. Centro Nacional de Pesquisa de Solos. Sistema Brasileiro de Classificação de Solos. Rio de Janeiro, Brazil (1999)

    Google Scholar 

  3. Prado, H.A., Oliveira, J.P.M., Ferneda, E., Wives, L.K., Silva, E.M., Loh, S.: Transforming Textual Patterns into Knowledge. In: Raisinghani, M. (ed.) Business Intelligence in the Digital Economy, pp. 207–227. IGI Global, Hershey (2004)

    Google Scholar 

  4. Sano, E.E., Rosa, R., Brito, J.L.S., Ferreira, L.G.: Mapeamento de cobertura vegetal do Bioma Cerrado: estratégias e resultados. Embrapa Cerrados, Planaltina (2007)

    Google Scholar 

  5. Simonson, R.W.: Soil classification in the United States. Science 137, 1027–1034 (1962)

    Article  Google Scholar 

  6. Smith, G.D.: Objectives and basic assumptions of the new soil classification system. Soil Science 96, 6–16 (1963)

    Article  Google Scholar 

  7. SOIL SURVEY STAFF. Soil taxonomy: A basic system of soil classification for making and interpreting soil surveys. U.S. Soil Conservation Service Agricultural Handbook No. 436. U.S. Department of Agriculture (1975)

    Google Scholar 

  8. USDA - UNITED STATES DEPARTMENT OF AGRICULTURE, Keys to Soil Taxonomy. 11th edn. (2010), ftp://ftp-fc.sc.egov.usda.gov/NSSC/Soil_Taxonomy/keys/2010_Keys_to_Soil_Taxonomy.pdf

  9. Wives, L.K., de Oliveira, J.P.M., Loh, S.: Conceptual Clustering of Textual Documents and Some Insights for Knowledge Discovery. In: Prado, H.A., Ferneda, E. (eds.) Emerging Technologies of Text Mining: Techniques and Applications, pp. 223–243. Idea Group, Hershey (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

do Prado, H.A., Ferneda, E., da Luz Rodrigues, F.C., de Souza, É.M., de Carvalho, O.A., Luiz, A.J.B. (2012). Structuring Taxonomies from Texts: A Case-Study on Defining Soil Classes. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2012. ICCSA 2012. Lecture Notes in Computer Science, vol 7335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31137-6_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31137-6_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31136-9

  • Online ISBN: 978-3-642-31137-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics