Skip to main content

Learning Textologies: Networks of Linked Word Clusters

  • Chapter
  • First Online:
Book cover Text Mining

Abstract

Ontologies have been used in different important applications like information extraction, generation of grammars, query expansion for information retrieval etc. However, building comprehensive ontologies is a time consuming process. On the other hand, building a full-fledged ontology is not necessary for every application which requires modeling of semantic classes and relations between them. In this chapter we propose an alternative solution: learning a textology, that is, a graph of word clusters connected by co-occurrence relations. We used the properties of the graph for the generation of grammars and also suggest a procedure for upgrading the model into an ontology. Preliminary experiments show encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bollegala D, Matsuo Y, Ishizuka M (2008) A co-occurrence graph-based approach for personal name alias extraction from anchor texts. In: International joint conference on natural language processing (Ijcnlp), pp 865–870

    Google Scholar 

  2. Bordag S, Heyer G, Quasthoff U (2003) Small worlds of concepts and other principles of semantic search. In: Böhme T, Heyer G, Unger H (eds) Iics, vol 2877. Springer, New York, pp 10–19. Retrieved from http://dblp.uni-trier.de/db/conf/iics/iics2003.html#BordagHQ03

  3. Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population. Bridging the gap between text and knowledge. Springer, Berlin

    MATH  Google Scholar 

  4. Carlson A, Betteridge J, Kisiel B, Settles B, Estevam R, Hruschka J, Mitchell T (2010) Toward an architecture for never-ending language learning. In: Proceedings of the twenty-fourth AAAI conference on Artificial Intelligence (AAAI-10), Atlanta, GA, pp 1306–1313

    Google Scholar 

  5. Costa ME, Bonomo F, Sigman M (2009) Scale-invariant transition probabilities in free word association trajectories. Front Integr Neurosci 3:17

    Google Scholar 

  6. Dagan I, Glickman O, Magnini B (2006) The pascal recognising textual entailment challenge. In: Machine learning challenges. Evaluating predictive uncertainty, visual object classification, and recognising tectual entailment. Springer, New York, pp 177–190

    Google Scholar 

  7. Drumond L, Girardi G (2008) A survey of ontology learning procedures. In: The 3rd workshop on ontologies and their applications, Salvador, Brasil, pp 13–25

    Google Scholar 

  8. Feldman R, Sanger J (2007) The text mining handbook. Advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge

    Google Scholar 

  9. Heyer G, Läuter M, Quasthoff U, Wittig T Wolff C (2001) Learning relations using collocations. In: Maedche A, Staab S, Nedellec C, Hovy EH (eds) Workshop on ontology learning, vol 38. CEUR-WS.org. Retrieved from http://dblp.uni-trier.de/db/conf/ijcai/ijcai2001ol.html#HeyerLQWW01

  10. Klebanov BB, Flor M (2013, August). Word association profiles and their use for automated scoring of essays. In: Proceedings of the annual meeting of the association for computational linguistics, Sofia, Bulgaria

    Google Scholar 

  11. Matsuo Y, Ishizuka M (2004) Keyword extraction from a single document using word co-occurrence statistical information. Int J Artif Intell Tools 13(01):157–169

    Article  Google Scholar 

  12. Navigli R, Velardi P (2008) From glossaries to ontologies: extracting semantic structure from textual definitions. In: Ontology learning and population. Bridging the gap between text and knowledge. Springer, Berlin, pp 71–87

    Google Scholar 

  13. Ohsawa Y, Soma H, Matsuo Y, Matsumura N, Usui M (2002) Featuring web communities based on word co-occurrence structure of communications: 736. In: Proceedings of the 11th international conference on world wide web, p 742

    Google Scholar 

  14. Pantel P, Lin D (2002) Discovering word senses from text. In: Proceedings of ACM SIGKDD conference on knowledge discovery and data mining, Edmonton, pp 613–619

    Google Scholar 

  15. Riloff E, Jones R (2002) Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the sixteenth national conference on Artificial Intelligence (AAAI 99), Orlando, FL, pp 474–479

    Google Scholar 

  16. Tanev H, Zavarella V, Kabadjov M, Piskorski J, Atkinson M, Steinberger R (2009) Exploiting machine learning techniques to build an event extraction system for Portuguese and Spanish. Linguamatica 2:55–66

    Google Scholar 

  17. Völker J, Haase, P Hitzler P (2008) Learning expressive ontologies. In: Proceedings of the 2008 conference on ontology learning and population: bridging the gap between text and knowledge. IOS Press, Amsterdam, pp 45–69

    Google Scholar 

  18. Wikipedia: Cyc. (2014) Retrieved from http://en.wikipedia.org/wiki/Cyc

  19. Zortea M, Menegola B, Villavicencio A, Salles JFD (2014) Graph analysis of semantic word association among children, adults, and the elderly. Psicologia: Reflexão e Crítica 27(1):90–99

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hristo Tanev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Tanev, H. (2014). Learning Textologies: Networks of Linked Word Clusters. In: Biemann, C., Mehler, A. (eds) Text Mining. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-12655-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12655-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12654-8

  • Online ISBN: 978-3-319-12655-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics