Learning Textologies: Networks of Linked Word Clusters

Tanev, Hristo

doi:10.1007/978-3-319-12655-5_2

Hristo Tanev⁶

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

3874 Accesses
2 Citations
3 Altmetric

Abstract

Ontologies have been used in different important applications like information extraction, generation of grammars, query expansion for information retrieval etc. However, building comprehensive ontologies is a time consuming process. On the other hand, building a full-fledged ontology is not necessary for every application which requires modeling of semantic classes and relations between them. In this chapter we propose an alternative solution: learning a textology, that is, a graph of word clusters connected by co-occurrence relations. We used the properties of the graph for the generation of grammars and also suggest a procedure for upgrading the model into an ontology. Preliminary experiments show encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bollegala D, Matsuo Y, Ishizuka M (2008) A co-occurrence graph-based approach for personal name alias extraction from anchor texts. In: International joint conference on natural language processing (Ijcnlp), pp 865–870
Google Scholar
Bordag S, Heyer G, Quasthoff U (2003) Small worlds of concepts and other principles of semantic search. In: Böhme T, Heyer G, Unger H (eds) Iics, vol 2877. Springer, New York, pp 10–19. Retrieved from http://dblp.uni-trier.de/db/conf/iics/iics2003.html#BordagHQ03
Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population. Bridging the gap between text and knowledge. Springer, Berlin
MATH Google Scholar
Carlson A, Betteridge J, Kisiel B, Settles B, Estevam R, Hruschka J, Mitchell T (2010) Toward an architecture for never-ending language learning. In: Proceedings of the twenty-fourth AAAI conference on Artificial Intelligence (AAAI-10), Atlanta, GA, pp 1306–1313
Google Scholar
Costa ME, Bonomo F, Sigman M (2009) Scale-invariant transition probabilities in free word association trajectories. Front Integr Neurosci 3:17
Google Scholar
Dagan I, Glickman O, Magnini B (2006) The pascal recognising textual entailment challenge. In: Machine learning challenges. Evaluating predictive uncertainty, visual object classification, and recognising tectual entailment. Springer, New York, pp 177–190
Google Scholar
Drumond L, Girardi G (2008) A survey of ontology learning procedures. In: The 3rd workshop on ontologies and their applications, Salvador, Brasil, pp 13–25
Google Scholar
Feldman R, Sanger J (2007) The text mining handbook. Advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge
Google Scholar
Heyer G, Läuter M, Quasthoff U, Wittig T Wolff C (2001) Learning relations using collocations. In: Maedche A, Staab S, Nedellec C, Hovy EH (eds) Workshop on ontology learning, vol 38. CEUR-WS.org. Retrieved from http://dblp.uni-trier.de/db/conf/ijcai/ijcai2001ol.html#HeyerLQWW01
Klebanov BB, Flor M (2013, August). Word association profiles and their use for automated scoring of essays. In: Proceedings of the annual meeting of the association for computational linguistics, Sofia, Bulgaria
Google Scholar
Matsuo Y, Ishizuka M (2004) Keyword extraction from a single document using word co-occurrence statistical information. Int J Artif Intell Tools 13(01):157–169
Article Google Scholar
Navigli R, Velardi P (2008) From glossaries to ontologies: extracting semantic structure from textual definitions. In: Ontology learning and population. Bridging the gap between text and knowledge. Springer, Berlin, pp 71–87
Google Scholar
Ohsawa Y, Soma H, Matsuo Y, Matsumura N, Usui M (2002) Featuring web communities based on word co-occurrence structure of communications: 736. In: Proceedings of the 11th international conference on world wide web, p 742
Google Scholar
Pantel P, Lin D (2002) Discovering word senses from text. In: Proceedings of ACM SIGKDD conference on knowledge discovery and data mining, Edmonton, pp 613–619
Google Scholar
Riloff E, Jones R (2002) Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the sixteenth national conference on Artificial Intelligence (AAAI 99), Orlando, FL, pp 474–479
Google Scholar
Tanev H, Zavarella V, Kabadjov M, Piskorski J, Atkinson M, Steinberger R (2009) Exploiting machine learning techniques to build an event extraction system for Portuguese and Spanish. Linguamatica 2:55–66
Google Scholar
Völker J, Haase, P Hitzler P (2008) Learning expressive ontologies. In: Proceedings of the 2008 conference on ontology learning and population: bridging the gap between text and knowledge. IOS Press, Amsterdam, pp 45–69
Google Scholar
Wikipedia: Cyc. (2014) Retrieved from http://en.wikipedia.org/wiki/Cyc
Zortea M, Menegola B, Villavicencio A, Salles JFD (2014) Graph analysis of semantic word association among children, adults, and the elderly. Psicologia: Reflexão e Crítica 27(1):90–99
Google Scholar

Download references

Author information

Authors and Affiliations

European Commission, Joint Research Centre, Ispra, Italy
Hristo Tanev

Authors

Hristo Tanev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hristo Tanev .

Editor information

Editors and Affiliations

Computer Science Department, Technische Universität Darmstadt FG Language Technology, Darmstadt, Germany
Chris Biemann
Computer Science Department, Goethe University WG Text Technology, Frankfurt am Main, Hessen, Germany
Alexander Mehler

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tanev, H. (2014). Learning Textologies: Networks of Linked Word Clusters. In: Biemann, C., Mehler, A. (eds) Text Mining. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-12655-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-12655-5_2
Published: 13 December 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12654-8
Online ISBN: 978-3-319-12655-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics