Improving Robustness and Flexibility of Concept Taxonomy Learning from Text

Leuzzi, Fabio; Ferilli, Stefano; Rotella, Fulvio

doi:10.1007/978-3-642-37382-4_12

Fabio Leuzzi²²,
Stefano Ferilli^22,23 &
Fulvio Rotella²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7765))

Included in the following conference series:

International Workshop on New Frontiers in Mining Complex Patterns

601 Accesses
6 Citations

Abstract

The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a weight that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Int. Res. 24(1), 305–339 (2005)
MATH Google Scholar
de Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure trees. In: LREC (2006)
Google Scholar
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Ferilli, S., Biba, M., Di Mauro, N., Basile, T.M.A., Esposito, F.: Plugging taxonomic similarity in first-order logic horn clauses comparison. In: Serra, R., Cucchiara, R. (eds.) AI*IA 2009. LNCS, vol. 5883, pp. 131–140. Springer, Heidelberg (2009)
Chapter Google Scholar
Ferilli, S., Leuzzi, F., Rotella, F.: Cooperating techniques for extracting conceptual taxonomies from text. In: Proceedings of The Workshop on Mining Complex Patterns at AI*IA XIIth Conference (2011)
Google Scholar
Hamming, R.W.: Error detecting and error correcting codes. Bell System Technical Journal 29(2), 147–160 (1950)
Article MathSciNet Google Scholar
Kimmig, A., Santos Costa, V., Rocha, R., Demoen, B., De Raedt, L.: On the efficient execution of probLog programs. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 175–189. Springer, Heidelberg (2008)
Chapter Google Scholar
Kimmig, A., De Raedt, L., Toivonen, H.: Probabilistic explanation based learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 176–187. Springer, Heidelberg (2007)
Chapter Google Scholar
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Advances in Neural Information Processing Systems, vol. 15, MIT Press (2003)
Google Scholar
Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)
Chapter Google Scholar
Maedche, A., Staab, S.: The text-to-onto ontology learning environment. In: ICCS-2000 - Eight International Conference on Conceptual Structures, Software Demonstration (2000)
Google Scholar
Ogata, N.: A formal ontology discovery from web documents. In: Zhong, N., Yao, Y., Ohsuga, S., Liu, J. (eds.) WI 2001. LNCS (LNAI), vol. 2198, pp. 514–519. Springer, Heidelberg (2001)
Chapter Google Scholar
Cucchiarelli, A., Velardi, P., Navigli, R., Neri, F.: Evaluation of OntoLearn, a methodology for automatic population of domain ontologies. In: Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press (2006)
Google Scholar
De Raedt, L., Kimmig, A., Toivonen, H.: Problog: a probabilistic prolog and its application in link discovery. In: Proceedings of 20th IJCAI, pp. 2468–2473. AAAI Press (2007)
Google Scholar
Sato, T.: A statistical learning method for logic programs with distribution semantics. In: Proceedings of the 12th ICLP 1995, pp. 715–729. MIT Press (1995)
Google Scholar
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 133–138. Association for Computational Linguistics (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università di Bari, Italy
Fabio Leuzzi, Stefano Ferilli & Fulvio Rotella
Centro Interdipartimentale per la Logica e sue Applicazioni, Università di Bari, Italy
Stefano Ferilli

Authors

Fabio Leuzzi
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Ferilli
View author publications
You can also search for this author in PubMed Google Scholar
Fulvio Rotella
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro, Via Orabona 4, 70125, Bari, Italy
Annalisa Appice , Michelangelo Ceci & Corrado Loglisci , &
Institute for High Performance Computing and Networks (ICAR), National Research Council (CNR), Via Pietro Bucci 41C, 87036, Rende, CS, Italy
Giuseppe Manco & Elio Masciari &
Department of Computer Science, University of North Caroline, 9201 Unviersity City Boulevard, 28223, Charlotte, NC, USA
Zbigniew W. Ras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leuzzi, F., Ferilli, S., Rotella, F. (2013). Improving Robustness and Flexibility of Concept Taxonomy Learning from Text. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2012. Lecture Notes in Computer Science(), vol 7765. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37382-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-37382-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37381-7
Online ISBN: 978-3-642-37382-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics