Skip to main content

Knowledge Extraction from Classification Schemas

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3290))

Abstract

The availability of formal ontologies is crucial for the success of the Semantic Web. Manual construction of ontologies is a difficult and time-consuming task and easily causes a knowledge acquisition bottleneck. Semi-Automatic ontology generation eases that problem. This paper presents a method which allows semi-automatic knowledge extraction from underlying classification schemas such as folder structures or web directories. Explicit as well as implicit semantics contained in the classification schema have to be considered to create a formal ontology. The extraction process is composed of five main steps: Identification of concepts and instances, word sense disambiguation, taxonomy construction, identification of non-taxonomic relations, and ontology population. Finally the process is evaluated by using a prototypical implementation and a set of real world folder structures.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pinto, S., Staab, S., Sure, Y., Tempich, C.: OntoEdit empowering SWAP: a case study in supporting DIstributed, Loosely-controlled and evolvInG Engineering of oNTologies (DILIGENT). In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 16–30. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  2. Maedche, A.: Emergent semantics for ontologies - support by an explicit lexical layer and ontology learning. IEEE Intelligent Systems (2002)

    Google Scholar 

  3. Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent Systems 16, 72–79 (2001)

    Article  Google Scholar 

  4. Lamparter, S. (Semi-)Automatische Wissensextraktion aus Klassifikationsschemata. Master’s thesis, Institute AIFB, University of Karlsruhe, TH (2004)

    Google Scholar 

  5. Magnini, B., Serafini, L., Speranza, M.: Making explicit the semantics hidden in schema models. In: Proc. of the 2nd Int. Semantic Web Conference (2003)

    Google Scholar 

  6. Hotho, A., Maedche, A., Staab, S., Zacharias, V.: On knowledgeable unsupervised text mining. In: Franke, J., Nakhaeizadeh, G., Renz, I. (eds.) Text Mining, Theoretical Aspects and Applications, pp. 131–152. Physica-Verlag, Heidelberg (2003)

    Google Scholar 

  7. Brants, T.: TnT – a statistical part-of-speech tagger. In: Proc. of the 6th Applied Natural Language Processing Conference (ANLP), Seattle,WA (2000)

    Google Scholar 

  8. Brill, E.: Transformation-based error-driven learning and natural language processing:A case study in part of speech tagging. Computational Lingusitics 21, 543–565 (1995)

    Google Scholar 

  9. Abney, S.P.: Parsing by chunks. In: Berwick, R.C., Abney, S.P., Tenny, C. (eds.) Principle- Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer, Dordrecht (1991)

    Google Scholar 

  10. Grishman, R.: The NYU system for MUC-6 or Where’s the Syntax? In: Proc. of the 6th Message Understanding Conference (MUC-6, 1995), Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  11. Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a high-performance learning name-finder. In: Proc. of 5th Conf. on Applied Natural Language Processing, pp. 194–201 (1997)

    Google Scholar 

  12. Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. Ph.d. thesis, NewYork University (1999)

    Google Scholar 

  13. Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazetteers. In: EACL 1999, Bergen, Norway, pp. 1–8 (1999)

    Google Scholar 

  14. Ide, N., Véronis, J.: Introduction to the special issue on word sense disambiguation: The state of the art. Computational Linguistics 24, 1–40 (1998)

    Google Scholar 

  15. Maedche, A., Pekar, V., Staab, S.: Ontology Learning Part One - On DiscoveringTaxonomic Relations from the Web, pp. 301–322. Springer, Heidelberg (2002)

    Google Scholar 

  16. Missikoff, M., Navigli, R., Velardi, P.: The usable ontology: An environment for building and assessing a domain ontology. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 39–53. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  17. Hearst, M.A.: Automatic acqusition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France (1992)

    Google Scholar 

  18. Resnik, P.: Selection and Information: A Class-based Approach to Lexical Relationships. PhD thesis, University of Pennsylania (1993)

    Google Scholar 

  19. Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proceedings of ECAI 2000, IOS Press, Amsterdam (2000)

    Google Scholar 

  20. Faure, D., Nedellec, C.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC-98Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, Spain (1998)

    Google Scholar 

  21. Kavalec, M., Maedche, A., Svatek, V.: Discovery of lexical entries for non-taxonomic relations in ontology learning. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2004. LNCS, vol. 2932, Springer, Heidelberg (2004)

    Google Scholar 

  22. Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19, 17–30 (1989)

    Article  Google Scholar 

  23. Deitel, A., Faron, C., Dieng, R.: Learning ontologies from rdf annotations. In: Proc. of the IJCAI Workshop in Ontology Learning (2001)

    Google Scholar 

  24. Doan, A., Domingos, P., Levy, A.: Learning source descriptions for data integration. In: Proc. of 3rd Int. Workshop on the Web and Databases, Dallas, Texas, pp. 81–86 (2000)

    Google Scholar 

  25. Wielinga, B.T.A., Wielemaker, S., Sandberg, J.: From thesaurus to ontology. In: Proc. of Int. Conf. on Knowledge Capture, pp. 194–201. ACM Press, New York (2001)

    Chapter  Google Scholar 

  26. Golbeck, J., Fragoso, G., Hartel, F., Hendler, J., Parsia, B., Oberthaler, J.: The national cancer institute’s thesaurus and ontology. Journal ofWeb Semantics 1 (2003)

    Google Scholar 

  27. Beneventano, D., Bergamaschi, S., Guerra, F., Vincini, M.: Synthesizing an integrated ontology. IEEE Internet Computing XX, 42–51 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lamparter, S., Ehrig, M., Tempich, C. (2004). Knowledge Extraction from Classification Schemas. In: Meersman, R., Tari, Z. (eds) On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, and ODBASE. OTM 2004. Lecture Notes in Computer Science, vol 3290. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30468-5_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30468-5_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23663-4

  • Online ISBN: 978-3-540-30468-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics