Abstract
The availability of formal ontologies is crucial for the success of the Semantic Web. Manual construction of ontologies is a difficult and time-consuming task and easily causes a knowledge acquisition bottleneck. Semi-Automatic ontology generation eases that problem. This paper presents a method which allows semi-automatic knowledge extraction from underlying classification schemas such as folder structures or web directories. Explicit as well as implicit semantics contained in the classification schema have to be considered to create a formal ontology. The extraction process is composed of five main steps: Identification of concepts and instances, word sense disambiguation, taxonomy construction, identification of non-taxonomic relations, and ontology population. Finally the process is evaluated by using a prototypical implementation and a set of real world folder structures.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Pinto, S., Staab, S., Sure, Y., Tempich, C.: OntoEdit empowering SWAP: a case study in supporting DIstributed, Loosely-controlled and evolvInG Engineering of oNTologies (DILIGENT). In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 16–30. Springer, Heidelberg (2004)
Maedche, A.: Emergent semantics for ontologies - support by an explicit lexical layer and ontology learning. IEEE Intelligent Systems (2002)
Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent Systems 16, 72–79 (2001)
Lamparter, S. (Semi-)Automatische Wissensextraktion aus Klassifikationsschemata. Master’s thesis, Institute AIFB, University of Karlsruhe, TH (2004)
Magnini, B., Serafini, L., Speranza, M.: Making explicit the semantics hidden in schema models. In: Proc. of the 2nd Int. Semantic Web Conference (2003)
Hotho, A., Maedche, A., Staab, S., Zacharias, V.: On knowledgeable unsupervised text mining. In: Franke, J., Nakhaeizadeh, G., Renz, I. (eds.) Text Mining, Theoretical Aspects and Applications, pp. 131–152. Physica-Verlag, Heidelberg (2003)
Brants, T.: TnT – a statistical part-of-speech tagger. In: Proc. of the 6th Applied Natural Language Processing Conference (ANLP), Seattle,WA (2000)
Brill, E.: Transformation-based error-driven learning and natural language processing:A case study in part of speech tagging. Computational Lingusitics 21, 543–565 (1995)
Abney, S.P.: Parsing by chunks. In: Berwick, R.C., Abney, S.P., Tenny, C. (eds.) Principle- Based Parsing: Computation and Psycholinguistics, pp. 257–278. Kluwer, Dordrecht (1991)
Grishman, R.: The NYU system for MUC-6 or Where’s the Syntax? In: Proc. of the 6th Message Understanding Conference (MUC-6, 1995), Morgan Kaufmann, San Francisco (1995)
Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a high-performance learning name-finder. In: Proc. of 5th Conf. on Applied Natural Language Processing, pp. 194–201 (1997)
Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. Ph.d. thesis, NewYork University (1999)
Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazetteers. In: EACL 1999, Bergen, Norway, pp. 1–8 (1999)
Ide, N., Véronis, J.: Introduction to the special issue on word sense disambiguation: The state of the art. Computational Linguistics 24, 1–40 (1998)
Maedche, A., Pekar, V., Staab, S.: Ontology Learning Part One - On DiscoveringTaxonomic Relations from the Web, pp. 301–322. Springer, Heidelberg (2002)
Missikoff, M., Navigli, R., Velardi, P.: The usable ontology: An environment for building and assessing a domain ontology. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 39–53. Springer, Heidelberg (2002)
Hearst, M.A.: Automatic acqusition of hyponyms from large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France (1992)
Resnik, P.: Selection and Information: A Class-based Approach to Lexical Relationships. PhD thesis, University of Pennsylania (1993)
Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proceedings of ECAI 2000, IOS Press, Amsterdam (2000)
Faure, D., Nedellec, C.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC-98Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, Granada, Spain (1998)
Kavalec, M., Maedche, A., Svatek, V.: Discovery of lexical entries for non-taxonomic relations in ontology learning. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2004. LNCS, vol. 2932, Springer, Heidelberg (2004)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19, 17–30 (1989)
Deitel, A., Faron, C., Dieng, R.: Learning ontologies from rdf annotations. In: Proc. of the IJCAI Workshop in Ontology Learning (2001)
Doan, A., Domingos, P., Levy, A.: Learning source descriptions for data integration. In: Proc. of 3rd Int. Workshop on the Web and Databases, Dallas, Texas, pp. 81–86 (2000)
Wielinga, B.T.A., Wielemaker, S., Sandberg, J.: From thesaurus to ontology. In: Proc. of Int. Conf. on Knowledge Capture, pp. 194–201. ACM Press, New York (2001)
Golbeck, J., Fragoso, G., Hartel, F., Hendler, J., Parsia, B., Oberthaler, J.: The national cancer institute’s thesaurus and ontology. Journal ofWeb Semantics 1 (2003)
Beneventano, D., Bergamaschi, S., Guerra, F., Vincini, M.: Synthesizing an integrated ontology. IEEE Internet Computing XX, 42–51 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lamparter, S., Ehrig, M., Tempich, C. (2004). Knowledge Extraction from Classification Schemas. In: Meersman, R., Tari, Z. (eds) On the Move to Meaningful Internet Systems 2004: CoopIS, DOA, and ODBASE. OTM 2004. Lecture Notes in Computer Science, vol 3290. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30468-5_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-30468-5_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23663-4
Online ISBN: 978-3-540-30468-5
eBook Packages: Springer Book Archive