Abstract
Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of digital resources. Classifications describe their contents using natural language labels, an approach which has proved very effective in manual classification. However natural language labels show their limitations when one tries to automate the process, as they make it very hard to reason about classifications and their contents. In this paper we introduce the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language. Formal Classifications turn out to be some form of lightweight ontologies. This, in turn, allows us to reason about them, to associate to each node a normal form formula which univocally describes its contents, and to reduce document classification and query answering to reasoning about subsumption.
This paper is an integrated and extended version of two papers: the first with title “Towards a Theory of Formal Classification” was presented at the 2005 International Workshop on Context and Ontologies; the second with title “Encoding Classifications into Lightweight Ontologies” was presented at the 2006 European Semantic Web Conference.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)
Bouquet, P., Serafini, L., Zanobini, S.: Semantic coordination: a new approach and an application. In: Proc. of the 2nd International Semantic Web Conference (ISWO’03). Sanibel Islands, Florida, USA (October 2003)
Lois Mai Chan and J.S. Mitchell. Dewey Decimal Classification: A Practical Guide. Forest P.,U.S., (December 1996)
eCl@ss: Standardized Material and Service Classification. see http://www.eclass-online.com/
Adami, G., Avesani, P., Sona, D.: Clustering documents in a web directory. In: Proceedings of Workshop on Internet Data management (WIDM-03) (2003)
Giunchiglia, F., Shvaiko, P.: Semantic matching. Knowledge Engineering Review 18(3), 265–280 (2003)
Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-match: An algorithm and an implementation of semantic matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Berlin Heidelberg (2004)
F. Giunchiglia, P. Shvaiko, and M. Yatskevich. Semantic schema matching. In: CoopIS (2005)
Giunchiglia, F., Yatskevich, M.: Element level semantic matching. In: Meaning Coordination and Negotiation workshop, ISWC (2004)
Giunchiglia, F., Yatskevich, M., Giunchiglia, E: Efficient semantic matching. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 272–280. Springer, Berlin Heidelberg (2005)
Gordon, A.D.: Classification. Monographs on Statistics and Applied Probability. Chapman-Hall/CRC, Second edition (1999)
Ian Horrocks, Lei Li, Daniele Turi, and Sean Bechhofer. The instance store: DL reasoning with large numbers of individuals. In: Proc. of the 2004 Description Logic Workshop (DL 2004), pp. 31–40 (2004)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Johnson-Laird: Mental Models. Harvard University Press, Cambridge (1983)
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Fisher, D.H. (ed.) Proceedings of ICML-97. 14th International Conference on Machine Learning, Nashville, US, pp. 170–178. Morgan Kaufmann Publishers, San Francisco, US (1997)
Lenat, D.B.: CYC: A large-scale investment in knowledge infrastructure. Communications of the ACM 38(11), 33–38 (1995)
Bernardo Magnini, Luciano Serafini, and Manuela Speranza. Making explicit the semantics hidden in schema models. In: Proceedings of the Workshop on Human Language Technology for the Semantic Web and Web Services, held at ISWC-2003, Sanibel Island, Florida (October 2003)
McGuinness, D.L., Shvaiko, P., Giunchiglia, F., da Silva, P.P.: Towards explaining semantic matching. In: International Workshop on Description Logics at KR’04 (2004)
Miller, G.: WordNet: An electronic Lexical Database. MIT Press, Cambridge, MA (1998)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.M.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2/3), 103–134 (2000)
Noy, N.F.: Semantic integration: a survey of ontology-based approaches. SIGMOD Rec. 33(4), 65–70 (2004)
The OpenNLP project. See http://opennlp.sourceforge.net/
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Serafini, L., Zanobini, S., Sceffer, S, Bouquet, P.: Matching hierarchical classifications with attributes. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 4–18. Springer, Berlin Heidelberg (2006)
Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, London, UK (1984)
Sun, A., Lim, E.-P.: Hierarchical text classification and evaluation. In: ICDM, pp. 521–528 (2001)
MeSH: The National Library of Medicine’s controlled vocabulary thesaurus. see http://www.nlm.nih.gov/mesh/
DMoz: The Open Directory Project. See http://dmoz.org/
Uschold, M., Gruninger, M.: Ontologies and semantics for seamless connectivity. SIGMOD Rec. 33(4), 58–64 (2004)
van Assem, M., Menken, M.R., Schreiber, G., Wielemaker, J., Wielinga, B.: A method for converting thesauri to RDF/OWL. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 17–31. Springer, Berlin Heidelberg New York (2004)
Wille, R.: Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications 23, 493–515 (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Giunchiglia, F., Marchese, M., Zaihrayeu, I. (2007). Encoding Classifications into Lightweight Ontologies. In: Spaccapietra, S., et al. Journal on Data Semantics VIII. Lecture Notes in Computer Science, vol 4380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70664-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-70664-9_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70663-2
Online ISBN: 978-3-540-70664-9
eBook Packages: Computer ScienceComputer Science (R0)