Abstract
In many areas of life science, such as biology and medicine, ontologies are nowadays commonly used to annotate objects of interest, such as biological samples, clinical pictures, or species in a standardized way. In these applications, an ontology is merely a structured vocabulary in the form of a tree or a directed acyclic graph of concepts. Typically, ontologies are stored together with the data they annotate in relational databases. Querying such annotations must obey the special semantics encoded in the structure of the ontology, i.e. relationships between terms, which is not possible using standard SQL alone.
In this paper, we develop a new method for querying DAGs using a pre-computed index structure. Our new indexing method extends the pre-/ postorder ranking scheme, which has been studied intensively for trees, to DAGs. Using typical queries on ontologies, we compare our approach to two other commonly used methods, i.e., a recursive database function and the pre-computation of the transitive closure of a DAG.
We show that pre-computed indexes are an order of magnitude faster than recursive methods. Clearly, our new scheme is slower than usage of the transitive closure, but requires only a fraction of the space and is therefore applicable even for very large ontologies with more than 200,000 concepts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wheeler, D.L., Chappey, C., Lash, A.E., Leipe, D.D., Madden, T.L., Schuler, G.D., Tatusova, T.A., Rapp, B.A.: Database resources of the National Center for Biotechnology Information. Nucleic Acids Research 28(1), 10–14 (2000)
Gene Ontoloy Consortium. The Gene Ontology (GO) database and inforamtics resource. Nucleic Acids Research 32, D258–D261 (2004) (Database issue)
Dietz, P., Sleator, D.: Two algorithms for maintaining order in a list. In: Proceedings of the nineteenth annual ACM conference on Theory of computing, pp. 365–372. ACM Press, New York (1987)
Grust, T.: Accelerating XPath location steps. In: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 109–120. ACM Press, New York (2002)
Lu, H.: New strategies for computing the transitive closure of a database relation. In: Proceedings of the 13th International Conference on Very Large Data Bases, pp. 267–274. Morgan Kaufmann Publishers Inc., San Francisco (1987)
Valduriez, P., Boral, H.: Evaluation of recursive queries using join indices. In: Kerschberg, L. (ed.) First International Conference on Expert Database Systems, Redwood City, CA, pp. 271–293. Addison-Wesley, Reading (1986)
Mayer, S., Grust, T., van Keulen, M., Teubner, J.: An injection of tree awareness: Adding staircase join to postgresql. In: Nascimento, M.A., Özsu, M.T., Kossmann, D., Miller, R.J., Blakeley, J.A., Schiefer, K.B. (eds.) VLDB, pp. 1305–1308. Morgan Kaufmann, San Francisco (2004)
Vagena, Z., Moro, M.M., Tsotras, V.J.: Twig query processing over graph-structured xml data. In: Amer-Yahia, S., Gravano, L. (eds.) WebDB, pp. 43–48 (2004)
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Clifford, J., Lindsay, B.G., Maier, D. (eds.) SIGMOD Conference, pp. 253–262. ACM Press, New York (1989)
Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the hopi index for complex xml document collections. In: ICDE (2005)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)
Wu, A.Y., Garland, M., Han, J.: Mining scale-free networks using geodesic clustering. In: Kim, W., Kohavi, R., Gehrke, J., DuMouchel, W. (eds.) KDD, pp. 719–724. ACM, New York (2004)
Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: Weikum, G., König, A.C., Deßloch, S. (eds.) SIGMOD Conference, pp. 335–346. ACM, New York (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Trißl, S., Leser, U. (2005). Querying Ontologies in Relational Database Systems. In: Ludäscher, B., Raschid, L. (eds) Data Integration in the Life Sciences. DILS 2005. Lecture Notes in Computer Science(), vol 3615. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11530084_7
Download citation
DOI: https://doi.org/10.1007/11530084_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27967-9
Online ISBN: 978-3-540-31879-8
eBook Packages: Computer ScienceComputer Science (R0)