Skip to main content
Log in

Semantic-distance based evaluation of ranking queries over relational databases

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Traditional database search uses pattern match in the comparison process. For a query with some search words, tuples are selected only if the words of the tuples exactly match the query words. In this paper, we propose a new method for evaluating relational ranking queries (or top-N queries) with text attributes. This method defines semantic distance functions and utilizes semantic match between words in database search. The attempt is that tuples, not only exactly matching, but also close to the query according to semantic distances, can both be fetched. The basic idea of the method is to create an index based on WordNet to expand the tuple words semantically. The candidate results for a query are retrieved by the index and a simple SQL selection statement, and then top-N answers are obtained. Extensive experiments are carried out to measure the performance of this new strategy for the evaluation of ranking queries over relational databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. A relationship L on the sets X 1, ..., X k is a subset of their Cartesian product, written \(L \subseteq X_{1} \times \) ... ×X k .

References

  • Bates, M. J. (1989). Rethinking subject cataloging in the online environment. Library Resources & Technical Services, 33(4), 400–412.

    MathSciNet  Google Scholar 

  • Budanitsky, A., & Hirst, G. (2001). Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Proceedings of NAACL 2001 workshop on WordNet and other lexical resources. Pittsburgh, USA.

  • Buscaldi, D., Rosso, P., & Sanchis, A. E. (2005). A wordnet-based query expansion method for geographical information retrieval. In Working notes for the CLEF workshop. Vienna, Austria.

  • Carey, M., & Kossmann, D. (1997). On saying “enough already!” In SQL. In Proceedings ACM international conference on management of data (SIGMOD’97) (pp. 219–230). Tucson, Arizona, USA.

  • Chen, Y. (2002). Raw relation sets, order fusion and top-N query problem. Ph.D. Dissertation, Department of Computer Science, SUNY at Binghamton.

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms (2nd ed.). Cambridge: MIT.

    MATH  Google Scholar 

  • Das, S., Chong, E. I., Eadon, G., & Srinivasan, J. (2004). Supporting ontology-based semantic matching in RDBMS. In Proceedings of the thirtieth international conference on very large data bases (VLDB’04) (pp. 1054–1065). Toronto, Canada.

  • Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407.

    Article  Google Scholar 

  • Dumais, S. T., Landauer, T. K., & Littman, M. L. (1996). Automatic cross-linguistic information retrieval using latent semantic indexing. In Proceedings ACM SIGIR ’96 workshop on cross-linguistic information retrieval. Zurich, Switzerland.

  • Fellbaum, C. (1998). WordNet: An electronic lexical database. Cambridge: MIT.

    MATH  Google Scholar 

  • Hristidis, V., Gravano, L., & Papakonstantinou, Y. (2003). Efficient IR-style keyword search over relational databases. In Proceedings of 29th international conference on very large data bases (VLDB’03) (pp. 850–861). Berlin, Germany.

  • Hung, E., Deng, Y., & Subrahmanian, V. S. (2004). TOSS: An extension of TAX with ontologies and similarity queries. In Proceedings of the ACM international conference on management of data (SIGMOD’04) (pp. 719–730). Paris, France.

  • Ilyas, I. F., Beskales, G., & Soliman, M. A. (2008). A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys, 40(4), 11.

    Article  Google Scholar 

  • IPUMS Census Database (1990). Retrieved from http://kdd.ics.uci.edu/databases/ipums/ipums.html.

  • Kandogan, E., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., & Zhu, H. (2006). Avatar semantic search: A database approach to information retrieval. In Proceedings of the ACM international conference on management of data (SIGMOD’06) (pp. 790–792). Chicago, Illinois, USA.

  • Kruse, P. M., Naujoks, A., Roesner, D., & Kunze, M. (2005). Clever search: A wordnet based wrapper for internet search engines. In Proc. 2nd GermaNet workshop, The Computing Research Repository (CoRR:2005) abs/cs/0501086.

  • Li, C., Chang, K., Ilyas, I. F., & Song, S. (2005). RankSQL, query algebra and optimization for relational top-k queries. In Proceedings ACM international conference on management of data (SIGMOD’05) (pp. 131–142). Baltimore, Maryland, USA.

  • Lim, L., Wang, H., & Wang, M. (2007). Unifying data and domain knowledge using virtual views. In Proceedings of the 33rd international conference on very large data bases (VLDB’07) (pp. 255–266). Vienna, Austria.

  • Liu, F., Yu, C., Meng, W., & Chowdhury, A. (2006). Effective keyword search in relational databases. In Proceedings of the ACM international conference on management of data (SIGMOD’06) (pp. 563–574). Chicago, IL, USA.

  • Lu, Y., Meng, W., Shu, L., Yu, C., & Liu, K. (2005). Evaluation of result merging strategies for metasearch engines. In 6th international conference on web information systems engineering (WISE’05) (pp. 53–66). New York, USA.

  • Mäkelä, E. (2005). Survey of semantic search research. In Proceedings of the seminar on knowledge management on the semantic web. Department of Computer Science, University of Helsinki. Retrieved from http://www.sange.fi/~humis/sw/semantic_search.pdf or http://www.seco.tkk.fi/publications/2005/makela-semantic-search-2005.pdf.

  • Miller, G. A., Fellbaum, C., Tengi, R., & Langone, H. (2005). WordNet 2.1. Retrieved from http://wordnet.princeton.edu/.

  • Moldovan, D. I., & Mihalcea, R. (2000). Using wordnet and lexical operators to improve internet searches. IEEE Internet Computing, 4(1), 34–43.

    Article  Google Scholar 

  • Motro, A. (1988). VAGUE: A user interface to relational databases that permits vague queries. ACM Transactions on Office Information Systems, 6(3), 187–214. doi:10.1145/45945.48027.

    Article  Google Scholar 

  • Necib, C. B., & Freytag, J. C. (2003). Ontology based query processing in database management systems. In CoopIS/DOA/ODBASE2003 (pp. 839–857). Catania, Sicily, Italy.

  • Pedersen, T. (2005). WordNet::milarity. Retrieved from http://search.cpan.org/dist/WordNet-Similarity.

  • Pedersen, T. (2008). WordNet::milarity. Retrieved from http://www.d.umn.edu/~tpederse/similarity.html.

  • Singhal, A. (2001). Modern information retrieval: A brief overview. IEEE Data Engineering Bulletin, 24(4), 35–43.

    Google Scholar 

  • Singhal, A., Buckley, C., & Mitra, M. (1996). Pivoted document length normalization. In Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’96) (pp. 21–29). Zurich, Switzerland.

  • Udrea, O., Deng, Y., Hung, E., & Subrahmanian, V. S. (2005). Probabilistic ontologies and relational databases. In CoopIS/DOA/ODBASE2005 (pp. 1–17). Agia Napa, Cyprus.

  • Yu, C., Philip, G., & Meng, W. (2003). Distributed top-N query processing with possibly uncooperative local systems. In Proceedings of 29th international conference on very large data bases (VLDB’03) (pp. 117–128). Berlin, Germany.

  • Zhang, J., Peng, Z., Wang, S., & Nie, H. (2006). Si-SEEKER: Ontology-based semantic search over databases. In Knowledge science, engineering and management, first international conference (KSEM 2006) (pp. 599–611). LNAI 4092. Guilin, China.

  • Zhu, L., Meng, W., Liu, C., Yang, W., & Liu, D. (2009). Processing top-N relational queries by learning. Journal of Intelligent Information Systems. Published online: 14 February 2009. Retrieved from http://www.springerlink.com/content/x51676946518726x/. doi:10.1007/s10844-009-0078-7.

Download references

Acknowledgements

This work is supported in part by the NSFC(30971693, 60873145), Hebei University PhD grant (2009-160), and the key project of applied fundamental research and the NSF of Hebei Province (08963522D, F2008000635). The authors would also like to express their gratitude to Professor Weiyi Meng for providing many helpful suggestions, and to Shenda Ji and Yanchao Feng who helped to recognize the semantic match between queries and answers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qin Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, L., Ma, Q., Liu, C. et al. Semantic-distance based evaluation of ranking queries over relational databases. J Intell Inf Syst 35, 415–445 (2010). https://doi.org/10.1007/s10844-009-0116-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-009-0116-5

Keywords

Navigation