Abstract
The amount of data available in the Web, in databases as well as other systems, is constantly increasing as increasing is the number of users that wish to access such data. Data is available in forms that may not be of easy access for not expert users. Keyword Search approaches are an effort to abstract from specific data representations, allowing users to retrieve information by providing a few terms of interest. Many solutions build on dedicated indexing techniques as well as search algorithms aiming at finding substructures that connect the data elements matching the keywords. In this paper, we present the development of Yaanii, a tool for effective Keyword Search over semantic datasets. Yaaniiis based on a novel keyword search paradigm for graph-structured data, focusing in particular on the RDF data model. We provide a clustering technique that identifies and groups graph substructures based on template match. A scoring function, IR inspired, evaluates the relevance of the substructures and of the clusters, and supports the generation of Top-k solutions during its execution in the first k steps. Experiments demonstrate the effectiveness of our approach.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abadi, D.J., Madden, S., Hollenbach, K.J.: Scalable semantic Web data management using vertical partitioning. In: Int. Conf. on Very Large DataBase (VLDB 2007), Austria (2007)
Agrawal, S., Chaudhuri, S., Das, G.: Dbxplorer: enabling keyword search over relational databases. In: Int. Conf. on Management of Data (SIGMOD 2002), USA (2002)
Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword searching and browsing in databases using banks. In: Int. Conf. on Data Engineering, ICDE 2002 (2002)
Chong, E.I., Das, S., Eadon, G., Srinivasan, J.: An efficient sql-based rdf querying scheme. In: Int. Conf. on Very Large DataBase (VLDB 2005), Norway (2005)
Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: Xsearch: A semantic search engine for xml. In: Int. Conf. on Very Large DataBase (VLDB 2003), Germany (2003)
Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: Xrank: Ranked keyword search over xml documents. In: Int. Conf. on Management of Data (SIGMOD 2003), USA (2003)
He, H.: Wang, H., Yang, J., Yu, P.S. Blinks: ranked keyword searches on graphs. In: Int. Conf. on Management of Data (SIGMOD 2007), China (2007)
Hristidis, V., Papakonstantinou, Y.: Discover: Keyword search in relational databases. In: Int. Conf. on Very Large DataBase (VLDB 2002), China (2002)
Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient ir-style keyword search over relational databases. In: Int. Conf. on Very Large DataBase (VLDB 2003), Germany (2003)
Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: Int. Conf. on Very Large DataBase (VLDB 2005), Norway (2005)
Kaushik, R., Krishnamurthy, R., Naughton, J.F., Ramakrishnan, R.: On the integration of structure indexes and inverted lists. In: Int. Conf. on Management of Data (SIGMOD 2004), France (2004)
Kimelfeld, B., Sagiv, Y.: Finding and approximating Top-k answers in keyword proximity search. In: Int. Symposium on Principles of Database Systems (PODS 2006), USA (2006)
Knuth, D.E.: The Art Of Computer Programming, 3rd edn., vol. 1. Addison-Wesley, Reading (1997)
Lalmas, M., Tombros, A.: INEX 2002 - 2006: Understanding XML Retrieval Evaluation. In: Thanos, C., Borri, F., Candela, L. (eds.) DELOS 2007. LNCS, vol. 4877, pp. 187–196. Springer, Heidelberg (2007)
Liu, F., Yu, C.T., Meng, W., Chowdhury, A.: Effective keyword search in relational databases. In: Int. Conf. on Management of Data (SIGMOD 2006), USA (2006)
Radev, D.R., Qi, H., Wu, H., Fan, W.: Evaluating Web-based Question Answering Systems. In: Proc. of 3rd Int. Conf. on Language Resources and Evaluation (LREC 2002), Spain (2002)
Singhal, A., Buckley, M.C.: Mitra Pivoted Document Length Normalization. In: Int. Conf. on Information Retrieval (SIGIR), Switzerland (1996)
Singhal, A.: Modern Information Retrieval: A Brief Overview. In: IEEE Data Eng. Bull, Switzerland, pp. 35–43 (2001)
Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query graph candidates for efficient keyword search on rdf. In: Int. Conf. on Data Engineering (ICDE 2009), China (2009)
Voorhees, E.M.: The TREC-8 Question Answering Track Report. In: Proc. of the 8th Text REtrieval Conference (TREC-8), Maryland (1999)
Yahia, S.A., Koudas, N., Marian, A., Srivastava, D., Toman, D.: Structure and Content Scoring for XML. In: Proc. of Int. Conf. on Very Large DataBase (VLDB 2005), Norway (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
De Virgilio, R., Cappellari, P., Miscione, M. (2009). Cluster-Based Exploration for Effective Keyword Search over Semantic Datasets. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds) Conceptual Modeling - ER 2009. ER 2009. Lecture Notes in Computer Science, vol 5829. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04840-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-04840-1_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04839-5
Online ISBN: 978-3-642-04840-1
eBook Packages: Computer ScienceComputer Science (R0)