Abstract
Our system applies authority-based ranking to keyword search in databases modeled as labeled graphs. Three ranking factors are used: the relevance to the query, the specificity and the importance of the result. All factors are handled using authority-flow techniques that exploit the link-structure of the data graph, in contrast to traditional Information Retrieval. We address the performance challenges in computing the authority flows in databases by using precomputation and exploiting the database schema if present. We conducted user surveys and performance experiments on multiple real and synthetic datasets, to assess the semantic meaningfulness and performance of our system.
- Abiteboul, S., Suciu, D., and Buneman, P. 2000. Data on the Web: From Relations to Semistructured Data and Xml. Morgan Kaufmann Series in Data Management Systems. Google ScholarDigital Library
- Agrawal, S., Chaudhuri, S., and Das, G. 2002. DBXplorer: A system for keyword-based search over relational databases. In Proceedings of the International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Aizawa, A. 2000. The feature quantity: an information theoretic perspective of tfidf-like measures. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
- Balmin, A., Hristidis, V., and Papakonstantinou, Y. 2004. ObjectRank: Authority-based keyword search in databases. In Proceedings of the International Conference on Very Large Database (VLDB). Google ScholarDigital Library
- Bhalotia, G., Nakhey, C., Hulgeri, A., Chakrabarti, S., and Sudarshan, S. 2002. Keyword searching and browsing in databases using BANKS. In Proceedings of the International Conference on Data Engineering (ICDE). Google ScholarDigital Library
- Bharat, K. and Henzinger, M. R. 1998. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
- Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the International/World Wide Web Conference (WWW). Google ScholarDigital Library
- Carmel, D., Cohen, D., Fagin, R., Farchi, E., Herscovici, M., Maarek, Y. S., and Soffer, A. 2001. Static index pruning for information retrieval systems. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
- Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Raghavan, P., and Rajagopalan, S. 1998. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the International/World Wide Web Conference (WWW). Google ScholarDigital Library
- Chen, Y., Gan, Q., and Suel, T. 2002. I/O-efficient techniques for computing PageRank. In Proceedings of the International Conference on Information and Knowledge Management (CIKM). Google ScholarDigital Library
- Cormen, T., Leiserson, C., and Rivest, R. 1989. Introduction to Algorithms. MIT Press. Google ScholarDigital Library
- Craswell, N., Robertson, S. E., Zaragoza, H., and Taylor, M. J. 2005. Relevance weighting for query independent evidence. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
- Croft, W. B. 2000. Combining approaches to information retrieval. In Advances in Information Retrieval: Recent Research from the CIIR, Chapter 1, Kluwer.Google Scholar
- Dar, S., Entin, G., Geva, S., and Palmon, E. 1998. DTL's DataSpot: Database exploration using plain language. In Proceedings of the International Conference on Very Large Database (VLDB). Google ScholarDigital Library
- Doyle, P. G. and Snell, J. L. 1984. Random Walks and Electric Networks. Mathematical Association of America, Washington, DC.Google Scholar
- Fagin, R., Kumar, R., and Sivakumar, D. 2003. Comparing top k lists. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA). Google ScholarDigital Library
- Fagin, R., Lotem, A., and Naor, M. 2001. Optimal aggregation algorithms for middleware. In Proceedings of the ACM SIGACT - SIGMOD - SIGART Simposium on Principles of Database Systems (PODS). Google ScholarDigital Library
- Faloutsos, C., McCurley, K. S., and Tomkins, A. 2004. Fast discovery of connection subgraphs. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04). Google ScholarDigital Library
- Geerts, F., Mannila, H., and Terzi, E. 2004. Relational link-based ranking. In Proceedings of the International Conference on Very Large Database (VLDB). Google ScholarDigital Library
- Golub, G. H. and Loan, C. F. 1996. Matrix Computations. Johns Hopkins.Google Scholar
- Gu, X., Nahrstedt, K., Yuan, W., Wichadakul, D., and Xu, D. 2002. An XML-based quality of service enabling language for the web. J. Visual Langu. Comput. 13, 1, 61--95.Google ScholarDigital Library
- Guo, L., Shao, F., Botev, C., and Shanmugasundaram, J. 2003. XRANK: Ranked keyword search over XML documents. In Proceedings of the International Conference on Management of Data (SIGMOD). Google ScholarDigital Library
- Gyongyi, Z., Garcia-Molina, H., and Pedersen, J. 2004. Combating Web spam with TrustRank. In Proceedings of the International Conference on Very Large Databases (VLDB). Google ScholarDigital Library
- Haveliwala, T. 1999. Efficient computation of PageRank. Tech. rep. Stanford University (http://www.stanford.edu/~taherh/papers/efficient-pr.pdf).Google Scholar
- Haveliwala, T. 2002. Topic-sensitive PageRank. In Proceedings of the International/World Wide Web Conference (WWW). Google ScholarDigital Library
- Hristidis, V., Gravano, L., and Papakonstantinou, Y. 2003. Efficient IR-style keyword search over relational databases. In Proceedings of the International Conference on Very Large Databases (VLDB). Google ScholarDigital Library
- Hristidis, V. and Papakonstantinou, Y. 2002. DISCOVER: Keyword search in relational databases. In Proceedings of the International Conference on Very Large Databases (VLDB). Google ScholarDigital Library
- Hristidis, V., Papakonstantinou, Y., and Balmin, A. 2003. Keyword proximity search on XML graphs. In Proceedings of the International Conference on Data Engineering (ICDE).Google Scholar
- Huang, A., Xue, Q., and Yang, J. 2003. TupleRank and implicit relationship discovery in relational databases. In Proceedings of the International Conference on Web-Age Information Management (WAIM).Google Scholar
- Hwang, H., Hristidis, V., and Papakonstantinou, Y. 2006. ObjectRank: A system for authority-based search on databases. Demonstration at SIGMOD. Google ScholarDigital Library
- Jeh, G. and Widom, J. 2003. Scaling personalized Web search. In Proceedings of the International/World Wide Web Conference (WWW). Google ScholarDigital Library
- Kamvar, S., Haveliwala, T., Manning, C., and Golub, G. 2003. Extrapolation methods for accelerating PageRank computations. In Proceedings of the Internatinal/World Wide Web Conference (WWW). Google ScholarDigital Library
- Kleinberg, J. M. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46. Google ScholarDigital Library
- Motwani, R. and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press, Cambridge, UK. Google ScholarDigital Library
- Ramakrishnan, R. and Gehrke, J. 2003. Database Management Systems. 3rd Ed. McGraw-Hill. Google ScholarDigital Library
- Raschid, L., Wu, Y., Lee, W.-J., Vidal, M. E., Tsaparas, P., Srinivasan, P., and Sehgal, A. K. 2006. Ranking target objects of navigational queries. In Proceedings of the 8th ACM International Workshop on Web Information and Data Management (WIDM06). Google ScholarDigital Library
- Richardson, M. and Domingos, P. 2002. The intelligent surfer: Probabilistic combination of link and content information in PageRank. Advances in Neural Information Processing Systems 14, MIT Press.Google Scholar
- Salton, G. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley. Google ScholarDigital Library
- Savoy, J. 1992. Bayesian inference networks and spreading activation in hypertext systems. Inform. Proc. Manag. 28, 3, 389--406. Google ScholarDigital Library
- Shafer, P., Isganitis, T., and Yona, G. 2006. Hubs of knowledge: Using the functional link structure in Biozon to mine for biologically significant entities. BMC Bioinformatics. 15, 7, 71.Google Scholar
- Singhal, A. 2001. Modern information retrieval: A brief overview. IEEE Data Engin. Bull., Special Issue on Text and Databases 24, 4.Google Scholar
- Tong, H. and Faloutsos, C. 2006. Center-piece subgraphs: problem definition and fast solutions. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06). Google ScholarDigital Library
Index Terms
- Authority-based keyword search in databases
Recommendations
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementThis work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Content and link-structure perspective of ranking webpages: A review
AbstractThe delivery of ranked relevant results is probably the most important factor in making a web search engine acceptable to its users. This inspiration has led the search engine engineers and researchers to conceive ranking algorithms ...
Comments