Personalized ranking in web databases: establishing and utilizing an appropriate workload

Telang, Aditya; Chakravarthy, Sharma; Li, Chengkai

doi:10.1007/s10619-012-7106-2

Personalized ranking in web databases: establishing and utilizing an appropriate workload

Published: 16 August 2012

Volume 31, pages 47–70, (2013)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Aditya Telang¹,
Sharma Chakravarthy¹ &
Chengkai Li¹

369 Accesses
Explore all metrics

Abstract

The emergence of the deep Web has given a new connotation to the concept of ranking database query results. Earlier approaches for ranking either resorted to analyzing frequencies of database values and query logs or establishing user profiles. In contrast, an integrated approach, based on the notion of a similarity model, for holistically supporting user- and query-dependent ranking has been recently proposed (Telang et al. in IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011). An important component of this framework is a workload consisting of ranking functions, wherein each function represents an individual user’s preferences towards the results of a specific query. At the time of answering a query for which no prior ranking function exists, the similarity model is employed, and is expected to ensure a good quality of ranking as long as a ranking function for a very similar user-query pair exists in this workload.

In this paper, we address the problem of determining an appropriate set of user-query pairs to form a workload of ranking functions to support user- and query-dependent ranking for Web databases. We propose a novel metric, termed workload goodness, that quantifies the notion of a “good” workload into an absolute value. The process of finding such a workload of optimal goodness is a combinatorially explosive problem; therefore, we propose a heuristic solution, and advance three approaches for determining an acceptable workload, in a static as well as a dynamic environment. We discuss the effectiveness of our proposal analytically as well as experimentally over two Web databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Crawling ranked deep Web data sources

Article 03 September 2016

Improving the Effectiveness of Keyword Search in Databases Using Query Logs

Adaptive query relaxation and top-k result ranking over autonomous web databases

Article 16 August 2016

Notes

The concept of workload here is significantly different from the one in traditional databases. In the Former’s case, the workload is a collection of ranking functions along with the user-query pairs for whom the functions are derived; in contrast, it pertains to a log of queries in the latter’s context.
A ranking function is obtained via a learning model, proposed in [30], that analyzes a user’s preferences towards the results of a query.
The functional details of the similarity-based ranking framework are elaborated in Sect. 2.
Given that we focus on establishing only W _K, we use the term workload and W _K interchangeably for the rest of the paper.
Typically, the number of users and queries on most real Web databases like Yahoo! Autos, Google Base, etc. are extremely large, whereas the value of K is typically much smaller.
The value ‘any’ will match all possible values for the domain of the particular attribute. For example, a value of ‘any’ for the Transmission attribute in a Vehicle database retrieves cars with ‘manual’ as well as ‘auto’ transmission.
Without loss of generality, we assume {Q ₁,Q ₂,…,Q _r} are the common queries for U and U′, although they can be any queries.
As elaborated in Sect. 2, since the highest rank of 0 is assigned to the pair itself, the next highest possible rank, computed by (1), of a user-query pair with respect to a given pair is 1.

References

Agrawal, R., Rantzau, R., Terzi, E.: Context-sensitive ranking. In: SIGMOD Conference, pp. 383–394. ACM, New York (2006)
Google Scholar
Agrawal, S., Chaudhuri, S., Das, G., Gionis, A.: Automated ranking of database query results. In: Conference on Innovations in Database Research (CIDR) (2003)
Google Scholar
Balabanovic, M., Shoham, Y.: Content-based collaborative recommendation. ACM Commun. 40(3), 66–72 (1997)
Article Google Scholar
Basilico, J., Hofmann, T.: A joint framework for collaborative and content filtering. In: SIGIR, pp. 550–551 (2004)
Chapter Google Scholar
Basu, C., Hirsh, H., Cohen, W.W.: Recommendation as classification: using social and content-based information in recommendation. In: AAAI/IAAI, pp. 714–720 (1998)
Google Scholar
Bergman, M.K.: The deep web: surfacing hidden value. J. Electron. Publ. 7(1) (2001)
Billsus, D., Pazzani, M.J.: Learning collaborative information filters. In: International Conference on Machine Learning (ICML), pp. 46–54 (1998)
Google Scholar
Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7, 448–461 (1973)
Article MathSciNet MATH Google Scholar
Chang, K.C.-C., He, B., Li, C., Patil, M., Zhang, Z.: Structured databases on the web: observations and implications. SIGMOD Rec. 33(3), 61–70 (2004)
Article Google Scholar
Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic ranking of database query results. In: VLDB, pp. 888–899 (2004)
Chapter Google Scholar
Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic information retrieval approach for ranking of database query results. TODS 31(3), 1134–1168 (2006)
Article Google Scholar
Foltz, P.W., Dumais, S.T.: Personalized information delivery: an analysis of information filtering methods. ACM Commun. 35(12), 51–60 (1992)
Article Google Scholar
Gauch, S., Speretta, M., Chandramouli, A., Micarelli, A.: User profiles for personalized information access. In: The Adaptive Web, pp. 54–89 (2007)
Chapter Google Scholar
Google. Google base. http://www.google.com/base
Hofmann, T.: Collaborative filtering via gaussian probabilistic latent semantic analysis. In: SIGIR, pp. 259–266 (2003)
Google Scholar
Hwang, S.-W.: Supporting ranking for data retrieval. Ph.D. thesis, University of Illinois, Urbana Champaign (2005)
Ilyas, I.F., Soliman, M.A.: Probabilistic Ranking Techniques in Relational Databases. Synthesis Lectures on Data Management (2011). Morgan & Claypool Publishers
MATH Google Scholar
Kanungo, T., Mount, D.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Article Google Scholar
Koutrika, G.: Database query personalization. In: EDBT, pp. 147–152 (2005)
Google Scholar
Koutrika, G., Ioannidis, Y.E.: Personalization of queries in database systems. In: ICDE, pp. 597–608 (2004)
Google Scholar
Koutrika, G., Ioannidis, Y.E.: Constrained optimalities in query personalization. In: SIGMOD Conference, pp. 73–84 (2005)
Google Scholar
Li, C., Chang, K.C.-C., Ilyas, I.F., Song, S.: Ranksql: query algebra and optimization for relational top-k queries. In: SIGMOD Conference, pp. 131–142 (2005)
Google Scholar
Marian, A., Bruno, N., Gravano, L.: Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004)
Article Google Scholar
Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An approach to integrating query refinement in sql. In: EDBT, pp. 15–33 (2002)
Google Scholar
Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: SIGIR, pp. 253–260 (2002)
Google Scholar
Soliman, M.A., Ilyas, I.F., Ben-David, S.: Supporting ranking queries on uncertain and incomplete data. VLDB J. 19(4), 477–501 (2010)
Article Google Scholar
Soliman, M.A., Ilyas, I.F., Martinenghi, D., Tagliasacchi, M.: Ranking with uncertain scoring functions: semantics and sensitivity measures. In: SIGMOD Conference, pp. 805–816 (2011)
Google Scholar
Su, W., Wang, J., Huang, Q., Lochovsky, F.: Query result ranking over e-commerce web databases. In: Conference on Information and Knowledge Management (CIKM), pp. 575–584 (2006)
Google Scholar
Telang, A., Li, C., Chakravarthy, S.: One size does not fit all: towards user- and query-dependent ranking for web databases. Technical report 6, University of Texas at Arlington (2009)
Telang, A., Li, C., Chakravarthy, S.: One size does not fit all: towards user- and query-dependent ranking for web databases. IEEE Transactions on Knowledge and Data Engineering (TKDE) (2011)
Werner, K.: Foundations of preferences in database systems. In: VLDB. VLDB Endowment, pp. 311–322 (2002)
Google Scholar
Yu, H., Hwang, S.-w., Chang, K.C.-C.: Enabling soft queries for data retrieval. Inf. Syst. 32(4), 560–574 (2007)
Article Google Scholar
Yu, H., Kim, Y., won Hwang, S.: Rv-svm: an efficient method for learning ranking svm. In: PAKDD, pp. 426–438 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Comp. Sci. & Engg., The University of Texas at Arlington, Arlington, TX, USA
Aditya Telang, Sharma Chakravarthy & Chengkai Li

Authors

Aditya Telang
View author publications
You can also search for this author inPubMed Google Scholar
Sharma Chakravarthy
View author publications
You can also search for this author inPubMed Google Scholar
Chengkai Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Aditya Telang.

Additional information

Communicated by: Kaushik Chakrabarti.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Telang, A., Chakravarthy, S. & Li, C. Personalized ranking in web databases: establishing and utilizing an appropriate workload. Distrib Parallel Databases 31, 47–70 (2013). https://doi.org/10.1007/s10619-012-7106-2

Download citation

Published: 16 August 2012
Issue Date: March 2013
DOI: https://doi.org/10.1007/s10619-012-7106-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized ranking in web databases: establishing and utilizing an appropriate workload

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Crawling ranked deep Web data sources

Improving the Effectiveness of Keyword Search in Databases Using Query Logs

Adaptive query relaxation and top-k result ranking over autonomous web databases

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now