A data-mining approach to preference-based data ranking founded on contextual information
Highlights
► The paper proposes a methodology to mine contextual preferences on tuples and attributes of a relational database. ► Preferences are used to personalize context-aware views over a database. ► Preferences are mined extracting association rules from log data requiring nouser intervention. ► Test data is collected by making real users interact with a prototype of our system. ► Our approach shows better recall with respect to other methodologies of the literature.
Introduction
The current ecosystem of available digital information represents an unprecedented opportunity for the users, but at the same time risks to overwhelm them during decision-making [1]. The effect of this problem is amplified for users who access data by means of mobile devices, which are equipped with limited resources and connectivity and thus impose that only the most valuable information should be kept on board. Imagine you want to keep on your smartphone some information for on-line trading but also to support your shopping activity and your travels: some of the personal data you need for these operations resides on your device, but keeping what is necessary for all three operations on the smartphone all the time is not really sensible. Instead, eliminating, at any time, the redundant information will speed-up your work both in terms of device efficiency and of the effectiveness that you can achieve by working in the absence of information noise.
However, distinguishing useful data from all the information which is irrelevant to the specific application or user is not a trivial task, since the same piece of information can be considered differently, even by the same user, in different situation or places—in a single word, in a different context.
This emergent problem has been tackled in the literature by introducing context models (see [2], [3], [4] for surveys) allowing the personalization of data repositories on the basis of a set of perspectives, or dimensions, such as the user's role and location, the time, his or her interests and the situations he or she is involved in [5]. However, data personalization based on context may be only a partial solution, since the tailoring of the available dataset may still be too coarse-grained. For example, if we consider a movie dataset and Bob – a young teenager who is interested in movies – a contextual system will suggest the movies played in cinemas close to Bob's location and appropriate for people of his age, but will not be able to propose any ranking or further filtering of this contextual data according to Bob's personal tastes: for example, Bob might like watching comedies when alone and thrillers when with his friends.
Therefore, to attain more effective personalization, this work couples the notion of context with the user personal preferences: this allows to rank the information delivered to Bob differently in each different context (alone or with friends).
The approaches already proposed for personalizing relational data (tuples or attributes) on the basis of contextual preferences [6], [7], [8] rely on the collaboration of the users for preference indication. However, with a large variety of data and a considerable number of possible contexts, the manual specification of an extensive list of preferences may be a trying experience which discourages the user. A way around this problem is exploiting other information, implicitly provided by the past querying activity of the user. This activity can be of various kinds, e.g. Bob might formulate queries to visualize the titles of the available comedies, then select “The Muppets” to see further details and subsequently repeat the same operation for other Disney movies in the list, and finally decide to watch one of them. A system analyzing Bob's activity may discover that he is often attracted by Disney comedies.
Given this rationale, this paper's contribution is the PREMINE (PREference MINEr) methodology and the related system, which use data mining algorithms to learn the contextual preferences of the users on both tuples and attributes of relational databases. Our interest towards the relational technology is motivated by the fact that most commercial databases, and also a significant part of the deep web rely on it, therefore handling relational preferences, have long been recognized as an important issue [9].
Contextual preferences are thus used to further personalize the set of data associated with each context (called contextual view) and can be applied with two goals: (1) to minimize the information noise, presenting a list of the data ordered by their relevance for the user with the effect of “recommending” the highest-ranked data, (2) to fulfill the memory requirements imposed by small devices, by loading only the data which have been ranked high according to the user preferences. Our approach starts from the contextual preference model introduced in [7] and adds a sophisticated technique to mine contextual association rules (that is, co-occurrences between each context and the browsed data) from the past interaction of the user with the contextual views over a given relational dataset.
Although there are several degrees of freedom for the personalization, leading to a large set of possible approaches, in this paper we focus on the preference mining part and give a quick account of how the mined preferences are used to produce the personalized contextual view.1 Also, we remark that our proposal, differently from the majority of recommendation systems, does not require any explicit input from the users about their preferences.
The procedure goes as follows: on the user's device runs a client application accessing a contextual view of the global database. This portion of data is initially selected only on the basis of the user current context; the user's querying activity and subsequent browsing in the list of the returned tuples allow the PREMINE server-side application to gain knowledge about the correlations between a context and the properties of the data preferred in that context. Afterwards, when the device connects to the application server, this knowledge is used to further filter and personalize the contextual view.
Note that the proposed approach does not completely exclude the manual specification of preferences; in fact, the two approaches can be used in conjunction: the user can manually add preferences or adjust the mined ones, when they do not reflect any more his or her actual needs. Some encouraging experiments performed with real users interacting with the dataset of a European company of video on demand show the practical impact of our proposal.
Running example: Fig. 1 shows the relational schema of the running example we use throughout the paper (a simplification of the mentioned case study), namely the information system of a company offering services of video on demand and reservation of movie tickets. All the applications composing the information system rely on a central database storing all the managed information. This database is also used for the experimental session at the end of the paper.
Paper structure: The structure of the paper is as follows. Section 2 presents the state of the art, Section 3 introduces some preliminary notions and Section 4 presents the mining framework. 5 Mining, 6 Mining describe our strategies for mining preferences, respectively, on tuples and attributes. Section 7 shows the effectiveness of the approach illustrating the experiments we have performed and, finally, Section 8 draws the conclusions.
Section snippets
State of the Art
The technique of mining contextual association rules has been proposed in [10] with the purpose of analyzing frequent user accesses to available services; however, in that case the authors focus on the mining process without using the discovered association rules for data personalization.
The problem of learning user preferences has been recognized as important in many applications. For example, problems related to ordering data on the basis of preference information explicitly provided by the
Preliminaries
PREMINE is an extension of the context-based personalization framework presented in [7]: thus, before introducing the innovative aspects of our proposal, in this section we quickly describe the background notions our work relies on.
Preference mining framework
Our approach for mining contextual preferences is integrated within the wider framework for contextual-view tailoring and personalization shown in Fig. 3.
Users interact with the information system by means of different kinds of portable devices, like PDAs or smartphones, but also by other systems, such as hot-spot terminals or desktop computers, depending on the application scenario. The users' devices run the client applications accessing the context-relevant data portions, i.e., the
Mining -preferences
In this section, we describe how the phases depicted in Fig. 4 are implemented for -preferences.
Mining -preferences
In this section we describe how the steps depicted in Fig. 4 are implemented for -preferences.
Experiments
PREMINE is implemented in Java,5 and integrated within the personalization methodology described in our previous work [7].
To evaluate our approach, we studied the user experience of a set of candidates with the PREMINE client prototype by collecting their activities. Specifically, we built a client in the movie domain, allowing users to browse a commercial database actually adopted by a European company of
Conclusions
This paper has proposed PREMINE, a methodology – and associated tool – exploiting data mining for the automatic extraction of contextual preferences on relational databases, in order to determine the personalized portion of data that will be provided to the end user at run time, in the current context.
The overall approach has been tested with real users, proving it an effective means for context-aware view personalization for relational databases. As future work, we plan to study how new
Acknowledgments
This research has been partially funded by the European Commission, Programme IDEAS ERC, Project 227977-SMScom and by the Italian project Industria 2015, Program no. MI01 00091 SENSORI. Sincere thanks are due to Paolo Garza, for carefully reading the paper and for his precious suggestions, especially on the experimental section. We also thank Paolo Cremonesi and Roberto Turrin for the dataset they provided.
References (42)
- et al.
Context-aware systemsa literature review and classification
Expert Systems with Applications
(2009) - et al.
CARVEcontext-aware automatic view definition over relational databases
Information Systems
(2013) - D. Agrawal, P. Bernstein, E. Bertino, S. Davidson, U. Dayal, M. Franklin, J. Gehrke, L. Haas, A. Halevy, J. Han, H.V....
- et al.
A survey on context-aware systems
International Journal of Ad Hoc and Ubiquitous Computing
(2007) - et al.
A data-oriented survey of context models
SIGMOD Record
(2007) - et al.
And what can context do for data?
Communications of the ACM
(2009) - K. Stefanidis, E. Pitoura, P. Vassiliadis, Adding context to preferences, in: Proceedings of the ICDE 2007, 23rd...
- A. Miele, E. Quintarelli, L. Tanca, A methodology for preference-based personalization of contextual data, in:...
- P. Ciaccia, R. Torlone, Modeling the propagation of user preferences, in: Proceedings of the ER 2011, 30th...
- et al.
Personalizing queries based on networks of composite preferences
ACM Transactions on Database Systems
(2010)
Automatic personalization based on web usage mining
Communications of the ACM
Preference formulas in relational queries
ACM Transactions on Database Systems
A survey on representation composition, and application of preferences in database systems
ACM Transactions on Database Systems
A statistical model for user preference
IEEE Transactions on Knowledge and Data Engineering
Cited by (11)
Mcore: Multi-Agent Collaborative Learning for Knowledge-Graph-Enhanced Recommendation
2021, Proceedings - IEEE International Conference on Data Mining, ICDMFoundations of Context-aware Preference Propagation
2020, Journal of the ACMA short account of techniques for assisting users in mastering big data
2018, Studies in Big DataAdaptive query relaxation and top-k result ranking over autonomous web databases
2017, Knowledge and Information SystemsReducing big data by means of context-aware tailoring
2016, Communications in Computer and Information ScienceADMK: An ad HOC component for aspect and domain based mobile ranking
2015, ARPN Journal of Engineering and Applied Sciences