Skip to main content
Log in

Dynamic personalization in conversational recommender systems

  • Original Article
  • Published:
Information Systems and e-Business Management Aims and scope Submit manuscript

Abstract

Conversational recommender systems are E-Commerce applications which interactively assist online users to acquire their interaction goals during their sessions. In our previous work, we have proposed and validated a methodology for conversational systems which autonomously learns the particular web page to display to the user, at each step of the session. We employed reinforcement learning to learn an optimal strategy, i.e., one that is personalized for a real user population. In this paper, we extend our methodology by allowing it to autonomously learn and update the optimal strategy dynamically (at run-time), and individually for each user. This learning occurs perpetually after every session, as long as the user continues her interaction with the system. We evaluate our approach in an off-line simulation with four simulated users, as well as in an online evaluation with thirteen real users. The results show that an optimal strategy is learnt and updated for each real and simulated user. For each simulated user, the optimal behavior is reasonably adapted to this user’s characteristics, but converges after several hundred sessions. For each real user, the optimal behavior converges only in several sessions. It provides assistance only in certain situations, allowing many users to buy several products together in shorter time and with more page-views and lesser number of query executions. We prove that our approach is novel and show how its current limitations can catered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The concept paper of this methodology appeared in the Malaysian Joint Conference on Artificial Intelligence (MJCAI) conference Mahmood et al. (2010a).

  2. We are not computing the state transition probability model of the user’s behavior in advance.

  3. We acquired NutKing’s data from eCTRL Solutions, an Italian company offering tourism-based technologies for conversational recommender systems.

  4. We set these values after analyzing some simulated sessions with our user models.

  5. http://www.useit.com/alertbox/ecommerce.html.

  6. Example given only for 3 states with UR = SelectPromotion, but same explanation applies to the corresponding state with UR = SelectTop10.

  7. http://www.richrelevance.com.

  8. http://www.intershop.com.

  9. http://www.oracle.com/us/products/applications/siebel/index.html.

  10. http://www.omniture.com/en/products/conversion/recommendations.

  11. http://www.locayta.com/.

  12. Suggesting items bought by those who have similar preferences; for more details, see Resnick and Varian (1997) on these preference levels to make recommendations.

  13. These two actions are representative of real user behaviors; users rejected tightening for result sizes approximately close to 100.

References

  • Brusilovsky P, Kobsa A, Nejdl W (2007) The Adaptive Web: methods and strategies of web personalization (Lecture Notes in Computer Science/Information Systems and Applications, incl. Internet/Web, and HCI), 1st edn. Springer, Berlin. http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20&path=ASIN/3540720782

  • Aha D, Breslow L (1997) Refining conversational case libraries. In: Case-based reasoning research and development, proceedings of the 2nd international conference on case-based reasoning (ICCBR-97), Springer, pp 267–278

  • Anderson CR, Domingos P, Weld DS (2001) Adaptive web navigation for wireless devices. In: Proceedings of the 17th international joint conference on artificial intelligence, vol 2, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’01, pp 879–884. http://dl.acm.org/citation.cfm?id=1642194.1642211

  • Brusilovsky P (2001) Adaptive hypermedia. User Model User Adapt Interact 11(1–2):87–110

    Article  Google Scholar 

  • Ceaparu I, Lazar J, Bessiere K, Robinson J, Shneiderman B (2004) Determining causes and severity of end-user frustration. Int J Hum Comput Interaction. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.157.8407

  • Cheng Y (2009) Real time demand learning-based Q-learning approach for dynamic pricing in e-retailing setting. In: IEEC ’09: Proceedings of the 2009 international symposium on information engineering and electronic commerce, IEEE Computer Society, Washington, DC, USA, pp 594–598. doi:10.1109/IEEC.2009.131

  • De Meo P, Rosaci D, Sarnè GM, Ursino D, Terracina G (2007) Ec-xamas: supporting e-commerce activities by an xml-based adaptive multi-agent system. Appl Artif Intell 21(6):529–562. doi:10.1080/08839510701409052

    Article  Google Scholar 

  • Golovin N, Rahm E (2004) Reinforcement learning architecture for web recommendations. In: International conference on information technology: coding and computing (ITCC’04), vol 1, April 5–7, 2004, Las Vegas, Nevada, USA, pp 398–402

  • Goy A, Ardissono L, Petrone G (2007) Personalization in e-commerce applications. In: The adaptive web: methods and strategies of web personalization, chap 16, pp 485–520

  • Hirohiko Morita EU, Yamakawa T (2009) Markov model based adaptive web advertisement system by tracking a user’s taste. Int J Innov Comput Info Control 5(3):811–819

    Google Scholar 

  • Kazienko P, Kolodziejski P (2006) Personalized integration of recommendation methods for e-commerce. IJCSA 3(3):12–26

    Google Scholar 

  • Kim Y, Yum BJ, Song J, Kim SM (2005) Development of a recommender system based on navigational and behavioral patterns of customers in e-commerce sites. Expert Syst Appl 28(2):381–393

    Article  Google Scholar 

  • Kobsa A, Koenemann J, Pohl W (2001) Personalized hypermedia presentation techniques for improving online customer relationships. Knowl Eng Rev 16:111–155

    Article  Google Scholar 

  • Li L, Yang Z, Wang B, Kitsuregawa M (2007) Dynamic adaptation strategies for long-term and short-term user profile to personalize search. In: APWeb/WAIM’07: proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on advances in data and web management, Springer, Berlin, Heidelberg, pp 228–240

  • Mahmood T, Ahmed SH, Mahmood S (2010a) Optimal dynamic personalization in conversational recommender systems. In: MJCAI: Malaysian joint conference on artificial intelligence, Malaysia

  • Mahmood T, Ricci F, Venturini A (2010b) Improving recommendation effectiveness by adapting the dialogue strategy in online travel planning. J Info Technol Tour 11(3):285–302

    Google Scholar 

  • Mirzadeh N, Ricci F (2007) Cooperative query rewriting for decision making support and recommender systems. Appl Artif Intell 21:1–38

    Article  Google Scholar 

  • Nakada T, Kanai H, Kunifuji S (2007) Dynamic book recommendation model for real bookstores. In: The 5th international conference on pervasive computing (Pervasive 2007), Canada

  • Nielsen J, Molich R, Snyder C, Farrell S (2001) E-Commerce eser experience. Nielsen Norman Group

  • Peterson ET (2011) The big book of key performance indicators, 1st edn. No. 2 in Web Analytics Demystified, webanalyticsdemystified.com

  • Resnick P, Varian HR (1997) Recommender systems. Commun ACM 40(3):56–58

    Article  Google Scholar 

  • Rojanavasu P, Srinil P, Pinngern O (2005) New recommendation system using reinforcement learning. In: Proceedings of the fourth international conference on eBusiness, Bangkok, Thailand

  • Rosaci D, Sarné GM (2012) A multi-agent recommender system for supporting device adaptivity in e-commerce. J Intell Inf Syst 38(2):393–418. doi:10.1007/s10844-011-0160-9

    Article  Google Scholar 

  • Schwartz B (2005) The paradox of choice : why more is less. Harper Perennial, New York City

  • Smith M, Lee-Urban S, Muñoz-Avila H (2007) Retaliate: learning winning policies in first-person shooter games. In: AAAI, pp 1801–1806

  • Song X, Lin CY, Tseng BL, Sun MT (2006) Modeling evolutionary behaviors for community-based dynamic recommendation. In: SDM

  • Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge. http://www.cs.ualberta.ca/~sutton/book/the-book.html

  • Tsang SL, Clarke S (2007) Mining user models for effective adaptation of context-aware applications. In: IPC ’07: proceedings of the 2007 international conference on intelligent pervasive computing, IEEE, Washington, DC, USA, pp 178–187. doi:10.1109/IPC.2007.76

  • Watkins C, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(3):279–292. doi:10.1023/A:1022676722315

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tariq Mahmood.

Appendices

Appendix 1: Details of user models

  1. 1.

    WillUM: This user accepts tightening if any of the suggested attributes has a non-Null value in the test item. The user accepts this attribute even if it is not her next preferred attribute, according to the sorted order of the attributes’ frequency usage. This allows us to model the "willingness” of the user, because the user is accepting this attribute although she doesn’t really prefer it. If none of the suggested attributes has a non-NULL value, then acceptance cannot be simulated. In this situation, WillUM acts as follows:

    • if the result size is smaller than 100, i.e., when CRS = small or when CRS = medium, the user rejects tightening and executes the original query, and

    • if the result size is larger than (or equal to) 100, i.e., when CRS is either large or very large, the user rejects tightening and autonomously modifies her query (as in Case 1) (T-modq).Footnote 13

  2. 2.

    ModwillUM: The user considers accepting tightening only 26 % of the time that Sugg has been executed during a session. In doing so, ModwillUM simulates the real users’ response to Sugg Mirzadeh and Ricci (2007). We make a random selection (from a uniform distribution) of the situation in which the user will accept tightening. If the user considers accepting tightening, acceptance is simulated similarly to the behavior in WillUM. If acceptance cannot be simulated in this case, or if the user does not consider accepting tightening (74 % of the time), then the user either rejects tightening or manually modifies her query as in WillUM.

  3. 3.

    UnwillUM: The user never accepts tightening; if \(CRS \in \{small, medium \}, \) the user rejects tightening and executes the query. Otherwise, if \(CRS \in \{large, very large\}, \) the user modifies her query as in WillUM.

  4. 4.

    PopRandomUM: This model represents the behavior of a user population: each time Sugg is executed, we randomly select (from a uniform distribution) and simulate one from amongst the above three user behaviors.

Appendix 2: Q-values logged in off-line evaluation

The Q-values for all the pairs in which PUR=G is always 100, since G is a goal state. Hence, we don’t log the Q-values for such pairs. We now count the number of remaining pairs. The pair associated with the initial state is {PUR = S-goCRS = very large}_ShowQF. Also, there are four states in which PUR=QF-execq (as mentioned above), and in each of these, the Agent can execute either Sugg or Exec; so that makes a total of \(4\times2=8\) pairs. For each of the remaining 5 PUR values, there are 4 possible states, one each for a possible value of CRS. Moreover, when PUR = T-acct, PUR = T-rejt, PUR = T-modq, PUR = R-modq, and PUR = R-add, the Agent can only take actions Exec, Exec, Modify, Modify and Add respectively. This gives \(5\times4\times1=20\) state-action pairs, and the total number of pairs for which we log the Q-values is 1 + 8 + 20 = 29.

Appendix 3: Q-values logged in online evaluation

The 23 Q-values for our online evaluation are listed below:

  1. 1.

    {UR = LoginPB = FalseTE = Less}_SWP

  2. 2.

    {UR = BuyPromotionPB = TrueTE = Less}_ATC

  3. 3.

    {UR = BuyPromotionPB = TrueTE = More}_ATC

  4. 4.

    {UR = BuyPromotionPB = FalseTE = Less}_ATC

  5. 5.

    {UR = BuyPromotionPB = FalseTE = More}_ATC

  6. 6.

    {UR = BuyTop10, PB = TrueTE = Less}_ATC

  7. 7.

    {UR = BuyTop10, PB = TrueTE = More}_ATC

  8. 8.

    {UR = BuyTop10, PB = FalseTE = Less}_ATC

  9. 9.

    {UR = BuyTop10, PB = FalseTE = More}_ATC

  10. 10.

    {UR = SelectPromotionPB = TrueTE = Less}_SPP

  11. 11.

    {UR = SelectPromotionPB = TrueTE = More}_STP

  12. 12.

    {UR = SelectPromotionPB = TrueTE = More}_SPP

  13. 13.

    {UR = SelectPromotionPB = FalseTE = Less}_SPP

  14. 14.

    {UR = SelectPromotionPB = FalseTE = Less}_STP

  15. 15.

    {UR = SelectPromotionPB = FalseTE = More}_STP

  16. 16.

    {UR = SelectPromotionPB = FalseTE = More}_SPP

  17. 17.

    {UR = SelectTop10, PB = TrueTE = Less}_STP

  18. 18.

    {UR = SelectTop10, PB = TrueTE = More}_STP

  19. 19.

    {UR = SelectTop10, PB = TrueTE = More}_SPP

  20. 20.

    {UR = SelectTop10, PB = FalseTE = Less}_STP

  21. 21.

    {UR = SelectTop10, PB = FalseTE = Less}_SPP

  22. 22.

    {UR = SelectTop10, PB = FalseTE = Less}_STP

  23. 23.

    {UR = SelectTop10, PB = FalseTE = More}_STP

  24. 24.

    {UR = SelectTop10, PB = FalseTE = More}_SPP

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mahmood, T., Mujtaba, G. & Venturini, A. Dynamic personalization in conversational recommender systems. Inf Syst E-Bus Manage 12, 213–238 (2014). https://doi.org/10.1007/s10257-013-0222-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10257-013-0222-3

Keywords

Navigation