Skip to main content
Log in

The Short-term User Modeling for Predictive Applications

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

One of the important purposes of data mining on the web is to reveal hidden characteristics of users including their behavior. These characteristics are often used to analyze previous user actions, his/her preferences, and also to predict the future behavior. An average user session consists of only few actions, which brings several complications for the user modeling and also for subsequent prediction tasks. Such tasks are usually researched from the long-term point of view (e.g., contract renewal or course quit). On the contrary, the short-term user modeling plays an important role in the context of web applications, where it helps to improve user experience. Its shortcoming is that it often requires rich data, which availability is rather rare. For this reason, we propose a novel user model focused on the capturing changes in the user’s behavior on the level of specific actions. The model idea is based on the enrichment of user actions by a comparison of actual user session data with previous sessions. As the model basis on generally available data sources, the approach is applicable to wide scale of existing systems. We evaluate our model by the task of session end intent prediction in the e-learning and news domain. Thanks to reflecting differences in user behavior we are able to predict the intent to end the session for particular user in the scale of his/her next couple of actions. Obtained results clearly show that the proposed model brings higher precision, accuracy and session hit ratio than baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://hbswk.hbs.edu/archive/the-economics-of-e-loyalty.

  2. http://piano.io/.

References

  1. Cisco. Cisco visual networking index: forecast and methodology, 2016–2021 , Cisco VNI Forecast, p 17, (2017)

  2. Kompan M, Bielikova M (2013) Context-based satisfaction modelling for personalized recommendations. In: 8th international workshop on semantic and social media adaptation and personalization (SMAP ’13), IEEE, pp 33–38

  3. Billsus D, Pazzani MJ (2007) Adaptive news access. In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web (4321). Springer, Berlin, pp 550–570

    Google Scholar 

  4. Wang W, Zhao D, Luo H, Wang X (2013) Mining user interests in web logs of an online news service based on memory model. In: IEEE 8th international conference on networking, architecture and storage, pp 151–155

  5. Xiang L, Yuan Q, Zhao S, Chen L, Zhang X, Yang Q, Sun J (2010) Temporal recommendation on graphs via long- and short-term preference fusion. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining ’10, pp 723–731

  6. Huang X, Yang Y, Hu Y, Shen F, Shao J (2016) Dynamic user attribute discovery on social media. In: Li F, Shim K, Zheng KL (Ed), Web technologies and applications: 18th Asia-Pacific web conference (APWeb ’16), Springer, pp 256–267

  7. Joshi S (2014) Customer Experience Management: An Exploratory Study on the Parameters Affecting Customer Experience for Cellular Mobile Services of a Telecom Company, in Proocedia - Social and Behavioral Sciences, Volume 133. ISSN 392–399:1877–0428

    Google Scholar 

  8. Ricci F, Rokach L, Shapira B (2015) Recommender systems handbook, 2nd edn. Springer, Berlin

    MATH  Google Scholar 

  9. Bieliková M, Moravcik M (2008) Modeling the reusable content of adaptive web-based applications using an ontology. In: Wallace M, Angelides MC, Mylonas P (eds) Advances in semantic media adaptation and personalization, Springer, pp 307–327

  10. Conati C (2004) How to evaluate models of user affect?, in affective dialogue systems. Springer, Berlin, pp 288–300

    Google Scholar 

  11. Herder E (2007) An analysis of user behavior on the web-understanding the web and its users. VDM Verlag, Saarbrcken

    Google Scholar 

  12. Kassak O, Kompan M, Bielikova M (2016) Students behavior in a web-based educational system: exit intent prediction. In: Engineering applications of artificial intelligence journal, mining the humanities: technologies and applications, vol 51, Elsevier, pp 136–149

  13. Chen W, Niu Z, Zhao X, Li Y (2014) A hybrid recommendation algorithm adapted in e-learning environments. World Wide Web J 17(2):271–284

    Google Scholar 

  14. Mills C, Bosch N, Graesser A, D’Mello S (2014) To quit or not to quit: predicting future behavioral disengagement from reading patterns. In: Trausan-Matu S, Boyer KE, Crosby M, Panourgia K (eds) Intelligent tutoring systems. Springer, Berlin, pp 19–28

    Google Scholar 

  15. Yu J, Zhu T (2015) Combining long-term and short-term user interest for personalized hashtag recommendation. Front Comput Sci 9(4):608–622

    Google Scholar 

  16. Tseng VS, Lin KW (2006) Efficient mining and prediction of user behavior patterns in mobile web systems. Inf Softw Technol 48(6):357–369

    Google Scholar 

  17. Zhou B, Zhang B, Liu Y, Xing K (2011) User model evolution algorithm: forgetting and reenergizing user preference. In: International conference on internet of things and 4th international conference on cyber, physical and social computing, pp 444–447

  18. Cheng Y, Qiu G, Bu J, Liu K, Han Y, Wang C, Chen C (2008) Model bloggers’ interests based on forgetting mechanism. In: 17th International conference on world wide web ’08, pp 1129–1130

  19. Mushtaq N, Werner P, Tolle K, Zicari R (2004) Building and evaluating non-obvious user profiles for visitors of web sites, In: IEEE international conference on e-commerce technologies ’04, pp 9–15

  20. Das A, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th international conference on world wide web ’07, pp 271–280

  21. Desrosiers C, Karypis G (2011) A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F (ed) Recommender system handbook. Springer, Berlin, pp 107–144

    Google Scholar 

  22. Tan M, Shao P (2015) Prediction of student dropout in e-learning program through the use of machine learning method. Int J Emerg Technol Learn 10(1):11–17

    MathSciNet  Google Scholar 

  23. Huntington P, Nicholas D, Jamali HR (2008) Website usage metrics: a reassessment of session data. Inf Process Manag 44(1):358–372

    Google Scholar 

  24. Schneider-Mizell CM, Sander LM (2008) A generalized voter model on complex networks, Technical Report

  25. Patel P, Parmar M (2014) Improve heuristics for user session identification through web server log in web usage mining. Int J Comput Sci Inf Technol 5(3):3562–3565

    Google Scholar 

  26. Spiliopoulou M, Mobasher B, Berendt B, Nakagawa M (2003) A framework for the evaluation of session reconstruction heuristics in web-usage analysis. Inf J Comput 15(2):171–190

    MATH  Google Scholar 

  27. Gayo-Avello D (2009) A survey on session detection methods in query logs and a proposal for future evaluation. Inf Sci 179(12):1822–1843

    Google Scholar 

  28. Mihalkova L, Mooney R (2009) Learning to disambiguate search queries from short sessions. In: Proceedings of the European conference on machine learning and knowledge discovery in databases ’09, pp 111–127

  29. Sisodia DS, Verma S (2012) Web usage pattern analysis through web logs: A review. In: Computer science and software engineering (JCSSE), pp 49–53

  30. Sangodiah A, Balamuralithara B (2014) Holistic prediction of student attrition in higher learning institutions in Malaysia using support vector machine model. Int J Res Stud Comput Sci Eng (IJRSCSE) 1(1):29–35

    Google Scholar 

  31. Delen D (2010) A comparative analysis of machine learning techniques for student retention management. In: Decision support systems, vol 49(4), Elsevier, pp 498–506

  32. Wojewnik P, Kaminski B, Zawisza M, Antosiewicz M (2011) Social-network influence on telecommunication customer attrition. In: Agent and multi-agent systems: technologies and applications, vol 6682, pp 64–73

  33. Li F, Lei J, Tian Y, Punyapatthanakul S, Wang YJ (2011) Model selection strategy for customer attrition risk prediction in retail banking. In: Proceedings of the 9th Australasian data mining conference, vol 121, Australian comp. soc., Darlinghurst, Australia, pp 119–124

  34. Piao G (2016) Towards comprehensive user modeling on the social web for personalized link recommendations. In: Proceedings of the 2016 conference on user modeling adaptation and personalization (UMAP ’16). ACM, New York, USA, pp 333–336

  35. Vasiloudis T, Vahabi H, Kravitz R, Rashkov V (2017) Predicting session length in media streaming. In: Proceedings of KDD18, August 2018, London, the 40th international ACM SIGIR conference on research and development in information retrieval (SIGIR 17), ACM, New York, pp 977–980

  36. Garca DL, Vellido Alcacena A, Nebot Castells MA (2007) Predictive models in churn data mining: a review. In: LSI-07-4-R, pp 1–12

  37. Song Y, Shi X, White R, Awadallah AH (2014) Context-aware web search abandonment prediction. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval (SIGIR ’14). ACM, New York, NY, USA, pp 93–102

  38. Diriye A, White R, Buscher G, Dumais S (2012) Leaving so soon?: understanding and predicting web search abandonment rationales. In: Proceedings of the 21st ACM international conference on information and knowledge management (CIKM ’12). ACM, New York, NY, USA, pp 1025–1034

  39. Chuklin A, Serdyukov P (2012) Potential good abandonment prediction. In: Proceedings of the 21st international conference on world wide web (WWW ’12 Companion). ACM, New York, NY, USA, pp 485–486

  40. Kukar-Kinney M, Close AG (2010) The determinants of consumers’ online shopping cart abandonment. J Acad Mark Sci 38(2):240–250

    Google Scholar 

  41. Williams K, Kiseleva J, Crook AC, Zitouni I, Awadallah AH, Khabsa M (2016) Detecting good abandonment in mobile search. In: Proceedings of the 25th international conference on world wide web (WWW ’16). Switzerland, pp 495–505

  42. Witten IH, Frank E, Hall MA (2015) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann. ISBN 978-0123748560

  43. Aly M, Hatch A, Josifovski V, Narayanan VK (2012) Web-scale user modeling for targeting. In: Proceedings of the 21st international conference on world wide web (WWW ’12 Companion), ACM, New York, USA, pp 3–12

  44. Li B, Chow TWS, Chow TWS, Huang D (2013) A novel feature selection method and its application. J Intell Inf Syst 41(2):235–268

    Google Scholar 

  45. De Silva M, Philip A, Philip L (2015) Grammar-based feature generation for time-series prediction. SpringerBriefs in computational intelligence. Springer, Berlin

    Google Scholar 

  46. Evans JD (1996) Straightforward statistics for the behavioral sciences. Brooks/Cole Publishing, Pacific Grove

    Google Scholar 

  47. Maier A, Rodriguez-Salas D (2017) Fast and robust selection of highly-correlated features in regression problems. In: 2017 fifteenth IAPR international conference on machine vision applications (MVA), IEEE, pp 482-485

  48. Pampn HJC, Jerbi H, O’Mahony MP (2015) Evaluating the relative performance of collaborative filtering recommender systems. J Univ Comput Sci 21(13):1849–1868

    MathSciNet  Google Scholar 

  49. Napierala K, Stefanowski J (2016) Types of minority class examples and their influence on learning classifiers from imbalanced data. J Intell Inf Syst 46(3):563–597

    Google Scholar 

  50. Bieliková M, Šimko M, Barla M, Tvarožek J, Labaj M, Móro R, Srba I, Ševcech J (2014) ALEF: from Application to Platform for Adaptive Collaborative Learning, in Recommender Systems for Technology Enhanced Learning. Springer Science and Business Media NY III:195–225

    Google Scholar 

  51. Formoso V, Fernández D, Cacheda F, Carneiro V (2015) Distributed architecture for k-nearest neighbors recommender systems. World Wide Web J 18(4):997–1017

    Google Scholar 

  52. Rich E, Kobsa A, Wahlster W (1989) Stereotypes and user modeling. In: User models in dialog systems, Springer, pp 35–51

  53. Kompan M, Bielikova M (2013) Personalized recommendation for individual users based on the group recommendation principles. Stud Inf Control 22(3):331–341

    Google Scholar 

  54. Kim Y, Hassan A, White RW, Zitouni I (2014) Modeling dwell time to predict click-level satisfaction. In Proceedings of the 7th ACM international conference on web search and data mining (WSDM ’14). ACM, New York, pp 193–202

Download references

Acknowledgements

This work was partially supported by the Slovak Research and Development Agency under the contract No. APVV-15-0508 grant, the Scientific Grant Agency of the Slovak Republic, Grants No. VG 1/0667/18 and VG 1/0646/15, and is the partial result of the Research&Development Operational Programme for the project ITMS 26240120039 and ITMS 26240220084, co-funded by the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Kompan.

Appendix

Appendix

Proposed model consists of two attributes types (Fig. 1):

  1. 1.

    Comparative attributes \(A_C\)

  2. 2.

    Descriptive attributes \(A_D\)

There are in total nine comparative attributes \(ac_1,\ldots ac_9\), which are described in Table 1. Each comparative attribute is further computed for each of the eight time layers \(tl \in TL\) and each of the two model part \(mp \in MP\). Resulting in \(9 \times 8 \times 2\) values of comparative attributes. The formal definition of nine comparative attributes (Table 1) is following.

Attribute. Let \(S_{u,n}\), \(n\in {\mathbb {N}}\) be the actual session of a user u and \(S_{u,n} \in S_u\). Then \(Act_{u,n}\) is the set of actions user performed in a session n\(S_{u,n}\).

$$\begin{aligned} Attr1= & {} \frac{\left| Act_{u,n} \right| }{\overline{\left| Act_{u,1..n-1} \right| }} \end{aligned}$$
(5)
$$\begin{aligned} Attr2= & {} \left| Act_{u,n} \right| - \overline{\left| Act_{u,1..n-1} \right| } \end{aligned}$$
(6)
$$\begin{aligned} Attr3= & {} if \,\, \left| Act_{u,n} \right| \ge \overline{\left| Act_{u,1..n-1} \right| } \, \, \, then \, \, 1\, \, else\, \, 0 \end{aligned}$$
(7)

Let \(ts_{u,n}\) be the time spent by user u in the session \(S_{u,n}\).

$$\begin{aligned} Attr4= & {} \frac{\left| ts_{u,n} \right| }{\overline{\left| ts_{u,1..n-1} \right| }} \end{aligned}$$
(8)
$$\begin{aligned} Attr5= & {} \left| ts_{u,n} \right| - \overline{\left| ts_{u,1..n-1} \right| } \end{aligned}$$
(9)
$$\begin{aligned} Attr6= & {} if \,\, \left| ts_{u,n} \right| \ge \overline{\left| ts_{u,1..n-1} \right| } \, \, \, then \, \, 1\, \, else\, \, 0 \end{aligned}$$
(10)

Let \(S_{cat,u}\) be the set of all sessions terminating after visiting a page with category cat and \(S_{u}\) set of all sessions of user u.

$$\begin{aligned} Attr7=\frac{\left| S_{cat,u} \right| }{\left| S_u \right| } \times \left| Act_{u,n} \right| \end{aligned}$$
(11)

Let \(Act_{cat,u,n}\) be the subset of actions \(Act_{u,n}\) with category cat.

$$\begin{aligned} Attr8= & {} \frac{\left| S_{cat,u} \right| }{\left| S_u \right| } \times \left| Act_{cat,u,n} \right| \end{aligned}$$
(12)
$$\begin{aligned} Attr9= & {} \frac{\left| S_{cat,u} \right| }{\left| S_u \right| } \times ts_{u,n} \end{aligned}$$
(13)

Let \(pageseq_m\) be the sequence of length m of consequent visited pages, i.e., user actions \(Act_{u,n}\). Let \(pageseq_{m,end}\) be the sequence of length m of consequent visited pages, i.e., user actions \(Act_{u,n}\) which result in session end.

$$\begin{aligned} Attr10= & {} \frac{\left| pageseq_{m,end} \right| }{\left| pageseq_m \right| }; \, \, \, m=4 \end{aligned}$$
(14)
$$\begin{aligned} Attr11= & {} \frac{\left| pageseq_{m,end} \right| }{\left| pageseq_m \right| }; \, \, \, m=3 \end{aligned}$$
(15)
$$\begin{aligned} Attr12= & {} \frac{\left| pageseq_{m,end} \right| }{\left| pageseq_m \right| }; \, \, \, m=2 \end{aligned}$$
(16)
$$\begin{aligned} Attr13= & {} \frac{\left| pageseq_{m,end} \right| }{\left| pageseq_m \right| }; \, \, \, m=1 \end{aligned}$$
(17)

Let \(catseq_{m}\) be the sequence of length m of consequent visited categories, i.e., user actions \(Act_{u,n}\) with category cat. Let \(catseq_{m,end}\) be the sequence of length m of consequent visited categories, i.e., user actions \(Act_{u,n}\) with category cat, which result in session end.

$$\begin{aligned} Attr14= & {} \frac{\left| catseq_{m,end} \right| }{\left| catseq_m \right| }; \, \, \, m=4 \end{aligned}$$
(18)
$$\begin{aligned} Attr15= & {} \frac{\left| catseq_{m,end} \right| }{\left| catseq_m \right| }; \, \, \, m=3 \end{aligned}$$
(19)
$$\begin{aligned} Attr16= & {} \frac{\left| catseq_{m,end} \right| }{\left| catseq_m \right| }; \, \, \, m=2 \end{aligned}$$
(20)
$$\begin{aligned} Attr17= & {} \frac{\left| catseq_{m,end} \right| }{\left| catseq_m \right| }; \, \, \, m=1 \end{aligned}$$
(21)
$$\begin{aligned} Attr18= & {} \left| Act_{u,n} \right| \end{aligned}$$
(22)
$$\begin{aligned} Attr19= & {} ts_{u,n} \end{aligned}$$
(23)

Let \(Cat_distinct\) be the set of distinct categories visited in actions \(Act_{u,n}\) within session \(S_{u,n}\)

$$\begin{aligned} Attr20=\left| Cat_distinct \right| \end{aligned}$$
(24)

Let \(cat_last\) be the category of the page actually visited by the user.

$$\begin{aligned} Attr21= & {} \left| Act_{cat_last,u,n} \right| \end{aligned}$$
(25)
$$\begin{aligned} Attr22= & {} \left| Act_{cat_last,u,n} \right| \times \left| Cat_distinct \right| \end{aligned}$$
(26)

As described in Sect. 3, descriptive attributes are calculated for two model parts (personal and global). We defined these attributes (Attr1–Attr9) in Eqs. 513 from the personal part perspective. The global part attributes 1–9 are calculated in the same way, but all users not only user u is considered. In other words, all metrics for user u are altered by average of all users. For example, in personal part \(ts_{u,n}\) is defined as time spent by the user u within the session n (last session). On the contrary, in global part \(ts_{u,n}\) results in average time spent by all users within last sessions.

Following the idea of proposed model, each of descriptive attributes (Attr1–Attr9) are also calculated for several time layers. In other words, according to the specific time layer (e.g., month) all attributes are calculated only based on last month data.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kompan, M., Kassak, O. & Bielikova, M. The Short-term User Modeling for Predictive Applications. J Data Semant 8, 21–37 (2019). https://doi.org/10.1007/s13740-018-0095-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-018-0095-1

Keywords

Navigation