Skip to main content
Log in

Maximal paths recipe for constructing Web user sessions

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

This paper introduces a new method for the session construction problem, which is the first main step of the Web usage mining process. The proposed method defines user sessions as a set of navigation paths in the Web graph and produces a complete set of all possible maximal paths. Our new method is capable of generating navigation paths which cannot be extracted by using previous greedy approaches. Through experiments performed on real data, it is shown that when our new technique is used, it outperforms previous approaches in Web usage mining applications such as next-page prediction. Our analysis on Web user sessions exposes an important observation: Web users sessions contain navigation graphs that has small number of nodes where users branch out their navigation into multiple paths.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Code Availability

The source code and the datasets are available at the following links:

https://github.com/MBayir/CSRA

https://github.com/MBayir/CSRA-Data

Notes

  1. The subsequence relation is equivalent to substring relation in this context.

  2. ⊔ operation stands for sequence concatenation which is same as string concatenation operator.

References

  1. Agarwal, R., Saxena, S.: An Efficient Approach for Web Usage Mining Using Ann Technique. In: System Performance and Management Analytics, pp 55–63. Springer (2019)

  2. Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: ICDE, pp 3–14 (1995)

  3. Algiriyage, N., Jayasena, S., Dias, G.: Web User Profiling Using Hierarchical Clustering with Improved Similarity Measure. In: 2015 Moratuwa Engineering Research Conference (MERCon), pp 295–300 . IEEE (2015)

  4. Bayir, M.A., Toroslu, I.H., Demirbas, M., Cosar, A.: Discovering better navigation sequences for the session construction problem. Data Knowl. Eng. 73, 58–72 (2012)

    Article  Google Scholar 

  5. Bishop, C.M.: Pattern recognition and machine learning. Springer (2006)

  6. Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the world-wide web. Computer Networks and ISDN Systems 27(6), 1065–1073 (1995)

    Article  Google Scholar 

  7. Ceci, M., Lanotte, P.F.: Closed sequential pattern mining for sitemap generation. World Wide Web 24(1), 175–203 (2021)

    Article  Google Scholar 

  8. Chen, W., Niu, Z., Zhao, X., Li, Y.: A hybrid recommendation algorithm adapted in e-learning environments. World Wide Web 17(2), 271–284 (2014)

    Article  Google Scholar 

  9. Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowl. Inf. Syst. 1(1), 5–32 (1999)

    Article  Google Scholar 

  10. Cooley, R., Tan, P.N., Srivastava, J.: Discovery of Interesting Usage Patterns from Web Data. In: WEBKDD, pp 163–182 (1999)

  11. Dell, R.F., Roman, P.E., Velásquez, J.D.: Web user session reconstruction using integer programming. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, pp 385–388. IEEE Computer Society (2008)

  12. Dell, R.F., Román, P.E., Velásquez, J.D.: Web user session reconstruction with back button browsing. In: Knowledge-Based and Intelligent Information and Engineering Systems, 13Th International Conference, KES 2009, Santiago, Chile, September 28-30, 2009, Proceedings, Part I, pp 326–332 (2009)

  13. Donato, D., Laura, L., Leonardi, S., Millozzi, S.: The web as a graph: how far we are. ACM Trans. Internet. Techn. 7(1), 25 (2007). https://doi.org/10.1145/1189740.1189744

    Article  Google Scholar 

  14. Esmeli, R., Bader-El-Den, M., Abdullahi, H.: Using word2vec recommendation for improved purchase prediction. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE (2020)

  15. Fu, Y., Shih, M.Y.: A Framework for Personal Web Usage Mining. In: International Conference on Internet Computing, pp 595–600 (2002)

  16. Gellert, A., Florea, A.: Web prefetching through efficient prediction by partial matching. World Wide Web 19(5), 921–932 (2016)

    Article  Google Scholar 

  17. Huang, Z., Cautis, B., Cheng, R., Zheng, Y.: Kb-enabled query recommendation for long-tail queries. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp 2107–2112 (2016)

  18. Huang, Z., Mamoulis, N.: Location-aware query recommendation for search engines at scale. In: International Symposium on Spatial and Temporal Databases, pp 203–220 . Springer (2017)

  19. Katarya, R., Verma, O.P.: An effective web page recommender system with fuzzy c-mean clustering. Multimedia Tools and Applications 76(20), 21,481–21,496 (2017)

    Article  Google Scholar 

  20. Liu, B., Mobasher, B., Nasraoui, O.: Web usage mining. In: Web Data Mining, Data-Centric Systems and Applications, pp 527–603. Springer Berlin Heidelberg, Berlin (2011)

  21. Lopes, P., Roy, B.: Dynamic recommendation system using web usage mining for ecommerce users. Procedia Computer Science 45, 60–69 (2015)

    Article  Google Scholar 

  22. Mobasher, B.: Data mining for web personalization. In: The Adaptive Web, pp 90–135 (2007)

  23. Mokryn, O., Bogina, V., Kuflik, T.: Will this session end with a purchase? Inferring current purchase intent of anonymous visitors. Electron. Commer. Res. Appl. 34(100), 836 (2019)

    Google Scholar 

  24. Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)

    Article  Google Scholar 

  25. Postelnicu, Z., Raviv, T., Ben-Gal, I.: Improving websites’ quality of service by shortening their browsing expected path length. Qual. Reliab. Eng. Int. 32(6), 2017–2031 (2016)

    Article  Google Scholar 

  26. Raphaeli, O., Goldstein, A., Fink, L.: Analyzing online consumer behavior in mobile and pc devices: a novel web usage mining approach. Electronic Commerce Research and Applications 26, 1–12 (2017)

    Article  Google Scholar 

  27. Shahabi, C., Kashani, F.B.: Efficient and anonymous web-usage mining for web personalization. INFORMS J. Comput. 15(2), 123–147 (2003)

    Article  Google Scholar 

  28. Sisodia, D.S., Verma, S.: Web Usage Pattern Analysis through Web Logs: a Review. In: 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), pp 49–53. IEEE (2012)

  29. Tarus, J.K., Niu, Z., Kalui, D.: A hybrid recommender system for e-learning based on context awareness and sequential pattern mining. Soft. Comput. 22(8), 2449–2461 (2018)

    Article  Google Scholar 

  30. Tarus, J.K., Niu, Z., Yousif, A.: A hybrid knowledge-based recommender system for e-learning based on ontology and sequential pattern mining. Futur. Gener. Comput. Syst. 72, 37–48 (2017)

    Article  Google Scholar 

  31. Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28 (1), 54–67 (2016)

    Article  Google Scholar 

  32. Zhang, J., Ghorbani, A.A.: The Reconstruction of User Sessions from a Server Log Using Improved Time-Oriented Heuristics. In: 2Nd Annual Conference on Communication Networks and Services Research (CNSR 2004), 19-21 May 2004, Fredericton, N.B., Canada, pp 315–322 (2004)

  33. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Arzu Bayir for the proofreading of the paper and reviewing format of the figures.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Murat Ali Bayir.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work was done during the time period when author was employed at Microsoft. The author is currently with Meta Platforms Inc.

This article belongs to the Topical Collection: Special Issue on Computational Aspects of Network Science

Guest Editors: Apostolos N. Papadopoulos and Richard Chbeir

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bayir, M.A., Toroslu, I.H. Maximal paths recipe for constructing Web user sessions. World Wide Web 25, 2455–2485 (2022). https://doi.org/10.1007/s11280-022-01024-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-022-01024-3

Keywords

Navigation