Skip to main content

The Curious Case of Session Identification

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12260))

  • 920 Accesses

Abstract

Dividing interaction logs into meaningful segments has been a core problem in supporting users in search tasks for over 20 years. Research has brought up many different definitions: from simplistic mechanical sessions to complex search missions spanning multiple days. Having meaningful segments is essential for many tasks depending on context, yet many research projects over the last years still rely on early proposals. This position paper gives a quick overview of session identification development and questions the widespread use of the industry standard.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agichtein, E., White, R.W., Dumais, S.T., Bennet, P.N.: Search, interrupted: understanding and predicting search task continuation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2012, pp. 315–324 (2012). https://doi.org/10.1145/2348283.2348328

  2. Bigon, L., et al.: Prediction is very hard, especially about conversion. Predicting user purchases from clickstream data in fashion e-commerce. CoRR abs/1907.00400 (2019). http://arxiv.org/abs/1907.00400

  3. Buzikashvili, N., Jansen, B.J.: Limits of the web log analysis artifacts. In: WWW 2006 Logging Traces of Web Activity Workshop (2006)

    Google Scholar 

  4. Cao, H., et al.: Context-aware query suggestion by mining click-through and session data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, pp. 875–883 (2008). https://doi.org/10.1145/1401890.1401995

  5. Catledge, L.D., Pitkow, J.E.: Characterizing browsing strategies in the world-wide web. Comput. Netw. ISDN Syst. 27(6), 1065–1073 (1995). https://doi.org/10.1016/0169-7552(95)00043-7

    Article  Google Scholar 

  6. Chitraa, V., Thanamani, D.A.S.: A novel technique for sessions identification in web usage mining preprocessing. Int. J. Comput. Appl. 34(9), 23–27 (2011)

    Google Scholar 

  7. Dinuca, C., Ciobanu, D.: Improving the session identification using the mean time. Int. J. Math. Models Methods Appl. Sci. 6, 265–272 (2012)

    Google Scholar 

  8. Downey, D., Dumais, S., Horvitz, E.: Models of searching and browsing: languages, studies, and applications. In: Proceedings of IJCAI 2007, IJCAI 2007, pp. 2740–2747 (2007)

    Google Scholar 

  9. Gayo-Avello, D.: A survey on session detection methods in query logs and a proposal for future evaluation. Inf. Sci. 179(12), 1822–1843 (2009). https://doi.org/10.1016/j.ins.2009.01.026

    Article  Google Scholar 

  10. Gomes, P., Martins, B., Cruz, L.: Segmenting user sessions in search engine query logs leveraging word embeddings. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds.) TPDL 2019. LNCS, vol. 11799, pp. 185–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30760-8_17

    Chapter  Google Scholar 

  11. Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, pp. 453–462 (2013). https://doi.org/10.1145/2484028.2484055

  12. Hagen, M., Gomoll, J., Beyer, A., Stein, B.: From search session detection to search mission detection. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR 2013, pp. 85–92 (2013)

    Google Scholar 

  13. Hagen, M., Stein, B., Rüb, T.: Query session detection as a cascade. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 147–152 (2011). https://doi.org/10.1145/2063576.2063602

  14. He, D., Göker, A.: Detecting session boundaries from Web user logs. In: Proceedings of of the BCS-IRSG 22nd Annual Colloquium on Information Retrieval Research, pp. 57–66 (2000)

    Google Scholar 

  15. He, D., Göker, A., Harper, D.J.: Combining evidence for automatic Web session identification. Inf. Process. Manag. 38(5), 727–742 (2002). https://doi.org/10.1016/S0306-4573(01)00060-7

    Article  MATH  Google Scholar 

  16. Hienert, D., Kern, D.: Recognizing topic change in search sessions of digital libraries based on thesaurus and classification system. In: Proceedings of the 18th Joint Conference on Digital Libraries, JCDL 2019, pp. 297–300 (2019). https://doi.org/10.1109/JCDL.2019.00049

  17. Jansen, B.J., Spink, A., Blakely, C., Koshman, S.: Defining a session on web search engines: research articles. J. Am. Soc. Inf. Sci. Technol. 58(6), 862–871 (2007)

    Article  Google Scholar 

  18. Jiang, D., Pei, J., Li, H.: Mining search and browse logs for web search: a survey. ACM Trans. Intell. Syst. Technol. 4(4) (2013). https://doi.org/10.1145/2508037.2508038

  19. Jones, R., Klinkner, K.L.: Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, pp. 699–708 (2008). https://doi.org/10.1145/1458082.1458176

  20. Kotov, A., Bennett, P.N., White, R.W., Dumais, S.T., Teevan, J.: Modeling and analysis of cross-session search tasks. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 5–14 (2011). https://doi.org/10.1145/2009916.2009922

  21. Liao, Z., et al.: A vlHMM approach to context-aware search. ACM Trans. Web 7(4) (2013). https://doi.org/10.1145/2490255

  22. Lucchese, C., Orlando, S., Perego, R., Silvestri, F., Tolomei, G.: Identifying task-based sessions in search engine query logs. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM 2011, pp. 277–286 (2011). https://doi.org/10.1145/1935826.1935875

  23. Lv, Y., Zhuang, L., Luo, P.: Neighborhood-enhanced and time-aware model for session-based recommendation. arXiv abs/1909.11252 (2019)

    Google Scholar 

  24. Mehrotra, R.: Inferring User Needs & Tasks from User Interactions. Dissertation, University College London, London (2018)

    Google Scholar 

  25. Mehrotra, R., Yilmaz, E.: Task embeddings: learning query embeddings using task context. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, pp. 2199–2202 (2017). https://doi.org/10.1145/3132847.3133098

  26. Montgomery, A., Faloutsos, C.: Identifying Web browsing trends and patterns. Computer 34(7), 94–95 (2001). https://doi.org/10.1109/2.933515

    Article  Google Scholar 

  27. Murray, G.C., Lin, J., Chowdhury, A.: Identification of user sessions with hierarchical agglomerative clustering. Proc. Am. Soc. Inf. Sci. Technol. 43, 1–9 (2007). https://doi.org/10.1002/meet.14504301312

    Article  Google Scholar 

  28. Piwowarski, B., Dupret, G., Jones, R.: Mining user web search activity with layered Bayesian networks or how to capture a click in its context. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM 2009, pp. 162–171 (2009). https://doi.org/10.1145/1498759.1498823

  29. Quadrana, M., Karatzoglou, A., Hidasi, B., Cremonesi, P.: Personalizing session-based recommendations with hierarchical recurrent neural networks. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys 2017, pp. 130–137 (2017). https://doi.org/10.1145/3109859.3109896

  30. Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD 2005, pp. 239–248 (2005). https://doi.org/10.1145/1081870.1081899

  31. Ruocco, M., Skrede, O.S.L., Langseth, H.: Inter-session modeling for session-based recommendation. In: Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems, DLRS 2017, pp. 24–31 (2017). https://doi.org/10.1145/3125486.3125491

  32. Sen, P., Ganguly, D., Jones, G.J.: Tempo-lexical context driven word embedding for cross-session search task extraction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long Papers), vol. 1, pp. 283–292 (2018). https://doi.org/10.18653/v1/N18-1026

  33. Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999). https://doi.org/10.1145/331403.331405

    Article  Google Scholar 

  34. Spink, A., Jansen, B.J., Wolfram, D., Saracevic, T.: From e-sex to e-commerce: Web search changes. Computer 35(3), 107–109 (2002). https://doi.org/10.1109/2.989940

    Article  Google Scholar 

  35. Spink, A., Park, M., Jansen, B.J., Pedersen, J.: Multitasking during web search sessions. Inf. Process. Manag. 42, 264–275 (2006). https://doi.org/10.1016/j.ipm.2004.10.004

    Article  Google Scholar 

  36. Twardowski, B.: Modelling contextual information in session-aware recommender systems with neural networks. In: Proceedings of the 10th ACM Conference on Recommender Systems, RecSys 2016, pp. 273–276 (2016). https://doi.org/10.1145/2959100.2959162

  37. Völske, M.: Retrieval enhancements for task-based web search. Dissertation, Bauhaus-Universität Weimar, Weimar, Germany (2019)

    Google Scholar 

  38. Wang, H., Song, Y., Chang, M.W., He, X., White, R.W., Chu, W.: Learning to extract cross-session search tasks. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 1353–1364 (2013). https://doi.org/10.1145/2488388.2488507

  39. White, R.W., Drucker, S.M.: Investigating behavioral variability in web search. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 21–30 (2007). https://doi.org/10.1145/1242572.1242576

  40. Ye, C., Wilson, M.L.: A user defined taxonomy of factors that divide online information retrieval sessions. In: Proceedings of the 5th Information Interaction in Context Symposium, IIiX 2014, pp. 48–57 (2014). https://doi.org/10.1145/2637002.2637010

  41. Yuankang, F., Zhiqiu, H.: A session identification algorithm based on frame page and pagethreshold. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 6, pp. 645–647 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian Dietz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dietz, F. (2020). The Curious Case of Session Identification. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2020. Lecture Notes in Computer Science(), vol 12260. Springer, Cham. https://doi.org/10.1007/978-3-030-58219-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58219-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58218-0

  • Online ISBN: 978-3-030-58219-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics