SAHN with SEP/COP and SPADE, to Build a General Web Navigation Adaptation System Using Server Log Information

Arbelaitz, Olatz; Gurrutxaga, Ibai; Lojo, Aizea; Muguerza, Javier; Perona, Iñigo

doi:10.1007/978-3-642-25274-7_42

SAHN with SEP/COP and SPADE, to Build a General Web Navigation Adaptation System Using Server Log Information

Olatz Arbelaitz²²,
Ibai Gurrutxaga²²,
Aizea Lojo²²,
Javier Muguerza²² &
…
Iñigo Perona²²

Conference paper

1261 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7023))

Abstract

During the last decades, the information on the web has increased drastically but larger quantities of data do not provide added value for web visitors; there is a need of easier access to the required information and adaptation to their preferences or needs. The use of machine learning techniques to build user models allows to take into account their real preferences. We present in this work the design of a complete system, based on the collaborative filtering approach, to identify interesting links for the users while they are navigating and to make the access to those links easier. Starting from web navigation logs and adding a generalization procedure to the preprocessing step, we use agglomerative hierarchical clustering (SAHN) combined with SEP/COP, a novel methodology to obtain the best partition from a hierarchy, to group users with similar navigation behavior or interests. We then use SPADE as sequential pattern discovery technique to obtain the most probable transactions for the users belonging to each group and then be able to adapt the navigation of future users according to those profiles. The experiments show that the designed system performs efficiently in a web-accesible database and is even able to tackle the cold start or 0-day problem.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brusilovsky, P., Kobsa, A., Nejdl, W.: The Adaptive Web: Methods and Strategies of Web Personalization. LNCS, vol. 4321. Springer, Heidelberg (2007)
Book Google Scholar
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining world wide web browsing patterns. Knowledge and Information Systems 1(1) (1999)
Google Scholar
Desikan, P., Srivastava, J., Kumar, V., Tan, P.N.: Hyperlink Analysis - Techniques and Applications. Army High Performance Computing Center Technical Report (2002)
Google Scholar
EPA-HTTP logs. HTTP requests to the EPA WWW server located at Research Triangle Park, NC (1995), http://ita.ee.lbl.gov/html/contrib/EPA-HTTP.html
García, E., Romero, C., Ventura, S., De Castro, C.: An architecture for making recommendations to courseware authors using association rule mining and collaborative filtering. User Modeling User and Adapted Interaction 19(1-2), 99–132 (2009)
Article Google Scholar
Gurrutxaga, I., Albisua, I., Arbelaitz, O., Martín, J.I., Muguerza, J., Pérez, J.M., Perona, I.: SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index. Pattern Recognition 43(10), 3364–3373 (2010)
Article MATH Google Scholar
Gusfield, D.: Algorithms on strings, trees, and sequences. Cambridge University Press (1997)
Google Scholar
The Internet Traffic Archive, ACM SIGCOMM, http://ita.ee.lbl.gov/
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Upper Saddle River (1988)
MATH Google Scholar
Kosala, R., Blockeel, H.: Web Mining Research: A Survey. ACM SIGKDD Explorations Newsletter 2(1), 1–15 (2000)
Article Google Scholar
Liu, B.: Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. Springer, Heidelberg (2007)
MATH Google Scholar
Mobasher, B.: Web Usage Mining. In: Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. Springer, Berlin (2006)
Google Scholar
NASA-HTTP logs. HTTP requests to the NASA Kennedy Space Center WWW server in Florida (1995), http://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
National Aeronautics and Space Administration (2010), http://www.nasa.gov/
Pierrakos, D., Paliouras, G., Papatheodorou, C., Spyropoulos, C.D.: Web Usage Mining as a Tool for Personalization: A Survey. User Modeling and User Adapted Interaction 13, 311–372 (2003)
Article Google Scholar
Srivastava, J., Desikan, P., Kumar, V.: Web Mining - Concepts, Applications & Research Directions. In: Foundations and Advances in Data Mining. Springer, Heidelberg (2005)
Google Scholar
Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42, 31–60 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Architecture and Technology, University of the Basque Country, M. Lardizabal, 1, 20018, Donostia, Spain
Olatz Arbelaitz, Ibai Gurrutxaga, Aizea Lojo, Javier Muguerza & Iñigo Perona

Authors

Olatz Arbelaitz
View author publications
You can also search for this author in PubMed Google Scholar
Ibai Gurrutxaga
View author publications
You can also search for this author in PubMed Google Scholar
Aizea Lojo
View author publications
You can also search for this author in PubMed Google Scholar
Javier Muguerza
View author publications
You can also search for this author in PubMed Google Scholar
Iñigo Perona
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science School, University of the Basque Country, PÂº Manuel de Lardizabal 1, 20018, Donostia-San Sebastian, Spain
Jose A. Lozano
Computing Systems Department, University of Castilla-La Mancha, Campus Universitario s/n, 02071, Albacete, Spain
José A. Gámez
Dep. Statistics, O.R. and Computation, University of La Laguna, 38271, La Laguna, S.C. Tenerife, Spain
José A. Moreno

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arbelaitz, O., Gurrutxaga, I., Lojo, A., Muguerza, J., Perona, I. (2011). SAHN with SEP/COP and SPADE, to Build a General Web Navigation Adaptation System Using Server Log Information. In: Lozano, J.A., Gámez, J.A., Moreno, J.A. (eds) Advances in Artificial Intelligence. CAEPIA 2011. Lecture Notes in Computer Science(), vol 7023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25274-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-25274-7_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25273-0
Online ISBN: 978-3-642-25274-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics