Skip to main content

Building and Exploiting Ad Hoc Concept Hierarchies for Web Log Analysis

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2454))

Abstract

Web usage mining aims at the discovery of interesting usage patterns from Web server log files. “Interestingness” relates to the business goals of the site owner. However, business goals refer to business objects rather than the page hits and script invocations recorded by the site server. Hence, Web usage analysis requires a preparatory mechanism that incorporates the business goals, the concepts reflecting them and the expert’s background knowledge on them into the mining process. To this purpose, we present a methodology and a mechanism for the establishment and exploitation of application-oriented concept hierarchies in Web usage analysis. We demonstrate our approach on a real data set and show how it can substantially improve both the search for interesting patterns by the mining algorithm and the interpretation of the mining results by the analyst.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sarabjot S. Anand, David A. Bell, and John G. Hughes. The role of domain knowledge in data mining. In CIKM’95, pages 37–43, Baltimore MD, USA, 1995.

    Google Scholar 

  2. Alex G. Büchner, Maurice D. Mulvenna, Sarab S. Anand, and John G. Hughes. An internetenabled knowledge discovery process. In Proc. of the 9th Int’l Database Conference, 1999.

    Google Scholar 

  3. Surajit Chaudhuri and Umeshwar Dayal. An overview of data warehousing and olap technology. ACM SIGMOD Record, 26(1), 1997.

    Google Scholar 

  4. Ming-Syan Chen, Jiawei Han, and Philip S. Yu. Data mining: An overview from database perspective. IEEE Trans. on Knowledge and Data Engineering, 9:866–883, 1996.

    Article  Google Scholar 

  5. Mat Cutler and Jim Sterne. E-metrics — business metrics for the new economy. Whitepaper, Net Genesis Corp., Cambridge, MA, 2000.

    Google Scholar 

  6. T. Ellman. Explanation-based learning:Asurvey of programs and perspectives. ACM Comput. Serveys, 21:162–222, 1989.

    Google Scholar 

  7. Bernhard Ganter and Rudolf Wille. Formale Begriffsanalyse: Mathematische Grundlagen. Springer-Verlag, 1996.

    Google Scholar 

  8. Henner Graubitz, Myra Spiliopoulou, and Karsten Winkler. The DIAsDEM framework for converting domain-specific texts into XML documents with data mining techniques. In Proc. of the 1st IEEE Intl. Conf. on Data Mining,, pages 171–178, San Jose, CA, Nov. 2001. IEEE.

    Google Scholar 

  9. J. Hereth, G. Stumme, R. Wille, and U. Wille. Conceptual knowledge discovery and data analysis. In B. Ganter and G. Mineau, editors, Proc. of Eight International Conference on Conceputel Structures: Logical, Linguistic, and Computational Issues, volume 1867 of Lecture Notes in Artificial Intelligence (LNAI), pages 421–437, Heidelberg, Aug 2000. Springer.

    Google Scholar 

  10. Patrik Jernmark, Nitin Mittal, Ramesh Narayan, Suresh Subudhi, and Kristian Wallin. Analysis of the Thomaskirche website. Kdd-course project report, Leipzig Graduate School of Management, Dec 2001.

    Google Scholar 

  11. Ryszard S. Michalski and Kenneth A. Kaufman. Data mining and knowledge discovery: A review of issues and a multistrategy approach. In R.S. Michalski, I. Bratko, and M. Kubat, editors, Machine Learning and Data Mining: Methods and Applications. JohnWiley & Sons Ltd., 1997.

    Google Scholar 

  12. B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Effective personalization based on association rule discovery from web usage data. In Proceedings of the 3rd ACM Workshop on Web Information and Data Management (WIDM01), held in conjunction with the International Conference on Information and Knowledge Management (CIKM 2001), Atlanta, Georgia, Nov 2001.

    Google Scholar 

  13. Peter Pirolli, James Pitkow, and Ramana Rao. Silk from a sow’s ear: Extracting usable structures from the web. In Conf. on Human Factors in Computing Systems (CIH’96), Vancouver, British Columbia, Canada, Apr 13–18 1996.

    Google Scholar 

  14. Giovanni M. Sacco. Dynamic taxonomies: A model for large information bases. IEEE Transactions on Knowledge and Data Engineering, 12(3):468–479, May/Jun 2000.

    Article  Google Scholar 

  15. Myra Spiliopoulou and Lukas C. Faulstich. WUM: A web utilization miner. In EDBT Workshop WebDB98, Valencia, Spain, 1998. Springer Verlag.

    Google Scholar 

  16. Myra Spiliopoulou and Carsten Pohle. Data mining for measuring and improving the success of web sites. Journal of Data Mining and Knowledge Discovery, Special Issue on E-Commerce, 5:85–114, Jan–Apr 2001.

    Google Scholar 

  17. Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized association rules. In Proc. 21st Conf. on Very Large Databases (VLDB) Zurich, Switzerland, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pohle, C., Spiliopoulou, M. (2002). Building and Exploiting Ad Hoc Concept Hierarchies for Web Log Analysis. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-46145-0_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44123-6

  • Online ISBN: 978-3-540-46145-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics