Skip to main content

Web Log Mining and Parallel SQL Based Execution

  • Conference paper
  • First Online:
  • 201 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1966))

Abstract

We performed association rule mining and sequence pattern mining against the access log which was accumulated at NTT Software Mobile Info Search portal site. Detail web log mining process and the rules we derived are reported in this paper. The integration of web data and relational database enables better management of web data. Some researches have even tried to implement applications such as web mining with SQL. Commercial RDBMSs support parallel execution of SQL. Parallelism is key to improve the performance. We showed that commercial RDBMS can achieve substantial speed up for web mining.

IBM Japan Co.,Ltd. 1-1, Nakase, Mihama-ku, Chiba-shi, Chiba 261-8522, Japan

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, T. Imielinski, A. Swami. “Mining Association Rules between Sets of Items in Large Databases”. In Proc. of the ACM SIGMOD Conference on Management of Data, 1993.

    Google Scholar 

  2. R. Agrawal, R. Srikant. “Fast Algorithms for Mining Association Rules”. In Proc. of the VLDB Conference, 1994.

    Google Scholar 

  3. R. Agrawal, R. Srikant “Mining Sequential Patterns”. ’In Proc. of Int. Conf. on Data Engineering, March 1995.

    Google Scholar 

  4. R. Srikant, R. Agrawal “Mining Sequential Patterns: Generalizations and performance improvements” ’In Proc. of 5th Int. Conf. on Extending Database Technology, March 1996.

    Google Scholar 

  5. G. O. Arocena, A. O. Mandelzon, G. A. Mihaila. “Applications of a Web Query Language” ’In Proc. of WWW6, April 1997.

    Google Scholar 

  6. S. Brin, L. Page “The Anatomy of a Large Scale HypertextualWeb Search Engine”. In Proc. of WWW7, May 1998.

    Google Scholar 

  7. A. Buchner, M. D. Mulvenna. “Discovering internet marketing intelligence through online analytical Web usage mining” In SIGMOD Record (4)27, 1999.

    Google Scholar 

  8. R. Cooley, B. Mobasher, J. Srivistava. “Data preparation for mining World Wide Web browsing patterns” In Journal of Knowledge and Information Systems (1)1, 1999.

    Google Scholar 

  9. E. Spertus, L. A. Stein. “Squel: A Structured Query Language for the Web” In Proc. of WWW9, May 2000.

    Google Scholar 

  10. M. Houtsma, A. Swami. “Set-oriented Mining of Association Rules” In Proc. of International Conference on Data Engineering, March 1995.

    Google Scholar 

  11. J. Kleinberg. “Authoritive sources in s hyperlinked environment”. In Proc. of ACM-SIAM Symposium in Discrete Algorithm, 1998.

    Google Scholar 

  12. M. Perkowitz, O. Etzioni. “Towards Adaptive Web Sites: Conceptual Framework and Case Study”, In Proc. of WWW8, May 1999.

    Google Scholar 

  13. I. Pramudiono, T. Shintani, T. Tamura, M. Kitsuregawa. “Parallel SQL Based Association Rule Mining on Large Scale PC Cluster: Performance Comparison with Directly Coded C Implementation”. In Proc. of Third Pacific-Asia Conference on Knowledge Discovery a nd Data Mining (PAKDD99), March 1999.

    Google Scholar 

  14. I. Pramudiono, T. Shintani, T. Tamura, M. Kitsuregawa. “Mining Generalized Association Rule using Parallel RDB Engine on PC Cluster”. In Proc. of First International Conference on Data Warehousing and Knowledege Discovery (DAWAK99), September 1999.

    Google Scholar 

  15. S. Sarawagi, S. Thomas, R. Agrawal. “Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications”. In Proc. of the ACM SIGMOD Conference on Management of Data, 1998.

    Google Scholar 

  16. S. Thomas, S. Chakravarthy. “Performance Evaluation and Optimization of Join Queries for Association Rule Mining”. In Proc. of First International Conference on Data Warehousing and Knowledege Discovery (DAWAK99), September 1999.

    Google Scholar 

  17. Katsumi Takahashi, Seiji Yokoji, Nobuyuki Miura “Location Oriented Integration of Internet Information-Mobile Info Search“. In Designing the Digital City, Springer-Verlag, March 2000.

    Google Scholar 

  18. Takayuki Tamura, Masato Oguchi, and Masaru Kitsuregawa “Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining”. In Proc. of SC97: High Performance Networking and Computing(SuperComputing’ 97), November, 1997.

    Google Scholar 

  19. S. Thomas, S. Sarawagi “Mining Generalized Association Rules and Sequential Patterns Using SQL Queries” ’In Proc. of Int. Conf. on Knowledge Discovery and Data Mining, March 1998.

    Google Scholar 

  20. T. Yan, M. Jacobsen, H. Garcia-Molina, U. Dayal. “From user access patterns to dynamic hypertext linking” In Proc. of WWW5, May 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kitsuregawa, M., Shintani, T., Yoshizawa, T., Pramudiono, I. (2000). Web Log Mining and Parallel SQL Based Execution. In: Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2000. Lecture Notes in Computer Science, vol 1966. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44431-9_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-44431-9_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41395-0

  • Online ISBN: 978-3-540-44431-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics