Skip to main content

Improving Web Sites with Web Usage Mining, Web Content Mining, and Semantic Analysis

  • Conference paper
SOFSEM 2006: Theory and Practice of Computer Science (SOFSEM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3831))

  • 1014 Accesses

Abstract

With the emergence of the World Wide Web, Web sites have become a key communication channel for organizations. In this context, analyzing and improving Web communication is essential to better satisfy the objectives of the target audience. Web communication analysis is traditionnally performed by Web analytics software, which produce long lists of audience metrics. These metrics contain little semantics and are too detailed to be exploited by organization managers and chief editors, who need summarized and conceptual information to take decisions. Our solution to obtain such conceptual metrics is to analyze the content of the Web pages output by the Web server. In this paper, we first present a list of methods that we conceived to mine the output Web pages. Then, we explain how term weights in these pages can be used as audience metrics, and how they can be aggregated using OLAP tools to obtain concept-based metrics. Finally, we present the concept-based metrics that we obtained with our prototype WASA and SQL Server OLAP tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aggarwal, C.C., Yu, P.S.: On Disk Caching of Web Objects in Proxy Servers. In: Proc. of the 6th Int. Conf. on Information and Knowledge Management, CIKM 1997, pp. 238–245 (1997)

    Google Scholar 

  2. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  3. Büchner, A.G., Mulvenna, M.D.: Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining. SIGMOD Record 27(4), 54–61 (1998)

    Article  Google Scholar 

  4. Chen, M.-S., Han, J., Yu, P.S.: Data Mining: An Overview from a Database Perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)

    Article  Google Scholar 

  5. Chi, E.H., Pirolli, P., Chen, K., Pitkow, J.E.: Using Information Scent to Model User Information Needs and Actions and the Web. In: Proc. of the SIGCHI on Human Factors in Computing Systems, pp. 490–497 (2001)

    Google Scholar 

  6. Facca, F.M., Lanzi, P.L.: Mining Interesting Knowledge from Weblogs: a Survey. Data Knowl. Eng. 53(3), 225–241 (2005)

    Article  Google Scholar 

  7. Fensel, D.: Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer, Heidelberg (2000)

    Google Scholar 

  8. Lozano-Tello, A., Gómez-Pérez, A.: Ontometric: A Method to Choose the Appropriate Ontology. J. Database Manag. 15(2), 1–18 (2004)

    Article  Google Scholar 

  9. Malinowski, E., Zimányi, E.: OLAP Hierarchies: A Conceptual Perspective. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 477–491. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. March, J.G., Simon, H.A., Guetzkow, H.S.: Organizations, 2nd edn. Blackwell, Cambridge (1983)

    Google Scholar 

  11. Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Communications of the ACM 43(8), 142–151 (2000)

    Article  Google Scholar 

  12. Moeller, M., Cicaterri, C., Presser, A., Wang, M.: Measuring e-Business Web Usage, Performance, and Availability. IBM Press (2003)

    Google Scholar 

  13. Perkowitz, M., Etzioni, O.: Towards Adaptive Web Sites: Conceptual Framework and Case Study. Artif. Intell. 118(1-2), 245–275 (2000)

    Article  MATH  Google Scholar 

  14. Pirolli, P., Pitkow, J.E.: Distributions of Surfers’ Paths through the World Wide Web: Empirical Characterizations. World Wide Web 2(1-2), 29–45 (1999)

    Article  Google Scholar 

  15. Ríos, S.A., Velásquez, J.D., Vera, E.S., Yasuda, H., Aoki, T.: Using SOFM to Improve Web Site Text Content. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3611, pp. 622–626. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  16. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  17. Srivastava, J., Cooley, R., Deshpande, M., Pang-Ning, T.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. SIGKDD 1(2) (2000)

    Google Scholar 

  18. Steinberger, R., Pouliquen, B., Ignat, C.: Exploiting Multilingual Nomenclatures and Language-Independent Text Features as an Interlingua for Cross-Lingual Text Analysis Applications. In: Proc. of the 4th Slovenian Language Technology Conf., Information Society 2004 (2004)

    Google Scholar 

  19. Sterne, J.: Web Metrics: Proven Methods for Measuring Web Site Success. John Wiley & Sons, Chichester (2002)

    Google Scholar 

  20. Stumme, G., Maedche, A.: Fca-Merge: Bottom-up Merging of Ontologies. In: Proc. of the 17th Int. Joint Conf. on Artificial Intelligence, IJCAI 2001, pp. 225–234 (2001)

    Google Scholar 

  21. Wahli, U., Norguet, J.P., Andersen, J., Hargrove, N., Meser, M.: Websphere Version 5 Application Development Handbook. IBM Press (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Norguet, JP., Zimányi, E., Steinberger, R. (2006). Improving Web Sites with Web Usage Mining, Web Content Mining, and Semantic Analysis. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2006: Theory and Practice of Computer Science. SOFSEM 2006. Lecture Notes in Computer Science, vol 3831. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11611257_41

Download citation

  • DOI: https://doi.org/10.1007/11611257_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31198-0

  • Online ISBN: 978-3-540-32217-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics